Finding the size of directory with multiply hardlinked files

Dean S. Messing deanm at sharplabs.com
Sun Mar 16 08:04:20 UTC 2008


Patrick O'Callaghan wrote:
>
< lots of stuff snipped; if interested, read the thead :-) >
>
> > I think the solution requires simultaneous knowledge of both
> > directories so that two files, one in backup_A and one in backup_B
> > that are hardlinked together are counted just once.
> 
> You're right of course. In fact just after posting I thought "shouldn't
> that be (du A+B)-(du A)?" but then I saw Roberto's solution so I left it
> at that.

I don't think (du A+B)-(du A) quite works either, unless by "A" and "B"
you mean each of the _individual files_ inside backup_A and backup_B.

As in my previous example, if backup_A and backup_B are entirely
independent directories with no cross-hardlinks, than the correct
result is sizeof[backup_A + backup_B].  For each crosslink file, that
filesize must be subtracted, and then the sum of all must be done.

So 'du -s -c backup_A backup_B' must, for each file found in backup_B, look to
see if it is multiply linked and then check if the other inode is in backup_A.
Note that a file in backup_B might be "new" relative to A but linked "forward"
to a more recent backup_C, not included on the commandline.  So merely looking at
number of links is insufficient.

`du' has some nice smarts.

Dean




More information about the fedora-list mailing list