Sometimes simply checking the size of the whole directory is not enough, and you need more information, e.g. the whole directory has 3GB – but how much of it are those .avi files inside? The most flexible way for matching the files (or directories) is by using find command, so let’s build the solution based on this. For the actual counting of the file sizes, we can use du.

Consider directory as follows. Numbers in the brackets are the file sizes.

% tree -s filesizes    
|-- [        116]  config.xml
|-- [        290]  config.yml
|-- [       4096]  dir1
|   |-- [       4096]  backup
|   |   `-- [      20583]  backup.tar.gz
|   |-- [          5]  blah space.yml
|   |-- [       2858]  script.php
|   `-- [       2858]
|-- [       4096]  dir2
|   `-- [       4096]  backup
|       `-- [      20583]  backup.tar.gz
`-- [       4096]  dir3
    `-- [       4096]  backup
        |-- [       4096]  backup
        |   `-- [      20583]  backup.tar.gz
        `-- [      20583]  backup.tar.gz
7 directories, 9 files

Size of the whole directory (-b option gives the size in bytes):

% du -sb filesizes
121227   filesizes

Total size of selected files

Let’s calculate the size of all the yml files. They are stored in different sub-directories and one of them contains a whitespace in the name (blah space.yml). find part of the command is straight-forward:

% find filesizes -type f -name '*yml'
filesizes/dir1/blah space.yml

Find all files (-type f) with the name ending with yml (-name ‘*yml’) in filesizes directory (that’s first argument – if it’s omitted then find will work from the current directory).
To join it with du command we will change the output of find to separate files it finds with NULL character (0) instead of default newline. We will also pass the list of files to du command, and instruct it that the items (files) are coming in the NULL-separated list from standard input:

% find filesizes -type f -name '*yml' -print0 | du -cb --files0-from=-
5    filesizes/dir1/blah space.yml
290  filesizes/config.yml
295  total

-c switch for du generates the last line, with the total count.

Total size of selected directories

Let’s try something a (little) bit more tricky – calculate the total size of all sub-directories, matched by a pattern. In our example, I would like to know the total size of all backup directories. Notice, that dir3 has backup sub-directory, which in turn has backup sub-directory. We don’t want to count those as two separate entries and sum them up by accident – that would bump our total size (you could possibly get the size of all backup directories bigger than the size of the whole top level directory!).

% find filesizes -type d -name backup -prune -print0 | du -csb --files0-from=-
24679    filesizes/dir1/backup
49358    filesizes/dir3/backup
24679    filesizes/dir2/backup
98716    total

This time we’re find-ing only directories (-type d) and once we find one, we stop and don’t go into it’s sub-directories (-prune). On the du side we’ve added -s switch that produces the summarized total for each item (directory) it gets.