Finding with find
Find is a really versatile utility that can be used to enumerate files of different types, narrow the list by file types, dates, sizes, access times and a whole list of expressions. The output can be formatted with various switches to be csv.
My goal was to list the sizes and access times for all video files in the system. I knew that there was over 3 TB of files but not how recently these were accessed/played. I also needed to know at what rate these files were being added to the system.
Here is a simple expression to list all these.
find . -iname ‘*.mp4’ -print
This will print the names on each line. A good starting point for my list.
The -exec switch will let you execute another utility on each file that you enumerate with find. The ‘{}’ is substituted with the filenames that are found.
find . -iname ‘*.mp4’ -exec stat –printf="%n, %s, %y, %x\n" ‘{}’ ; | gawk ‘{ split($0, a, “_”); print $0,",", a[4] }’
The –printf allows me to format the output; the %n prints the name %s prints the size and %y and %x print the access and create dates.
The awk utility lets me further split the name since the files are named a certain way to identify different types of videos.
Pretty powerful stuff - all in one line. Gotta know your shell utilities.
Once I had the information into a CSV file - I opened it up in Excel and added a pivot table to see the rate at which these were added by grouping by month and quarter.