list quantity of files by file types


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting list quantity of files by file types
# 1  
Old 12-03-2008
list quantity of files by file types

I'm trying to create a simple file inventory for a series of huge directories containing e-records. What I'm after is a list of all directories and sub-directories with just the number of each type of file in that directory/sub-directory. For example output would look like:
\home\erecords\unit5007b (45 gif, 78 jpg, 666 doc)

I'm fiddling with some java programs I found that are a bit overkill - validating file-types outputting xml etc. I really just need number of files by file type. Any Unix tools out there that will get me closer to this goal?
Thanks,
# 2  
Old 12-03-2008
Hammer & Screwdriver Perhaps this will provide a start?

My test directory did not have good filenames, but this appeared to be a good start at what you were looking for. (Script could probably be cleaned up some; late in the day here!)

Code:
> cat determine_files 
#! /usr/bin/bash

ls -l | grep "^d" | awk '{print $9}' | grep -v "^\." >P_WORK_DIR
while read P_DIR
   do
#   echo $P_DIR
   P_WORK_LIST=$(ls $P_DIR | cut -d"." -f2 | sort | uniq -c | tr "\n" "," | tr -s " " )
   echo $P_DIR" ("$P_WORK_LIST")"
done <P_WORK_DIR

> determine_files 
delete_extra ( 1 file100, 1 file101, 1 file102, 1 file103, 1 keep_two, 7 xml,)
files_with_spaces ( 1 f_list, 1 file 04, 1 file01, 1 see_blanks,)
indic_file ( 8 TXT, 1 makes, 1 sh,)
pass_var ( 1 file1, 1 sh,)
resnum ( 1 other0, 1 res0, 1 res1, 1 res19, 1 res2, 1 res4, 1 res666, 1 res9,)
scsi35 ( 1 35file, 1 dsk, 1 tmp,)

# 3  
Old 12-04-2008
Thank you so much!

That is exactly what I was trying to hammer together for several days!

I have two follow-ups: I want this to recursively go to subdirectories also - I experimented with your script a bit adding -R to the initial ls command but the results were not encouraging - would it be better to have another sh script that goes through the directory tree and calls this one at each directory?

Also the directories I'm dealing with have many dir names with spaces in them and this script is running the ls on just the part of the dir name before the space - again I did try some things and I'm guessing that the change needed is in this bit but I can't quite grasp it:

ls -l | grep "^d" | awk '{print $9}' | grep -v "^\." >P_WORK_DIR

this is getting all dir names and the awk is printing what? then that last grep is removing the "\." from the name?
# 4  
Old 12-04-2008
Neat script. Wasn't aware of the -c option of uniq. Makes the task much easier.

I've made a version of the script that does recursion:

Code:
find $1  -type d | while read DIR ; do
  echo "$DIR"
  ls -l "$DIR" | awk -F. '$2 !~/^$/{print $2}' | sort | uniq -c | tr "\n" "," | tr -s " "
  echo
done

Output looks like:
Code:
/home/nobody/var/www/manual/mod
 204 html,
/home/nobody/var/www/manual/mod/mod_python
 1 css, 99 html,
/home/nobody/var/www/manual/mod/mod_python/icons
 7 gif, 7 png,

Run it as: scriptname /directory

Last edited by Autocross.US; 12-04-2008 at 12:56 PM.. Reason: Updated to handle directories with spaces in the name
# 5  
Old 12-04-2008
Still fighting spaces in filenames

in this line:
Code:
ls -l | grep "^d" | awk '{print $9}' | grep -v "^\." >P_WORK_DIR

how can I add single quotes to either side of the dir name that is saved in P_WORK_DIR? So that in that list a dir name of /HAS Spaces/ will be saved as 'HAS Spaces'

I'm hoping that would do the trick in getting the ls command to work on directories with spaces in the name...
# 6  
Old 12-04-2008
script (lets call it filetype.sh):
Code:
for file in `ls`
do
file $file
done

Now:
Code:
sh filetype.sh  > typeofeachfileindir.txt
sed -n -e 's/.*: \(.*\)/\1/p' typeofeachfileindir.txt | sort | uniq > filetypesindir.txt

cat filetypesindir.txt | while read filetype
do
cat typeofeachfileindir.txt | grep $filetype | wc -l | echo -n $filetype": "
done
rm typeofeachfileindir.txt
rm filetypesindir.txt

This should check filetypes of files in directory based on their datatype not extension.
I leave directory recursion upto you. Hope it helps.
# 7  
Old 12-04-2008
Another one by extension:

Code:
find . -type d |
  perl -nle'
    map @_{/([^.]+?)$/}++, grep -f, glob "$_/*.*";
    print "\t---> $_\n", join ", ", map "$_: $_{$_}", keys %_;
	undef %_
    '


Last edited by radoulov; 12-05-2008 at 05:08 AM.. Reason: refactored
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Counting total files with different file types in each folder

Trying to count total files with different file types with thousands of files in each folder. Since some files do not have extensions I have to use below criteria. Count Total Files starting with --> "^ERROR" Count Total Files starting with --> "^Runtime" Count Everything else or files... (3 Replies)
Discussion started by: kchinnam
3 Replies

2. UNIX for Advanced & Expert Users

Recursive grep with only certain types of files

Can I please have some ideas on how to do a recursive grep with certain types of files? The file types I want to use are *.c and *.java. I know this normally works with all files. grep -riI 'scanner' /home/bob/ 2>/dev/null Just not sure how to get it to work *.c and *.java files. (5 Replies)
Discussion started by: cokedude
5 Replies

3. Shell Programming and Scripting

List file types

Hello everyone - I have a task of listing files from a directory together with their type. I tried using ls -l | file -b or different versions of that but that did not work. I will need this to be in a C shell script that will list the file name, size and type from a directory. I can do... (12 Replies)
Discussion started by: adrianvas12
12 Replies

4. Shell Programming and Scripting

Using a single "find" cmd to search for multiple file types and output individual files

Hi All, I am new here but I have a scripting question that I can't seem to figure out with the "find" cmd. What I am trying to do is to only have to run a single find cmd parsing the directories and output the different file types to induvidual files and I have been running into problems.... (3 Replies)
Discussion started by: swaters
3 Replies

5. Shell Programming and Scripting

Remove two types of files from a directory

Hi All, i need to move two types of files from a directory. I have used the below command to find the files from the directory.. SOURCE_DIR="some directory path" TARGET_DIR="Target Dir" Datestamp=Date_format find $SOURCE_DIR \( -name "*.log" -o -name "*.out" ) - exec ls -1 {} \; now i... (9 Replies)
Discussion started by: ch33ry
9 Replies

6. UNIX for Dummies Questions & Answers

grep with quantity

Hello, is there a way to find a list of all files in my folder that have 5 instances of the symbol | ?? grep "|" *.* would give me all files that have one isntance of | Is there a way to modify this to find files with 5 or more instances of the symbol ?? (3 Replies)
Discussion started by: juliette salexa
3 Replies

7. Shell Programming and Scripting

Help - Bug: A script to compile two types of data files into two temporary files

Dear other forum members, I'm writing a script for my homework, but I'm scratching all over my head and still can't figure out what I did wrong. Please help me. I just started to learn about bash scripting, and I appreciate if anyone of you can point out my errors. I thank you in advance. ... (3 Replies)
Discussion started by: ilove2smoke
3 Replies

8. Shell Programming and Scripting

Find duplicates from multuple files with 2 diff types of files

I need to compare 2 diff type of files and find out the duplicate after comparing each types of files: Type 1 file name is like: file1.abc (the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations). The other file is file1.bcd01abc (the extension... (2 Replies)
Discussion started by: ricky007
2 Replies

9. UNIX for Advanced & Expert Users

selection of files based on its types

Daily we are getting some datafiles to our unix server location FTPIN. Incoming File names will be present in the location "/xyz/test/" as below: "infile_A1_YYYYMMDD", "infile_A2_YYYYMMDD", "infile_B1_YYYYMMDD", "infile_C1_YYYYMMDD" "infile_C2_YYYYMMDD" Where A, B and C are the... (3 Replies)
Discussion started by: ganapati
3 Replies

10. UNIX for Dummies Questions & Answers

find directory with 2 types of files

Trying to use the find command to find any directory which contains a file ending in .zip AND a file ending in .o I'm having trouble specifying multiple files as criteria and have can't seem to figure it out from Unix in a Nutshell and Google. (2 Replies)
Discussion started by: dangral
2 Replies
Login or Register to Ask a Question