Visit Our UNIX and Linux User Community


counting a list of string in a list of txt files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers counting a list of string in a list of txt files
# 1  
Old 08-13-2008
Lightbulb counting a list of string in a list of txt files

Hi there!

I have 150 txt files named chunk1, chunk2, ........., chunk150. I have a second file called string.txt with more than 1000 unique strings, house, dog, cat ... I want to know which command I should use to count how many times each string appears in the 150 files.

I have tried with a grep -c dog chunk* but then I get the count of all of the files and I have to do it separately for each of the strings.

The ideal solution would be an output saying:

dog 45
cat 69
house 92
song 45

Thanks a lot in advance.

Kind regards,
Pep
# 2  
Old 08-13-2008
Code:
cat chunk* > tmp.tmp
awk '   FILENAME=="string.txt" { arr[$0]=0 }
        FILENAME=="tmp.tmp"  { for(i=1; i<=NF; i++) {
             if ($i in arr) {arr[$i]++} 
        }}        
        END { for (i in arr) { print i, arr[i]}} ' string.txt tmp.tmp

# 3  
Old 08-14-2008
Jim,

Thanks a lot for the quick answer but when running it I got the following error.

awk: syntax error near line 3
awk: illegal statement near line 3
awk: syntax error near line 5
awk: bailing out near line 5

Do you know whether there is something wrong?
Thanks
Pep
# 4  
Old 08-14-2008
Hi.

Most versions of grep can handle a file of patterns, so that standard *nix utlities can be used:
Code:
#!/bin/bash -

# @(#) s3       Demonstrate string count total from files.

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) grep sort uniq
set -o nounset
echo

echo " strings file:"
cat strings

echo
echo " data files" data* ":"
cat -n data*

echo
echo " Results:"
grep -h -f strings data* |
sort |
uniq -c

exit 0

Producing:
Code:
% ./s3

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0
grep (GNU grep) 2.5.1
sort (coreutils) 5.2.1
uniq (coreutils) 5.2.1

 strings file:
dog
horse
cat

 data files data1 data2 data3 data4 :
     1  File 1
     2  monkey
     3  cat
     4  dog
     5  dog
     6  File 2
     7  horse
     8  sawhorse
     9  Files 3
    10  cat
    11  horse
    12  witch
    13  seven
    14  File 4
    15  spider
    16  hoarse
    17  horse
    18  horse
    19  horse
    20  cat

 Results:
      3 cat
      2 dog
      5 horse
      1 sawhorse

The files are filtered for the lines that contain strings of interest. Then, in order to count with uniq, we need to sort the result.

If you need better filtering, you may need to change the patterns in the strings file, or -- in some versions of grep -- use the "word" option "-w".

Adjust as necessary for your environment according to your man pages ... cheers, drl
# 5  
Old 08-15-2008
Thanks a lot it is working now!

Kind regards,

Pep
 

Previous Thread | Next Thread
Test Your Knowledge in Computers #881
Difficulty: Medium
One many threat vectors to a modern SCADA system is the threat of unauthorized access to the control software,
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grep string in files and list file names that contain the string

Hi, I have a list of zipped files. I want to grep for a string in all files and get a list of file names that contain the string. But without unzipping them before that, more like using something like gzcat. My OS is: SunOS test 5.10 Generic_142900-13 sun4u sparc SUNW,SPARC-Enterprise (8 Replies)
Discussion started by: apenkov
8 Replies

2. Shell Programming and Scripting

How to list the files having particular string in it ?

I want to list the name of files with path having perticular string in it. search must be start from root (/) dir. Thanks:) (3 Replies)
Discussion started by: anandgodse
3 Replies

3. UNIX for Dummies Questions & Answers

Creating a column based list from a string list

I have a string containing fields separated by space Example set sr="Fred Ted Joe Peter Paul Jean Chris Tim Tex" and want to display it in a column format, for example to a maximum of a window of 100 characters And hopefully display some thing like Fred Ted Joe ... (3 Replies)
Discussion started by: kristinu
3 Replies

4. Shell Programming and Scripting

Script to list files not present in audio.txt file

I am having following folder structure. /root/audios/pop /root/audios/jazz /root/audios/rock Inside those pop, jazz, rock folders there are following files, p1.ul, p2.ul, p3.ul, j1.ul, j2.ul, j3.ul, r1.ul, r2.ul, r3.ul And I have a file named as "audio.txt" in the path /root/audios,... (11 Replies)
Discussion started by: gopikrish81
11 Replies

5. Solaris

list of files in a txt file from sftp location

I want equivalent of ftp in sftp for listing of files into local machine from sftp location. ftp>ls -l list.txt the above creates a file list.txt in the local machine's current directory. sftp>ls -l list.txt it is giving Couldn't stat remote file: No such file or directory is there... (1 Reply)
Discussion started by: megh
1 Replies

6. Shell Programming and Scripting

List the files with string

I want to list all the files which are having today's date in its header... Please let me know if this can be achveived by a single command (8 Replies)
Discussion started by: anandapani
8 Replies

7. UNIX for Dummies Questions & Answers

List all files except *.txt in a directory

I have many types of files (Eg: *.log, *.rpt, *.txt, *.dat) in a directory. I want to display all file types except *.txt. What is the command to display all files except "*.txt" (9 Replies)
Discussion started by: apsprabhu
9 Replies

8. Shell Programming and Scripting

command to list .txt and .TXT file

Hi expersts, in my directory i have *.txt and *.TXT and *.TXT.log, *.txt.log I want list only .txt and .TXT files in one command... how to ?? //purple (1 Reply)
Discussion started by: thepurple
1 Replies

9. Solaris

list files .txt and .TXT in one command

Dear experts, In a directory i have both *.TXT and *.txt files. I have a script- for file in `ls *.txt`; do mv $file /tmp/$file How to list both *.txt and*.TXT file in one command so that script will move both .txt or .TXT whatever it find. br//purple (4 Replies)
Discussion started by: thepurple
4 Replies

10. UNIX for Dummies Questions & Answers

How to List and copy the files containing a string

:confused: I have more than 8000 files in a dir, I need to copy to other dir which containing the "sample" I tried grep -il "1189609240791-1268115603299237276@216.109.111.119 ' | cp /tmp/inv Nothing is happening for long time for 100 file dir too, Any one can help me? (11 Replies)
Discussion started by: redlotus72
11 Replies

Featured Tech Videos