The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
send email from address list and subject list paulds UNIX for Dummies Questions & Answers 2 06-27-2008 10:11 AM
perl script to list filenames that do not contain given string royalibrahim Shell Programming and Scripting 21 04-22-2008 01:55 PM
How to List and copy the files containing a string redlotus72 UNIX for Dummies Questions & Answers 11 09-28-2007 11:58 AM
Search a string from list of input files sivakumarvenkat UNIX for Dummies Questions & Answers 2 03-08-2006 06:08 PM
Extracting String from a list odogbolu98 Shell Programming and Scripting 4 06-01-2002 02:48 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 08-13-2008
Pep Puigvert Pep Puigvert is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 3
Lightbulb counting a list of string in a list of txt files

Hi there!

I have 150 txt files named chunk1, chunk2, ........., chunk150. I have a second file called string.txt with more than 1000 unique strings, house, dog, cat ... I want to know which command I should use to count how many times each string appears in the 150 files.

I have tried with a grep -c dog chunk* but then I get the count of all of the files and I have to do it separately for each of the strings.

The ideal solution would be an output saying:

dog 45
cat 69
house 92
song 45

Thanks a lot in advance.

Kind regards,
Pep
  #2 (permalink)  
Old 08-13-2008
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,643
Code:
cat chunk* > tmp.tmp
awk '   FILENAME=="string.txt" { arr[$0]=0 }
        FILENAME=="tmp.tmp"  { for(i=1; i<=NF; i++) {
             if ($i in arr) {arr[$i]++} 
        }}        
        END { for (i in arr) { print i, arr[i]}} ' string.txt tmp.tmp
  #3 (permalink)  
Old 08-14-2008
Pep Puigvert Pep Puigvert is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 3
Jim,

Thanks a lot for the quick answer but when running it I got the following error.

awk: syntax error near line 3
awk: illegal statement near line 3
awk: syntax error near line 5
awk: bailing out near line 5

Do you know whether there is something wrong?
Thanks
Pep
  #4 (permalink)  
Old 08-14-2008
drl's Avatar
drl drl is offline Forum Advisor  
Registered User
  
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 699
Hi.

Most versions of grep can handle a file of patterns, so that standard *nix utlities can be used:
Code:
#!/bin/bash -

# @(#) s3       Demonstrate string count total from files.

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) grep sort uniq
set -o nounset
echo

echo " strings file:"
cat strings

echo
echo " data files" data* ":"
cat -n data*

echo
echo " Results:"
grep -h -f strings data* |
sort |
uniq -c

exit 0
Producing:
Code:
% ./s3

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0
grep (GNU grep) 2.5.1
sort (coreutils) 5.2.1
uniq (coreutils) 5.2.1

 strings file:
dog
horse
cat

 data files data1 data2 data3 data4 :
     1  File 1
     2  monkey
     3  cat
     4  dog
     5  dog
     6  File 2
     7  horse
     8  sawhorse
     9  Files 3
    10  cat
    11  horse
    12  witch
    13  seven
    14  File 4
    15  spider
    16  hoarse
    17  horse
    18  horse
    19  horse
    20  cat

 Results:
      3 cat
      2 dog
      5 horse
      1 sawhorse
The files are filtered for the lines that contain strings of interest. Then, in order to count with uniq, we need to sort the result.

If you need better filtering, you may need to change the patterns in the strings file, or -- in some versions of grep -- use the "word" option "-w".

Adjust as necessary for your environment according to your man pages ... cheers, drl
  #5 (permalink)  
Old 08-15-2008
Pep Puigvert Pep Puigvert is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 3
Thanks a lot it is working now!

Kind regards,

Pep
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 09:43 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language translation by Google.
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0