Visit Our UNIX and Linux User Community


How to find repeated string in a text file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to find repeated string in a text file
# 1  
Old 10-25-2011
Bug How to find repeated string in a text file

I have a text file where I need to find the string = ST*850*
This string is repetaed several times in the file, so I need to know how many times it appears in the file, this is the text files:
Code:
ISA*00* *00* *08*925485USNR *ZZ*IMSALADDERSP *110824*1631*:*00501*850001355*0*P*>~GS*PO*925485USNR*IMSALADDERSP*20110824*1631*850001355*X*005010~ST*850*2262~BEG*00*SA*31016446**20110824~CUR*BY*USD~REF*IA*541177~REF*19*01~REF*MR*ZUS1~REF*AFN*ZZ~PER*BD*Aleshia Simrell*TE*4792041336~ITD*08*3*1**40~ITD*01*3~DTM*996*20110919~N1*BT*WalMart Stores Inc.*UL*0078742061078~N1*ST*Hayes Retail Services TX 7862*UL*0078742066493~N1*SU*LOUISVILLE LADDER INC~PO1*00010*5*EA*71.01*LE*IN*100016092~PID*F****4 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742000589*5~N9*L1*SPECIAL INSTRUCTIONS~MTX**Color Length 0.000 Width 0.000 Height~MTX**0.000 Unit of Dim. Size Unit of Mea.EA Make ModelSA~AMT*1*355.05~PO1*00020*6*EA*90.79*LE*IN*100016093~PID*F****6 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742000589*6~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*544.74~CTT*2~AMT*TT*899.79~SE*29*2262~ST*850*2263~BEG*00*SA*31016447**20110824~CUR*BY*USD~REF*IA*541177~REF*19*01~REF*MR*ZUS1~REF*AFN*ZZ~PER*BD*Aleshia Simrell*TE*4792041336~ITD*08*3*1**40~ITD*01*3~DTM*996*20110919~N1*BT*WalMart Stores Inc.*UL*0078742061078~N1*ST*Hayes Retail Services TX 7862*UL*0078742066493~N1*SU*LOUISVILLE LADDER INC~PO1*00010*1*EA*127.06*LE*IN*100016094~PID*F****8 STEP FIBERGLASS LADDER~SDQ*EA*UL*0078742000589*1~N9*L1*SPECIAL INSTRUCTIONS~MTX**Color Length 0.000 Width 0.000 Height~MTX**0.000 Unit of Dim. Size Unit of Mea.EA Make ModelSA~AMT*1*127.06~CTT*1~AMT*TT*127.06~SE*24*2263~ST*850*2264~BEG*00*SA*31016448**20110824~CUR*BY*USD~REF*IA*541177~REF*19*01~REF*MR*ZUS1~REF*AFN*ZZ~PER*BD*Aleshia Simrell*TE*4792041336~ITD*08*3*1**40~ITD*01*3~DTM*996*20110919~N1*BT*WalMart Stores Inc.*UL*0078742061078~N1*ST*Hayes Retail Services TX 7862*UL*0078742066493~N1*SU*LOUISVILLE LADDER INC~PO1*00010*2*EA*90.79*LE*IN*100016093~PID*F****6 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742000589*2~N9*L1*SPECIAL INSTRUCTIONS~MTX**Color Length 0.000 Width 0.000 Height~MTX**0.000 Unit of Dim. Size Unit of Mea.EA Make ModelSA~AMT*1*181.58~PO1*00020*1*EA*127.06*LE*IN*100016094~PID*F****8 STEP FIBERGLASS LADDER~SDQ*EA*UL*0078742000589*1~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*127.06~PO1*00030*1*EA*191.72*LE*IN*100016096~PID*F****TWELVE STEP FIBERGLASS LADDER~SDQ*EA*UL*0078742000589*1~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*191.72~PO1*00040*10*EA*71.01*LE*IN*100016092~PID*F****4 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742000589*10~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*710.1~PO1*00050*5*EA*55*LE*IN*100016091~PID*F****2 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742000589*5~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*275~CTT*5~AMT*TT*1485.46~SE*44*2264~ST*850*2265~BEG*00*SA*31016449**20110824~CUR*BY*USD~REF*IA*541177~REF*19*01~REF*MR*ZUS1~REF*AFN*ZZ~PER*BD*Linda Cheek*TE*4792042014~ITD*08*3*1**40~ITD*01*3~DTM*996*20110829~N1*BT*WalMart Stores Inc.*UL*0078742061078~N1*ST*Hayes Retail Services TX 7862*UL*0078742066493~N1*SU*LOUISVILLE LADDER INC~PO1*00010*4*EA*71.01*LE*IN*100016092~PID*F****4 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742067391*4~N9*L1*SPECIAL INSTRUCTIONS~MTX**Color Length 0.000 Width 0.000 Height~MTX**0.000 Unit of Dim. Size Unit of Mea.EA Make ModelSA~AMT*1*284.04~PO1*00020*6*EA*90.79*LE*IN*100016093~PID*F****6 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742067391*6~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*544.74~CTT*2~AMT*TT*828.78~SE*29*2265~ST*850*2266~BEG*00*SA*31016450**20110824~CUR*BY*USD~REF*IA*541177~REF*19*01~REF*MR*ZUS1~REF*AFN*ZZ~PER*BD*Linda Cheek*TE*4792042014~ITD*08*3*1**40~ITD*01*3~DTM*996*20110829~N1*BT*WalMart Stores Inc.*UL*0078742061078~N1*ST*Hayes Retail Services TX 7862*UL*0078742066493~N1*SU*LOUISVILLE LADDER INC~PO1*00010*2*EA*90.79*LE*IN*100016093~PID*F****6 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742067391*2~N9*L1*SPECIAL INSTRUCTIONS~MTX**Color Length 0.000 Width 0.000 Height~MTX**0.000 Unit of Dim. Size Unit of Mea.EA Make ModelSA~AMT*1*181.58~PO1*00020*1*EA*127.06*LE*IN*100016094~PID*F****8 STEP FIBERGLASS LADDER~SDQ*EA*UL*0078742067391*1~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*127.06~PO1*00030*1*EA*191.72*LE*IN*100016096~PID*F****TWELVE STEP FIBERGLASS LADDER~SDQ*EA*UL*0078742067391*1~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*191.72~PO1*00040*10*EA*71.01*LE*IN*100016092~PID*F****4 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742067391*10~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*710.1~PO1*00050*5*EA*55*LE*IN*100016091~PID*F****2 STEP FIBERGLASS LADDER PLATFORM~SDQ*EA*UL*0078742067391*5~N9*L1*SPECIAL INSTRUCTIONS~AMT*1*275~CTT*5~AMT*TT*1485.46~SE*44*2266~GE*5*850001355~IEA*1*850001355~

Please encode data with code tags

Last edited by jim mcnamara; 10-25-2011 at 05:56 PM.. Reason: code tags
# 2  
Old 10-25-2011
Code:
awk '{ tmp=$0
         cnt=0
         i=index(tmp, "ST*850*")
         while(i>0)
         {
                cnt++;
                tmp=substr(tmp,i)
                i=index(tmp, "ST*850*")
          }
          END {print "Found ", cnt, " Times" } '  inputfilename

# 3  
Old 10-25-2011
Code:
grep -o "ST\*850\*" infile |wc -l

# 4  
Old 10-25-2011
Quote:
Originally Posted by cucosss
I have a text file where I need to find the string = ST*850*
This string is repetaed several times in the file, so I need to know how many times it appears in the file, this is the text files:
Create a file where you store the instances you are looking for like Strings.txt and create another file where you paste this script and name it String_Freq.awk

awk -f String_Freq.awk Strings.txt

strings.txt
ST*850*
ISA*00*
BEG*00*





String_Freq.awk
Code:
NR==FNR {words[++nwords]=$0;next}
{for(i=1;i<=NF;i++) freq[$i]++}
END {for(w=1;w<=nwords;w++)
{if (freq[words[w]]+0>0) print "Instances of " words[w] " : " freq[words[w]]+0}}

# 5  
Old 10-25-2011
another way by gawk

Code:
gawk -F "ST\\\*850\\\*" '{print NF-1}' infile

# 6  
Old 10-25-2011
jum mcnamara your code give errors:
Syntax Error The source line is 10.
The error context is
>>> END <<< {print "Found ", cnt, " Times" }
awk: 0602-502 The statement cannot be correctly parsed. The source line is 10.
awk: 0602-540 There is a missing } character.


rdcwayx
my grep command does not have -o option
# 7  
Old 10-25-2011
check my another reply, if your awk is GAWK, it should work for you.

Previous Thread | Next Thread
Test Your Knowledge in Computers #205
Difficulty: Easy
Routing Information Protocol (RIP) runs over Transmission Control Protocol (TCP).
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Find and replace a string in a text file

Dear all, I want to find all the "," in my text file and then replace the commas to a tab. I found a script online but I don't know how to modify the script for my case. Any one can help? Thank you. @echo off &setlocal set "search=%1" set "replace=%2" set "textfile=Input.txt" set... (2 Replies)
Discussion started by: forevertl
2 Replies

2. Shell Programming and Scripting

[Need help] perl script to find the occurance of string from a text file

I have two files 1. input.txt 2. keyword.txt input.txt has contents like .src_ref 0 "call.s" 24 first 0x000000 0x5a80 0x0060 BRA.l 0x60 .src_ref 0 "call.s" 30 first 0x000002 0x1bc5 RETI .src_ref 0 "call.s" 31 first 0x000003 0x6840 ... (2 Replies)
Discussion started by: acdc
2 Replies

3. Shell Programming and Scripting

Find repeated word and take sum of the second field to it ,for all the repeated words in awk

Hi below is the input file, i need to find repeated words and sum up the values of it which is second field from the repeated work.Im trying but getting no where close to it.Kindly give me a hint on how to go about it Input fruits,apple,20,fruits,mango,20,veg,carrot,12,veg,raddish,30... (11 Replies)
Discussion started by: 100bees
11 Replies

4. Shell Programming and Scripting

find string(s) in text file and nearby data, export to list help

Hi, So I'm kinda new to shell scripts and the like. I've picked up quite a bit of use from browsing the forums here but ran into a new one that I can't seem to find an answer for. I'm looking to parse/find a string AND the next 15 or so charachters that follow the string within a text file... (1 Reply)
Discussion started by: kar23me
1 Replies

5. Shell Programming and Scripting

Find string in text file

Hello! Please, help me to write such script. I have some text file with name filename.txt I must check if this file contains string "test-string-first", I must cut from this file string which follows string "keyword-string:" and till first white-space and save it to some variable. For... (3 Replies)
Discussion started by: optik77
3 Replies

6. Shell Programming and Scripting

find a string in a file and add some text after that file

Hi Could you please help me out by solving teh below problem ? I have a file with as below source1|target1|yes source2|target2|no source1 is file in which i have to place some code under the <head> tag in it. What code i have to place in source1 is something like this "abcd.....<target1>... (5 Replies)
Discussion started by: Tasha_T
5 Replies

7. Shell Programming and Scripting

Extract multiple repeated data from a text file

Hi, I need to extract data from a text file in which data has a pattern. I need to extract all repeated pattern and then save it to different files. example: input is: ST*867*000352214 BPT*00*1000352214*090311 SE*1*1 ST*867*000352215 BPT*00*1000352214*090311 SE*1*2 ... (5 Replies)
Discussion started by: apjneeraj
5 Replies

8. Shell Programming and Scripting

Delete repeated word in text file

Hi expert, I am using C shell. And i trying to delete repeated word. Example file.txt: BLUE YELLOW RED VIOLET RED RED BLUE WHITE YELLOW BLACK and i wan store the output into a new file: BLUE (6 Replies)
Discussion started by: vincyoxy
6 Replies

9. Shell Programming and Scripting

Looking for command(s)/ script to find a text string within a file

I need to search through all files with different file suffixes in a directory structure to locate any files containing a specific string (5 Replies)
Discussion started by: wrwelden
5 Replies

10. UNIX for Dummies Questions & Answers

how to find a word repeated in a file

Hi everyone, I have a file in which a word is repeated more than one time and I want to know how many times it is repeated. ex: if i repeated word 'guru' in 10 lines I can get the o/p as: cat filename | grep -c 'guru'. How ever if the word is repeated more than one time, then how can I... (4 Replies)
Discussion started by: gurukottur
4 Replies

Featured Tech Videos