Get nth occurence of string from a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Get nth occurence of string from a file
# 8  
Old 04-10-2015
My txt file has data something like this,

Code:
01,0000000,xxxxxxx/                        
02,xxxxxxxx,yyyyyy/                                    
03,test1,41203016,,/      
01,0000000,xxxxxxx/                        
02,xxxxxxxx,yyyyyy/                                    
03,test2,41203017,,/     
01,0000000,xxxxxxx/     
04,xxxxx,00000,test2                   
02,xxxxxxxx,yyyyyy/                                    
03,test3,41203018,,/

I need to get the string 2nd column value after 03.
That means in this case it is test1, test2 and test3. The same string may exist in other lines too (like test2 exist in line starting 04 which I dont want to replace).

So this is what I am expecting..
get the value 'test1' and I am passing this to a prog. and getting replace string 'replace1' and then need to replace with this in the file.
similarly for test2 and test3.

I used this to get each occurence value, put it in variable and use sed to replace. But this is replacing 'test2' string in both line starting with 03 and 04.

Code:
$testvalue=`awk -F"," -v X="03" -v N=2 '($1 == X) && ((++L)>=(N+0)) { $2="replace2"; print $2 ; exit }'

$replacevalue='replace2'

sed "s/$testvalue/$replacevalue/" $1 > temp.txt

like this
Code:
01,0000000,xxxxxxx/                        
02,xxxxxxxx,yyyyyy/                                    
03,replace1,41203016,,/      
01,0000000,xxxxxxx/                        
02,xxxxxxxx,yyyyyy/                                    
03,replace2,41203017,,/     
01,0000000,xxxxxxx/     
04,xxxxx,00000,replace2                   
02,xxxxxxxx,yyyyyy/                                    
03,replace3,41203018,,/

So I am looking for single command which gets the string and also replaces it.
# 9  
Old 04-10-2015
Try

Code:
awk -F, '$1=="03"{gsub(/test/,"replace",$2)}1' OFS="," file

# 10  
Old 04-10-2015
Quote:
Originally Posted by senhia83
Try

Code:
awk -F, '$1=="03"{gsub(/test/,"replace",$2)}1' OFS="," file

Its working when I hard code the values but when I use the variables like this the its not replacing..

Code:
awk -F, '$1=="03"{gsub(/$l_test/,$l_replace,$2)}1' OFS="," file.txt > temp.txt

# 11  
Old 04-11-2015
Quote:
Originally Posted by r@v!7*7@
Its working when I hard code the values but when I use the variables like this the its not replacing..
Code:
awk -F, '$1=="03"{gsub(/$l_test/,$l_replace,$2)}1' OFS="," file.txt > temp.txt

You haven't defined any variables in this awk script except FS and OFS. In an ERE, $l_test is looking for end of field 2 followed by the string l_test in field 2 (which it will NEVER find) and for each time that it is found, will replace it with the entire input line (since undefined awk variables expand to 0 or an empty string depending on context, and in awk $ followed by a field number refers to the contents of that field). If you are saying that you have defined shell variables in your shell script:
Code:
l_test="test"
l_replace="replace"

and you want to use those variables as the extended regular expression search pattern and substitution replacement specification, respectively, inside your awk script, that could be done with something like:
Code:
l_test="test"
l_replace="replace"
awk -F, -v ERE="$l_test" -v rep="$l_replace" '$1=="03"{gsub(ERE,rep,$2)}1' OFS="," file.txt > temp.txt

Note that if your search pattern occurs multiple times in field 2 in your input, each occurrence will be replaced. If you just want to replace the first occurrence of the search pattern, you should change:
Code:
gsub(ERE,rep,$2)

to:
Code:
sub(ERE,rep,$2)

This User Gave Thanks to Don Cragun For This Post:
# 12  
Old 04-11-2015
Quote:
Originally Posted by Don Cragun
You haven't defined any variables in this awk script except FS and OFS. In an ERE, $l_test is looking for end of field 2 followed by the string l_test in field 2 (which it will NEVER find) and for each time that it is found, will replace it with the entire input line (since undefined awk variables expand to 0 or an empty string depending on context, and in awk $ followed by a field number refers to the contents of that field). If you are saying that you have defined shell variables in your shell script:
Code:
l_test="test"
l_replace="replace"

and you want to use those variables as the extended regular expression search pattern and substitution replacement specification, respectively, inside your awk script, that could be done with something like:
Code:
l_test="test"
l_replace="replace"
awk -F, -v ERE="$l_test" -v rep="$l_replace" '$1=="03"{gsub(ERE,rep,$2)}1' OFS="," file.txt > temp.txt

Note that if your search pattern occurs multiple times in field 2 in your input, each occurrence will be replaced. If you just want to replace the first occurrence of the search pattern, you should change:
Code:
gsub(ERE,rep,$2)

to:
Code:
sub(ERE,rep,$2)

Thanks a lot. That was really helpful.
But even though I changed it to below, it is replacing mutiple occurrence's. Am I missing something.
Code:
sub(ERE,rep,$2)

[/QUOTE]

Last edited by r@v!7*7@; 04-11-2015 at 03:17 AM..
# 13  
Old 04-11-2015
Quote:
Originally Posted by r@v!7*7@
Thanks a lot. That was really helpful.
But even though I changed it to below, it is replacing mutiple occurrence's. Am I missing something.
Code:
sub(ERE,rep,$2)

Using gsub(ERE,rep,2 replaces every string matched by ERE in field 2 on each line with the replacement string indicated by rep.

Using sub(ERE,rep,2 replaces the 1st string matched by ERE in field 2 on each line with the replacement string indicated by rep.

With the sample input you provided and the sample output you said you wanted, both of these do exactly what you said you wanted. You did not provide any sample input with the string test appearing two or more times in field 2 on a line where the 1st field on that line is the string 03.
# 14  
Old 04-11-2015
Quote:
Originally Posted by Don Cragun
Using gsub(ERE,rep,2 replaces every string matched by ERE in field 2 on each line with the replacement string indicated by rep.

Using sub(ERE,rep,2 replaces the 1st string matched by ERE in field 2 on each line with the replacement string indicated by rep.

With the sample input you provided and the sample output you said you wanted, both of these do exactly what you said you wanted. You did not provide any sample input with the string test appearing two or more times in field 2 on a line where the 1st field on that line is the string 03.
you are right. But is there a way we can only replace only first occurrence of the string in a file for line starting with
Code:
03

.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete after nth occurence of string in each line

Hello, Environment: I am under Ubuntu 18.04 bionic. I have an sql file consisting of 10K lines. Objective: What I am trying to attain is to remove everything coming after 2nd tab in each line. While searching for the answer, I found two answers and both gave expected result just for the first... (2 Replies)
Discussion started by: baris35
2 Replies

2. UNIX for Dummies Questions & Answers

Extract until nth occurence

Hi, I couldn't figure how to extract until last occurence of a character. I have the string ./dir1/file1/abc.sh The output should be /dir1/file1 So, the command should display the path until last occurence of "/". Thanks. (3 Replies)
Discussion started by: rajivn786
3 Replies

3. Shell Programming and Scripting

Grab nth occurence in between two patterns using awk or sed

Hi , I have an issue where I want to parse through the output from a file and I want to grab the nth occurrence of text in between two patterns preferably using awk or sed ! TICKET NBR : 1 !GSI : 102 ! 3100.2.112.1 11/06/2013 15:56:29 ! 3100.2.22.3 98 ! 3100.2.134.2... (8 Replies)
Discussion started by: OTNA
8 Replies

4. Shell Programming and Scripting

[Solved] Find and replace till nth occurence of a special character

Hi, I have a requirement to search for a pattern in each line in a file and remove the in between words till the 3rd occurrence of double quote ("). Ex: CREATE TABLE "SCHEMANAME"."AMS_LTV_STATUS" (Note: "SCHEMANAME" may changes for different schemas. Its not a fixed value) I need to... (2 Replies)
Discussion started by: satyaatcgi
2 Replies

5. Shell Programming and Scripting

Delete until Nth occurence (sed, awk)

Hello people, Once more I need your help with SED/AWK I need to delete up to the Nth occurence of a char (from the beggining) and until the Mth occurence of a char (from the end) Example: Input: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z Output: i,j Must delete up to the... (2 Replies)
Discussion started by: drbiloukos
2 Replies

6. Shell Programming and Scripting

Find the occurence of particular string in log file

I have a log file which looks like this: <845185415165:STATUS:5/0:0:0:0:0|ghy59DI5zasldf87asdfamas8df9asd903tGUVSQx4GJVSQ==> I have to extract DATE and number of times the keyword STATUS is shown on each date. Input is : <1354625655744:STATUS:5/0:0:0:0:0|ghy59DI5ztGUVSQx4GJVSQ==>... (8 Replies)
Discussion started by: maddyrox
8 Replies

7. UNIX for Dummies Questions & Answers

To find the Nth Occurence of Search String

Hi guys, I like to find the Line number of Nth Occurence of a Search string in a file. If possible, if it will land the cursor to that particualar line will be great. Cheers!! (3 Replies)
Discussion started by: mac4rfree
3 Replies

8. Shell Programming and Scripting

Replacing nth occurence

Hi My input file is like this for eg: abc abc abc abc abc abc i would like to replace "abc" with "cba" where the occurrence is divisible by 2 of eg here 2nd, 4th and 6th occurence shud be replace can anyone suggest in awk or sed (11 Replies)
Discussion started by: raghav288
11 Replies

9. Shell Programming and Scripting

Replace matching nth occurence

Hi for eg my file has: val1 | val2 | val1 | val2 | val1 | val2 | val1 | val2 | here i need to replace '|' with '|\n' where the occurence of '|' is divisble by 2 so that the output comes like this val1 | val2 | val1 | val2 | val1 | val2 | val1 | val2 | Requesting suggestions in... (1 Reply)
Discussion started by: raghav288
1 Replies

10. Shell Programming and Scripting

delete line upto the nth occurence of a particular charachter.

hi all i want to delete a line upto a particular character. here is example. cp cms/images/wifi-zone.png i want to delete the line till . (cp cms/images/wifi-zone.) so the output wud be "png" only how can i do it? also please note down that dot (.) can also occur multiple... (12 Replies)
Discussion started by: kashifv
12 Replies
Login or Register to Ask a Question