Substitute first occurrence of keyword if occurrence between two other keywords


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Substitute first occurrence of keyword if occurrence between two other keywords
# 1  
Old 12-12-2015
Substitute first occurrence of keyword if occurrence between two other keywords

Assume a string that contains one or multiple occurrences of three different keywords (abbreviated as "kw"). I would like to replace kw2 with some other string, say "qux". Specifically, I would like to replace that occurrence of kw2 that is the first one that is preceded by kw1 somewhere in the string (i.e., kw1 not necessarily adjacent to kw2) and followed by kw3 (i.e., kw2 not necessarily adjacent to kw3).

Examples:
Code:
> echo "foo kw1 bar kw2 baz kw3 kw2 baz kw3" | sed ...
# Desired output: foo kw1 bar qux baz kw3 kw2 baz kw3

Code:
> echo "foo kw2 bar kw1 bar kw2 baz kw3" | sed ...
# Desired output: foo kw2 bar kw1 bar qux baz kw3

# 2  
Old 12-12-2015
Got Perl?

Code:
cat m_gruenstaeudl.examples
foo kw1 bar kw2 baz kw3 kw2 baz kw3
foo kw2 bar kw1 bar kw2 baz kw3
foo kw2 bar kw1 kw2 baz kw3
foo kw2 bar kw1 kw2 kw3

Code:
perl -ple 's/(kw1.*?)kw2(.*?kw3)/$1qux$2/' m_gruenstaeudl.examples
foo kw1 bar qux baz kw3 kw2 baz kw3
foo kw2 bar kw1 bar qux baz kw3
foo kw2 bar kw1 qux baz kw3
foo kw2 bar kw1 qux kw3

This User Gave Thanks to Aia For This Post:
# 3  
Old 12-12-2015
Or take the scenic route with awk... Smilie

Code:
$ 
$ # Show the data file
$ cat f33
foo kw1 bar kw2 baz kw3 kw2 baz kw3
foo kw2 bar kw1 bar kw2 baz kw3
foo kw2 bar kw1 kw2 baz kw3
foo kw2 bar kw1 kw2 kw3
foo kw1 bar kw1 kw2 kw3 kw1 kw2 kw3
kw1 kw2 kw3 kw1 kw2 kw3
kw1 kw2 kw3
kw1 kw2 kw4
kw1 kw2 kw4 foo bar buzz
foo kw1 kw2 kw2 kw2 kw3 bar
$ 
$ # Show the awk script
$ cat -n f33.awk
     1	BEGIN {TIMES = 0}
     2	{
     3	     for (i=1; i<=NF; i++) {
     4	         if ($i == "kw1" && TIMES == 0) {
     5	             # kw1 found: set IN, print it
     6	             printf("%s ", $i)
     7	             IN = 1
     8	             j = 0
     9	             TIMES++
    10	         } else if (IN == 1) {
    11	             # if we are here, TIMES=1 always, and we'll be here only once per "kw1..." pattern per line
    12	             if ($i == "kw2") {
    13	                 # kw2 found: add to partial array, do not print
    14	                 j++
    15	                 partial[j] = $i
    16	             } else if ($i == "kw3") {
    17	                 # kw3 found: reset 1st element, print array, reset IN and TIMES
    18	                 partial[1] = "qux"
    19	                 for (k=1; k<=j; k++) {
    20	                     printf("%s ",partial[k])
    21	                     delete partial[k]
    22	                 }
    23	                 printf("%s ", $i)
    24	                 TIMES++
    25	                 IN = 0
    26	             } else if (j > 0) {
    27	                 # j > 0: partial array has been initialized; add to partial array, do not print
    28	                 j++
    29	                 partial[j] = $i
    30	             } else {
    31	                 # j == 0 and field is something other than kw2 and kw3, print it
    32	                 printf("%s ", $i)
    33	             }
    34	         } else {
    35	             # Either TIMES=0 or > 1: print it"
    36	             printf("%s ", $i)
    37	         }
    38	     }
    39	     # if partial array was initialized, because kw2 was found, but kw3 was never found,
    40	     # then print the contents of the partial array and flush it
    41	     if (length(partial) > 0) {
    42	         for (k=1; k<=j; k++) {
    43	             printf("%s ",partial[k])
    44	             delete partial[k]
    45	         }
    46	     }
    47	     # we are done with this line; reset variables and repeat
    48	     printf("\n")
    49	     IN = 0
    50	     TIMES = 0
    51	}
    52	
$ 
$ # Run the awk script
$ awk -f f33.awk f33
foo kw1 bar qux baz kw3 kw2 baz kw3 
foo kw2 bar kw1 bar qux baz kw3 
foo kw2 bar kw1 qux baz kw3 
foo kw2 bar kw1 qux kw3 
foo kw1 bar kw1 qux kw3 kw1 kw2 kw3 
kw1 qux kw3 kw1 kw2 kw3 
kw1 qux kw3 
kw1 kw2 kw4 
kw1 kw2 kw4 foo bar buzz 
foo kw1 qux kw2 kw2 kw3 bar 
$ 
$


Last edited by durden_tyler; 12-13-2015 at 01:57 AM..
# 4  
Old 12-13-2015
Hi M Gruenstaeudl,
Your specification is a little bit ambiguous. If all three keywords are found in order in an input line more than once (as in the line:
Code:
kw1 kw2 kw3 kw1 kw2 kw3

in durden tyler's sample input), do you just want kw2 to be replaced in the first set of 3 keywords on the line as in the output Aia's perl script and durden tyler's awk script produce:
Code:
kw1 qux kw3 kw1 kw2 kw3

or did you want the 1st occurrence of kw2 to be replaced in each set of 3 keywords:
Code:
kw1 qux kw3 kw1 qux kw3

?
# 5  
Old 12-13-2015
@Aia: Yes, perl seems to be the way to go here, not sed. Thanks for your answer. It answered my question to the point.
Code:
perl -pi -le 's/(kw1.*?)kw2(.*?kw3)/$1qux$2/' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed print from last occurrence match until the end of last occurrence match

Hi, i have file file.txt with data like: START 03:11:30 a 03:11:40 b END START 03:13:30 eee 03:13:35 fff END jjjjjjjjjjjjjjjjjjjjj START 03:14:30 eee 03:15:30 fff END ggggggggggg iiiiiiiiiiiiiiiiiiiiiiiii I want the below output START (13 Replies)
Discussion started by: Jyotshna
13 Replies

2. Shell Programming and Scripting

Compare value in the last occurrence for each id

Dear Gents, Please can you help me. I have a file with multiple values called ID ID ( columns 11-24) INDEX ( column 26 ) STATUS ( columns 91-92) The ID can be repetead many times to diference each one there is a value called INDEX which difference each time, it increase if the ID is... (10 Replies)
Discussion started by: jiam912
10 Replies

3. UNIX for Dummies Questions & Answers

Sed, last occurrence

How to find last occurrence of a keyword in a file using sed. (4 Replies)
Discussion started by: nexional
4 Replies

4. Shell Programming and Scripting

number of occurrence

: i need a bash script to convert the displayed output 12 14 15 12 15 13 to 12 * 2 ,13 * 1,14*1,15*1 Thanks, nevil (2 Replies)
Discussion started by: nevil
2 Replies

5. Shell Programming and Scripting

Grabbing Keywords Below a Searched Keyword

Hello, I have a text file like the one found below and wouild like the grab the certain lines after the searched phrase. For example, I'd like to look up "Hello" and once I find the "Hello" section, grab the lines that contain "Text" and stops at the next section. Input.txt Example Hello... (8 Replies)
Discussion started by: jl487
8 Replies

6. UNIX for Dummies Questions & Answers

Breaking up at the second occurrence

hi, My input is: 123 1234|123|123|123 123|123|456 123|123|12 12 Expected output is: 123 1234|123 123|123 123|123 456 123|123 12 (1 Reply)
Discussion started by: pandeesh
1 Replies

7. Shell Programming and Scripting

count a occurrence

I am looking to get a output of "2 apple found" from the awk command below. black:34104 tomonorisoejima$ cat tomo apple apple black:34104 tomonorisoejima$ awk '/apple/ {count++}END{print count " apple found"}' tomo 1 apple found black:34104 tomonorisoejima$ (5 Replies)
Discussion started by: soemac
5 Replies

8. Shell Programming and Scripting

Using sed to substitute first occurrence

I am trying to get rid of some ending tags but I run into some problems. Ex. How are you?</EndTag><Begin>It is fine.</Begin><New> Just about I am trying to get rid of the ending tags, starts with </ and ending with >. (which is </EndTag> and </Begin>) I tried the following sed... (2 Replies)
Discussion started by: quixoticking11
2 Replies

9. Shell Programming and Scripting

Replace second occurrence only

HPUX /bin/sh (posix) I have a file as such cat dog mouse deer elk rabbit mouse rat pig I would like to replace the second occurrence of mouse in this file with mouse2. The rest of the file has to stay exactly as is. I'm not sure exactly where mouse might be (could be first,second,third... (5 Replies)
Discussion started by: lyoncc
5 Replies

10. UNIX for Dummies Questions & Answers

awk + last occurrence

Hi, I'm attempting to search, using awk, a pattern range in a file. Something like: >awk '/first bit of text.../,/...last bit of text/' file Is it possible to print only the last (or first) occurrence of the pattern range this way? Thanks for any suggestions. Al (2 Replies)
Discussion started by: agibbs
2 Replies
Login or Register to Ask a Question