Grep multiple patterns(file) and replace whole line


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Grep multiple patterns(file) and replace whole line
# 1  
Old 06-19-2019
Grep multiple patterns(file) and replace whole line

I am able to grep multiple patterns which stored in a files. However, how could we replace the whole line with either the pattern or new string?

For example:
pattern_file: *Info in the () is not part of the pattern file. They are the intended name to replace the whole line after the pattern found. Listed here for reference.
Code:
hot.*aaa.* (H_A)
cold.*bbb.* (C_B)
cold.*aaa.* (C_A)
(.. lots more)

input_file:
Code:
hot_temp_aaa_first
hot_temp_bbb_first
cold_temp_aaa_last
cold_temp_bbb_first
hot_bake_aaa_last
hot_bake_bbb_last
cold_bake_aaa_last

Expected Output:
Code:
H_A
C_A
C_B
H_A
C_A

The output i get which not able to conclude how many pattern had been found:
Code:
hot_temp_aaa_first
cold_temp_aaa_last
cold_temp_bbb_first
hot_bake_aaa_last
cold_bake_aaa_last

How to replace them with either the new name or pattern name . The reason i want to replace them is that later i need to count how many patterns had been found. Maybe using
Code:
sort -u | wc

.
I stuck after grep all the matched, but do not know how many patterns had been found.
Code:
less input_file | grep -f pattern_file | ... | sort -u | wc

Thank you very much.

Moderator's Comments:
Mod Comment edit by bakunin: please use CODE-tags not only for code but also data and terminal output. Thank you.

Last edited by bakunin; 06-19-2019 at 06:15 AM..
# 2  
Old 06-19-2019
One solution using awk, without converting the original input lines into intermediate format.

Code:
awk -F"_" '{++a[$1$3]} END{for(i in a){print i" "a[i]}}' input_file

Output:
Code:
hotbbb 2
coldbbb 1
hotaaa 2
coldaaa 2

# 3  
Old 06-19-2019
Quote:
Originally Posted by wxboo
How to replace them with either the new name or pattern name . The reason i want to replace them is that later i need to count how many patterns had been found. Maybe using
Code:
sort -u | wc

.
I stuck after grep all the matched, but do not know how many patterns had been found.
Code:
less input_file | grep -f pattern_file | ... | sort -u | wc

OK, first: if you want to change something, grep is not the right tool for it. You should use sed. grep is for finding things - but only finding, not changing them.

Second: before you start on a solution you should define your problem correctly. For instance, your sample input file has seven lines, your expected output has 5. Are the two missing lines left on purpose? If yes, say so. If not, how should they be handled? Maybe let unchanged?

So, let us first rephrase your task. I will make some assumptions here which might as well be wrong. Don't hesitate to correct them:

you have an input file containing certain text patterns and a pattern file which you want to apply to the input. When a pattern is matched you want to replace the whole line in the input with a certain marker, which is defined distinctly for each pattern found that way. Lines not matched by any pattern should be deleted from the result set. In a final step you want to count how many markers of each kind are found in the result set.

Is that correct?

I hope this helps.

bakunin
# 4  
Old 06-19-2019
Another guess what you might want:
Code:
while IFS= read pat
do
  printf "%s match %s times\n" "$pat" $(grep -c "$pat" input_file)
done < pattern_file

Code:
hot.*aaa.* match 2 times
cold.*bbb.* match 1 times
cold.*aaa.* match 2 times

This User Gave Thanks to MadeInGermany For This Post:
# 5  
Old 06-20-2019
Thanks everyone for the input

--- Post updated at 09:11 AM ---

Quote:
Originally Posted by krishmaths
One solution using awk, without converting the original input lines into intermediate format.

Code:
awk -F"_" '{++a[$1$3]} END{for(i in a){print i" "a[i]}}' input_file

Output:
Code:
hotbbb 2
coldbbb 1
hotaaa 2
coldaaa 2

krishmaths, thank you very much for the input.
Useful command that combine the grouping and count together. After that I can filter the group not in the pattern_file and achieve the purpose.
But, the grouping seem to be limited to certain format of input. The input file might have format as below, quite random:
Code:
defect_hot_temp_chk_aaa_first
line_chk_hot_temp_bbb_first
cold_temp_aaa_last
cold_temp_bbb_first
hot_bake_aaa_last
hot_bake_bbb_last
cold_bake_aaa_last
cold_bake_10hrs_aaa_last

--- Post updated at 10:06 AM ---

Quote:
Originally Posted by bakunin
OK, first: if you want to change something, grep is not the right tool for it. You should use sed. grep is for finding things - but only finding, not changing them.

Second: before you start on a solution you should define your problem correctly. For instance, your sample input file has seven lines, your expected output has 5. Are the two missing lines left on purpose? If yes, say so. If not, how should they be handled? Maybe let unchanged?

So, let us first rephrase your task. I will make some assumptions here which might as well be wrong. Don't hesitate to correct them:

you have an input file containing certain text patterns and a pattern file which you want to apply to the input. When a pattern is matched you want to replace the whole line in the input with a certain marker, which is defined distinctly for each pattern found that way. Lines not matched by any pattern should be deleted from the result set. In a final step you want to count how many markers of each kind are found in the result set.

Is that correct?

I hope this helps.

bakunin
bakunin, thank you very much for sorting this out.

My initial thinking is to identify how many patterns can be found for an input file.
Let's say I had 50 lines of patterns and 1000 lines of input. How many patterns are there in these 1000 lines? Maybe 400 lines matched but only 30 patterns. These 400 lines are unique so my idea is to group them and count. That's how I come to grep and replace line work flow.

Focus is not to overwrite the input info. I do not need an output file as well. Everything can do in pipe and get the count is the best.

--- Post updated at 10:30 AM ---

Quote:
Originally Posted by MadeInGermany
Another guess what you might want:
Code:
while IFS= read pat
do
  printf "%s match %s times\n" "$pat" $(grep -c "$pat" input_file)
done < pattern_file

Code:
hot.*aaa.* match 2 times
cold.*bbb.* match 1 times
cold.*aaa.* match 2 times

MadeInGermany, thank you very much for this. This suit what I want to do.

For those who got new label to assign, below is my thinking:
Code:
while IFS= read pat; do printf "%s match %s times\n" $(grep "$pat" pattern_grp | awk '{print $1}') $(grep -c "$pat" input_file); done < pattern_file

Format of pattern_grp:
Code:
H_A hot.*aaa.*
C_B cold.*bbb.*
C_A cold.*aaa.*

Output:
Code:
H_A match 2 times
C_B match 1 times
C_A match 2 times

I use grep one more time to count
Code:
while IFS= read pat; do printf "%s match %s times\n" $(grep "$pat" pattern_grp | awk '{print $1}') $(grep -c "$pat" input_file); done < pattern_file| grep -c '0 times'

*Not a programmer, very limited knowledge, try to use what I have.

Last edited by wxboo; 06-20-2019 at 05:09 AM..
# 6  
Old 06-21-2019
Looks too complicated.
Why 3 input files?
How does you pattern_grp file look like?
Say it looks like
Code:
H_A hot.*aaa.*
C_B cold.*bbb.*
C_A cold.*aaa.*

The value pairs seem related.?
Then you can read both whitespace-separated columns into two variables:
Code:
while read sp pat; do printf "%s alias %s match %s times\n" "$sp" "$pat" "$(grep -c "$pat" input_file)"; done < pattern_grp

But why do you do all the printing with aliaes when at the end you throw the output away, in favor of the amount of the non-matches?
--
BTW each expression in command arguments should be in "quotes", because the shell should not attempt substitutions on it.
So there should be quotes around the $pat argument of the grep command, and another pair around the $( ) argument of the printf command.
The $( ) runs a subshell, so the quotes inside and outside do not conflict. I forgot the outer quotes in my previous post.

Last edited by MadeInGermany; 06-21-2019 at 05:31 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to use grep with multiple patterns?

I am trying to grep a variable with multiple lines with multiple patterns below is the pattern list in a variable called "grouplst", each pattern is speerated by "|" grouplst="example1|example2|example3|example4|example5|example6|example7" I need to use the patterns above to grep a... (2 Replies)
Discussion started by: ajetangay
2 Replies

2. Shell Programming and Scripting

Replace multiple patterns together with retaining the text in between

Hi Team I have the following text in one of the file j1738-abc-system_id(in.value1)-2838 G566-deF-system_id(in.value2)-7489 I want to remove system_id(...) combination completely The output should look like this j1738-abc-in.value1-2838 G566-deF-in.value2-7489 Any help is appreciated... (4 Replies)
Discussion started by: Thierry Henry
4 Replies

3. Shell Programming and Scripting

Grep from multiple patterns multiple file multiple output

Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep... (3 Replies)
Discussion started by: Diya123
3 Replies

4. Shell Programming and Scripting

Grep and replace multiple strings in a file with multiple filenames in a file

Hi, I have a file containing list of strings like i: Pink Yellow Green and I have file having list of file names in a directory j : a b c d Where j contains of a ,b,c,d are as follows a: Pink (3 Replies)
Discussion started by: madabhg
3 Replies

5. Shell Programming and Scripting

Match multiple patterns in a file and then print their respective next line

Dear all, I need to search multiple patterns and then I need to print their respective next lines. For an example, in the below table, I will look for 3 different patterns : 1) # ATC_Codes: 2) # Generic_Name: 3) # Drug_Target_1_Gene_Name: #BEGIN_DRUGCARD DB00001 # AHFS_Codes:... (3 Replies)
Discussion started by: AshwaniSharma09
3 Replies

6. UNIX for Dummies Questions & Answers

replace multiple patterns in a string/filename

This should be somewhat simple, but I need some help with this one. I have a bunch of files with tags on the end like so... Filename {tag1}.ext Filename2 {tag1} {tag2}.ext I want to hold in a variable just the filename with all the " {tag}" removed. The tag can be anything so I'm looking... (4 Replies)
Discussion started by: kerppz
4 Replies

7. Shell Programming and Scripting

grep for multiple patterns

I have a file with many rows. I want to grep for multiple patterns from the file. For eg: XX=123|YY=222|ZZ=566 AA=123|EE=222|GG=566 FF=123|RR=222|GG=566 DD=123|RR=222|GG=566 I want the lines which has both XX and ZZ. I know I can get it like this. grep XX file | grep YY But... (10 Replies)
Discussion started by: tene
10 Replies

8. Shell Programming and Scripting

Grep for Multiple patterns

Hi All, I have a file. I need to find multiple patterns in a row and need those rows to divert to new file. I tried using grep -e / -E / -F options as given in man. But its not working. ==> cat testgrep.txt william,fernandes,xxxxx mark,morsov,yyyy yy=,xx= yyyy=,xxxx== ==>... (7 Replies)
Discussion started by: WillImm123
7 Replies

9. Shell Programming and Scripting

Grep multiple patterns

Hi, Can we grep multiple patterns in UNIX. for example: cat /x/y/oratab | grep -i "pattern1|pattern2" .... etc I require the syntax for multiple patterns. | is not working as I explained in example. Malay (4 Replies)
Discussion started by: malaymaru
4 Replies

10. UNIX for Dummies Questions & Answers

grep for multiple patterns

I want to get a list of all the files in the current directory that have two patterns. I can do first grep of one pattern and then with the output do the grep of the second pattern. if the output of 1st pattern search results in many files, it is very difficult to do a grep of the 2nd pattern for... (1 Reply)
Discussion started by: tselvanin
1 Replies
Login or Register to Ask a Question