Today (Saturday) We will make some minor tuning adjustments to MySQL.

You may experience 2 up to 10 seconds "glitch time" when we restart MySQL. We expect to make these adjustments around 1AM Eastern Daylight Saving Time (EDT) US.


Grep multiple patterns(file) and replace whole line


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Grep multiple patterns(file) and replace whole line

I am able to grep multiple patterns which stored in a files. However, how could we replace the whole line with either the pattern or new string?

For example:
pattern_file: *Info in the () is not part of the pattern file. They are the intended name to replace the whole line after the pattern found. Listed here for reference.
Code:
hot.*aaa.* (H_A)
cold.*bbb.* (C_B)
cold.*aaa.* (C_A)
(.. lots more)

input_file:
Code:
hot_temp_aaa_first
hot_temp_bbb_first
cold_temp_aaa_last
cold_temp_bbb_first
hot_bake_aaa_last
hot_bake_bbb_last
cold_bake_aaa_last

Expected Output:
Code:
H_A
C_A
C_B
H_A
C_A

The output i get which not able to conclude how many pattern had been found:
Code:
hot_temp_aaa_first
cold_temp_aaa_last
cold_temp_bbb_first
hot_bake_aaa_last
cold_bake_aaa_last

How to replace them with either the new name or pattern name . The reason i want to replace them is that later i need to count how many patterns had been found. Maybe using
Code:
sort -u | wc

.
I stuck after grep all the matched, but do not know how many patterns had been found.
Code:
less input_file | grep -f pattern_file | ... | sort -u | wc

Thank you very much.

Moderator's Comments:
Mod Comment edit by bakunin: please use CODE-tags not only for code but also data and terminal output. Thank you.

Last edited by bakunin; 4 Weeks Ago at 06:15 AM..
# 2  
One solution using awk, without converting the original input lines into intermediate format.

Code:
awk -F"_" '{++a[$1$3]} END{for(i in a){print i" "a[i]}}' input_file

Output:
Code:
hotbbb 2
coldbbb 1
hotaaa 2
coldaaa 2

# 3  
Quote:
Originally Posted by wxboo
How to replace them with either the new name or pattern name . The reason i want to replace them is that later i need to count how many patterns had been found. Maybe using
Code:
sort -u | wc

.
I stuck after grep all the matched, but do not know how many patterns had been found.
Code:
less input_file | grep -f pattern_file | ... | sort -u | wc

OK, first: if you want to change something, grep is not the right tool for it. You should use sed. grep is for finding things - but only finding, not changing them.

Second: before you start on a solution you should define your problem correctly. For instance, your sample input file has seven lines, your expected output has 5. Are the two missing lines left on purpose? If yes, say so. If not, how should they be handled? Maybe let unchanged?

So, let us first rephrase your task. I will make some assumptions here which might as well be wrong. Don't hesitate to correct them:

you have an input file containing certain text patterns and a pattern file which you want to apply to the input. When a pattern is matched you want to replace the whole line in the input with a certain marker, which is defined distinctly for each pattern found that way. Lines not matched by any pattern should be deleted from the result set. In a final step you want to count how many markers of each kind are found in the result set.

Is that correct?

I hope this helps.

bakunin
# 4  
Another guess what you might want:
Code:
while IFS= read pat
do
  printf "%s match %s times\n" "$pat" $(grep -c "$pat" input_file)
done < pattern_file

Code:
hot.*aaa.* match 2 times
cold.*bbb.* match 1 times
cold.*aaa.* match 2 times

This User Gave Thanks to MadeInGermany For This Post:
# 5  
Thanks everyone for the input

--- Post updated at 09:11 AM ---

Quote:
Originally Posted by krishmaths
One solution using awk, without converting the original input lines into intermediate format.

Code:
awk -F"_" '{++a[$1$3]} END{for(i in a){print i" "a[i]}}' input_file

Output:
Code:
hotbbb 2
coldbbb 1
hotaaa 2
coldaaa 2

krishmaths, thank you very much for the input.
Useful command that combine the grouping and count together. After that I can filter the group not in the pattern_file and achieve the purpose.
But, the grouping seem to be limited to certain format of input. The input file might have format as below, quite random:
Code:
defect_hot_temp_chk_aaa_first
line_chk_hot_temp_bbb_first
cold_temp_aaa_last
cold_temp_bbb_first
hot_bake_aaa_last
hot_bake_bbb_last
cold_bake_aaa_last
cold_bake_10hrs_aaa_last

--- Post updated at 10:06 AM ---

Quote:
Originally Posted by bakunin
OK, first: if you want to change something, grep is not the right tool for it. You should use sed. grep is for finding things - but only finding, not changing them.

Second: before you start on a solution you should define your problem correctly. For instance, your sample input file has seven lines, your expected output has 5. Are the two missing lines left on purpose? If yes, say so. If not, how should they be handled? Maybe let unchanged?

So, let us first rephrase your task. I will make some assumptions here which might as well be wrong. Don't hesitate to correct them:

you have an input file containing certain text patterns and a pattern file which you want to apply to the input. When a pattern is matched you want to replace the whole line in the input with a certain marker, which is defined distinctly for each pattern found that way. Lines not matched by any pattern should be deleted from the result set. In a final step you want to count how many markers of each kind are found in the result set.

Is that correct?

I hope this helps.

bakunin
bakunin, thank you very much for sorting this out.

My initial thinking is to identify how many patterns can be found for an input file.
Let's say I had 50 lines of patterns and 1000 lines of input. How many patterns are there in these 1000 lines? Maybe 400 lines matched but only 30 patterns. These 400 lines are unique so my idea is to group them and count. That's how I come to grep and replace line work flow.

Focus is not to overwrite the input info. I do not need an output file as well. Everything can do in pipe and get the count is the best.

--- Post updated at 10:30 AM ---

Quote:
Originally Posted by MadeInGermany
Another guess what you might want:
Code:
while IFS= read pat
do
  printf "%s match %s times\n" "$pat" $(grep -c "$pat" input_file)
done < pattern_file

Code:
hot.*aaa.* match 2 times
cold.*bbb.* match 1 times
cold.*aaa.* match 2 times

MadeInGermany, thank you very much for this. This suit what I want to do.

For those who got new label to assign, below is my thinking:
Code:
while IFS= read pat; do printf "%s match %s times\n" $(grep "$pat" pattern_grp | awk '{print $1}') $(grep -c "$pat" input_file); done < pattern_file

Format of pattern_grp:
Code:
H_A hot.*aaa.*
C_B cold.*bbb.*
C_A cold.*aaa.*

Output:
Code:
H_A match 2 times
C_B match 1 times
C_A match 2 times

I use grep one more time to count
Code:
while IFS= read pat; do printf "%s match %s times\n" $(grep "$pat" pattern_grp | awk '{print $1}') $(grep -c "$pat" input_file); done < pattern_file| grep -c '0 times'

*Not a programmer, very limited knowledge, try to use what I have.

Last edited by wxboo; 3 Weeks Ago at 05:09 AM..
# 6  
Looks too complicated.
Why 3 input files?
How does you pattern_grp file look like?
Say it looks like
Code:
H_A hot.*aaa.*
C_B cold.*bbb.*
C_A cold.*aaa.*

The value pairs seem related.?
Then you can read both whitespace-separated columns into two variables:
Code:
while read sp pat; do printf "%s alias %s match %s times\n" "$sp" "$pat" "$(grep -c "$pat" input_file)"; done < pattern_grp

But why do you do all the printing with aliaes when at the end you throw the output away, in favor of the amount of the non-matches?
--
BTW each expression in command arguments should be in "quotes", because the shell should not attempt substitutions on it.
So there should be quotes around the $pat argument of the grep command, and another pair around the $( ) argument of the printf command.
The $( ) runs a subshell, so the quotes inside and outside do not conflict. I forgot the outer quotes in my previous post.

Last edited by MadeInGermany; 3 Weeks Ago at 05:31 AM..
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
How to use grep with multiple patterns?
ajetangay
I am trying to grep a variable with multiple lines with multiple patterns below is the pattern list in a variable called "grouplst", each pattern is speerated by "|" grouplst="example1|example2|example3|example4|example5|example6|example7" I need to use the patterns above to grep a...... Shell Programming and Scripting
2
Shell Programming and Scripting
Grep from multiple patterns multiple file multiple output
Diya123
Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep...... Shell Programming and Scripting
3
Shell Programming and Scripting
Grep and replace multiple strings in a file with multiple filenames in a file
madabhg
Hi, I have a file containing list of strings like i: Pink Yellow Green and I have file having list of file names in a directory j : a b c d Where j contains of a ,b,c,d are as follows a: Pink... Shell Programming and Scripting
3
Shell Programming and Scripting
Match multiple patterns in a file and then print their respective next line
AshwaniSharma09
Dear all, I need to search multiple patterns and then I need to print their respective next lines. For an example, in the below table, I will look for 3 different patterns : 1) # ATC_Codes: 2) # Generic_Name: 3) # Drug_Target_1_Gene_Name: #BEGIN_DRUGCARD DB00001 # AHFS_Codes:...... Shell Programming and Scripting
3
Shell Programming and Scripting
Grep multiple patterns
malaymaru
Hi, Can we grep multiple patterns in UNIX. for example: cat /x/y/oratab | grep -i "pattern1|pattern2" .... etc I require the syntax for multiple patterns. | is not working as I explained in example. Malay... Shell Programming and Scripting
4
Shell Programming and Scripting

Featured Tech Videos