Sponsored Content
Top Forums Shell Programming and Scripting Match multiple patterns sequentially in order - grep or awk Post 302950068 by Don Cragun on Monday 20th of July 2015 04:35:21 PM
Old 07-20-2015
I'm sorry that you feel that way.

In this forum, we try to help people learn how to use the common tools provided on their system to do what they're trying to do; not just provide complete scripts.

When I make a suggestion on how to do something, I try to provide a script that will take sample input provided by the submitter (but there was none in this case) that will produce output that exactly matches the desired output specified by the submitter. While I could make up sample input data and write a script that would produce the sample output you provided, it would not match the description you provided for what you said you wanted to do. And the specification of what you wanted for output for missing fields (with no example output for that case) was ambiguous.

If you had answered any of my questions or had shown that you had tried to fix RudiC's suggested code to more exactly meet your requirements, I would have suggested that you try something like:
Code:
awk '
BEGIN {	# Set search pattern:
	pat = "^From |^From: |^Subject: |^Message-Id: |^Date: |^To: "
	# Extract mail message headers from search pattern...
	nh = split(pat, h, "|")
	for(i = 1; i <= nh; i++) {
		# Remove ">" and " " from the headers:
		gsub(/[ ^]/, "", h[i])
		# Set field to be printed for missing headers:
		b[i] = sprintf("%s \"Blank\"", h[i])
	}
}
function dump() {
	# Function to print headers from a mail message...
	printf("File: \"%s\" message #%d\n", FILENAME, ++msgcnt)
	for(i = 1; i <= nh; i++)
		if(h[i] in d) {
			printf("%s%s", d[h[i]], i == nh ? "\n" : "\t")
			delete d[h[i]]
		} else	printf("%s%s", b[i], i == nh ? "\n" : "\t")
}
FNR == 1 {
	# 1st line of new file found, print final results from previous file
	# and reset counters for this file.
	if(found)
		# Print headers from last mail message in previous file...
		dump()
	found = msgcnt = 0
}
/^From / && found++ {
	# Print headers from previous mail message...
	dump()
}
$0 ~ pat {
	# Gather data from current mail message...
	d[$1] = $0
}
END {	# Print headers from last mail message...
	if(found)
		dump()
}' inbox1 inbox2 inbox3...

Which produces output that I believe matches what you described in post #1 (except that it also outputs a line showing the file from which each message came and the sequence number within that file in case you want to process more than one file at a time) and makes a guess at the output you wanted for missing fields. I could also produce a 1-liner version of it, just to show that it can be done, but it wouldn't help people trying to learn how to write code to give them something that looks like it was intended to be an obfuscated code contest entry:
Code:
awk 'BEGIN{e="^From |^From: |^Subject: |^Message-Id: |^Date: |^To: ";n=split(e,h,"|");for(i=1;i<=n;i++){gsub(/[ ^]/,"",h[i]);b[i]=sprintf("%s \"Blank\"",h[i])}}function p(){printf("File: \"%s\" message #%d\n",FILENAME,++c);for(i=1;i<=n;i++)if(h[i] in d){printf("%s%s",d[h[i]],i==n?"\n":"\t");delete d[h[i]]}else printf("%s%s",b[i],i==n?"\n":"\t")}FNR==1{if(f)p();f=c=0}/^From /&&f++{p()}$0~e{d[$1]=$0}END{if(f)p()}' inbox1 inbox2 inbox3...

As I said before, if there someone showed me code like the above 1-liner and asked me to help them fix it; I would tell them to find someone else to clean up their mess.

As always, with either of these scripts, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

grep for multiple patterns

I want to get a list of all the files in the current directory that have two patterns. I can do first grep of one pattern and then with the output do the grep of the second pattern. if the output of 1st pattern search results in many files, it is very difficult to do a grep of the 2nd pattern for... (1 Reply)
Discussion started by: tselvanin
1 Replies

2. Shell Programming and Scripting

Grep multiple patterns

Hi, Can we grep multiple patterns in UNIX. for example: cat /x/y/oratab | grep -i "pattern1|pattern2" .... etc I require the syntax for multiple patterns. | is not working as I explained in example. Malay (4 Replies)
Discussion started by: malaymaru
4 Replies

3. Shell Programming and Scripting

Grep for Multiple patterns

Hi All, I have a file. I need to find multiple patterns in a row and need those rows to divert to new file. I tried using grep -e / -E / -F options as given in man. But its not working. ==> cat testgrep.txt william,fernandes,xxxxx mark,morsov,yyyy yy=,xx= yyyy=,xxxx== ==>... (7 Replies)
Discussion started by: WillImm123
7 Replies

4. Shell Programming and Scripting

Perl: Match a line with multiple search patterns

Hi I'm not very good with the serach patterns and I'd need a sample how to find a line that has multiple patterns. Say I want to find a line that has "abd", "123" and "QWERTY" and there can be any characters or numbers between the serach patterns, I have a file that has thousands of lines and... (10 Replies)
Discussion started by: Juha
10 Replies

5. Shell Programming and Scripting

grep for multiple patterns

I have a file with many rows. I want to grep for multiple patterns from the file. For eg: XX=123|YY=222|ZZ=566 AA=123|EE=222|GG=566 FF=123|RR=222|GG=566 DD=123|RR=222|GG=566 I want the lines which has both XX and ZZ. I know I can get it like this. grep XX file | grep YY But... (10 Replies)
Discussion started by: tene
10 Replies

6. Shell Programming and Scripting

print lines which match multiple patterns

Hi, I have a text file as follows: 11:38:11.054 run1_rdseq avg_2-5 999988.0000 1024.0000 11:50:52.053 run3_rdrand 999988.0000 1135.0 128.0417 11:53:18.050 run4_wrrand avg_2-5 999988.0000 8180.5833 11:55:42.051 run4_wrrand avg_2-5 999988.0000 213.8333 11:55:06.053... (2 Replies)
Discussion started by: annazpereira
2 Replies

7. Shell Programming and Scripting

Match multiple patterns in a file and then print their respective next line

Dear all, I need to search multiple patterns and then I need to print their respective next lines. For an example, in the below table, I will look for 3 different patterns : 1) # ATC_Codes: 2) # Generic_Name: 3) # Drug_Target_1_Gene_Name: #BEGIN_DRUGCARD DB00001 # AHFS_Codes:... (3 Replies)
Discussion started by: AshwaniSharma09
3 Replies

8. Shell Programming and Scripting

Grep from multiple patterns multiple file multiple output

Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep... (3 Replies)
Discussion started by: Diya123
3 Replies

9. Shell Programming and Scripting

How to use grep with multiple patterns?

I am trying to grep a variable with multiple lines with multiple patterns below is the pattern list in a variable called "grouplst", each pattern is speerated by "|" grouplst="example1|example2|example3|example4|example5|example6|example7" I need to use the patterns above to grep a... (2 Replies)
Discussion started by: ajetangay
2 Replies

10. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies
All times are GMT -4. The time now is 02:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy