Match multiple patterns sequentially in order - grep or awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match multiple patterns sequentially in order - grep or awk
# 1  
Old 07-18-2015
Match multiple patterns sequentially in order - grep or awk

Hello.

grep v2.21
Debian 8

I wish to search for and output these patterns in order;
Code:
"From " "To: " "Subject: " "Message-Id: " "Date: " "To: "

grep works, but not in strict order...

Code:
$ grep -a -E "^From |^Subject:|^From: |^Message-Id: |^Date: |^To: " Inbox

Result;
Code:
From - Wed Feb 18 17:18:31 2015
Subject: Re: Ram pictures
From: "Jonathan @ The Verge " <leasing@theverge.org>
Message-Id: <C9D1D4Z269-D35C-476D-8B5C@aol.com>
Date: Wed, 11 Feb 2015 20:20:31 -0800
To: deejai <Kris@sansibarr.com>

Problem 1;
The 'order' of the patterns may occur differently.
I wish to search each record ("^From " to the next "^From ")
and display the patterns in search order, then on to the next message (^From ").

Problem 2;
I can't get grep or awk to output the multiple pattern returned on one line seperated by a tab.

Problem 3;

If one of the patterns doesn't match, can a string "Blank" be used as a result ?

Desired Result;
Code:
From - Wed Feb 18 17:18:31 2015 <tab> From: howday <howday@netcom.com> Subject: Testing <tab> Message-Id: 20952083209 <tab> Date: Date: Tue, 17 Feb 2015 20:31:12 -0800 <tab> To: "better@joomz.org" <better@joomz.org>

Question;
Can this be done all in grep or awk or a combination of both?

I have tried and so far am stuck, any help would be much appreciated.

I also tried...,
Code:
awk '/^From / || /^Subject: / || /^From: / || /^Message-Id: / ||  /^Date: / || /^To: /' Inbox

looks like it's not displaying 'in order'

Thank you.

Last edited by Don Cragun; 07-18-2015 at 03:26 PM.. Reason: Add CODE and ICODE tags.
# 2  
Old 07-18-2015
The order depends on how the sender put the lines. Try
Code:
awk '
BEGIN           {for (MX=i=split ("From To: Subject: Message-ID: Date:", KEYS); i>0; i--) VALS[KEYS[i]]
                }

function prtit()        {printf "\n"
                         for (i=1; i<=MX; i++) print VALS[KEYS[i]]
                        }

$1 in VALS      {if ($1==KEYS[1] && NR>1) prtit()
                 VALS[$1]=$0}

END             {prtit()}
' /home/...default/Mail/Local\ Folders/Inbox

# 3  
Old 07-18-2015
Hi DSommers,
With the excellent starting point RudiC provided, can you show us how you might modify it to address your Problems 2 & 3?
# 4  
Old 07-19-2015
RudiC, Don. I appreciate your help. Sorry to 'not' mention I wish to achieve my objective on a single line if at all possible. Thank you again. If this isn't possible, I'll pass for now. Cheers.
# 5  
Old 07-19-2015
Not sure I understand. Is your request satisfied, and are you happy with it? Or not, and if so, are you going to improve the proposal, or do you expect us to do so?
# 6  
Old 07-19-2015
First: you can write almost any awk script in one line. For anything this complex, however, I would never do that. I will take readable (and maintainable) over single-line every time.

Second: RudiC gave you a good starting point to reach your goal. If you are playing a game to see if you can get volunteers from the UNIX and Linux Forum to write code for you (with increasing complex requirements as the thread goes on), that isn't what we're here for. We want to help you learn how to write your own code; not act as your unpaid programming staff.

Third: Your requirements are confusing and inconsistent. First you say:
Quote:
I wish to search for and output these patterns in order;
Code:
"From " "To: " "Subject: " "Message-Id: " "Date: " "To: "

Note that "From: " is not in this list. Note that "To: " is in this list twice. Then you say that the output your want is:
Quote:
Code:
From - Wed Feb 18 17:18:31 2015 <tab> From: howday <howday@netcom.com> Subject: Testing <tab> Message-Id: 20952083209 <tab> Date: Date: Tue, 17 Feb 2015 20:31:12 -0800 <tab> To: "better@joomz.org" <better@joomz.org>

which has output in the order:
Code:
"From " "From: " "Subject: " "Message-ID: " "Date: Date: " "To: "

I could assume that the 1st "To: " from the 1st quote above was a typo and that "From: " was the intended header. But, why is "Date: Date: " expected to be part of your output. And, where is there just a <space> between the From: data and the Subject: data instead of <space><tab><space> that you apparently want between the other fields? Why do you want <space><tab><space> separators between fields instead of just a <tab>?

Then you say:
Quote:
Problem 3;
If one of the patterns doesn't match, can a string "Blank" be used as a result ?
But, you don't show us any example of what that should look like in the output. If, for example, there is no Subject: line in a mail message, are you hoping to get: Subject: "Blank", Subject: Blank, "Blank", or Blank for the Subject: section in your output?

Please take the suggestion RudiC provided and try to modify it to meet your requirements. If you run into problems, show us what you have tried and explain where you are stuck, and we'll be happy to help you. Remember to use CODE tags when presenting sample input, output, and code segments. And, be aware that if you present a single-line attempt to solve this problem, it will make it much harder (and less likely) for us to see the structure of your code, to understand the logic behind your attempt to solve your problem, and for us to figure out what needs to be fixed to turn code that doesn't quite do what you want into code that does exactly what you want.
This User Gave Thanks to Don Cragun For This Post:
# 7  
Old 07-19-2015
Don, understood. Thank you for your help. Take care.
Cheers.

Quote:
"First: you can write almost any awk script in one line. For anything this complex, however, I would never do that. I will take readable (and maintainable) over single-line every time."
[SOLVED]
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

2. Shell Programming and Scripting

How to use grep with multiple patterns?

I am trying to grep a variable with multiple lines with multiple patterns below is the pattern list in a variable called "grouplst", each pattern is speerated by "|" grouplst="example1|example2|example3|example4|example5|example6|example7" I need to use the patterns above to grep a... (2 Replies)
Discussion started by: ajetangay
2 Replies

3. Shell Programming and Scripting

Grep from multiple patterns multiple file multiple output

Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep... (3 Replies)
Discussion started by: Diya123
3 Replies

4. Shell Programming and Scripting

Match multiple patterns in a file and then print their respective next line

Dear all, I need to search multiple patterns and then I need to print their respective next lines. For an example, in the below table, I will look for 3 different patterns : 1) # ATC_Codes: 2) # Generic_Name: 3) # Drug_Target_1_Gene_Name: #BEGIN_DRUGCARD DB00001 # AHFS_Codes:... (3 Replies)
Discussion started by: AshwaniSharma09
3 Replies

5. Shell Programming and Scripting

print lines which match multiple patterns

Hi, I have a text file as follows: 11:38:11.054 run1_rdseq avg_2-5 999988.0000 1024.0000 11:50:52.053 run3_rdrand 999988.0000 1135.0 128.0417 11:53:18.050 run4_wrrand avg_2-5 999988.0000 8180.5833 11:55:42.051 run4_wrrand avg_2-5 999988.0000 213.8333 11:55:06.053... (2 Replies)
Discussion started by: annazpereira
2 Replies

6. Shell Programming and Scripting

grep for multiple patterns

I have a file with many rows. I want to grep for multiple patterns from the file. For eg: XX=123|YY=222|ZZ=566 AA=123|EE=222|GG=566 FF=123|RR=222|GG=566 DD=123|RR=222|GG=566 I want the lines which has both XX and ZZ. I know I can get it like this. grep XX file | grep YY But... (10 Replies)
Discussion started by: tene
10 Replies

7. Shell Programming and Scripting

Perl: Match a line with multiple search patterns

Hi I'm not very good with the serach patterns and I'd need a sample how to find a line that has multiple patterns. Say I want to find a line that has "abd", "123" and "QWERTY" and there can be any characters or numbers between the serach patterns, I have a file that has thousands of lines and... (10 Replies)
Discussion started by: Juha
10 Replies

8. Shell Programming and Scripting

Grep for Multiple patterns

Hi All, I have a file. I need to find multiple patterns in a row and need those rows to divert to new file. I tried using grep -e / -E / -F options as given in man. But its not working. ==> cat testgrep.txt william,fernandes,xxxxx mark,morsov,yyyy yy=,xx= yyyy=,xxxx== ==>... (7 Replies)
Discussion started by: WillImm123
7 Replies

9. Shell Programming and Scripting

Grep multiple patterns

Hi, Can we grep multiple patterns in UNIX. for example: cat /x/y/oratab | grep -i "pattern1|pattern2" .... etc I require the syntax for multiple patterns. | is not working as I explained in example. Malay (4 Replies)
Discussion started by: malaymaru
4 Replies

10. UNIX for Dummies Questions & Answers

grep for multiple patterns

I want to get a list of all the files in the current directory that have two patterns. I can do first grep of one pattern and then with the output do the grep of the second pattern. if the output of 1st pattern search results in many files, it is very difficult to do a grep of the 2nd pattern for... (1 Reply)
Discussion started by: tselvanin
1 Replies
Login or Register to Ask a Question