String comparison: All lines in file but each within line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting String comparison: All lines in file but each within line
# 1  
Old 03-07-2010
String comparison: All lines in file but each within line

Hi folks,
I'm trying to do a bit of extra monitoring on who is sending mail through our server and as I'm sure you'll understand, the log files aren't exactly small.
As a note, I'm working to sort out a solution to this myself, so I'll keep editing the post when I get a chance.

Here an example exim log line. There are other lines in the file for other types of email that are to be ignored completely.

Code:
2010-03-07 11:34:16 1NoEkR-000AJf-NV <= email@domain.tld H=11.22.33.44 (helo.domain.tld) [11.22.33.44] P=esmtpa A=fixed_login:username S=35249 id=messageid@domain.tld

There are three cases for valid mail through the system being sent with fixed_login.
  1. The username for fixed_login will match the email@domain.tld earlier in the line
  2. The username will be a sub string of the domain specified in email@domain.tld earlier in the line
  3. There are some cases where multiple domains come from one server. These domains are all the same, but with different tlds. So, we have cases where the username is user@domain.co.uk but email comes from user@domain.com.

So what I need to put together is a script that does the following:

1) Scans through the logfile and ignores all lines that do not have "fixed_login" in it. ie "grep fixed_login mainlog" - sorted

We then move onto processing and comparison which I know nothing about, except for how to plan it. Best I can tell, the following would produce a script that would catch all lines that don't match the criteria

2) Search the line for "fixed_login:username" and discard "fixed_login:" giving string A. If string A looks like an email address, discard the "email@" and ".tld" parts.

3) Search the same line for the first instance of anything that looks like an email address. Discard the "email@" and ".tld" parts thus giving string B "domain"

4) Convert strings A and B to lowercase and compare, if they match or a is a substring of B then ignore the line and move onto the next line.

5) If they don't match we then need to take the third variable (message ID) and run a grep on the logfile for that ID. All lines thus relating to the message will be output before moving on to the next line.

6) The final output after all lines are checked will be emailed off (this will be run from cron so just standard output will do and cron will take care of the mail)

I did use awk some years back for basic string processing but can't remember any of it now! I can't use $9 or things like that because there are a few different types of lines that put different bits of information in and change the placement.

So far I have a sed line working on stripping things down but its getting a bit clumsy.

Code:
zgrep fixed_login mainlog.0.gz | sed 's/.* \(.*\) <= \(.*\) H=.*A=fixed_login:\(.*\) S.*/\1 \2 \3/g' | sed 's/\(.*\) .*@\(.*\)\..* /\1 \2 /g' | sed 's/\(.*\) .*@\(.*\)\..*$/\1 \2/g'

I had to go into two sed instances because I needed the first output to also match lines where the second and third variables don't have an email@domain.tld. I then had to add a third sed because parameter 2 ends on a space, 3 in EOL. Sticking a space on the end of the whole string and trying to use one sed produced completely the wrong results!

There are however two problems with this. Firstly, if the domain is email@domain.co.uk or some other double barrel tld, it only strips the last bit (.uk). Secondly, some of the users have used their usernames as email+domain.tld so I need to add @|+ into the 2nd and 3rd seds. I'm having trouble figuring out how to put the pip in and only have it affect the character immediately either side of it rather than the whole expression. I'd rather fix that than go silly and put a couple of extra seds on!

Last edited by beddo; 03-07-2010 at 09:57 AM..
# 2  
Old 03-09-2010
In my opinion, It would be easier if you post a bigger sample of your input data (the exim log) and an example of the desired output.
# 3  
Old 03-11-2010
I'm not sure that would work as the output depends on the content. That is why I have describe the different stages. Maybe I can break apart the different stages into examples of input and output but the main problem is that I want to take the output of some stages and use it to as an input to other stages!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Insert String every n lines, resetting line counter at desired string

I need to read a text file and insert a string every n lines, but also have the line counter restart when I come across a header string. Line repeating working every 3 lines using code: sed '0~3 s/$/\nINSERT/g' < INPUT/PATH/FILE_NAME.txt > OUTPUT/PATH/FILE_NAME.txt I cannot seem to find... (1 Reply)
Discussion started by: Skonectthedots
1 Replies

2. UNIX for Advanced & Expert Users

How to find a string in a line in UNIX file and delete that line and previous 3 lines ?

Hi , i have a file with data as below.This is same file. But actual file contains to many rows. i want to search for a string "Field 039 00" and delete that line and previous 3 lines in that file.. Can some body suggested me how can i do using either sed or awk command ? Field 004... (7 Replies)
Discussion started by: vadlamudy
7 Replies

3. Shell Programming and Scripting

Comparison of fields then increment a counter reading line by line in a file

Hi, i have a scenario were i should compare a few fields from each line then increment a variable based on that. Example file 989878|8999|Y|0|Y|N|V 989878|8999|Y|0|N|N|V 989878|8999|Y|2344|Y|N|V i have 3 conditions to check and increment a variable on every line condition 1 if ( $3... (4 Replies)
Discussion started by: selvankj
4 Replies

4. Shell Programming and Scripting

String search and print next all lines in one line until blank line

Dear all I want to search special string in file and then print next all line in one line until blank lines come. Help me plz for same. My input file and desire op file is as under. i/p file: A1/EXT "BSCABD1_21233G1" 757 130823 1157 RADIO X-CEIVER ADMINISTRATION BTS EXTERNAL FAULT ... (7 Replies)
Discussion started by: jaydeep_sadaria
7 Replies

5. Shell Programming and Scripting

Faster Line by Line String/Date Comparison of 2 Files

Hello, I was wondering if anyone knows a faster way to search and compare strings and dates from 2 files? I'm currently using "for loop" but seems sluggish as i have to cycle through 10 directories with 10 files each containing thousands of lines. Given: -10 directories -10 files... (4 Replies)
Discussion started by: agentgrecko
4 Replies

6. Shell Programming and Scripting

awk file comparison, x lines after matching as output

Hello, I couldn't find anything on the Forum that would help me to solve this problem. Could any body help me process below data using awk? I have got two files: file1: Worker1: Thomas Position: Manager Department: Sales Salary: $5,000 Worker2: Jason Position: ... (5 Replies)
Discussion started by: killerbee
5 Replies

7. Shell Programming and Scripting

Grep a string from input file and delete next three lines including the line contains string in xml

Hi, 1_strings file contains $ cat 1_strings /home/$USER/Src /home/Valid /home/Review$ cat myxml <projected value="some string" path="/home/$USER/Src"> <input 1/> <estimate value/> <somestring/> </projected> <few more lines > <projected value="some string" path="/home/$USER/check">... (4 Replies)
Discussion started by: greet_sed
4 Replies

8. Shell Programming and Scripting

replace (sed?) a single line/string in file with multiple lines (string) from another file??

Can someone tell me how I can do this? e.g: Say file1.txt contains: today is monday the 22 of NOVEMBER 2010 and file2.txt contains: the 11th month of How do i replace the word NOVEMBER with (5 Replies)
Discussion started by: tuathan
5 Replies

9. Shell Programming and Scripting

search string in a file and retrieve 10 lines including string line

Hi Guys, I am trying to write a perl script to search a string "Name" in the file "FILE" and also want to create a new file and push the searched string Name line along with 10 lines following the same. can anyone of you please let me know how to go about it ? (8 Replies)
Discussion started by: sukrish
8 Replies

10. Shell Programming and Scripting

Ignore some lines with specific words from file comparison

Hi all, I need help in doing this scenario. I have two files with multiple lines. I want to compare these two files but ignoring the lines which have words like Tran, Loc, Addr, Charge. Also if i have a word Credit in line, i want to tokenize (i.e string after character " ... (2 Replies)
Discussion started by: jakSun8
2 Replies
Login or Register to Ask a Question