Need help with a shell script for finding a certain pattern from log file


 
Thread Tools Search this Thread
Operating Systems Linux Red Hat Need help with a shell script for finding a certain pattern from log file
# 1  
Old 02-08-2011
Need help with a shell script for finding a certain pattern from log file

Hell Guys,

Being a newbie, I need some help in finding a certain string from a log file of thousands of lines (around 30K lines) and have the output in a separate file.

Below is the file output -
Code:
10.155.65.5 - - [20/Jan/2011:07:41:58 +0100] "POST /cas/login?post=true&service=http://test.domain.ca:8000/psp/ppf88prd/?cmd=login%26languageCd=CAN%26userid=VP1%26pwd=TEST123 HTTP/1.1" 200 888

Now what I am after is I need to separate the output after "&service=" and I need to catch the http:// url for now.

Please note that I need the output in a separate file with each search on a different line (basically neatly arranged).
Now one of the other problems is that the log file also contains some other log entries such as below:
Code:
19.489.50.8 - - [25/Jan/2011:00:00:11 +0100] "GET /cas/themes/testdomains/fondCas.jpg HTTP/1.1" 200 29659
17.538.23.034 - - [25/Jan/2011:00:00:12 +0100] "GET /cas/status.jsp HTTP/1.0" 200 104

And I need not check them since I am looking for log lines that have "&service=" and only wish to catch the url after this pattern on a separate line in a separate file.

I have looked up on a lot of threads doing similar things and lot of very helpful smalls conditions using grep, sed and awk being offered. Though being a novice in all these, I find it almost impossible to tweak them as per my requirement and thus I have posted it here.

Would really appriciate if someone can guide me on this.

Thanks,
Andy


Moderator's Comments:
Mod Comment Please use code tags when posting data and code samples, thank you.

Last edited by Franklin52; 02-08-2011 at 03:38 AM..
# 2  
Old 02-08-2011
Something like this?
Code:
sed -n 's!.*service=\(http://[^:]*\):.*!\1!p' logfile > newfile

# 3  
Old 02-08-2011
SmilieTruely amazing Frank!!!SmilieSmilie
You just did an awesome job and it did work for me Smilie

Guess I got really excited since this was my first post though I have been visiting this forum for a while now.

If you don't mind, can you please tell me a bit about those regular expressions?
I assume ^ stands for first line though not sure about !, ', ], p etc..

I am learning all these though it would be at least a good couple of months before I get close to this. One last favour was I already have a host of online sites, contents, ebooks to go through on general linux and unix stuff though if you can refer me to any good books for scripting beginners on both shell and perl, that would be really great. Though, this is not very urgent.

Thanks once again,
Andy.
# 4  
Old 02-08-2011
Code:
sed -n 's!.*service=\(http://[^:]*\):.*!1!p' logfile > newfile

You can use a saved substring with \(.*\) which can be recalled with \1

Code:
\(http://[^:]*\)

This substring contains the part after "service=" and it starts with "http://".

[^:]* means characters that doesn't contain any colons.

:.* after the substring is the next semicolon and the rest of the line.

Here you can find some tutorial links:

https://www.unix.com/answers-frequent...tutorials.html

An excellent book for sed and awk:

http://oreilly.com/catalog/9781565922259
# 5  
Old 02-08-2011
One more thing I wanted was to remove any duplicate lines in a bunch of hundreds of lines.

For e.g - If I have following,

portal-test-domain-com
portal-test-domain-com
portal-test-domain-com

I just need them to be only one line saying portal-test-domain-com instead of 3.

Also I need to remove certain parts of each line.

For e.g -
abbbs://porta-test-domain.com/wps/urportal/weber/localbr&ticket=ST-385945-1cdbuEe1o57neMuMgIWa-1681078B-DFC2-7B56-E58B-AA15B18411AD&pgtUrl=https

I need to remove first abbbs and remove everything after domain.com.


If someone can help me, that would be really great since I am still learning these things and it will be at least good couple of weeks before I would do them on my own.

Thanks in advance,
Andy.

---------- Post updated at 05:05 PM ---------- Previous update was at 04:30 PM ----------

Ok, I have just managed to remove the first http:// part from ALL lines of the log file and have also remove the https part of ALL lines to have a much better output for now.

The only part remaining for now is to find and remove all duplicate URLs as mentioned above. Will post a fix here if I find any; till then all suggestions are welcome.

Thanks,
Andy
# 6  
Old 02-08-2011
To remove duplicates from your output you do something like:
Code:
<commands> | awk '!a[$0]++'

This User Gave Thanks to Franklin52 For This Post:
# 7  
Old 02-08-2011
Sorry for bothering you, I just did that and tried few sed and uniq options to play around as well.

Thanks Frank for all your help, amazing how much one can learn in less than 24 hours; this was MUCH helpful for a newbie like me.

Best Regards,
Andy
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding the file name in a directory with a pattern

I need to find the latest file -filename_YYYYMMDD in the directory DIR. the below is not working as the position is shifting each time because of the spaces between(occuring mostly at file size field as it differs every time.) please suggest if there is other way. report =‘ls -ltr... (2 Replies)
Discussion started by: archana25
2 Replies

2. Shell Programming and Scripting

Finding file extension on C Shell script

I very new to C Shell. I am trying to do is read from Command line. Find the if the file is zip, .txt, symbloic link,pipe, unknow (if file is not zip, txt, sy....) here is what I what got so far. I am very stuck atm Please help me out : If the file is symblooc link what file is link to ... (12 Replies)
Discussion started by: madbull41
12 Replies

3. Shell Programming and Scripting

Script to search for a pattern in 30 minutes from a log file

Hello All, I have to write a script which will search for diffrent patterns like "Struck" "Out of Memory" , etc from a log file in Linux box's. Now I will be executing a cron job to find out the results by executing the script once in every 30 minutes. suppose time is 14-04-29:05:31:09 So I... (3 Replies)
Discussion started by: Shubhasis Mathr
3 Replies

4. Shell Programming and Scripting

Finding file pattern in ksh 88

Hi, I've to find the file which has the pattern "Delete Report for History Tables" and need to search this file pattern from directory which has sub directories as well. I'm using ksh 88 Please suggest me which command will be used to find the file pattern . Thanks. (1 Reply)
Discussion started by: smile689
1 Replies

5. Shell Programming and Scripting

Finding the pattern and replacing the pattern inside the file

i have little challenge, help me out.i have a file where i have a value declared and and i have to replace the value when called. for example i have the value for abc and ccc. now i have to substitute the value of value abc and ccc in the place of them. Input File: go to &abc=ddd; if... (16 Replies)
Discussion started by: saaisiva
16 Replies

6. Shell Programming and Scripting

Finding log files that match number pattern

I have logs files which are generated each day depending on how many processes are running. Some days it could spin up 30 processes. Other days it could spin up 50. The log files all have the same pattern with the number being the different factor. e.g. LOG_FILE_1.log LOG_FILE_2.log etc etc ... (2 Replies)
Discussion started by: atelford
2 Replies

7. Homework & Coursework Questions

shell script that can create, monitor the log files and report the issues for matching pattern

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Write an automated shell program(s) that can create, monitor the log files and report the issues for matching... (0 Replies)
Discussion started by: itian2010
0 Replies

8. Shell Programming and Scripting

Script to monitor the pattern in the log file

hi All, how to find a pattern in the log file & display the above and below line for example in the log file, i have many lines, whenever i search for "Category" it should display the above line with only few parameter like i want only the location name & department name Thu Jul 02 11:05:23... (2 Replies)
Discussion started by: rithick256
2 Replies

9. Shell Programming and Scripting

script for finding an error from a log file

Hi , I have a doubt about a shell script to find an ERROR from the log file. But I need to specify a the scan from a particular date and time in the log. Till now I have developed the following script. Please suggest what shall I add in this for date and time. If the script finds a particular... (5 Replies)
Discussion started by: himvat
5 Replies

10. Shell Programming and Scripting

help with finding & replacing pattern in a file

Hi everyone. Could u be so kind and help me with on "simple" shell script? 1. i need to search a file line by line for a pattern. example of a lines in that file 2947 domain = feD,id = 00 0A 02 48 17 1E 1D 39 DE 00 0E 00,Name Values:snNo = f10 Add AttFlag = 0 2. i need to find... (0 Replies)
Discussion started by: dusoo
0 Replies
Login or Register to Ask a Question