Getting info from a huge log file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Getting info from a huge log file
# 1  
Old 09-10-2011
Getting info from a huge log file

Hello everyone.

I am having problem with parsing a data from the huge log file. the log file is an application log with around 5 Gb in size and it rotates every midnight.

Now if the application encountered such issue, it sends an email with a specific info but without further details. So I need to login to the server and do some grepping on the log file but I am having a hard time to get a good and accurate parsing results.

Since I dont have the exact log file. I am providing a sample log which I got from /var/log/messages of my linux box.

Usually the email that I received has a specific timestamp which I need to get the exact location of the info.

"06:10:28 mymachine kernel: This is it:: BIOS INFO 123: Get Bios info on the upper part of the log"

Based on that info above I need to get some details from the log.

06:10:28 mymachine kernel: BIOS Name: 123456
06:10:28 mymachine kernel: BIOS Date: 789123
06:10:28 mymachine kernel: BIOS QUALITY: 34567890


That info is what I need but that info is on the upper part of the log which I don't know how to crawl the parsing going up when I found the string "BIOS INFO 123: Get Bios info on the upper part of the log"

So my requirements is to grep based on timestamp with a string BIOS INFO 123: Get Bios info on the upper part of the log, get the info on the upper part of the log and print the 3 infos which I need. By the way it doesn't tell how may lines where the 3 info are located based on the string which I need to get first.

Here's a sample log I hope anyone can help me. thanks a lot.


Code:
06:10:28 mymachine syslogd 1.4.1: restart.
06:10:28 mymachine syslog: syslogd startup succeeded
06:10:28 mymachine kernel: klogd 1.4.1, log source = /proc/kmsg started.
06:10:28 mymachine kernel: Linux version 2.4.22-1.2115.nptlsmp (gcc version 3.2.3 20030422 (Red Hat Linux 3.2.3-6)) #1 SMP Wed Oct 29 15:30:09 EST 2003
06:10:28 mymachine kernel: BIOS-provided physical RAM map:
06:10:28 mymachine kernel: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
06:10:28 mymachine kernel: BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
06:10:28 mymachine kernel: BIOS-e820: 0000000000100000 - 000000003ff70000 (usable)
06:10:28 mymachine kernel: BIOS-e820: 000000003ff70000 - 000000003ff72000 (ACPI NVS)
06:10:28 mymachine kernel: BIOS-e820: 000000003ff72000 - 000000003ff93000 (ACPI data)
06:10:28 mymachine kernel: BIOS-e820: 000000003ff93000 - 0000000040000000 (reserved)
06:10:28 mymachine kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
06:10:28 mymachine kernel: BIOS Name: 123456
06:10:28 mymachine kernel: BIOS Date: 789123
06:10:28 mymachine kernel: BIOS QUALITY: 34567890
06:10:28 mymachine kernel: BIOS-e820: 000000003ff93000 - 0000000040000000 (reserved)
06:10:28 mymachine kernel: BIOS-e820: 00000000fecf0000 - 00000000fecf1000 (reserved)
06:10:28 mymachine kernel: BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)
06:10:28 mymachine kernel: BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
06:10:28 mymachine kernel: BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
06:10:28 mymachine kernel: 127MB HIGHMEM available.
06:10:28 mymachine kernel: 896MB LOWMEM available.
06:10:28 mymachine kernel: found SMP MP-table at 000fe710
06:10:28 mymachine kernel: hm, page 000fe000 reserved twice.
06:10:28 mymachine kernel: Intel machine check reporting enabled on CPU#0.
06:10:28 mymachine kernel: CPU0: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 09
06:10:28 mymachine random: Initializing random number generator: succeeded
06:10:28 mymachine kernel: per-CPU timeslice cutoff: 1462.76 usecs.
06:10:28 mymachine kernel: task migration cache decay timeout: 10 msecs.
06:10:28 mymachine kernel: enabled ExtINT on CPU#0
06:10:28 mymachine kernel: ESR value before enabling vector: 00000040
06:10:28 mymachine kernel: ESR value after enabling vector: 00000000
06:10:28 mymachine kernel: Booting processor 1/1 eip 3000
06:10:28 mymachine kernel: Initializing CPU#1
06:10:28 mymachine kernel: masked ExtINT on CPU#1
06:10:28 mymachine kernel: ESR value CPU1: found SMP MP table
06:10:28 mymachine kernel: ESR value before enabling vector: 00000000
06:10:28 mymachine kernel: ESR value after enabling vector: 00000000
06:10:28 mymachine kernel: Calibrating delay loop... 6383.20 BogoMIPS
06:10:28 mymachine kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
06:10:28 mymachine kernel: CPU: L2 cache: 512K
06:10:28 mymachine kernel: CPU: Physical Processor ID: 0
06:10:28 mymachine kernel: This is it:: BIOS INFO: Get Bios info on the upper part of the log
06:10:28 mymachine kernel: Intel machine check reporting enabled on CPU#1.
06:10:28 mymachine kernel: CPU1: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 09
06:10:28 mymachine kernel: Total of 2 processors activated (12753.30 BogoMIPS).
06:10:28 mymachine rc: Starting pcmcia: succeeded
06:10:28 mymachine kernel: ENABLING IO-APIC IRQs
06:10:28 mymachine kernel: Setting 2 in the phys_id_present_map
06:10:28 mymachine kernel: ...changing IO-APIC physical APIC ID to 2 ... ok.
06:10:28 mymachine netfs: Mounting other filesystems: succeeded
06:10:28 mymachine kernel: ..TIMER: vector=0x31 pin1=2 pin2=0
06:10:28 mymachine kernel: testing the IO APIC.......................
06:10:28 mymachine autofs: automount startup succeeded


Last edited by cwiggler; 09-10-2011 at 09:39 AM..
# 2  
Old 09-10-2011
Code:
awk -F: '/BIOS Name:|BIOS Date:|BIOS QUALITY:/ {x=x$(NF-1)":"$(NF)"\n";} /BIOS INFO:/{print x}' logfile

Something like this?

--ahamed

Last edited by ahamed101; 09-10-2011 at 10:01 AM..
This User Gave Thanks to ahamed101 For This Post:
# 3  
Old 09-10-2011
Or like this:
Code:
awk -F': ' '
$2 == "BIOS Name" {
  name = $0; getline; date = $0; getline; quality = $0;
}
$3 == "BIOS INFO" {
  printf "%s\n%s\n%s\n", name, date, quality
  exit
}          
' INPUTFILE


Last edited by yazu; 09-10-2011 at 09:55 AM.. Reason: minor changes
This User Gave Thanks to yazu For This Post:
# 4  
Old 09-10-2011
Hi Ahamed and Yazu.

Thank you for the reply and solution. Actually I haven't tried it yet but who do I add the add the grep of the log file first then get the details?

The log file contains of data with the same info but what I need to get is from the specific timestamp.

Thanks


Quote:
Originally Posted by ahamed101
Code:
awk -F: '/BIOS Name:|BIOS Date:|BIOS QUALITY:/ {x=x$(NF-1)":"$(NF)"\n";} /BIOS INFO:/{print x}' logfile

Something like this?

--ahamed
Quote:
Originally Posted by yazu
Or like this:
Code:
awk -F': ' '
$2 == "BIOS Name" {
  name = $0; getline; date = $0; getline; quality = $0;
}
$3 == "BIOS INFO" {
  printf "%s\n%s\n%s\n", name, date, quality
  exit
}          
' INPUTFILE

# 5  
Old 09-10-2011
Oh, yes. But because your file is really huge it's better to embed this check in awk:
Code:
awk -v t="$time" -F': ' '
$0 ~ "^" t && $2 == "BIOS Name" {
   name = $0; getline; date = $0; getline; quality = $0;
}
name && $3 == "BIOS INFO" {
  printf "%s\n%s\n%s\n", name, date, quality
  exit
}'

===

You can't use grep, it will search the whole file, but you need quit after getting your lines (just imagine that your information in the first 100 kilo). But I'm afraid the above awk solution would be slow because of the string regex. But you can embed the time variable (you need a variable for easy further automation) in the awk regex literal:
Code:
time=06:10:28
awk  -F': ' '
$0 ~ /^'"$time"'/ && $2 == "BIOS Name" {
   name = $0; getline; date = $0; getline; quality = $0;
}
name && $3 == "BIOS INFO" {
  printf "%s\n%s\n%s\n", name, date, quality
  exit
}' LOGFILE


Last edited by yazu; 09-10-2011 at 11:15 AM..
These 2 Users Gave Thanks to yazu For This Post:
# 6  
Old 09-10-2011
Hi Yazu,

thank you. so the $time is the variable of the timestamp that I want to get right?

Yazu, what will happen if there are two entries of bios name, date and quality on the same timestamp? does it also show up?

Quote:
Originally Posted by yazu
Oh, yes. But because your file is really huge it's better to embed this check in awk:
Code:
awk -v t="$time" -F': ' '
$0 ~ "^" t && $2 == "BIOS Name" {
   name = $0; getline; date = $0; getline; quality = $0;
}
name && $3 == "BIOS INFO" {
  printf "%s\n%s\n%s\n", name, date, quality
  exit
}

# 7  
Old 09-10-2011
Quote:
Yazu, what will happen if there are two entries of bios name, date and quality on the same timestamp? does it also show up?
The last values replace the previous ones. I believe this is what you need. If not it is easy to modify the script.

Last edited by yazu; 09-10-2011 at 11:04 AM.. Reason: Mention of easy modification
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help on script to capture info on log file for a particular time frame

Hi I have a system running uname -a Linux cmovel-db01 2.6.32-38-server #83-Ubuntu SMP Wed Jan 4 11:26:59 UTC 2012 x86_64 GNU/Linux I would like to capture the contents of /var/log/syslog from 11:00AM to 11:30AM and sent to this info via email. I was thinking in set a cron entry at that... (2 Replies)
Discussion started by: fretagi
2 Replies

2. Shell Programming and Scripting

How to select bulk of info from log file?

unix : sun shell : bash i need to select multiple rows with this format : <special format> 10 lines /<special format> from log file that have lots of info i thought of getting the number of the first line using grep -n "special format" file | cut -d: -f1 then pass it to shell... (2 Replies)
Discussion started by: scorpioneer
2 Replies

3. Shell Programming and Scripting

HELP: Shell Script to read a Log file line by line and extract Info based on KEYWORDS matching

I have a LOG file which looks like this Import started at: Mon Jul 23 02:13:01 EDT 2012 Initialization completed in 2.146 seconds. -------------------------------------------------------------------------------- -- Import summary for Import item: PolicyInformation... (8 Replies)
Discussion started by: biztank
8 Replies

4. Shell Programming and Scripting

Event logging to file and display to console | tee command is not able to log all info.

My intention is to log the output to a file as well as it should be displayed on the console > I have used tee ( tee -a ${filename} ) command for this purpose. This is working as expected for first few outputs, after some event loggin nothing is gettting logged in to the file but It is displaying... (3 Replies)
Discussion started by: sanoop
3 Replies

5. Shell Programming and Scripting

Optimised way for search & replace a value on one line in a very huge file (File Size is 24 GB).

Hi Experts, I had to edit (a particular value) in header line of a very huge file so for that i wanted to search & replace a particular value on a file which was of 24 GB in Size. I managed to do it but it took long time to complete. Can anyone please tell me how can we do it in a optimised... (7 Replies)
Discussion started by: manishkomar007
7 Replies

6. Shell Programming and Scripting

Help finding info from log file

Hi, I have a log file that contains information such as this: date id number command1 command2 command3 command4 data data data date id number command1 command2 command3 command4 (4 Replies)
Discussion started by: bbbngowc
4 Replies

7. Shell Programming and Scripting

Log File - Getting Info about preceding Date of Pattern Found

Ok Suppose I have a log file like the below: 2010-07-15 00:00:01,410 DEBUG 2010-07-15 00:01:01,410 DEBUG 2010-07-15 00:01:02,410 DEBUG com.af ajfajfaf affafadfadfd dfa fdfadfdfadfadf fafafdfadfdafadfdaffdaffadf afdfdafdfdafafd error error failure afdfadfdfdfdf EBUDGG eafaferror failure... (6 Replies)
Discussion started by: SkySmart
6 Replies

8. Shell Programming and Scripting

Extract info from log file and compute using time date stamp

Looking for a shell script or a simple perl script . I am new to scripting and not very good at it . I have 2 directories . One of them holds a text file with list of files in it and the second one is a daily log which shows the file completion time. I need to co-relate both and make a report. ... (0 Replies)
Discussion started by: breez_drew
0 Replies

9. Shell Programming and Scripting

insert a header in a huge data file without using an intermediate file

I have a file with data extracted, and need to insert a header with a constant string, say: H|PayerDataExtract if i use sed, i have to redirect the output to a seperate file like sed ' sed commands' ExtractDataFile.dat > ExtractDataFileWithHeader.dat the same is true for awk and... (10 Replies)
Discussion started by: deepaktanna
10 Replies

10. Linux

Searching for gaps in huge (2.2G) log file?

I've got a 2.2 Gig syslog file from our Cisco firewall appliance. The problem is that we've been seeing gaps in the syslog for anywhere from 10 minutes to 2 hours. Currently I've just been using 'less' and paging through the file to see if I can find any noticeable gaps. Obviously this isn't the... (3 Replies)
Discussion started by: deckard
3 Replies
Login or Register to Ask a Question