Parsing a large log


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing a large log
# 8  
Old 05-30-2008
The most rapid solution is to write a program with a compiled language like C.

With awk you can do something like that :
Code:
awk -v from=$(date --date=yesterday +'%D') \
    -v   to=$(date +'%D')  '
$1 == from {
   if (int($2) >= 13)
      print;
   next;
}
$1 == to  {
   if (int($2) < 13) {
      print;
      next;
   } else
      exit;
}
' inputfile

Input file:
Code:
05/29/08 01:56:53 nsrexecd: select() error: Invalid argument
05/29/08 01:56:53 nsrexecd: select() error: Invalid argument
05/29/08 01:56:53 nsrexecd: select() error: Invalid argument
05/29/08 12:59:50 not selected
05/29/08 13:00:00 selected 1
05/29/08 23:59:59 selected 2
05/30/08 00:00:01 selected 3
05/30/08 12:59:59 selected 4
05/30/08 13:00:00 not selected
06/01/08 00:00:01 not selected

Output (current date is 05/30/08):
Code:
05/29/08 13:00:00 selected 1
05/29/08 23:59:59 selected 2
05/30/08 00:00:01 selected 3
05/30/08 12:59:59 selected 4

Jean-Pierre.
# 9  
Old 06-02-2008
Thanks a lot.
But my problem is that my log is large- 300-400mb.
I am unable to use awk, sed. grep etc.
I need a solution in perl or shell for parsing the log for current date(24 hours)
and then searching the string
# 10  
Old 06-02-2008
None of the tools you mentioned are sensitive to the file size. Other things being equal, they read the file one line at a time, and prints that line if certain conditions are met. (Of course you can write an awk or sed script which consumes memory for every line; but for this case, I don't think you need to.)
# 11  
Old 06-03-2008
Quote:
Originally Posted by era
None of the tools you mentioned are sensitive to the file size. Other things being equal, they read the file one line at a time, and prints that line if certain conditions are met. (Of course you can write an awk or sed script which consumes memory for every line; but for this case, I don't think you need to.)
Please helppppppppp Smilie
But my problem is that my log is large- 300-400mb.
I am unable to use awk, sed. grep etc.
I need a solution in perl or shell for parsing the log for current date(24 hours)
and then searching the string
# 12  
Old 06-03-2008
I'm sorry, no offense, but I cannot type this any slower than this: grep and sed and awk do not care what size the file is. They only read it one line at a time, just like cat.

Perl is unlikely to be any faster than grep. Here is a Perl script anyway.

Code:
perl -ne 'print if m/^05/(29/08 (1[3-9]|2[0-3])|30/08 (0|1[0-2]))/' file

Notice the similarity to the egrep solution I posted before. This one is probably going to be slower, and in any event will not be much faster.

Please answer the following questions:
  • What have you tried?
  • Have you tried the solutions various people have posted to this thread?
  • How long did it take to complete?
  • How long would you like it to take?
  • How quickly can you simply cat the file?
  • If you extract just one day's worth from the file, how long does that take to cat?
# 13  
Old 06-03-2008
Quote:
Originally Posted by era
I'm sorry, no offense, but I cannot type this any slower than this: grep and sed and awk do not care what size the file is. They only read it one line at a time, just like cat.

Perl is unlikely to be any faster than grep. Here is a Perl script anyway.

Code:
perl -ne 'print if m/^05/(29/08 (1[3-9]|2[0-3])|30/08 (0|1[0-2]))/' file

Notice the similarity to the egrep solution I posted before. This one is probably going to be slower, and in any event will not be much faster.

Please answer the following questions:
  • What have you tried?
  • Have you tried the solutions various people have posted to this thread?
  • How long did it take to complete?
  • How long would you like it to take?
  • How quickly can you simply cat the file?
  • If you extract just one day's worth from the file, how long does that take to cat?

What have you tried? --I have tried "cat file |/bin/awk '$1 ~ /^$date/'"
Have you tried the solutions various people have posted to this thread?--yep but as i have mentioned for simply using cat it is timong out
How long did it take to complete? more than 5 min-- i quit b4 it completed..
How long would you like it to take? a normal time as it takes for cat or grep
How quickly can you simply cat the file? I am unable to cat the file it is not at all opening
If you extract just one day's worth from the file, how long does that take to cat?I am unable to extract with the awk or grep...i ma only able to use the tail and head command.
As mentioned earlier in the thread to use chunks of file.. i am unable to create a logic for chunks and find the log of 24 hours.
# 14  
Old 06-03-2008
Quote:
Originally Posted by asth
What have you tried? --I have tried "cat file |/bin/awk '$1 ~ /^$date/'"
The cat is useless, simply run awk '$1 ~ /^06\/01\//' file

Quote:
Have you tried the solutions various people have posted to this thread?--yep but as i have mentioned for simply using cat it is timong out
You have not mentioned this very explicitly. I think there may be an unrelated problem here.

Quote:
How long did it take to complete? more than 5 min-- i quit b4 it completed..
How long would you like it to take? a normal time as it takes for cat or grep
How quickly can you simply cat the file? I am unable to cat the file it is not at all opening
If you extract just one day's worth from the file, how long does that take to cat?I am unable to extract with the awk or grep...i ma only able to use the tail and head command.
So if you, say, tail -n 10000 file | grep '^06/01/ ' do you get roughly what you want? How long does it take? Too long still?

Last edited by era; 06-03-2008 at 09:43 AM.. Reason: Minor edit of regexes
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing a subset of data from a large matrix

I do have a large matrix of the following format and it is tab delimited ch-ab1-20 ch-bb2-23 ch-ab1-34 ch-ab1-24 er-cc1-45 bv-cc1-78 ch-ab1-20 0 2 3 4 5 6 ch-bb2-23 3 0 5 ... (6 Replies)
Discussion started by: Kanja
6 Replies

2. Shell Programming and Scripting

Parsing large files in Solaris 11

I have a 1.2G file that contains no newline characters. This is essentially a log file with each entry being exactly 78bits long. The basic format is /DATE/USER/MISC/. The single uniform thing about the file is that that the 8 character is always ":" I worked with smaller files of the same... (8 Replies)
Discussion started by: os2mac
8 Replies

3. Shell Programming and Scripting

Log parsing

I have a directory with daily logs that have records like this: Date: 04/17/13 Time: 09:29:15 IP: 123.123.123.123 URL: usr/local/file1 and I want to only count how many times each file was accessed (e.g. file1 in that example above), and I want to also look in all the logs in the current... (3 Replies)
Discussion started by: Jaymz
3 Replies

4. UNIX for Dummies Questions & Answers

I need to isolate a date in a large log file

I wrote head -n1 example.log I grab the first line of the log, but I need to isolate just the date, which is 08/May/2012:09:52:52. I also need to find the reverse of this, which would be tail... http://i.imgur.com/Lp1eBD0.png Thanks in advance (4 Replies)
Discussion started by: spookydll
4 Replies

5. Shell Programming and Scripting

Help needed for parsing large XML with awk.

My XML structure looks like: <?xml version="1.0" encoding="UTF-8"?> <SearchRepository> <SearchItems> <SearchItem> ... </SearchItem> <SearchItem> ... ... (1 Reply)
Discussion started by: jasonjustice
1 Replies

6. Red Hat

Help for capturing a URL from a line from large log file

Can someone please help me how do I find a URL from lines of log file and write all the output to a new file? For e.g - Log file has similar entries, 39.155.67.5 - - "GET /abc/login?service=http://161.120.36.39/CORPHR/TMA2007/default.asp HTTP/1.1" 401 3218 54.155.63.9 - - "GET... (2 Replies)
Discussion started by: rockf1bull
2 Replies

7. Shell Programming and Scripting

parsing large CDR XML file

Dear Freind in the file attached how parse the data to be like a normal table :D (3 Replies)
Discussion started by: saifsafaa
3 Replies

8. Shell Programming and Scripting

Cutting a large log file in to smaller ones

I have a very large (150 megs) IRC log file from 2000-2001 which I want to cut down to individual daily log files. I have a very basic knowledge of the cat, sed and grep commands. The log file is time stamped and each day in the large log file begins with a "Session Start" string like so: ... (11 Replies)
Discussion started by: MrTangent
11 Replies

9. Shell Programming and Scripting

Problem with parsing a large file

Hi All, Following is the sample file and following is the op desired that is the last entry of each unique first field is required. My solution is as follows However the original file has around a million entries and around a 100,000 uniques first fields, so this soln.... (6 Replies)
Discussion started by: gauravgoel
6 Replies

10. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies
Login or Register to Ask a Question