extract certain parts from a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting extract certain parts from a file
# 1  
Old 10-16-2011
extract certain parts from a file

I have a logfile from which i need to extract certain pattern based on the time but the problem here is the time is not same for all days.

Input file:

Mon 12:34:56 abvjingjgg
Mon 12:34:57 ofjhjgjhgh
.
.
.
Mon 22:30:00 kkfng
.
.
.
Mon 23:12:23 kjgsdafhkljf
.
.
.
Tue 01:04:54 ldkjaoper

Now i need only extract the data which is updated after 22:00:00 till the end of the file but there may be chances of no update at exactly 22:00:00
the input file also contains data for the previous days as well but i need only the part which i updated last i.e the data i need will be at the end of the file always.

Desired output file:
Mon 22:30:00 kkfng
.
.
.
Mon 23:12:23 kjgsdafhkljf
.
.
.
Tue 01:04:54 ldkjaoper

Last edited by gpk_newbie; 10-16-2011 at 11:36 PM..
# 2  
Old 10-16-2011
As long as you want everything after the first timestamp of 22:00 or later, then this should work:

Code:
awk ' snarf || $2+0 >= 22 { snarf = 1; print; }' log-file-name

It does require that a timestamp between 22:00 and 23:59:59 be present.
# 3  
Old 10-16-2011
Thanks agama. Let me try.

---------- Post updated at 08:20 AM ---------- Previous update was at 08:17 AM ----------

Thanks a lot agama. It works. But can you just explain briefly what it does
# 4  
Old 10-17-2011
The basic format of an awk programme is

Code:
condition { action }

such that action statements are executed if condition evaluates to true.

In this case, the condition is

Code:
snarf || $2+0 >= 22

Awk treats snarf like C, and thus it evaluates to true if not zero or is not a null string (undefined). As the programme starts, it evaluates to false. The second part evaluates to true when the hour of the timestamp is greater or equal to 22. This makes use of an awk trick that converts the lead portion of a string to an integer by adding zero ($2+0) so that it can be compared to the integer 22.

Once the expression evaluates to true (time stamp is good) then we set snarf to 1 such that the expression always is true and all lines after the first good timestamp are printed.

Some suggested reading on awk:
Awk - A Tutorial and Introduction - by Bruce Barnett
# 5  
Old 10-17-2011
Thanks a lot agama. but still i have a doubt will this check for the latest update in the log file because there may be updates in logfile for previous days also.

---------- Post updated at 09:24 AM ---------- Previous update was at 09:04 AM ----------

I tried for the same and it did not work when logfile contains previous days data also. its checking for the first occurance of 22:00:00 and displaying all the contents that follow i whereas i need only data which has been updated for 22:00:00 at the end till end of the file.
# 6  
Old 10-17-2011
Quote:
Originally Posted by gpk_newbie
Thanks a lot agama. but still i have a doubt will this check for the latest update in the log file because there may be updates in logfile for previous days also.
You are correct; my original post indicated that it would snarf from the first occurrence of the timestamp until the end. I didn't catch the part in your original post that indicated you only wanted the last day -- sorry about that.

Along the same lines, but it does not include anything before the last timestamp after 21:59:59. It does assume that every line in the file has a timestamp.

Code:
awk  ' 
    BEGIN { i = 0; }
    $2+0 < 22 { roll = 1; }     # rolled to next day -- signal reset needed

    snarf || $2+0 >= 22 {
        if(  $2+0 >= 22 && roll )  # reset on first timestamp after roll
        {
            roll = 0;
            delete capture;
            i = 0;
        }

        snarf = 1; 
        capture[i++] = $0; 
    }

    END {      # after all of the file has been read, print the lines from the last timestamp of 22:00 or later
        for( j = 0; j < i; j++ )
            print capture[j];
    }' input-file


Last edited by agama; 10-17-2011 at 01:11 AM.. Reason: clarification
This User Gave Thanks to agama For This Post:
# 7  
Old 10-17-2011
great this works fine.

---------- Post updated at 10:24 AM ---------- Previous update was at 09:45 AM ----------

Hi agama, i use the below command to get 7 lines after the pattern from file1 to file2, but the problem here is im not able able to include even the pattern into file2.

gawk 'c-->0;/pattern/{c=7}' file1 > file2

---------- Post updated at 10:46 AM ---------- Previous update was at 10:24 AM ----------

sorry again but a small doubt if the time im looking is 22:15:00 instead of 22:00:00 then how to change the gawk command.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract parts of the line

I have a long list of lines in a txt file which i'm only interested to extract the list of domains like the colored ones. domain.com domain.com/page codes $.09 domain.org domain.org/page2/ codes $0.10 domain.net domain.net/page03 codes $0.05 domain.info ... (3 Replies)
Discussion started by: garfish
3 Replies

2. Shell Programming and Scripting

Split file into n parts.

Hi all: I have a 5-column tab-separated file. The only thing that I want to do with it is to split it. However, I want to split it with a 80/20 proportion -- randomized, if possible. I know that something like : awk '{print $0 ""> "file" NR}' RS='' input-file will work, but it only... (6 Replies)
Discussion started by: owwow14
6 Replies

3. Shell Programming and Scripting

Incrementing parts of ten digits number by parts

I have number in file which contains date and serial number: 2013101000. The last two digits are serial number (00). So maximum of serial number is 100. After reaching 100 it becomes 00 with incrementing 10 which is day with max 31. after reaching 31 it becomes 00 and increments 10... (31 Replies)
Discussion started by: Natalie
31 Replies

4. Shell Programming and Scripting

Combine two parts of a file

Hello All, I have a file like this APPLY ( 'INSERT INTO brdcst_media_cntnt ( cntnt_id ,brdcst_media_cntnt_cd ,cntnt_prvdr_cd ,data_src_type_cd ,cntnt_titl_nm ,cntnt_desc ,batch_dt ,batch_id ) VALUES ( :cntnt_id (3 Replies)
Discussion started by: nnani
3 Replies

5. Shell Programming and Scripting

Extract Parts of File

Hello All, I have a file like this Define schema flat_file_schema ( a varchar(20) ,b varchar(30) ,c varchar(40) ); (Insert into table ( a ,b ,c ) values ( 1 ,2 ,3 ); (4 Replies)
Discussion started by: nnani
4 Replies

6. UNIX for Dummies Questions & Answers

How to swap parts of a file name?

I have a number of files that a structured like this: Eg. file_name.ext1 another file name with spaces.ext2 yatf with .ext3 also a file (plus).ext4 I would like to swap the part with the descriptive_file_name part, so that it looks like this: Eg. file_name .ext1 I know (or... (4 Replies)
Discussion started by: invenio
4 Replies

7. Shell Programming and Scripting

Extracting parts of a file.

Hello, I have a XML file as below and i would like to extract all the lines between <JOB & </JOB> for every such occurance. The number of lines between them is not fixed. Anyways to do this awk? ============ <JOB APR="1" AUG="1" DEC="1" FEB="1" JAN="1" JUL="1" JUN="1" MAR="1" MAY="1"... (3 Replies)
Discussion started by: srivat79
3 Replies

8. Shell Programming and Scripting

How to extract some parts of a file to create some outfile

Hi All, I am very new in programming. I need some help. I have one input file like: Number of disabled taxa: 9 Loading mapping file: ncbi.map Load mapping: taxId2TaxLevel: 469951 --- Subsample reads (20%): 66680 of 334386 Processing: tree-from-summary Running tree-from-summary... (21 Replies)
Discussion started by: iammitra
21 Replies

9. Shell Programming and Scripting

getting parts of a file

Hello, I'm trying to retreive certain bits of info from a file. the file contains a list like this info1:info2:info3:info4 info1:info2:info3:info4 info1:info2:info3:info4 info1:info2:info3:info4 how do i pick out only info2 or only info3 without the others? Thanks (11 Replies)
Discussion started by: bebop1111116
11 Replies

10. UNIX for Dummies Questions & Answers

cksum parts of a file

Every time we build an executable the date and time are put into the file, I need to run checksum on just the working lines.(IE, no header files) Is this even possible, if so how would I go about it? I am using a HP-UX server any help you can give me will be greatly appreciated. Thanks (6 Replies)
Discussion started by: crazykelso
6 Replies
Login or Register to Ask a Question