take a section of a data with conditions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting take a section of a data with conditions
# 1  
Old 08-25-2012
take a section of a data with conditions

I have a data file like below:
[input]
Code:
 
2011 0701 2015 21.2 L 37.692 46.202 18.0 Teh 4 0.3 2.1 LTeh 1
GAP=233 E
Iranian Seismological Center, Institute of Geophysics, University of Tehran 6
STAT SP IPHASW D HRMM SECON CODA AMPLIT PERI AZIMU VELO SNR AR TRES W DIS CAZ7
TBZ SN EPg 0 2015 31.19 -0.3 60.0 355
BST SZ EPg 0 2015 31.30 -0.3 61.0 89
 
2011 0702 0624 39.4 L 38.067 46.391 13.9 Teh 5 0.1 1.7 LTeh 1
GAP=157 E
Iranian Seismological Center, Institute of Geophysics, University of Tehran 6
STAT SP IPHASW D HRMM SECON CODA AMPLIT PERI AZIMU VELO SNR AR TRES W DIS CAZ7
SHB SZ EPg 0 0624 51.37 0.0 72.0 290
MRD SZ EPg 0 0624 54.83 0.1 94.0 320

i want to retain sections which have these constrains:
in the 1st field of each section (each section seprate with a blank line) which starts with "2011", if $12>3 then print the whole section. so i wrote this command with awk:
Code:
 
awk '/^201?/ {if ($12 > 3) {RS="";FS="\n"} print $0}' in > out

but the result doesn't have any changes with input!
# 2  
Old 08-25-2012
I think this does what you might be looking for:

Code:
awk '
    $1 == "#"  { if( snarf ) print " "; snarf = 0; next; }   # turn off section capture, write a trailing blank line
    snarf || (/^201?/ && $12+0 > 3.0) { snarf = 1; print; }  # print a record from the section
    ' input >output

You said "blank line" but that seems to be a line with a lone hash (comment symbol) at the start. I assumed you wanted all lines from the 2011 line (with a value in field 12 greater than three, up to the next 'blank' line printed.
This User Gave Thanks to agama For This Post:
# 3  
Old 08-26-2012
agama said he found a # by itself on a line as the section separator. When I copied the sample input and fed it through od -cb, I found that the separator line contained the octal byte values 343, 200, and 200 terminated by the <newline> character.

I believe the following meets the criteria specified, but nothing will be printed given the sample input because no section header in the sample input has $12 > 3.
Code:
awk 'BEGIN {line1 = 1} # Next line with no alpha-numeric is a section header.
!/[0-9a-zA-Z]/ { # Found what is assumed to be a blank line.
        # The sample input had three bytes with octal values 343, 200, and 200
        #   followed by a <newline> as the separator between sections.
        #   The submitter described this as a "blank line".
        # This script will use empty lines as section separators no matter what
        #   section separator lines are found in input files.
        copy = 0 # Turn off copy mode.
        line1 = 1 # The next non-"blank" line is a sectoin header.
        next
}
copy    {print;next} # Copy any lines found before the next "blank" line.
line1   {if(($1 ~ /^2011/) && ($12 > 3)) { 
                # The text in the first post in this thread said sections were
                #   to be printed only for the year 2011 and $12 is > 3.
                # The script in the first post was looking for years 2010-2019.
                # All entries in the sample input were for 2011, but no entries
                #   had $12 > 3 (the only entries had $12 set to 2.1 and 1.7,
                #   so no entries match the criteria.
                copy=1 # Turn on copy mode for the rest of the section.
                # Add an empty line as a section separator, except before the 1st
                #   section to be printed.
                if(found++ > 0) print ""
                print # Print the 1st line of the section.
        } 
        # Whether a match was found or not, don't look for another seciion
        #   header until we find another separator line.
        line1 = 0
}' input

These 2 Users Gave Thanks to Don Cragun For This Post:
# 4  
Old 08-26-2012
How abut this ?
Code:
#!/usr/bin/perl

$/="\n\n";

while (<DATA>) {
chomp;
if ( ((split))[11] > 3 ) {
print ;
}
}



__DATA__
2011 0701 2015 21.2 L 37.692 46.202 18.0 Teh 4 0.3 2.1 LTeh 1
GAP=233 E
Iranian Seismological Center, Institute of Geophysics, University of Tehran 6
STAT SP IPHASW D HRMM SECON CODA AMPLIT PERI AZIMU VELO SNR AR TRES W DIS CAZ7
TBZ SN EPg 0 2015 31.19 -0.3 60.0 355
BST SZ EPg 0 2015 31.30 -0.3 61.0 89

2011 0702 0624 39.4 L 38.067 46.391 13.9 Teh 5 0.1 3.7 LTeh 1
GAP=157 E
Iranian Seismological Center, Institute of Geophysics, University of Tehran 6
STAT SP IPHASW D HRMM SECON CODA AMPLIT PERI AZIMU VELO SNR AR TRES W DIS CAZ7
SHB SZ EPg 0 0624 51.37 0.0 72.0 290
MRD SZ EPg 0 0624 54.83 0.1 94.0 320

This User Gave Thanks to pravin27 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grep a section from an UNIX file obtaining only part of the data

Hello, I have a log file that has several sections "BEGIN JOB, End of job" like in the following example: 19/06/12 - 16:00:57 (27787398-449294): BEGIN JOB j1(27787398-449294) JOB1 19/06/12 - 16:00:57 (27787398-449294): DIGIT: 0 number of present logs : 1 19/06/12 - 16:00:57... (4 Replies)
Discussion started by: mvalonso
4 Replies

2. UNIX for Dummies Questions & Answers

Shell script to extract data from csv file based on certain conditions

Hi Guys, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (1 Reply)
Discussion started by: Vivekit82
1 Replies

3. Shell Programming and Scripting

Getting last section of data from logfile

Hi, I have a log file from Munin like this:2012/12/04 13:45:31 : Munin-update finished (29.68 sec) 2012/12/04 13:50:01 Opened log file 2012/12/04 13:50:01 : Starting munin-update 2012/12/04 13:50:01 Error occured in under in the configuration. 2012/12/04 13:50:01 Could not parse datafile... (9 Replies)
Discussion started by: Jotne
9 Replies

4. Shell Programming and Scripting

Errors in if conditions with to many OR conditions

Hi ALL I have a script where in i need to check for several values in if conditons but when i execute the script it throws error such as "TOO MANY ARGUMENTS" if then msg="BM VAR Issue :: bmaRequestVAR=$bmaRequestVAR , nltBMVAR=$nltBMVAR , bmaResponseVAR=$bmaResponseVAR ,... (10 Replies)
Discussion started by: nikhil jain
10 Replies

5. Shell Programming and Scripting

split continues lines to separated section with conditions

Hello; i have a file contains N continues records. i want to split these lines to some separate sections with each lines of a section has the desired condition compared to other sections input: AZR ? ? ? Pn 37.202 48.82 1136119044 1136119009 SHB ? ? ? Pn 37.802 48.02 1136119047 1136119008... (4 Replies)
Discussion started by: saeed.soltani
4 Replies

6. Shell Programming and Scripting

Prepend first line of section to each line until the next section header

I have searched in a variety of ways in a variety of places but have come up empty. I would like to prepend a portion of a section header to each following line until the next section header. I have been using sed for most things up until now but I'd go for a solution in just about anything--... (7 Replies)
Discussion started by: pagrus
7 Replies

7. Shell Programming and Scripting

Extract section of file based on word in section

I have a list of Servers in no particular order as follows: virtualMachines="IIBSBS IIBVICDMS01 IIBVICMA01"And I am generating some output from a pre-existing script that gives me the following (this is a sample output selection). 9/17/2010 8:00:05 PM: Normal backup using VDRBACKUPS... (2 Replies)
Discussion started by: jelloir
2 Replies

8. Shell Programming and Scripting

Organization data based on two conditions applied problem asking...

Input file: HS04636 type header 836 7001 ID=g1 HS04636 type status 836 1017 Parent=g1.t1 HS04636 type location 966 1017 ID=g1.t1.cds;Parent=g1.t1 HS04636 type location 1818 1934 ID=g1.t1.cds;Parent=g1.t1 HS04636 type status 1818... (8 Replies)
Discussion started by: patrick87
8 Replies

9. Shell Programming and Scripting

Search and Remove No data Section

Hello, I have written a script that removes duplicates within a file and places them in another report. File: ABC1 012345 header ABC2 7890-000 ABC3 012345 Content Header ABC5 593.0000 587.4800 ABC5 593.5000 587.6580 ABC5 593.5000 587.6580 ABC1 67890 header ABC2 1234-0001 ABC3... (2 Replies)
Discussion started by: petersf
2 Replies

10. Shell Programming and Scripting

parsing data for certain conditions

Hi guys, I have got this working OK but I am sure there is a more efficient/elegant way of doing it, which I hope you can help me with. It can be done in whatever is most suitable i.e perl/awk.. Any suggestions are welcome and many thanks in advance. What I require is to extract... (5 Replies)
Discussion started by: PAW
5 Replies
Login or Register to Ask a Question