AWK exclude first and last record, sort and print


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK exclude first and last record, sort and print
# 1  
Old 11-23-2011
AWK exclude first and last record, sort and print

Hi everyone,
I've really searched for a solution to this and this is what I found so far:

I need to sort a command output (here represented as a "cat file" command) and from the second down to the second-last line based on the second row and then print ALL the output with the specified section sorted. Almost done, BUT:

input:
Code:
~ $ cat test
ID     Done       Have  ETA           Up    Down  Ratio  Status       Name
   1     5%       None  Unknown      0.0     0.0   None  Idle         TEST 1
   2     0%       None  Unknown      0.0     0.0   None  Stopped      TEST 2
   3     9%       None  Unknown      0.0     0.0   None  Idle         TEST 3
   4    40%       10Mb  Unknown      0.0     0.0   None  Stopped      TEST 4
   5    25%       None  Unknown      0.0     0.0   None  Stopped      TEST 5  
Sum          10Mb             0.0     0.0

For the moment I have this:
Code:
~ $ 
~ $ cat test | awk 'NR==1; NR > 1 && NR < '$Num' {print $0 | "sort -k2nr"}'
ID     Done       Have  ETA           Up    Down  Ratio  Status       Name
   4    40%       10Mb  Unknown      0.0     0.0   None  Stopped      TEST 4
   5    25%       None  Unknown      0.0     0.0   None  Stopped      TEST 5  
   3     9%       None  Unknown      0.0     0.0   None  Idle         TEST 3
   1     5%       None  Unknown      0.0     0.0   None  Idle         TEST 1
   2     0%       None  Unknown      0.0     0.0   None  Stopped      TEST 2

Assigning to the Num variable the code cat test | wc -l;

I would like to include the last line to have this:
Code:
ID     Done       Have  ETA           Up    Down  Ratio  Status       Name
   4    40%       10Mb  Unknown      0.0     0.0   None  Stopped      TEST 4
   5    25%       None  Unknown      0.0     0.0   None  Stopped      TEST 5  
   3     9%       None  Unknown      0.0     0.0   None  Idle         TEST 3
   1     5%       None  Unknown      0.0     0.0   None  Idle         TEST 1
   2     0%       None  Unknown      0.0     0.0   None  Stopped      TEST 2
Sum          10Mb             0.0     0.0

But trying
Code:
~ $ cat test | awk 'NR==1; NR > 1 && NR < '$Num' {print $0 | "sort -k2nr"}; END{print}'

I have this, with the last row jumping to the second row: Smilie
Code:
ID     Done       Have  ETA           Up    Down  Ratio  Status       Name
Sum          10Mb             0.0     0.0    
   4    40%       10Mb  Unknown      0.0     0.0   None  Stopped      TEST 4
   5    25%       None  Unknown      0.0     0.0   None  Stopped      TEST 5  
   3     9%       None  Unknown      0.0     0.0   None  Idle         TEST 3
   1     5%       None  Unknown      0.0     0.0   None  Idle         TEST 1
   2     0%       None  Unknown      0.0     0.0   None  Stopped      TEST 2

Also using not a reverse sort with just sort -k2n.

I hope it's clear and not too long.
Thanks in advance!
# 2  
Old 11-23-2011
More detail is always good Smilie

You're seeing the last line come out at top because you're not closing the pipe to the sort command and thus your print is happening before sort finishes and prints it's output. Here's my solution, which also doesn't require knowing the number of records between before hand:

Code:
awk '
    NR == 1 { print; next; }
    {
        if( last )
            print last | "sort -k2nr,2";
        last = $0;
    }

    END {
        close( "sort -k2nr,2" );   # finish sort let sort print
        print last;
    }
' input-file

These 2 Users Gave Thanks to agama For This Post:
# 3  
Old 11-23-2011
Incredibly fast and accurate...
I'm really grateful, thanks mate.
Works like a charm !!!! Smilie

But now I'm FORCED to really understand how it works!!!
Smilie
# 4  
Old 11-23-2011
Quote:
Originally Posted by agama
More detail is always good Smilie

You're seeing the last line come out at top because you're not closing the pipe to the sort command and thus your print is happening before sort finishes and prints it's output. Here's my solution, which also doesn't require knowing the number of records between before hand:

Code:
awk '
    NR == 1 { print; next; }
    {
        if( last )
            print last | "sort -k2nr,2";
        last = $0;
    }

    END {
        close( "sort -k2nr,2" );   # finish sort let sort print
        print last;
    }
' input-file

Hmmm.... I don't quite follow how it works, but it does.
How sorting only one line at a time ends up sorting the whole list (not including the last line?
Care to explain?
Also, some awk-s have the limit on how many file handlers (in this case "sort" invocations) one can have opened - some awk-s have it set to 9. So if your awk has limit set to 9, and you have more than 9 lines in your file, your awk might bomb out....
# 5  
Old 11-23-2011
ok, a somewhat different approach I can understand:
Code:
FNR == 1 { print; next; }
{
  if( last )
    all= (!all)?last:all ORS last
  last = $0
}
END {
  print all | "sort -k2nr,2"
  close( "sort -k2nr,2" )   # finish sort let sort print
  print last
}

# 6  
Old 11-23-2011
Quote:
Originally Posted by vgersh99
Hmmm.... I don't quite follow how it works, but it does.
How sorting only one line at a time ends up sorting the whole list (not including the last line?
Care to explain?
Also, some awk-s have the limit on how many file handlers (in this case "sort" invocations) one can have opened - some awk-s have it set to 9. So if your awk has limit set to 9, and you have more than 9 lines in your file, your awk might bomb out....
It's a bit misleading, but a cool feature of awk....

When the pipe symbol is used awk forks a single process and begins writing to it's stdin or reading from it's stdout depending on where the command and pipe are placed. A more common example is to execute a command and read each record generated:

Code:
function run_it( cmd )
{      while( (cmd|getline)> 0 )   # fork cmd and read from its stdout
        {
           # do something with $0
        }
        close( cmd );
}

As you point out, awk tends to have a limited number of file descriptors available, so it's very important to close them when finished especially in the case of the run_it function that might be called lots of times.

When awk creates the child process, the real FD is mapped using the command string, which is why I tend to put the command into a variable as it's easier to pass it to close, especially if someone comes along and modifies the original command without realising that it needs to be exactly the same in the close.

For the code above, the sort command is forked on the first execution of the statement, and after that awk maps the command string to an already open file descriptor and writes the additional records to the already open FD rather than starting another process. Thus, we get the records we want sorted, and the ones we dont (first and last) are just written to standard output.

This is along the same lines as
Code:
   print foo >"/tmp/file";

Each time the statement is executed the variable contents in foo are printed to the tmp file; the open only happens on the first execution of the statement.

@vgersh99 -- a bit more info than I think you needed, but from the perspective of someone wrestling with the concept of piping a command from inside an awk programme I decided to err on too much.

Last edited by agama; 11-23-2011 at 03:48 PM.. Reason: clarification
This User Gave Thanks to agama For This Post:
# 7  
Old 11-23-2011
Just to add a reference to the relevant part of the manual:


Quote:
print items | command

It is possible to send output to another program through a pipe instead of into a file.
This redirection opens a pipe to command, and writes the values of items through this pipe to another process created to execute command.
The redirection argument command is actually an awk expression. Its value is converted to a string whose contents give the shell command
to be run.

For example, the following produces two files, one unsorted list of BBS names,
and one list sorted in reverse alphabetical order:

Code:
awk '{ 
    print $1 > "names.unsorted" 
    command = "sort -r > names.sorted" 
    print $1 | command 
    }' BBS-list

So, as already stated, it's like:

Code:
zsh-4.3.12[sysadmin]% print -l 3 1 2 | sort
1
2
3

These 2 Users Gave Thanks to radoulov For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk print matching records and occurences of each record

Hi all , I have two files : dblp.xml with dblp records and itu1.txt with faculty members records. I need to find out how many dblp records are related to the faculty members. More specific: I need to find out which names from itu1.txt are a match in dblp. xml file , print them and show how many... (4 Replies)
Discussion started by: iori
4 Replies

2. Shell Programming and Scripting

awk to print record not equal specific pattern

how to use "awk" to print any record has pattern not equal ? for example my file has 5 records & I need to get all lines which $1=10 or 20 , $2=10 or 20 and $3 greater than "130302" as it shown : 10 20 1303252348212B030 20 10 1303242348212B030 40 34 1303252348212B030 10 20 ... (14 Replies)
Discussion started by: arm
14 Replies

3. Shell Programming and Scripting

How to compare current record,with next and previous record in awk without using array?

Hi! all can any one tell me how to compare current record of column with next and previous record in awk without using array my case is like this input.txt 0 32 1 26 2 27 3 34 4 26 5 25 6 24 9 23 0 32 1 28 2 15 3 26 4 24 (7 Replies)
Discussion started by: Dona Clara
7 Replies

4. Shell Programming and Scripting

AWK print initial record and double

I have an initial record 0.018 I would like a script that would for i=0;i<200;i++ print 0.018*1 0.018*2 0.018*3 0.018*4 ... 0.018*200 using newline. (7 Replies)
Discussion started by: chrisjorg
7 Replies

5. Shell Programming and Scripting

[AWK script]Counting the character in record and print them in condition

.......... (1 Reply)
Discussion started by: Antonlee
1 Replies

6. Shell Programming and Scripting

Print all the fields of record using awk

Hi, i want to generate print statement using awk. i have 20+ and 30+ fields in each line Now its priting only first eight fields print statement as output not all. my record is as shown below filename ... (2 Replies)
Discussion started by: raghavendra.nsn
2 Replies

7. Shell Programming and Scripting

awk - print record with both string1 and string2

How do I use awk to find the records in a file that contains two specific strings? I have tried piping and using awk two times, but I don't know how to do it in one action. (2 Replies)
Discussion started by: locoroco
2 Replies

8. Shell Programming and Scripting

awk - sort, then print the high value for each group

Hi @ all I'm trying to achive to this problem, I've a 2-column composed file as the following: 192.168.1.2 2 192.168.1.3 12 192.168.1.2 4 192.168.1.4 3 cpc1-swan1-2-3-cust123.swan.cable.ntl.com 4 192.168.1.3 5 192.168.1.2 10 192.168.1.4 8... (8 Replies)
Discussion started by: m4rco-
8 Replies

9. UNIX for Advanced & Expert Users

Print Full record and substring in that record

I have i got a requirement like below. I have input file which contains following fixed width records. 00000000000088500232007112007111 I need the full record and concatenated with ~ and characters from 1to 5 and concatenated with ~ and charactes from 10 to 15 The out put will be like... (1 Reply)
Discussion started by: ukatru
1 Replies

10. UNIX for Dummies Questions & Answers

How to exclude a record from unix file

I want to exclude records from my unix file that have a specific pattern. How can I do this? Thanks. Ryan (1 Reply)
Discussion started by: Ryan2786
1 Replies
Login or Register to Ask a Question