awk print values between consecutive lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk print values between consecutive lines
# 1  
Old 12-05-2013
awk print values between consecutive lines

I have a file in below format:

file01.txt

Code:
TERM
TERM
TERM
ABC     12315   68.53   12042013   165144
ABC     12315   62.12   12042013   165145
ABC     12315  122.36   12052013   165146
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141
ABC     12316   68.53   12042013   165142
ABC     12316   62.12   12042013   165143
ABC     12316  122.36   12052013   165144
ABC     12316  122.36   12052013   165145

my desired output will be:
Code:
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145

In this file all the values are sorted by column 2 and 5.

I've tried the following command:
Code:
awk '/^ABC/ {if (lastval != $5-1 ) { print line;print $0}  lastval = $5; line = $0 }' file01.txt

which adds an extra line at the beginning and skips the last row as well:
Code:
                                                        
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141

Seeking for your assistance regarding on how to modify the one-liner in order to:

- print the last row

- add a count for each pair of values and add one new line between pairs :
Code:
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147  4

ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145  5

- if a value is missing between first and last value do not split e.g.:
Code:
ABC     12316   62.12   12042013   165143

is missing from the file01.txt

The final output should be:
Code:
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147  4

ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145  4

Thank you in advance for your help
# 2  
Old 12-05-2013
So you want the first and last line of each group (as determined by $2) plus a count of how many lines there were in the group?

It will be difficult to make this a "one-liner" as printing the count requires it to read ahead, to know when the "group" ends.
# 3  
Old 12-05-2013
Yes, that's correct.

It doesn't have to be a one-liner. I use the one-liner only on my trials.
# 4  
Old 12-05-2013
Code:
$ cat grp2.awk
!/^ABC/ { next }

!($2 in A)      {       if(LAST) print LAST,A[LID] ; print      }
                {       A[$2]++; LAST=$0; LID=$2                }
END             {       if(LAST) print LAST, A[LID]             }

$ awk -f grp2.awk data

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147 4
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145 5

$

Your input data includes five lines for 12316, not four.
This User Gave Thanks to Corona688 For This Post:
# 5  
Old 12-05-2013
Thank you, worked very well.

The output with 4 lines was when
Code:
ABC     12316   62.12   12042013   165143

was missing. I tested your script and works well even if a value is missing from group.
# 6  
Old 12-05-2013
Quote:
Originally Posted by alex2005
Thank you, worked very well.

The output with 4 lines was when
Code:
ABC     12316   62.12   12042013   165143

was missing. I tested your script and works well even if a value is missing from group.
How ?

I am also getting result like corona, with assumption file is sorted

Code:
$ awk '!/^ABC/{next}p!=$5-1{printf last ? last FS x[l]++ RS $0 RS : $0 RS}{p=$5;last=$0;l=$2;x[$2]++}END{print last FS x[l]++}' file
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147 4
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145 5


Last edited by Akshay Hegde; 12-05-2013 at 03:59 PM..
This User Gave Thanks to Akshay Hegde For This Post:
# 7  
Old 12-05-2013
Hi,
Thank you for your reply.
I wanted to be able to use the script even if the values in column $5 are not consecutive,

For example row "
Code:
ABC     12316   62.12   12042013   165143

" is missing

The
Code:
file01.txt

would become:
Code:
TERM
TERM
TERM
ABC     12315   68.53   12042013   165144
ABC     12315   62.12   12042013   165145
ABC     12315  122.36   12052013   165146
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141
ABC     12316   68.53   12042013   165142
ABC     12316  122.36   12052013   165144
ABC     12316  122.36   12052013   165145

Here is the result of your one-liner:

Code:
awk '!/^ABC/{next}p!=$5-1{printf last ? last FS x[l]++ RS $0 RS : $0 RS}{p=$5;last=$0;l=$2;x[$2]++}END{print last FS x[l]++}' file02.txt

Code:
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147 4
ABC     12316    2.36   12052013   165141
ABC     12316   62.12   12042013   165143 3
ABC     12316  122.36   12052013   165145
ABC     12316  122.36   12052013   165145 5

My desired output would be:

Code:
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147  4
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145  4

Sorry if I couldn't describe more accurate from the first trial.

Best Regards
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

I have a text file with many thousands of lines, a small sample of which looks like this: InputFile:PS002,003 D -1 5 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 6 6 -1 -1 -1 -1 0 509 0 PS002,003 PSQ 0 1 7 18 1 0 -1 1 1 3 -1 -1 ... (5 Replies)
Discussion started by: jvoot
5 Replies

2. UNIX for Beginners Questions & Answers

awk - If field value of consecutive records are the identical print portion of lines

I have some data that looks like this: PXD= ZW< 1,6 QR> QRJ== 1,2(5) QR> QRJ== 4,1(2) QR> QRJ== 4,2 QRB= QRB 4,2 QWM QWM 6,2 R<C ZW< 11,2 R<H= R<J= 6,1 R>H XZJ= 1,2(2) R>H XZJ= 2,6(2) R>H XZJ= 4,1(2) R>H XZJ= 6,2 RDP RDP 1,2 What I would like to do is if fields $1 and $2 are... (5 Replies)
Discussion started by: jvoot
5 Replies

3. Shell Programming and Scripting

How to print lines that have values in certain columns ?

Hi, everyone I have a dataset like this: A B C D A C C D E F G H F D K Y X A K K C Gsome of columns have no values in each line. I want to print all lines that have 1/2/3/4 values, export separately to four files. What I expected is like this: file1 Y file 2 A C X Afile 3... (3 Replies)
Discussion started by: nengcheng
3 Replies

4. Shell Programming and Scripting

Print values within groups of lines with awk

Hello to all, I'm trying to print the value corresponding to the words A, B, C, D, E. These words could appear sometimes and sometimes not inside each group of lines. Each group of lines begins with "ZYX". My issue with current code is that should print values for 3 groups and only is... (6 Replies)
Discussion started by: Ophiuchus
6 Replies

5. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

6. Shell Programming and Scripting

How to compare the values of a column in awk in a same file and consecutive lines..

I would like to compare the values of 2nd column of consecutive lines of same file in such a way so that if the difference between first value and second value is more than 100 it should print complete line else ignore line. Input File ========== PDB 2500 RTDB 123 RTDB-EAGLE 122 VSCCP 2565... (4 Replies)
Discussion started by: manuswami
4 Replies

7. Shell Programming and Scripting

Print lines containing same values in a group

Hi, I have a table like this: Name A1 A2 A3 B1 B2 B3 111 abc abc abc cbc cbc cbc 222 acv acv n_n bbc bbc bbc 333 bvv bvv bvv cBx ccx ccx 444 ttk ttk ttk kke kke kke 555 mcr mcr mcr mcr mcr mcr The 1st column is just names... (3 Replies)
Discussion started by: polsum
3 Replies

8. UNIX for Dummies Questions & Answers

Only print lines with 3 numeric values

Hey guys & gals, I am hoping for some advice on a sed or awk command that will allow to only print lines from a file that contain 3 numeric values. From previous searches here I saw that ygemici used the sed command to remove lines containing more than 3 numeric values ; however how... (3 Replies)
Discussion started by: TAPE
3 Replies

9. Shell Programming and Scripting

AWK: combining consecutive values in a field

Hi, Here is my sample input X 2 AAA Y 3 BBB Y 2 CCC Z 4 DDD In field 1, if the value of one line is same as that of next line, I want to concatenate the corresponding value of the second line in the third field with the value of the third field of first line. And I dont need the third... (2 Replies)
Discussion started by: polsum
2 Replies

10. Shell Programming and Scripting

need help cutting consecutive lines with sed or awk

HI All, I want to cut 5 lines after a pattern using sed or awk. can any one tell me how to do it ? (2 Replies)
Discussion started by: raghin
2 Replies
Login or Register to Ask a Question