awk print values between consecutive lines

12-05-2013

Registered User

61, 3

Join Date: Nov 2010

Last Activity: 14 July 2020, 12:43 PM EDT

Posts: 61

Thanks Given: 44

Thanked 3 Times in 3 Posts

awk print values between consecutive lines

I have a file in below format:

file01.txt

Code:

TERM
TERM
TERM
ABC     12315   68.53   12042013   165144
ABC     12315   62.12   12042013   165145
ABC     12315  122.36   12052013   165146
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141
ABC     12316   68.53   12042013   165142
ABC     12316   62.12   12042013   165143
ABC     12316  122.36   12052013   165144
ABC     12316  122.36   12052013   165145

my desired output will be:

Code:

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145

In this file all the values are sorted by column 2 and 5.

I've tried the following command:

Code:

awk '/^ABC/ {if (lastval != $5-1 ) { print line;print $0}  lastval = $5; line = $0 }' file01.txt

which adds an extra line at the beginning and skips the last row as well:

Code:

                                                        
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141

Seeking for your assistance regarding on how to modify the one-liner in order to:

- print the last row

- add a count for each pair of values and add one new line between pairs :

Code:

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147  4

ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145  5

- if a value is missing between first and last value do not split e.g.:

Code:

ABC     12316   62.12   12042013   165143

is missing from the file01.txt

The final output should be:

Code:

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147  4

ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145  4

Thank you in advance for your help

alex2005

View Public Profile for alex2005

Find all posts by alex2005

12-05-2013

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

So you want the first and last line of each group (as determined by $2) plus a count of how many lines there were in the group?

It will be difficult to make this a "one-liner" as printing the count requires it to read ahead, to know when the "group" ends.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

12-05-2013

Registered User

61, 3

Join Date: Nov 2010

Last Activity: 14 July 2020, 12:43 PM EDT

Posts: 61

Thanks Given: 44

Thanked 3 Times in 3 Posts

Yes, that's correct.

It doesn't have to be a one-liner. I use the one-liner only on my trials.

alex2005

View Public Profile for alex2005

Find all posts by alex2005

12-05-2013

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Code:

$ cat grp2.awk
!/^ABC/ { next }

!($2 in A)      {       if(LAST) print LAST,A[LID] ; print      }
                {       A[$2]++; LAST=$0; LID=$2                }
END             {       if(LAST) print LAST, A[LID]             }

$ awk -f grp2.awk data

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147 4
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145 5

$

Your input data includes five lines for 12316, not four.

This User Gave Thanks to Corona688 For This Post:

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

12-05-2013

Registered User

61, 3

Join Date: Nov 2010

Last Activity: 14 July 2020, 12:43 PM EDT

Posts: 61

Thanks Given: 44

Thanked 3 Times in 3 Posts

Thank you, worked very well.

The output with 4 lines was when

Code:

ABC     12316   62.12   12042013   165143

was missing. I tested your script and works well even if a value is missing from group.

alex2005

View Public Profile for alex2005

Find all posts by alex2005

12-05-2013

Moderator

1,837, 668

Join Date: Nov 2012

Last Activity: 30 June 2020, 12:07 PM EDT

Posts: 1,837

Thanks Given: 180

Thanked 668 Times in 590 Posts

Quote:

Originally Posted by alex2005

Thank you, worked very well.

The output with 4 lines was when

Code:

ABC     12316   62.12   12042013   165143

was missing. I tested your script and works well even if a value is missing from group.

How ?

I am also getting result like corona, with assumption file is sorted

Code:

$ awk '!/^ABC/{next}p!=$5-1{printf last ? last FS x[l]++ RS $0 RS : $0 RS}{p=$5;last=$0;l=$2;x[$2]++}END{print last FS x[l]++}' file
ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147 4
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145 5

Last edited by Akshay Hegde; 12-05-2013 at 03:59 PM..

This User Gave Thanks to Akshay Hegde For This Post:

Akshay Hegde

View Public Profile for Akshay Hegde

Find all posts by Akshay Hegde

12-05-2013

Registered User

61, 3

Join Date: Nov 2010

Last Activity: 14 July 2020, 12:43 PM EDT

Posts: 61

Thanks Given: 44

Thanked 3 Times in 3 Posts

Hi,
Thank you for your reply.
I wanted to be able to use the script even if the values in column $5 are not consecutive,

For example row "

Code:

ABC     12316   62.12   12042013   165143

" is missing

The

Code:

file01.txt

would become:

Code:

TERM
TERM
TERM
ABC     12315   68.53   12042013   165144
ABC     12315   62.12   12042013   165145
ABC     12315  122.36   12052013   165146
ABC     12315  582.18   12052013   165147
ABC     12316    2.36   12052013   165141
ABC     12316   68.53   12042013   165142
ABC     12316  122.36   12052013   165144
ABC     12316  122.36   12052013   165145

Here is the result of your one-liner:

Code:

awk '!/^ABC/{next}p!=$5-1{printf last ? last FS x[l]++ RS $0 RS : $0 RS}{p=$5;last=$0;l=$2;x[$2]++}END{print last FS x[l]++}' file02.txt

Code:

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147 4
ABC     12316    2.36   12052013   165141
ABC     12316   62.12   12042013   165143 3
ABC     12316  122.36   12052013   165145
ABC     12316  122.36   12052013   165145 5

My desired output would be:

Code:

ABC     12315   68.53   12042013   165144
ABC     12315  582.18   12052013   165147  4
ABC     12316    2.36   12052013   165141
ABC     12316  122.36   12052013   165145  4

Sorry if I couldn't describe more accurate from the first trial.

Best Regards

alex2005

View Public Profile for alex2005

Find all posts by alex2005

Shell Programming and Scripting

awk print values between consecutive lines

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

Discussion started by: jvoot

2. UNIX for Beginners Questions & Answers

awk - If field value of consecutive records are the identical print portion of lines

Discussion started by: jvoot

3. Shell Programming and Scripting

How to print lines that have values in certain columns ?

Discussion started by: nengcheng

4. Shell Programming and Scripting

Print values within groups of lines with awk

Discussion started by: Ophiuchus

5. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Discussion started by: jacobs.smith

6. Shell Programming and Scripting

How to compare the values of a column in awk in a same file and consecutive lines..

Discussion started by: manuswami

7. Shell Programming and Scripting

Print lines containing same values in a group

Discussion started by: polsum

8. UNIX for Dummies Questions & Answers

Only print lines with 3 numeric values

Discussion started by: TAPE

9. Shell Programming and Scripting

AWK: combining consecutive values in a field

Discussion started by: polsum

10. Shell Programming and Scripting

need help cutting consecutive lines with sed or awk

Discussion started by: raghin