Visit The New, Modern Unix Linux Community


Merging Adjacent Lines Using Gawk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merging Adjacent Lines Using Gawk
# 1  
Merging Adjacent Lines Using Gawk

Hi all,

I have a text file consisting of 4 columns. What I am trying to do is see whether column 2 repeats multiple times, and collapse those repeats into one row. For example, here is a snippet of the file I am trying to analyze:

1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Sure_Thing 15.043 0.39
1 Gamble_Loss 15.496 1.236
1 Gamble_Loss 16.982 0.402
1 Gamble_Loss 17.647 0.19
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

Here is what I am trying to do: For the conditions where "Sure_Thing" and "Gamble_Loss" repeat, I want to collapse it into a single line, adding up all of column 4 over the repeats. So after I gawk it, I want it to look something like this:

1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.564
1 Gamble_Loss 15.496 1.828
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371


Here is the code I have used to analyze it so far, but it only works for 2 adjacent repeats; I want to generalize it for multiple repeats:

Code:
igawk '

BEGIN{

OFS=" "

prevTrial = "-";
prevTime = "0";
prevDur = "0";
}

{

if ($2 == prevTrial)
print $1, prevTrial, prevTime, prevDur+$4;
else if ($2 != prevTrial)
print $0;
prevTrial = $2; prevTime = $3; prevDur = $4; } ' $*

I appreciate any input!
# 2  
this close to what you want ? :

Code:
#  awk '$2==t{s+=$4}$2!=t{print x,s;x=$1" "$2" "$3;t=$2;s=$4}END{print x,s}' infile
 
1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.564
1 Gamble_Loss 15.496 1.828
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

# 3  
you need to modify as below:-

Code:
gawk '

BEGIN{

OFS=" "
prevTrial = "-";
prevTime = "0";
prevDur = "0";

}

{

if ($2 == prevTrial) { next ;}
else if ($2 != prevTrial) {print $0; prevTrial = $2 ; prevTime = $3; prevDur = $4}

}
' infile.txt

Code:
O/P:-

1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Gamble_Loss 15.496 1.236
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

---------- Post updated at 19:03 ---------- Previous update was at 18:59 ----------

Even better you can use the below short code.

Code:
gawk '($2==p){ next ; }{print $0 ; p=$2 }' infile.txt

Code:
O/P:
1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Gamble_Loss 15.496 1.236
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

SmilieSmilieSmilieSmilie
# 4  
Thanks Tytalus, that was exactly what I was looking for.
# 5  
Code:
my $val="---";
while(<DATA>){
  my @tmp = split;
   if($val eq $tmp[1]){
     $suffix+=$tmp[3];
   }
   else{
    print $prefix," ",$suffix,"\n" unless $.==1;
    $prefix=$tmp[0]." ".$tmp[1]." ".$tmp[2];
    $suffix=$tmp[3];
    $val=$tmp[1];
   }
  }
print $prefix," ",$suffix,"\n";
__DATA__
1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Sure_Thing 15.043 0.39
1 Gamble_Loss 15.496 1.236
1 Gamble_Loss 16.982 0.402
1 Gamble_Loss 17.647 0.19
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371
1 Arrow 18.203 0.371


Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #387
Difficulty: Medium
The Unix apropos utility locates commands using a neural network.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging multiple lines to columns with awk, while inserting commas for missing lines

Hello all, I have a large csv file where there are four types of rows I need to merge into one row per person, where there is a column for each possible code / type of row, even if that code/row isn't there for that person. In the csv, a person may be listed from one to four times... (9 Replies)
Discussion started by: RalphNY
9 Replies

2. Shell Programming and Scripting

Gawk Find Pattern Print Lines Before and After

Using grep I can easily use: cvs log |grep -iB 10 -A 10 'date: 2013-10-30' to display search results and 10 lines before and after. How can this be accompished using gawk? (4 Replies)
Discussion started by: metallica1973
4 Replies

3. Emergency UNIX and Linux Support

[Solved] AWK to parse adjacent matching lines

Hi, I have an input file like F : 0.1 : 0.002 P : 0.3 : 0.004 P : 0.5 : 0.008 P : 0.1 : 0.005 L : 0.05 : 0.02 P: 0.1 : 0.006 P : 0.01 : 0.08 F : 0.02 : 0.08 Expected output: (2 Replies)
Discussion started by: vasanth.vadalur
2 Replies

4. Shell Programming and Scripting

merging two .txt files by alternating x lines from file 1 and y lines from file2

Hi everyone, I have two files (A and B) and want to combine them to one by always taking 10 rows from file A and subsequently 6 lines from file B. This process shall be repeated 40 times (file A = 400 lines; file B = 240 lines). Does anybody have an idea how to do that using perl, awk or sed?... (6 Replies)
Discussion started by: ink_LE
6 Replies

5. Shell Programming and Scripting

Gawk / Awk Merge Lines based on Key

Hi Guys, After windows died on my netbook I installed Lubuntu and discovered Gawk about a month ago. After using Excel for 10+ years I'm amazed how quick and easily Gawk can process data but I'm stuck with a little problem merging data from multiple lines. I'm an SEO Consultant and provide... (9 Replies)
Discussion started by: Jamesfirst
9 Replies

6. UNIX for Dummies Questions & Answers

Joining lines of a text file using GAWK

sir... am having a data file of customer master., containing some important fields as a set one line after another., what i want is to have one set of these fields(rows) one after another in line.........then the second set... and so on... till the last set completed. ... (0 Replies)
Discussion started by: KANNI786
0 Replies

7. Shell Programming and Scripting

How to subtract the adjacent lines from a single column?

Hi All, I have a file with only one column and i need to subtract the adjacent lines of the same column and print it in the same column. For Example: (Input) Col1 5 10 12 6 9 12 5 . . . .output should be like this: (12 Replies)
Discussion started by: Fredrick
12 Replies

8. Shell Programming and Scripting

Gawk combining lines unexpectedly

I am attempting to combine sections of log that should be one line but are spaced out over 10-30 lines due to how the software is outputting the info. (If I am making a newbie mistake I apologize) Example of log I am working with: 2009-04-14 14:51:22 access data here info. Info. Info. ……..... (4 Replies)
Discussion started by: demanche
4 Replies

9. Shell Programming and Scripting

gawk print some special lines

Hi every body, i have this file example : TD1 TD2 TD3 . . .TDn <DIE_1> xxxxxx <\DIE_1> <TD1> information 1 inormation n <\TD1> <TDq> information (0 Replies)
Discussion started by: kamel.kimo
0 Replies

10. UNIX for Dummies Questions & Answers

print adjacent lines

how do you print the lines before and after the line you are interested in? Example: Line to be printed: line 344 Output: line 343 line 344 line 345 Thanks (1 Reply)
Discussion started by: apalex
1 Replies

Featured Tech Videos