awk: switching lines and concatenating lines?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk: switching lines and concatenating lines?
# 1  
Old 03-28-2010
Data awk: switching lines and concatenating lines?

Hello, I have only recently begun with awk and need to write this:
I have an input consisting of a couple of letters, a space and a number followed by various other characters:

fiRcQ 9( [various data ])
klsRo 9( [various data ]) pause
fiRcQ 9( [various data ]) pause
klsRo continue 1
aPLnJ 62( [various data ])
fiRcQ continue 5
... and so on

I want an output where each pause would be followed by a continue with the same key identifier:

fiRcQ 9( [various data ])
klsRo 9( [various data ]) pause klsRo continue 1
fiRcQ 9( [various data ]) pause fiRcQ continue 5
aPLnJ 62( [various data ])

So the algorithm would be something along the lines of:

Code:
start on line = 1
search for "pause" [else increase line by 1 and repeat]
 next line, search for "continue" [else move to next line]
  compare \^[a-zA-Z]*\ from starting line with the same regexp on the current line [else move to next line]
    in case of match, take current line, add it to starting line and delete current line
    increase line by 1 and repeat.

I know what I want to do, but being a total newbie in awk, I have no idea how the syntax would look like.
# 2  
Old 03-28-2010
If the order of the lines in output doesn't matter, this will do what you want...
Code:
awk '/pause/ || /continue/{arr[$1]=arr[$1]" "$0;next}
{arr[$0]=arr[$0]" "$0}END{for(i in arr) print arr[i]}' infile

# 3  
Old 03-28-2010
Thanks

Well...what I meant was: when it finds a line with "pause", search one line after another from top down for a line with a matching first column (letter-id) that contains "continue". Then take the "continue" line and MOVE it to the end of the first one, keeping all the characters of both lines so instead of two separate lines you get one longer line in place of the first.

The input is already sorted in a way that although more lines than just two can have the same id, if a line with an id contains "pause", than the next line with the same id will contain "continue", in the same way that interrupted processes work.

EDIT: Could you please comment how your piece of code works? I don't really understand it much :-(

Last edited by Borghal; 03-28-2010 at 02:16 PM..
# 4  
Old 03-28-2010
Something like this?
Code:
awk '/pause/{ a[$1]=$0;next } a[$1]{ print a[$1] FS $0;next } 1' file

# 5  
Old 03-28-2010
Hello, Borghal:

Welcome to the forums. I modified your sample data to include a pause-continue pair that reuses a key used by a previous pause-continue pair.

Code:
$ cat data
fiRcQ 9( [various data ])
klsRo 9( [various data ]) pause
fiRcQ 9( [various data ]) pause
klsRo continue 1
aPLnJ 62( [various data ])
fiRcQ continue 5
klsRo 9( [various data ]) pause
klsRo continue 21

$ awk 'NR==FNR {if ($2=="continue") c[$1,++c[$1,"i"]]=$0; next} $NF=="pause" {print $0,c[$1,++p[$1]]; next} $2!="continue"' data data
fiRcQ 9( [various data ])
klsRo 9( [various data ]) pause klsRo continue 1
fiRcQ 9( [various data ]) pause fiRcQ continue 5
aPLnJ 62( [various data ])
klsRo 9( [various data ]) pause klsRo continue 21

Regards,
Alister

---------- Post updated at 02:03 PM ---------- Previous update was at 01:56 PM ----------

If you need to strictly preserve the order, I would recommend my solution over Franklin52's. If not, then most definitely use Franklin52's as it's simpler and could be significantly faster (since mine must read the data twice).

Alister

Last edited by alister; 03-28-2010 at 03:09 PM..
# 6  
Old 03-28-2010
Thanks, everyone... unfortunately, I do need to preserve the order of the input file, but as I'm using it in a filter cat | awk | grep | sed ..., reading the data twice is not an option I think.

This is proving more difficult than I thought it would Smilie

Could someone explain to me please what does this line do:
(It's supposed to do what I want, but all it does is delete both concerning lines)

Code:
awk '/pause$/ {array[$1$2]=$0; next}/continue/ {if($1$2 in array) print array[$1$2] $0; delete array[$0]; next}{ print $0 }'

First it looks for a line with pause, then assigns the line it finds to array[$1$2] (why $1$2?) then it goes on to the next line to start a search for continue?

Sorry, I must look really dumb, but I can't find any good tutorial that would help me understand how it works...

Last edited by Borghal; 03-28-2010 at 04:08 PM..
# 7  
Old 03-28-2010
For the best help possible, you should post the entire pipeline (your filter).

Your awk code appears to delete lines that end in "pause" or contain "continue" because when a continue line is found, and you look up $1$2 in the array, there is never a match. $1$2 for the continue line is equal to a key value followed by the word "continue". None of the pause lines will match that as the second field begins with a number. You need to key on $1 alone, not $1 and $2.

Tweaking your code:
Code:
awk '/pause$/ {array[$1]=$0; next}/continue/ {if($1 in array) print array[$1] $0; delete array[$1]; next}{ print $0 }'

... but, that's just a more verbose version of Franklin52's solution above.

This approach will reorder the lines a bit, because while the array holds a pause line until its continue is found, other lines may print, effectively moving a pause line further down the sequence.

Alister

Last edited by alister; 03-28-2010 at 04:50 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Concatenating the lines of a data

I have a data of 1 lac lines with the following format abcde,1,2,3,4, ,ee ,ff ,gg ,hh ,mm abcde,3,4,5,6, ,we ,qw ,as ,zx ,cf abcde,1,5,6,7, ,dd ,aa ,er .... .... (6 Replies)
Discussion started by: aravindj80
6 Replies

2. Shell Programming and Scripting

Concatenating more than two lines into one based on some columns

Hi, I need to concatenate some lines in a file based on the First 4 coloumns of a file .. (For Eg.) Consider a file ... I,01,000002,0666,00000.00,000,00,000,000, ,0 I,01,000002,0667,00000.00,000,00,000,000, ,0 I,01,000002,0666,00056.10 I,01,000002,0667,00056.10 I,01,000002,0666,00001... (6 Replies)
Discussion started by: Sri3001
6 Replies

3. Shell Programming and Scripting

Switching lines

Hi I'm quite new with linux. Very simple, I need to swap every 2 lines in a file. Example INPUT: a a a b b b x x x y y y s s s t t t OUTPUT: b b b a a a y y y x x x t t t (5 Replies)
Discussion started by: hernand
5 Replies

4. Shell Programming and Scripting

Concatenating lines ending with '+' using awk

Hi, I have an ASCII text file where some of the lines are ending with '+' character. I have to concatenate the next successive line with those lines having the trailing '+' char by removing that char. The below awk code has some problems to do this task: awk '{while(sub(/\+$/,"")) {... (12 Replies)
Discussion started by: royalibrahim
12 Replies

5. Shell Programming and Scripting

Concatenating lines in bash

Hi, I'm attempting to join two lines in a file which are separated by a line break. The file contents are shown below: event_id=0 id=0_20100505210853 IFOconfig=HLV template=TaylorF2 Nlive=1000.0 Nruns=1.0 NIFO=3... (7 Replies)
Discussion started by: Supersymmetric
7 Replies

6. Shell Programming and Scripting

Concatenating lines of separate files using awk or sed

For example: File 1: abc def ghi jkl mno pqr File 2: stu vwx yza bcd efg hij klm nop qrs I want the reult to be: abc def ghistu vwx yza jkl mno pqrbcd efg hij klm nop qrs (4 Replies)
Discussion started by: tamahomekarasu
4 Replies

7. Shell Programming and Scripting

Concatenating the lines with different pattern

Hi, I have put a similar question in one of the other threads through which I got the solution shown below but I have some more condition to add to it, hence have further queries on it. I appologies if I should be putting this with the old thread. I have a file which perform a grep on the... (1 Reply)
Discussion started by: simi28
1 Replies

8. Shell Programming and Scripting

Swapping or switching 2 lines using sed

I made a script that can swap info on two lines using a combination of awk and sed, but was hoping to consolidate the script to make it run faster. If found this script, but can't seem to get it to work in a bash shell. I keep getting the error "Too many {'s". Any help here would be appreciated:... (38 Replies)
Discussion started by: LaTortuga
38 Replies

9. Shell Programming and Scripting

Concatenating the two lines in a file

hi My requirement is i have a file with some records like this file name ::xyz a=1 b=100,200 ,300,400 ,500,600 c=700,800 d=900 i want to change my file a=1 b=100,200,300,400 c=700,800 d=900 if record starts with " , " that line should fallows the previous line.please give... (6 Replies)
Discussion started by: srivsn
6 Replies

10. Shell Programming and Scripting

Concatenating lines and formatting.

Hi, I have a html file which is unformatted and need to concatenate the lines between each "table" statement in order to run an awk statement on it. Here is the example of the raw file: <table border="0" cellspacing="0" cellpadding="0" class="playerDetails"> def ... (3 Replies)
Discussion started by: Tonka52
3 Replies
Login or Register to Ask a Question