Merging multiple lines into single line based on one column

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Merging multiple lines into single line based on one column
# 8  
Old 12-16-2016
And the inevitable:
Code:
awk -F. '{A[$1]=A[$1] q "," q $2} END{for(i in A) print q i A[i] q}' q=\' file

If the output order is not important (otherwise pipe through sort sort) and may be helpful in particular if the input is not grouped.

Last edited by Scrutinizer; 12-16-2016 at 04:26 AM..
# 9  
Old 12-16-2016
Code:
awk -F. -vFMT="'%s'" '                          # split line at ".", use FMT variable for printing as single
                                                # quote handling is difficult within the awk script
$1 != LAST      {printf QRS FMT, $1             # on changing first field (LAST is the empty string in the first record
                                                # so pattern is TRUE) print it after a temporary record separator (which
                                                # is empty, too, on the 1. record)
                 QRS = ORS                      # now set temp sep to line feed for all remaining records
                 LAST = $1                      # keep track of the first field for the upcoming pattern checksy
                }
                {printf "," FMT, $2             # now keep adding the quoted second fields until first field changes
                }
END             {printf ORS                     # final line feed 
                }
' file

# 10  
Old 12-16-2016
Just wondering if last printf END {printf ORS} might not be slightly improved as END {printf QRS} to suppress empty line for empty input.
This User Gave Thanks to ronaldxs For This Post:
# 11  
Old 12-17-2016
Quote:
Originally Posted by raju2016
Hi Ravinder/Rudy,
Could you please explain what exactly this AWK is doing
Hello raju2016,

Could you please go through the following and let me know if this helps, also it is not the code in running form, it is only expanded for explanation purposes.
Code:
awk                                            #### Starting awk here.
-F"."                                          #### Mentioning -F as .(DOT) here so e could make custom delimiters in awk as per our requirement so making DOT as a delimiter.
-vs1="'"                                       #### defining a variable named s1 whose value is '. So in awk we could define variables by using -v variable_name="value".
'FNR==NR                                       #### mentioning here FNR==NR condition. Where FNR and NR both are awk's in-built keywords and tell us the number of lines in any 
                                                    Input_file but a major difference between FNR and NR is FNR's value will be RESET each time a new Input_file is being read 
                                                    and NR's value will keep on increase till the last Input_file being read(as in awk we could mention mutiple files as an Input).
{A[$1]=A[$1]?A[$1] OFS s1 $NF s1:s1 $NF s1;    #### So creating here an array named A whose index is $1(first field) and whose value is s1 $NF s1 if that element's index is NOT registered in array A.
                                               #### Where s1 as mentioned before is a variable with value of ' and $NF denoted the last field of a line which is being read. So if a value is find in array A
                                                    which is already registered(mentioned in code by ?) then A's that specific index's value will be A[$1] OFS s1 $NF s1, which means appending the current value
                                                    into the previous A's value of current index(which is $1 from each line).
next}                                          #### Now using next which is a in-built keyword of awk. So by mentioning this we are skipping all further statements now.
A[$1]{                                         #### A[$1], so this is a condition which will be executed when 2nd time Input_file is being read and it make sure if any first field is present in array A then execute following statements.
print s1 $1 s1 OFS A[$1];                      #### printing here s1 $1 s1 OFS A[$1], where s1 kis a variable mentioned above, $1 is first field of line and OFS is Output field separator(awk's in-built keyword, whose default value is a space) then A[$1], which will print array A's value whose index is $1 of current line.
delete A[$1]                                   #### deleting the array A's element whose index is $1 so that we will not execute already executed $1 as we are reading Input_file twice.
}' OFS=,   Input_file  Input_file              #### Mentioning the value of OFS as a comma(,) which is awk's in-built keyword denotes as Ouptut field separator, mentioning Input_file 2 times too here.

Thanks,
R. Singh
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Multiple lines to single line

I have code as below # create temporary table `temp4277`(key(waybill_no)) select waybill_no,concat_ws('',card_type,card_series_no) cardinfo from rfid_temp_ticket where waybill_no='4277' group by... (4 Replies)
Discussion started by: kaushik02018
4 Replies

2. Shell Programming and Scripting

Merging two tables including multiple ocurrence of column identifiers and unique lines

I would like to merge two tables based on column 1: File 1: 1 today 1 green 2 tomorrow 3 red File 2: 1 a lot 1 sometimes 2 at work 2 at home 2 sometimes 3 new 4 a lot 5 sometimes 6 at work (4 Replies)
Discussion started by: BSP
4 Replies

3. Shell Programming and Scripting

Returning multiple outputs of a single line based on previous repeated lines

Hello, I am trying to return a time multiple times from a file that has varying output just before the time instance, i.e. cat jumped cat jumped cat jumped time = 1.1 cat jumped cat jumped time = 1.2 cat jumped cat jumped time = 1.3 In this case i would like to output a time.txt... (6 Replies)
Discussion started by: ryddner
6 Replies

4. UNIX for Dummies Questions & Answers

Merging lines based on one column

Hi, I have a file which I'd like to merge lines based on duplicates in one column while keeping the info for other columns. Let me simplify it by an example: File ESR1 ANASTROZOLE NA FDA_approved ESR1 CISPLATIN NA FDA_approved ESR1 DANAZOL agonist NA ESR1 EXEMESTANE NA FDA_approved... (3 Replies)
Discussion started by: JJ001
3 Replies

5. Shell Programming and Scripting

merging multiple lines into single line

Hi, 1. Each message starts with date 2. There is blank line between each message 3. Each message does not contain same number of lines. Any help in merging multiple lines in each message to a single line is much appreciated. AIX: Korn Shell Error log file looks like below. ... (5 Replies)
Discussion started by: bala123
5 Replies

6. Shell Programming and Scripting

Awk multiple lines with 4th column on to a single line

This is related to one of my previous post.. I have huge file currently I am using loop to read file and checking each line to build this single record, its taking much much time to parse those records.. I thought there should be a way to do this in awk or sed. I found this code in this forum... (7 Replies)
Discussion started by: Vasan
7 Replies

7. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

8. Shell Programming and Scripting

Split single file into multiple files based on the number in the column

Dear All, I would like to split a file of the following format into multiple files based on the number in the 6th column (numbers 1, 2, 3...): ATOM 1 N GLY A 1 -3.198 27.537 -5.958 1.00 0.00 N ATOM 2 CA GLY A 1 -2.199 28.399 -6.617 1.00 0.00 ... (3 Replies)
Discussion started by: tomasl
3 Replies

9. Shell Programming and Scripting

merge lines into single line based on symbol \t

The symbols are \t and \t\t (note: not tab) If the line starts with \t merge them into a single line upto symbol \t\t \t\t to end and start new line I able to join in a single line but not ending at \t\t and I completely confused help would be appreciated:b::D Input \ta tab XXXXXXXXXX \te... (5 Replies)
Discussion started by: repinementer
5 Replies

10. Shell Programming and Scripting

Awk multiple lines with 3rd column onto a single line?

I have a H U G E file with over 1million entries in it. Looks something like this: USER0001|DEVICE001|VAR1 USER0001|DEVICE001|VAR2 USER0001|DEVICE001|VAR3 USER0001|DEVICE001|VAR4 USER0001|DEVICE001|VAR5 USER0001|DEVICE001|VAR6 USER0001|DEVICE002|VAR1 USER0001|DEVICE002|VAR2... (4 Replies)
Discussion started by: SoMoney
4 Replies
Login or Register to Ask a Question