Merging multiple lines to columns with awk, while inserting commas for missing lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merging multiple lines to columns with awk, while inserting commas for missing lines
# 8  
Old 09-02-2015
Hello RalphNY,

If order doesn't matter for you, you can try as follows.
Code:
awk -F, '{A[$1 FS $2]=$1 FS $2;B[$1 FS $2]=$(NF-2) FS $(NF-1) FS $NF;C[$1 FS $2 FS $3]=$4;;A[$1 FS $2]=A[$1 FS $2] FS C[$1 FS $2 FS "WE"] FS C[$1 FS $2 FS "MO"] FS C[$1 FS $2 FS "CE"] FS C[$1 FS $2 FS "ES"] FS B[$1 FS $2]} END{for(i in A){print A[i]}}'  Input_file

Output will be as follows.
Code:
3333,Hortense Gable,,,25,14,Three Pinapple Ln,Somewhere,NY
1111,John Smith,105,23,,,Fourty Two Cherry Lane,Nowhere,NY
2222,Elane Boozler,70,24,30,8,Twelve Peach Court,Elsehere,NY

If you need to get output in sorting order then following may help you.
Code:
awk -F, '{A[$1 FS $2]=$1 FS $2;B[$1 FS $2]=$(NF-2) FS $(NF-1) FS $NF;C[$1 FS $2 FS $3]=$4;;A[$1 FS $2]=A[$1 FS $2] FS C[$1 FS $2 FS "WE"] FS C[$1 FS $2 FS "MO"] FS C[$1 FS $2 FS "CE"] FS C[$1 FS $2 FS "ES"] FS B[$1 FS $2]} END{for(i in A){print A[i]}}' Input_file | sort -k1

Output will be as follows.
Code:
1111,John Smith,105,23,,,Fourty Two Cherry Lane,Nowhere,NY
2222,Elane Boozler,70,24,30,8,Twelve Peach Court,Elsehere,NY
3333,Hortense Gable,,,25,14,Three Pinapple Ln,Somewhere,NY

EDIT: Adding a non one-liner for same solution now.
Code:
awk -F, '{
                A[$1 FS $2]=$1 FS $2
                B[$1 FS $2]=$(NF-2) FS $(NF-1) FS $NF;
                C[$1 FS $2 FS $3]=$4;
                A[$1 FS $2]=A[$1 FS $2] FS C[$1 FS $2 FS "WE"] FS C[$1 FS $2 FS "MO"] FS C[$1 FS $2 FS "CE"] FS C[$1 FS $2 FS "ES"] FS B[$1 FS $2]
         }
                END{
                        for(i in A){
                                        print A[i]
                                   }
                   }
        ' Input_file | sort -k1

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-02-2015 at 03:17 AM.. Reason: Adding a non one-liner form for solution now. Removed a not used array named D in solution now.
# 9  
Old 09-02-2015
Don,
Thanks for pointing those things out! I hadn't noticed the latest version was matching the name instead of the ID. In the real file, there definitely would be some people with identical names, so it would have to match the ID.

Just changing the $8 to $7, to match the ID rather than the name seems to have worked. Could you tell me if there is anything wrong with the code below?

Code:
awk -F, '
NR != 1 && p1 != $7 {
    print prev
    delete TMP
}
{
    p1 = $7
    TMP[$9] = $10
    prev = $2 FS $4 FS $7 FS $8 FS TMP["WE"] FS TMP["MO"] FS TMP["CE"] FS TMP["ES"] FS $11 FS $13
}
END {
    print prev
}' OFS="," file

I'm using cygwin awk / gnu awk. I don't see any errors when running with delete TMP, but that should be compatible with gnu awk, right? Thanks again : )
# 10  
Old 09-02-2015
Quote:
Originally Posted by RalphNY
Don,
Thanks for pointing those things out! I hadn't noticed the latest version was matching the name instead of the ID. In the real file, there definitely would be some people with identical names, so it would have to match the ID.

Just changing the $8 to $7, to match the ID rather than the name seems to have worked. Could you tell me if there is anything wrong with the code below?

Code:
awk -F, '
NR != 1 && p1 != $7 {
    print prev
    delete TMP
}
{
    p1 = $7
    TMP[$9] = $10
    prev = $2 FS $4 FS $7 FS $8 FS TMP["WE"] FS TMP["MO"] FS TMP["CE"] FS TMP["ES"] FS $11 FS $13
}
END {
    print prev
}' OFS="," file

I'm using cygwin awk / gnu awk. I don't see any errors when running with delete TMP, but that should be compatible with gnu awk, right? Thanks again : )
I'm glad to have helped.

As I said before, the above code should be fine as long as it doesn't print any diagnostics (and GNU awk is one of many that have the extension that accepts delete array_name in addition to the delete array_element that is required by the standards). Just keep the delete element loop in the back of your mind if you ever need to port your code to a system that doesn't support that extension.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merging two lines into one (awk)

Hi, I am attempting to merge the following lines which run over two lines using awk. INITIAL OUTPUT 2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time 2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa ce Ethernet1/45 is down (Interface removed)... (10 Replies)
Discussion started by: sand1234
10 Replies

2. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'... (10 Replies)
Discussion started by: raju2016
10 Replies

3. Shell Programming and Scripting

Merging multiple lines

I do have a text file with multiple lines on it. I want to put the lines of text into a single line where ever there is ";" for example ert, ryt, yvig, fgr; rtyu, hjk, uio, hyu, hjo; ghj, tyu, gho, hjp, jklo, kol; The resultant file I would like to have is ert, ryt, yvig, fgr;... (2 Replies)
Discussion started by: Kanja
2 Replies

4. Shell Programming and Scripting

Merging multiple files using lines from one file

I have been working of this script for a very long time and I have searched the internet for direction but I am stuck here. I have about 3000 files with two columns each. The length of each file is 50000. Each of these files is named this way b.4, b.5, b.6, b.7, b.8, b.9, b.10, b.11, b.12... (10 Replies)
Discussion started by: iconig
10 Replies

5. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

6. Shell Programming and Scripting

merging multiple lines into single line

Hi, 1. Each message starts with date 2. There is blank line between each message 3. Each message does not contain same number of lines. Any help in merging multiple lines in each message to a single line is much appreciated. AIX: Korn Shell Error log file looks like below. ... (5 Replies)
Discussion started by: bala123
5 Replies

7. Shell Programming and Scripting

inserting and replacing lines with awk

Hello, I need to insert varying lines (i.e. these lines are an output of another script) between lines starting with certain fields. An example to make it more clear. This is the file where I wanna insert lines: (save it as "input.txt") ContrInMi_c_mir 2 10066 181014 200750... (12 Replies)
Discussion started by: tempestas
12 Replies

8. Shell Programming and Scripting

inserting multiple lines with awk

awk '/<login-module code="com.nlayers.seneca.security.LdapLogin" flag="sufficient">/{p++} /<login-module code="com.nlayers.seneca.security.LdapLogin" flag="sufficient">/ && p==1 {$0="Mulitple lines here\n"$0}1' login-config.xml In the above awk code inside shell script, i am having problems when... (1 Reply)
Discussion started by: sunrexstar
1 Replies

9. Shell Programming and Scripting

Matching lines across multiple csv files and merging a particular field

I have about 20 CSV's that all look like this: "","","","","","","","","","","","","","","",""What I've been told I need to produce is the exact same thing, but with each file now containing the start_code from every other file where the email matches. It doesn't matter if any of the other... (1 Reply)
Discussion started by: Demosthenes
1 Replies

10. Shell Programming and Scripting

Merging non-repeating columns of lines

Hello, I have file to work with. It has 5 columns. The first three, altogether, constitutes the position. The 4th column contains some values for downstream analysis and the fifth column contains some values that I want to add to 4th column (only if they happen to be in the same position). My... (5 Replies)
Discussion started by: menenuh
5 Replies
Login or Register to Ask a Question