Awk multiple lines with 4th column on to a single line | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Awk multiple lines with 4th column on to a single line

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 09-30-2011
Vasan Vasan is offline
Registered User
 
Join Date: Feb 2009
Last Activity: 5 October 2011, 3:39 PM EDT
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
Awk multiple lines with 4th column on to a single line

This is related to one of my previous post.. I have huge file currently I am using loop to read file and checking each line to build this single record, its taking much much time to parse those records.. I thought there should be a way to do this in awk or sed.

I found this code in this forum and I think it's closed to my request.

Code:
nawk 'BEGIN {FS="|"}END{for(r in _)print r FS _[r]}{idx=$1 FS $2;_[idx]=_[idx]?_[idx] FS $3:$3}' myFile

I changed based on my request but I can't get this worked.. Can any one Please help me on this. Much much appreciated.

Input file:

Code:
XXXXXXXXXX1|07/24/2007|1|aaaaaaaaaaabbbbbbbbccccccccccccc
XXXXXXXXXX1|07/24/2007|2|sometxt
XXXXXXXXXX1|07/30/2007|1|some_random_text
XXXXXXXXXX1|07/30/2007|2|new_random.
XXXXXXXXXX1|09/27/2007|1|some_nre_random_test
XXXXXXXXXX1|09/27/2007|2|blabla
XXXXXXXXXX1|09/27/2007|3|fixed_text_random
XXXXXXXXXX1|11/14/2007|1|blabla
XXXXXXXXXX1|11/28/2007|1|junk_text
XXXXXXXXXX2|12/21/2007|1|Notes



I am looking for the out put something like

Out:

Code:
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc|sometxt
XXXXXXXXXX1|07/30/2007|some_random_text|new_random.
XXXXXXXXXX1|09/27/2007|some_nre_random_test|blabla|fixed_text_random
XXXXXXXXXX1|11/14/2007|blabla
XXXXXXXXXX1|11/28/2007|junk_text
XXXXXXXXXX2|12/21/2007|Notes


Moderator's Comments:
Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by Franklin52; 10-01-2011 at 04:56 AM.. Reason: Please use code tags for data and code samples, thank you
Sponsored Links
    #2  
Old 09-30-2011
ieth0 ieth0 is offline
Registered User
 
Join Date: Sep 2011
Last Activity: 11 February 2012, 2:34 AM EST
Posts: 48
Thanks: 0
Thanked 8 Times in 8 Posts

Code:
cat FILENAME|awk -F"|" '{print $1,$2,$4}' |tr " " "\|"

Sponsored Links
    #3  
Old 09-30-2011
Vasan Vasan is offline
Registered User
 
Join Date: Feb 2009
Last Activity: 5 October 2011, 3:39 PM EDT
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
Thanks.

But I am not getting the right output..

Here is what I got

Code:
$ cat FILENAME|awk -F"|" '{print $1,$2,$4}' |tr " " "\|"
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc
XXXXXXXXXX1|07/24/2007|sometxt
XXXXXXXXXX1|07/30/2007|some_random_text
XXXXXXXXXX1|07/30/2007|new_random.
XXXXXXXXXX1|09/27/2007|some_nre_random_test
XXXXXXXXXX1|09/27/2007|blabla
XXXXXXXXXX1|09/27/2007|fixed_text_random
XXXXXXXXXX1|11/14/2007|blabla
XXXXXXXXXX1|11/28/2007|junk_text
XXXXXXXXXX2|12/21/2007|Notes


Am I missing something..

Thanks

Last edited by Franklin52; 10-01-2011 at 04:56 AM.. Reason: Please use code tags for data and code samples, thank you
    #4  
Old 09-30-2011
ieth0 ieth0 is offline
Registered User
 
Join Date: Sep 2011
Last Activity: 11 February 2012, 2:34 AM EST
Posts: 48
Thanks: 0
Thanked 8 Times in 8 Posts
Quote:
Originally Posted by Vasan View Post
Am I missing something..
no i just misunderstood ,
so you need to merge all records in each day into 1 line separated by pipe line,.. may be i could figure it out in one line, i need time.

---------- Post updated at 05:05 PM ---------- Previous update was at 05:02 PM ----------

try this one:

Code:
awk 'BEGIN {FS="|"}END{for(r in _)print r FS _[r]}{idx=$1 FS $2;_[idx]=_[idx]?_[idx] FS $4:$4}' FILENAME |sort -k2

Sponsored Links
    #5  
Old 09-30-2011
Vasan Vasan is offline
Registered User
 
Join Date: Feb 2009
Last Activity: 5 October 2011, 3:39 PM EDT
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
Thanks.

I got the results it looks like grouped by Date, But I would like to have the following output.

Sample Input

Code:
XXXXXXXXXX1|07/24/2007|1|aaaaaaaaaaabbbbbbbbccccccccccccc
XXXXXXXXXX1|07/26/2007|2|sometxt
XXXXXXXXXX1|07/30/2007|1|some_random_text
XXXXXXXXXX1|08/31/2007|2|new_random.
XXXXXXXXXX1|09/27/2007|3|some_nre_random_test

Required output..

1. First record would be:

Code:
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc|sometxt

or

Code:
XXXXXXXXXX1|07/26/2007|aaaaaaaaaaabbbbbbbbccccccccccccc|sometxt


2. The second record would be

Code:
XXXXXXXXXX1|07/30/2007|some_random_text|new_random.|some_nre_random_test

or

Code:
XXXXXXXXXX1|08/31/2007|some_random_text|new_random.|some_nre_random_test

or

Code:
XXXXXXXXXX1|09/27/2007|some_random_text|new_random.|some_nre_random_test

In the out put the date can be any date

Eg: from the second record from the possible date 07/30 , 08/31 and 09/27. We ca have any one date, but the sequence (next field) is what I am looking for

Apreaciate your Help..

Thanks Again

Last edited by Franklin52; 10-01-2011 at 04:58 AM.. Reason: Please use code tags for data and code samples, thank you
Sponsored Links
    #6  
Old 09-30-2011
alister alister is offline
Registered User
 
Join Date: Dec 2009
Last Activity: 11 June 2014, 8:40 PM EDT
Posts: 3,231
Thanks: 179
Thanked 973 Times in 789 Posts

Code:
awk -F\| 'd==$2 {printf("%s", FS$4); next} {d=$2; printf("%s", o$1FS$2FS$4)} NR==1 {o=ORS} END {print}' file

d = the date in the previous line's second field, $2
o = for the first line, it's empty. for all subsequent lines it's set to the output record separator.

If the current line's date matches the previous', just print the field separator, |, followed by the value of the fourth field.

Otherwise, the current line's date is different, set d to store the new date, print the current line's fields of interest. If it is not the first line printed, print the output record separator before the current line, to terminate the previous record.

When done, print out one last record separator to cap the output.

Regards,
Alister

Last edited by alister; 09-30-2011 at 10:15 PM.. Reason: To correct output format
Sponsored Links
    #7  
Old 10-01-2011
Vasan Vasan is offline
Registered User
 
Join Date: Feb 2009
Last Activity: 5 October 2011, 3:39 PM EDT
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
Thanks Alister

It works great.

Thanks Again.

---------- Post updated at 06:14 PM ---------- Previous update was at 08:09 AM ----------

Hi Alister,

Small clarification

When we append it to previous line, I do not want to append field separator"|" to be append - just for the new .

in other way, I do need field separator for col1 and col2 but not for the rest of the column.

Is it possible?

I am looking for some thing like,


Code:
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc sometxt
XXXXXXXXXX1|07/30/2007|some_random_text new_random.some_nre_random_test


Thanks for your help.

---------- Post updated at 10:04 PM ---------- Previous update was at 06:14 PM ----------

Able to figured it out.

Thanks.

Last edited by radoulov; 10-02-2011 at 03:16 AM.. Reason: Code tags!
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Multiple lines in a single column to be merged as a single line for a record Bhuvaneswari Shell Programming and Scripting 1 08-11-2011 03:16 AM
Combine multiple lines in single line The One Shell Programming and Scripting 8 10-26-2010 12:15 PM
Multiple lines into a single line on Ubuntu 10.04 RickyC9999 Shell Programming and Scripting 8 10-25-2010 01:08 PM
Multiple lines into a single line RickyC9999 Shell Programming and Scripting 4 02-22-2010 02:41 PM
Awk multiple lines with 3rd column onto a single line? SoMoney Shell Programming and Scripting 4 12-06-2008 07:59 AM



All times are GMT -4. The time now is 12:57 PM.