Awk multiple lines with 4th column on to a single line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk multiple lines with 4th column on to a single line
# 1  
Old 09-30-2011
Awk multiple lines with 4th column on to a single line

This is related to one of my previous post.. I have huge file currently I am using loop to read file and checking each line to build this single record, its taking much much time to parse those records.. I thought there should be a way to do this in awk or sed.

I found this code in this forum and I think it's closed to my request.
Code:
nawk 'BEGIN {FS="|"}END{for(r in _)print r FS _[r]}{idx=$1 FS $2;_[idx]=_[idx]?_[idx] FS $3:$3}' myFile

I changed based on my request but I can't get this worked.. Can any one Please help me on this. Much much appreciated.

Input file:
Code:
XXXXXXXXXX1|07/24/2007|1|aaaaaaaaaaabbbbbbbbccccccccccccc
XXXXXXXXXX1|07/24/2007|2|sometxt
XXXXXXXXXX1|07/30/2007|1|some_random_text
XXXXXXXXXX1|07/30/2007|2|new_random.
XXXXXXXXXX1|09/27/2007|1|some_nre_random_test
XXXXXXXXXX1|09/27/2007|2|blabla
XXXXXXXXXX1|09/27/2007|3|fixed_text_random
XXXXXXXXXX1|11/14/2007|1|blabla
XXXXXXXXXX1|11/28/2007|1|junk_text
XXXXXXXXXX2|12/21/2007|1|Notes



I am looking for the out put something like

Out:
Code:
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc|sometxt
XXXXXXXXXX1|07/30/2007|some_random_text|new_random.
XXXXXXXXXX1|09/27/2007|some_nre_random_test|blabla|fixed_text_random
XXXXXXXXXX1|11/14/2007|blabla
XXXXXXXXXX1|11/28/2007|junk_text
XXXXXXXXXX2|12/21/2007|Notes


Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by Franklin52; 10-01-2011 at 05:56 AM.. Reason: Please use code tags for data and code samples, thank you
# 2  
Old 09-30-2011
Code:
cat FILENAME|awk -F"|" '{print $1,$2,$4}' |tr " " "\|"

# 3  
Old 09-30-2011
Thanks.

But I am not getting the right output..

Here is what I got
Code:
$ cat FILENAME|awk -F"|" '{print $1,$2,$4}' |tr " " "\|"
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc
XXXXXXXXXX1|07/24/2007|sometxt
XXXXXXXXXX1|07/30/2007|some_random_text
XXXXXXXXXX1|07/30/2007|new_random.
XXXXXXXXXX1|09/27/2007|some_nre_random_test
XXXXXXXXXX1|09/27/2007|blabla
XXXXXXXXXX1|09/27/2007|fixed_text_random
XXXXXXXXXX1|11/14/2007|blabla
XXXXXXXXXX1|11/28/2007|junk_text
XXXXXXXXXX2|12/21/2007|Notes


Am I missing something..

Thanks

Last edited by Franklin52; 10-01-2011 at 05:56 AM.. Reason: Please use code tags for data and code samples, thank you
# 4  
Old 09-30-2011
Quote:
Originally Posted by Vasan
Am I missing something..
no i just misunderstood ,
so you need to merge all records in each day into 1 line separated by pipe line,.. may be i could figure it out in one line, i need time. Smilie

---------- Post updated at 05:05 PM ---------- Previous update was at 05:02 PM ----------

try this one:
Code:
awk 'BEGIN {FS="|"}END{for(r in _)print r FS _[r]}{idx=$1 FS $2;_[idx]=_[idx]?_[idx] FS $4:$4}' FILENAME |sort -k2

# 5  
Old 09-30-2011
Thanks.

I got the results it looks like grouped by Date, But I would like to have the following output.

Sample Input
Code:
XXXXXXXXXX1|07/24/2007|1|aaaaaaaaaaabbbbbbbbccccccccccccc
XXXXXXXXXX1|07/26/2007|2|sometxt
XXXXXXXXXX1|07/30/2007|1|some_random_text
XXXXXXXXXX1|08/31/2007|2|new_random.
XXXXXXXXXX1|09/27/2007|3|some_nre_random_test

Required output..

1. First record would be:
Code:
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc|sometxt

or
Code:
XXXXXXXXXX1|07/26/2007|aaaaaaaaaaabbbbbbbbccccccccccccc|sometxt


2. The second record would be
Code:
XXXXXXXXXX1|07/30/2007|some_random_text|new_random.|some_nre_random_test

or
Code:
XXXXXXXXXX1|08/31/2007|some_random_text|new_random.|some_nre_random_test

or
Code:
XXXXXXXXXX1|09/27/2007|some_random_text|new_random.|some_nre_random_test

In the out put the date can be any date

Eg: from the second record from the possible date 07/30 , 08/31 and 09/27. We ca have any one date, but the sequence (next field) is what I am looking for

Apreaciate your Help..

Thanks Again

Last edited by Franklin52; 10-01-2011 at 05:58 AM.. Reason: Please use code tags for data and code samples, thank you
# 6  
Old 09-30-2011
Code:
awk -F\| 'd==$2 {printf("%s", FS$4); next} {d=$2; printf("%s", o$1FS$2FS$4)} NR==1 {o=ORS} END {print}' file

d = the date in the previous line's second field, $2
o = for the first line, it's empty. for all subsequent lines it's set to the output record separator.

If the current line's date matches the previous', just print the field separator, |, followed by the value of the fourth field.

Otherwise, the current line's date is different, set d to store the new date, print the current line's fields of interest. If it is not the first line printed, print the output record separator before the current line, to terminate the previous record.

When done, print out one last record separator to cap the output.

Regards,
Alister

Last edited by alister; 09-30-2011 at 11:15 PM.. Reason: To correct output format
# 7  
Old 10-02-2011
Thanks Alister

It works great.

Thanks Again.

---------- Post updated at 06:14 PM ---------- Previous update was at 08:09 AM ----------

Hi Alister,

Small clarification

When we append it to previous line, I do not want to append field separator"|" to be append - just for the new .

in other way, I do need field separator for col1 and col2 but not for the rest of the column.

Is it possible?

I am looking for some thing like,

Code:
XXXXXXXXXX1|07/24/2007|aaaaaaaaaaabbbbbbbbccccccccccccc sometxt
XXXXXXXXXX1|07/30/2007|some_random_text new_random.some_nre_random_test


Thanks for your help.

---------- Post updated at 10:04 PM ---------- Previous update was at 06:14 PM ----------

Able to figured it out.

Thanks.

Last edited by radoulov; 10-02-2011 at 04:16 AM.. Reason: Code tags!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Multiple lines to single line

I have code as below # create temporary table `temp4277`(key(waybill_no)) select waybill_no,concat_ws('',card_type,card_series_no) cardinfo from rfid_temp_ticket where waybill_no='4277' group by... (4 Replies)
Discussion started by: kaushik02018
4 Replies

2. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'... (10 Replies)
Discussion started by: raju2016
10 Replies

3. Shell Programming and Scripting

Coverting multiple lines to a single line

Hi all, I have a requirement to covert multiple lines in a comma delimited file to a single line through shell scripting. We should compare the data in the first column in each line. If it is same, then the other data should be put in the same line.Below is the sample input and expected output:... (4 Replies)
Discussion started by: Bobby_2000
4 Replies

4. Shell Programming and Scripting

Making multiple lines as single line

Hi All, I have a spool file which as shown below. I want to make it as single line after every semicolon. In this case there should be 2 lines in vi editor. I am not used to use sed so could you guys please help me out ? exec spk_dba.sp_runsql('ALP','CREATE DATABASE LINK "TEST" CONNECT TO... (2 Replies)
Discussion started by: nicolas38
2 Replies

5. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

6. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

7. Shell Programming and Scripting

Combine multiple lines in single line

This is related to one of my previous post but now with a slight difference: I need the "Updated:" to be in one line as well as the "Information:" on one line as well. These are in multiple lines right now as seen below. These can have 2 or more lines that needs to be in one line. System name:... (8 Replies)
Discussion started by: The One
8 Replies

8. Shell Programming and Scripting

Multiple lines into a single line

Hi, I've some files with the following data and i need to convert the lines between the separator ---, into a single line. I've tried with the paste cmd but my main problem is that the number of lines between the separator is not fix, it can very between 1-4 lines. Input --- 2010-02-22... (4 Replies)
Discussion started by: RickyC9999
4 Replies

9. Shell Programming and Scripting

Awk multiple lines with 3rd column onto a single line?

I have a H U G E file with over 1million entries in it. Looks something like this: USER0001|DEVICE001|VAR1 USER0001|DEVICE001|VAR2 USER0001|DEVICE001|VAR3 USER0001|DEVICE001|VAR4 USER0001|DEVICE001|VAR5 USER0001|DEVICE001|VAR6 USER0001|DEVICE002|VAR1 USER0001|DEVICE002|VAR2... (4 Replies)
Discussion started by: SoMoney
4 Replies

10. Shell Programming and Scripting

replacing multiple lines with single line

Can any one give me the idea on replacing multiple blank lines with a single blank line? Please conside it for a file having more than 100 number of characters. Regards, Siba (3 Replies)
Discussion started by: siba.s.nayak
3 Replies
Login or Register to Ask a Question