How to concatenate lines with specific pattern?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to concatenate lines with specific pattern?
# 1  
Old 06-25-2013
Question How to concatenate lines with specific pattern?

How to concatenate lines with specific pattern?
I have data dumped from a table into text file. In some occurrence the data row is split into two rows.

Example:

Code:
 
12345678|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|
12345679|Global Test2|Global Test Task2|My Request2|Date|Date|Date|2|2|
12345680|Global Test1232|Global Test Task2|My Request2|Date|Date|Date|3|3|
12345681|Global Test1232123455|Global Test Task2343212312334
|My Request 3|Date|Date|4|4|
12345682|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|

Expected Result:

Code:
 
12345678|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|
12345679|Global Test2|Global Test Task2|My Request2|Date|Date|Date|2|2|
12345680|Global Test1232|Global Test Task2|My Request2|Date|Date|Date|3|3|
12345681|Global Test1232123455|Global Test Task2343212312334|My Request 3|Date|Date|4|4|
12345682|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|

In the example above. Line #4 and #5 is one data row. The text file has about 300K+ rows.

Basically concatenate the current line with the next line if the next line starts with '|'.

Last edited by nixtime; 06-25-2013 at 01:47 PM..
# 2  
Old 06-25-2013
Try:
Code:
perl -0pe 's/\n\|/|/g' file

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 06-25-2013
An awk approach:
Code:
awk '/\|$/{ORS=RS}!/\|$/{ORS=""}1' file

# 4  
Old 06-25-2013
Thank you Bartus11 and Yoda.

The suggestion made by Bartus11 worked as expected. Updated the exact expected rows.
The suggestion made by Yoda worked, but updated about 2K additional rows.

I'm not perl and awk expert. Can you please provide some more detail on what exactly is happening with the command provided? This would help me learn.

thank you



---------- Post updated at 03:34 PM ---------- Previous update was at 03:34 PM ----------

I have gained some understanding on the code provided by Bartus11
Code:
perl -0pe 's/\n\|/|/g' file

But, how do I go about changing this to work where the next continued line has '|' as 2nd character instead of 1st.

Example:
Code:
12345678|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|
12345679|Global Test2|Global Test Task2|My Request2|Date|Date|Date|2|2|
12345680|Global Test1232|Global Test Task2|My Request2|Date|Date|Date|3|3|
12345681|Global Test1232123455|Global Test Task2343212312334
5|My Request 3|Date|Date|4|4|
12345682|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|

Result:
Code:
12345678|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|
12345679|Global Test2|Global Test Task2|My Request2|Date|Date|Date|2|2|
12345680|Global Test1232|Global Test Task2|My Request2|Date|Date|Date|3|3|
12345681|Global Test1232123455|Global Test Task23432123123345|My Request 3|Date|Date|4|4|
12345682|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|

As per the suggestion using Perl, I could do that following:
Code:
perl -0pe 's/\na\|/a|/g' file
perl -0pe 's/\nb\|/b|/g' file
...
perl -0pe 's/\nz\|/z|/g' file
OR
perl -0pe 's/\n0\|/0|/g' file
...
perl -0pe 's/\n9\|/9|/g' file

This will work if I put them in a loop, but probably not the smartest way to do it.
Any suggestions?
# 5  
Old 06-25-2013
Try:
Code:
perl -0pe 's/\n([^\n]\|)/$1/g' file

This User Gave Thanks to bartus11 For This Post:
# 6  
Old 06-25-2013
Thank you bartus11.That worked.

Using the same logic, I was able to test the below as well when the next line has '|' as the 3rd character.

Code:
perl -0pe 's/\n([^\n]([^\n]\|))/$1/g' file

Please let me know if this would not work in certain scenario. Otherwise thank you so much. Saved me few hours today.
# 7  
Old 06-25-2013
Try also (independent of position of | in next line):
Code:
awk -F\| 'NF<10 {getline x; $0=$0 x}1' file
12345678|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|
12345679|Global Test2|Global Test Task2|My Request2|Date|Date|Date|2|2|
12345680|Global Test1232|Global Test Task2|My Request2|Date|Date|Date|3|3|
12345681|Global Test1232123455|Global Test Task23432123123345|My Request 3|Date|Date|4|4|
12345682|Global Test|Global Test Task|My Request|Date|Date|Date|1|1|

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print all lines between two keyword if a specific pattern exist

I have input file as below I need to check for a pattern and if it is there in file then I need to print all the lines below BEGIN and END keyword. Could you please help me how to get this in AIX using sed or awk. Input file: ABC ******** BEGIN ***** My name is Amit. I am learning unix.... (8 Replies)
Discussion started by: Amit Joshi
8 Replies

2. Shell Programming and Scripting

Delete lines that contain a pattern from specific line to the end.

Gents, I am trying to delete all lines which start with "H" character, but keeping the fist header. Example In the input file I will delete all lines starting from line 8 which contents character "H" to the end of the file. I try sed '8,10000{/^H/d;}' file But as don't know the end... (1 Reply)
Discussion started by: jiam912
1 Replies

3. Shell Programming and Scripting

Vi editor deleting lines with specific pattern

Hi, I need to delete all lines in the file using vi editor which start with word aternqaco. Please assist. aternqaco.__oracle_base='/amdbqa01/app/oracle'#ORACLE_BASE set from environment aternqa.__oracle_base='/amdbqa01/app/oracle'#ORACLE_BASE set from environment... (3 Replies)
Discussion started by: Vishal_dba
3 Replies

4. Shell Programming and Scripting

Want to get lines before specific pattern

Hi , I want to insert data into a new file after grepping specific pattern . for more info please read following for example: abc=12345678902222 def=45678904444 ------- ------- INAVLID ABC I want to "INAVLID ABC" grep above pattern from multiple files and want to write abc value and ... (3 Replies)
Discussion started by: vipin auja
3 Replies

5. Shell Programming and Scripting

Append lines for a specific pattern

Input: 09:43:46,538 INFO first text 10:45:46,538 INFO second text 11:00:46,538 INFO third more text Output: 09:43:46,538 INFO first text 10:45:46,538 INFO second text 11:00:46,538 INFO third more text The rule is to append all lines so each line contains this format... (7 Replies)
Discussion started by: chitech
7 Replies

6. Shell Programming and Scripting

Delete multiple lines starting with a specific pattern

Hi, just tried some script, awk, sed for the last 2 hours and now need help. Let's say I have a huge file of 800,000 lines like this : It's a tedious job to look through it, I'd like to remove those useless lines in it as there's a few thousands : Or to be even more precise : if line1 =... (6 Replies)
Discussion started by: Zurd
6 Replies

7. Shell Programming and Scripting

NAWK to remove lines that matches a specific pattern

Hi, I have requirement that I need to split my input file into two files based on a search pattern "abc" For eg. my input file has below content abc defgh zyx I need file 1 with abc and file2 with defgh zyx I can use grep command to acheive this. But with grep I need... (8 Replies)
Discussion started by: sbhuvana20
8 Replies

8. Shell Programming and Scripting

merging of 2 consecutive lines in a file for a specific pattern

Hi , I'm looking for a way to merge two lines only for a given pattern / condition. Input : abcd/dad + -49.201 2.09 -49.5 34 ewrew rewtre * fdsgfds/dsgf/sdfdsfasdd + -4.30 0.62 -49.5 45 sdfdsf cvbbv * sdfds/retret/asdsaddsa + ... (1 Reply)
Discussion started by: novice_man
1 Replies

9. Shell Programming and Scripting

how to delete lines from a file which starts with a specific pattern

I need to delete those lines from a file, which starts with 45. How to do it? (3 Replies)
Discussion started by: mady135
3 Replies

10. Shell Programming and Scripting

Concatenate lines between lines starting with a specific pattern

Hi, I have a file such as: --- >contig00001 length=35524 numreads=2944 gACGCCGCGCGCCGCGGCCAGGGCTGGCCCA CAGGCCGCGCGGCGTCGGCTGGCTGAG >contig00002 length=4242 numreads=43423 ATGCCGAAGGTCCGCCTGGGGCTGG CGCCGGGAGCATGTAGCG --- I would like to concatenate the lines not starting with ">"... (9 Replies)
Discussion started by: s052866
9 Replies
Login or Register to Ask a Question