Need to concatenate spuriously separated lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need to concatenate spuriously separated lines
# 1  
Old 12-23-2011
Need to concatenate spuriously separated lines

Given the pattern below:

3113296571|NULL|NULL|NULL||N| 1| 0| 926667| 1001036| 0| 3076120438|NULL|NULL|NULL|NULL|DUE FOR NEW CONSENT!|N|NULL|
10198318|2011-07-25-12.34.02.786000|NULL|NULL|NULL| 0
3113336478|NULL|NULL|NULL||N| 1| 0| 931871| 1001315| 1000997|
3076140580|NULL|NULL|NULL|NULL|Cosmetic; Patient has no Defib
rillator or Pacemaker; Location: Eisenberg 7A; Dx/Indication: Cyst; Excision: 1-2 sites = 1 hour; Additional information (specify): Please put on my excisi
on calendar for excision of cyst on chest and back. Thanks!
|N|NULL| 11418290|2011-07-11-14.14.02.226000|NULL|NULL|NULL| 0
3113338127|NULL|NULL|NULL||N| 1| 0| 927074| 1001109| 0| 3076141370|NULL|NULL|NULL|NULL|Return Instructions (other): v
a only 1st available when patient returns from vacation - thanks!|C|1 month out, not 1 week| 15462510|2011-07-11-14.31.26.803000|NULL|NULL|NULL|
0


whereas the bold-font records are "normal" (starting with an integer field, having NO imbedded new lines within the record), non-bold-font entry is "abnormal" having an embedded "\n" (new line) character [in this example, it appears after "Thanks!"(most probably entered by a data-entry clerk while inputting a record). I need to concatenate the lines starting with "^|" with the lines immediately preceding them (using sed/awk/Perl)

Your help is greatly appreciated.
# 2  
Old 12-23-2011
It must be easier to fix the data while it is still in the database and while you have the whole field available.
What software wrote this file?
# 3  
Old 12-23-2011
Methyl,

Unfortunately, as the data is coming from the DB2 database (the database we don't own), it won't be easier to to fix the data at the source; therefore, I was seeking a programmatic solution from the outside. Could you recommend one?

lemele
# 4  
Old 12-23-2011
Do you have control over the data extract program?

Referrring to your original post, I'm having trouble seeing the record structure or even deciding where one record stops and the next record starts. Some records seem to end with a pipe character and some with something else. The number of fields in each record also varies.
Is your example data an accurate representation of the record format. i.e. A random number of pipe-delimited fields? It would be so much easier if there were a fixed number of fields in each record.
# 5  
Old 12-23-2011
Hard to see the record/line breaks withOUT the proper code tagging....
Code:
nawk -F'|' '$1~"^[0-9][0-9]*$" {printf("%c%s", (FNR==1)?"":ORS,$0);next}{printf $0}' myFile

This User Gave Thanks to vgersh99 For This Post:
# 6  
Old 12-23-2011
need to concatenate spuriously separated lines

Methyl,

Here is a re-posted set of records; they do have a fixed format (fixed number of fields (25), "|" separated); the only problem is that some of these records get disjointed when imported by the AIX from DB2 (the data transport program is 'isql' utility by Sybase). Records in bold type are intact (without new line); record in non-bold has one embedded new line character that I need to get rid of thus concatenating the disjointed lines. The anchor pattern is "^|"

3113296571|NULL|NULL|NULL||N| 1| 0| 926667| 1001036| 0| 3076120438|NULL|NULL|NULL|NULL|DUE FOR NEW CONSENT!|N|NULL|
10198318|2011-07-25-12.34.02.786000|NULL|NULL|NULL| 0

3113336478|NULL|NULL|NULL||N| 1| 0| 931871| 1001315| 1000997|
3076140580|NULL|NULL|NULL|NULL|Cosmetic; Patient has no Defibrillator or Pacemaker; Location: Eisenberg 7A; Dx/Indication: Cyst; Excision: 1-2 sites = 1 hour; Additional information (specify): Please put on my excision calendar for excision of cyst on chest and back. Thanks![L
ine splits here]
|N|NULL| 11418290|2011-07-11-14.14.02.226000|NULL|NULL|NULL| 0

3113338127|NULL|NULL|NULL||N| 1| 0| 927074| 1001109| 0| 3076141370|NULL|NULL|NULL|NULL|Return Instructions (other): v
a only 1st available when patient returns from vacation - thanks!|C|1 month out, not 1week| 15462510|2011-07-11-14.31.26.803000|NULL|NULL|NULL|0


Anxiously awaiting a solution

Lemele
# 7  
Old 12-23-2011
I agree with vgersh99. I believe that the way the data sample has been posted into the original post has introduced more extra line terminators!


The original data fault is the sort of effect you get if people cut/paste data from another Window (like from an email). It's all too common.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Concatenate three lines into one

Hi. I'm new to this forum. I am attempting to parse an Audit Log from Cognos/TM1, selecting only Event IDs of "client" which are found on the "start-tag" record. These Logs are in a pseudo-XML format but not a true XML format. I want to FTP an Audit Log File from the Cognos server to our UNIX... (7 Replies)
Discussion started by: FredAtArrow
7 Replies

2. Shell Programming and Scripting

Count lines separated by new line

Hi guys, I have a file which has random records like mentioned below emcpower28a pci@3,03 (disk physical name) pci@3,04 emcpower9a pci@1,03 pci@2,03 pci@3,01 pci@4,03 there could be any number of disk names for any LUN (emc...) So, I want a solution to count disk names for its... (12 Replies)
Discussion started by: prashant2507198
12 Replies

3. Shell Programming and Scripting

grep lines separated with semicolon

Hello, I would like to kindly ask you for help. I have a file with some lines in one row separated by semicolon. I need to find out, if the line I have in different variable is included in this file. e.g I have a file foo.txt with lines A=hello there;hello world;hello there world In... (6 Replies)
Discussion started by: satin1321
6 Replies

4. Programming

Concatenate two lines in a fIle

Hi All, Can any one help me in finding the solution for concatenating two or more lines in a file and writing them to a temporary file. for Example: He is a wise student. So he got first rank. This is in a file i want the output as He is a wise student so he got first rank. into a file... (3 Replies)
Discussion started by: uday.sena.m
3 Replies

5. Shell Programming and Scripting

Grep and print next 10 Lines separated by ,

Hi All, I need to grep through a file for a string and print the next ten lines to a file separating the lines with a , and save it as a csv file to open it as a XL file. The 10 lines should be on a sigle row in xl. Any suggesstions please. Note; I dont have a GNU Grep to use -A flag. ... (6 Replies)
Discussion started by: Nani369
6 Replies

6. Shell Programming and Scripting

Concatenate lines between lines starting with a specific pattern

Hi, I have a file such as: --- >contig00001 length=35524 numreads=2944 gACGCCGCGCGCCGCGGCCAGGGCTGGCCCA CAGGCCGCGCGGCGTCGGCTGGCTGAG >contig00002 length=4242 numreads=43423 ATGCCGAAGGTCCGCCTGGGGCTGG CGCCGGGAGCATGTAGCG --- I would like to concatenate the lines not starting with ">"... (9 Replies)
Discussion started by: s052866
9 Replies

7. Shell Programming and Scripting

concatenate lines in pairs

Hi, I have a text file with the following contents /C=IT/O=INFN/OU=Personal Certificate/L=Napoli/CN=Some guy /C=IT/O=INFN/CN=INFN CA /O=Grid/O=NorduGrid/OU=uninett.no/CN=Another guy /O=Grid/O=NorduGrid/CN=NorduGrid Certification Authority /C=TW/O=AP/OU=GRID/CN=Someone else... (5 Replies)
Discussion started by: kerl
5 Replies

8. Shell Programming and Scripting

Concatenate two lines in one

Does anyone know a way I can fix my file below where it has lines that are cut? Just like the (notice the space before ms,c=PH).. dn: cn=english,ou=Messaging,ou=lang,o=subject,t=j ms,c=PH which should be dn: cn=english,ou=Messaging,ou=lang,o=subject,t=jms,c=PH The whole text file... (5 Replies)
Discussion started by: Orbix
5 Replies

9. UNIX Desktop Questions & Answers

How to concatenate consecutive lines

I have a few lines like -- feature 1, subfeat 0, type 3, subtype 1, value 0, -- feature 1, subfeat 0, type 1, subtype 1, value 0, I would like to concatenate the... (1 Reply)
Discussion started by: shivi707
1 Replies

10. Shell Programming and Scripting

two lines into one colon separated line...

Does anyone know how to get these two output lines into one colon ':' separated line with some unix command? Maybe nawk. I've tried to read the nawk and awk man pages but I don't get it right. Are these commands the one to use? Output from find command: # /sw/tools/matlab/7.0.1/man... (2 Replies)
Discussion started by: tonlu
2 Replies
Login or Register to Ask a Question