Visit The New, Modern Unix Linux Community


Merging of rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merging of rows
# 1  
Merging of rows

Hi guys,

Wish you all a very Happy New Year!!!.

Thanks in advance.

I want to read a file and merge the rows which have '\n' in it.
The rows could be > 50,000 bytes. The script should merge all the rows till the next row starts with word 'Type|'.
ex.
Type|Ticket|TradeID|42224A|SystemID|DDDD61266|SourceSystem|RMS|Version|5|LatestVersionYN|Y|Counterpa rty|OB|DBLegalEntity|London|Notes|cal events added|Salesperson|DHX|LastUpdated|DEC 2008 18:31:40|Action|Exp|DealCaptureSystem|Sales|CustomerSourceSystem|RMS|Customer|AUGS|ActionEffectiveDa te|DEC 2008 00:00:00|BookingLocation|London|EnteredBy|GONU|MarkupAmt|100|MarkupCcy|KEX|PremiumCurrency|EUR|Premi umAmount|000|PremiumDate|DEC 2008|PremiumHasPaidYN|N|BookingDetails|9000/mihl
Client sells KEZUSD
st ref 1.3724
Leveraged Sele is at ST
52 sele, starting


Type|FxDiscreteDNT|TradeID|SSA32553H|SystemID|GGT04481|SourceSystem|RMS|SourceSystemBook|OT|Book|OT| BookBranchParent|Go|Version|8|LatestVersionYN|Y|Counterparty|DOB|Status|KnockedOut|IsLiveYN|N|DBLega lEntity|London|LastUpdated|JAN 2009 00:26:35|Action|KnockedOut|CounterpartySourcesystem|RMS|TradeDate|JUL 2006|ProductID|KKI99106|StructureID|DDS32553|DealCaptureSystem|RMS|CustomerSourceSystem|RMS|Customer |ESL|PricingSite|Option|UnderCurrency1|JPY|AccCurrency1|US|UpBarrier1|1.5|LowBarrier1|1|UnderCurrenc y2|JPY|AccCurrency2|CH|UpBarrier2|1.64|LowBarrier2|1.49|UnderCurrency3|EUR|AccCurrency3|JPY|UpBarrie r3|175|LowBarrier3|105|LastFixed

The outout should be
The below record should be in one line.

Type|Ticket|TradeID|42224A|SystemID|DDDD61266|SourceSystem|RMS|Version|5|LatestVersionYN|Y|Counterpa rty|OB|DBLegalEntity|London|Notes|cal events added|Salesperson|DHX|LastUpdated|DEC 2008 18:31:40|Action|Exp|DealCaptureSystem|Sales|CustomerSourceSystem|RMS|Customer|AUGS|ActionEffectiveDa te|DEC 2008 00:00:00|BookingLocation|London|EnteredBy|GONU|MarkupAmt|100|MarkupCcy|KEX|PremiumCurrency|EUR|Premi umAmount|000|PremiumDate|DEC 2008|PremiumHasPaidYN|N|BookingDetails|9000/mihl Client sells KEZUSD st ref 1.3724 Leveraged Sele is at ST 52 sele, starting

Type|FxDiscreteDNT|TradeID|SSA32553H|SystemID|GGT04481|SourceSystem|RMS|Version|8|LatestVersionYN|Y| Counterparty|DOB|DBLegalEntity|London|LastUpdated|JAN 2009 00:26:35|Action|KnockedOut|DealCaptureSystem|RMS|CustomerSourceSystem|RMS|Customer|ESL|PricingSite|O ption|UnderCurrency1|JPY|AccCurrency1|US|UnderCurrency3|EUR|AccCurrency3|JPY|UpBarrier3|175|LowBarri er3|105|LastFixed

I tried using below mentioned script but AWK can't handle > 20,000 bytes. Also, using String as 'LINE_BREAK' for identification rows got merged.

FileName=LINE_TEST.TXT
FileName1=O_LINE_TEST.TXT
OUTPUT_FILE=OUTPUT_WITHOUT_LINE_BREAK.tmp
if [[ -e $FileName ]]; then
sed "s/%/percentage/g" $FileName > $FileName1
## To remove '%' as 'awk' can not handle it.

/usr/xpg4/bin/awk 'NR==1{printf $0;next}
!/^TradeType/ {printf "LINE_BREAK" $0;next}
{printf "\n" $0}
END{print ""}
' $FileName1 > $OUTPUT_FILE
fi
# 2  
Try with nawk:

Code:
nawk '$1=$1' FS='\n' RS= ORS='\n\n' infile

# 3  
Merging of rows - error

I am getting the below mentioned error :

13460 Segmentation Fault(coredump). Is it related Memory ?

The Actual File size is : 95532469 bytes.
# 4  
Try with Perl:

Code:
perl -00ple'tr.\n. .' infile

# 5  
Identify

How to identfy which lines were merged while processing the file ?
# 6  
Quote:
Originally Posted by ssachins
How to identfy which lines were merged while processing the file ?
Just append some string to identify them:

Code:
perl -00ple'tr.\n. . and $_.=" __FIXED__"' infile

# 7  
Merging of rows - resolved.

Hi Radoulov,

It's working !!! Smilie

Thank you very much for quick reply. Appriciated !!! Smilie

One last request

Could you please confirm how this script works ?

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #866
Difficulty: Easy
Lisp and Fortran are the two oldest high-level programming language still in widespread use today.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merging rows based on same ID in First column.

Hellow, I have a tab-delimited file with 3 columns : BINPACKER.13259.1.p2 SSF48239 BINPACKER.13259.1.p2 PF13243 BINPACKER.13259.1.p2 G3DSA:1.50.10.20 BINPACKER.13259.2.p2 SSF48239 BINPACKER.13259.2.p2 PF13243 BINPACKER.13259.2.p2 G3DSA:1.50.10.20... (7 Replies)
Discussion started by: anjaliANJALI
7 Replies

2. Shell Programming and Scripting

Merging rows after matching a pattern

Hi All, I have the below file where I want the lines to merged based on a pattern. AFTER CMMILAOJ CMMILAAJ AFTER CMDROPEJ CMMIMVIJ CMMIRNTJ CMMIRNRJ CMMIRNWJ CMMIRNAJ CMMIRNDJ AFTER CMMIRNTJ CMMIRNRJ CMMIRNWJ (4 Replies)
Discussion started by: varun22486
4 Replies

3. Shell Programming and Scripting

Moving or copying first rows and last rows into another file

Hi I would like to move the first 1000 rows of my file into an output file and then move the last 1000 rows into another output file. Any help would be great Thanks (6 Replies)
Discussion started by: kylle345
6 Replies

4. Shell Programming and Scripting

Merging rows in awk

Hello, I have a data format as follows: Ind1 0 1 2 Ind1 0 2 1 Ind2 1 1 0 Ind2 2 2 0 I want to use AWK to have this output: Ind1 00 12 21 Ind2 12 12 00 That is to merge each two rows with the same row names. Thank you very much in advance for your help. (8 Replies)
Discussion started by: Homa
8 Replies

5. Shell Programming and Scripting

Merging rows using two common rows.

Hi.. My requirement is simple but unable to get that.. File 1 : 3 415 A G 4 421 G . 39 421 G A 2 421 G A,C 41 427 A . 4 427 A C 42 436 G . 3 436 G C 43 445 C . 2 445 C T 41 447 A . Output (4 Replies)
Discussion started by: empyrean
4 Replies

6. UNIX Desktop Questions & Answers

merging files and add missing rows

hello all, I have files that have a specific way for naming the first column they are make of five names in Pattern of 3 Y = (no case sensitive) so the files are names $Y-$Y-$Y or $X-$Y-$Z depending how we look they only exist of the pattern exist now I want to create a file from them that... (9 Replies)
Discussion started by: A-V
9 Replies

7. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

I have 2 files, file01= 7 columns, row unknown (but few) file02= 7 columns, row unknown (but many) now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there e.g. file 01 James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies

8. Shell Programming and Scripting

Merging rows with same column 1 value

I have the following space-delimited input: 1 11.785710 117.857100 1 15 150 1 20 200 1 25 250 3 2.142855 21.428550 3 25 250 22 1.071435 10.714350 The first field is the ID number, the second field is the percentage of the total points that the person has and the third column is the number... (3 Replies)
Discussion started by: mdlloyd7
3 Replies

9. Shell Programming and Scripting

Split single rows to multiple rows ..

Hi pls help me out to short out this problem rm PAB113_011.out rm: PAB113_011.out: override protection 644 (yes/no)? n If i give y it remove the file. But i added the rm command as a part of ksh file and i tried to remove the file. Its not removing and the the file prompting as... (7 Replies)
Discussion started by: sri_aue
7 Replies

10. Shell Programming and Scripting

Deleting specific rows in large files having rows greater than 100000

Hi Guys, I need help in modifying a large text file containing more than 1-2 lakh rows of data using unix commands. I am quite new to the unix language the text file contains data in a pipe delimited format sdfsdfs sdfsdfsd START_ROW sdfsd|sdfsdfsd|sdfsdfasdf|sdfsadf|sdfasdf... (9 Replies)
Discussion started by: manish2009
9 Replies

Featured Tech Videos