How to fix line breaks format text for huge files?
Hi,
I need to correct line breaks for huge files (more than 1MM records in a file) and then format it properly.
Except the header and trailer, each record starts with 'D'.
Requirement:Scan the whole file except the header and trailer records and see if any of the records start with anything other than 'D'. In such cases, merge the broken line with the preceeding line after inserting a space after the end of the previous line.
The input file is:
HEADER474687
D1356jkl ugbliuybikb 879870
898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh
kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008
Expected output file is:
HEADER474687
D1356jkl ugbliuybikb 879870 898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008
I am using the following code to achieve it:
Received output:
<space>HEADER474687
D1356jkl ugbliuybikb 87987089 8976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyhkb ygfluy9809
D8796870 kjlhuigiyig TRAILER0008
After using the above code, I am facing the following issues:
On using the awk command, although I am using NR > 1 && NR < $RECORDCOUNT,
a) I am unable to exclude the Header and trailer records from the awk processing which is merging Trailer line as well to the previous one.
b)Also, a space is getting inserted before the first line. The first line becomes:
Because of this, I have to use a separate sed command(given below) just after the awk execution to delete the leading space from first line which is adding to the execution time of the whole process.
I would really appreciate if any one of you can guide me in writing this piece of code using awk/sed/perl (whichever is suitable keeping in mind the huge file size).
Thanks a lot in advance.
Last edited by kikionline; 01-10-2012 at 10:14 AM..
When i am using the sed command, i am receiving sed: command garbled error. I have checked the command but did not find any issue. Can you please help?
I have tried running it directly in the prompt as well as tried executing it as a ksh script (where it takes the location and filename as params) - however, results are the same.
Thanks
---------- Post updated at 08:06 AM ---------- Previous update was at 07:56 AM ----------
Hi Birei,
Can this be because we may be using different shells (it should not be ideally though)?
Is there any alternative solution like using awk/perl to achieve the same thing?
Below code extracts multiple field values from XML into array and prints all in one line.
perl -nle '@r=/(?: jndiName| authDataAlias| value| minConnections| maxConnections| connectionTimeout| name)="(+)/g and print join ",",$ENV{tIPnSCOPE},$ENV{pr
ovider},$ENV{impClassName},@r' server.xml
... (4 Replies)
Hi All,
Need an urgent solution to an issue . We have created a ksh file or shell script which generates 1 DAT file. the DAT file contains extract of a select statement .
Now the issue is , when we are executing the ksh file , the output is coimng with page breaks and line breaks .
We have... (4 Replies)
Hi all,
I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this? Or just reinsert thousands of line breaks again :eek: ? (2 Replies)
Hmmm I think I found the correct subforum to ask my question...
I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this?... (1 Reply)
I have two csv files having 90K records each & each row has around 50 columns.Lets say the file names are FILE1 and FILE2. I have to compare both the files and generate a new file that has rows from FILE2 if it differs.
FILE1
-----
2001,"John",25,19901130,21211.41,Unix Forum... (3 Replies)
Hi,
Ive spent ages trying to find an explanation for how to do this on the web, but now feel like I'm :wall:
I would like to change each occurence (there are many within my script) of the following:
to
in Vim. I know how to search and replace when it is just single lines... (2 Replies)
I have the following situation:
a text file with 50000 string patterns:
abc2344536
gvk6575556
klo6575556
....
and 3 text files each with more than 1 million lines:
...
000000 abc2344536 46575 0000
000000 abc2344536 46575 4444
000000 abc2344555 46575 1234
...
I... (8 Replies)
The file FTP'd got few breaks and the data looks like:
ABCTOM NYMANAGER
ABCDAVE NJ
PROGRAMMER
ABCJIM CTTECHLEAD
ABCPETERCA
HR
and i want the output like:
ABCTOM NYMANAGER
ABCDAVE NJPROGRAMMER
ABCJIM CTTECHLEAD
ABCPETERCAHR
can you please help me in writing the shell... (8 Replies)
i need help..!!!!
i have one big text file estimate data file size 50 - 100GB with 70 Mega Rows.
on OS SUN Solaris version 8
How i can remove first line of the text file.
Please suggest me for solutions.
Thank you very much in advance:) (5 Replies)