![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Need help with shell script for chekking a column in txt file - pipe delimited | ravi0435 | UNIX for Dummies Questions & Answers | 12 | 01-02-2009 04:31 PM |
| How to generate a pipe ( | ) delimited file? | anushree.a | Shell Programming and Scripting | 5 | 10-15-2008 02:35 AM |
| How to split pipe delimited file | njgirl | Shell Programming and Scripting | 4 | 06-18-2008 05:15 PM |
| splitting a pipe delimited file in unix | ddedic | Shell Programming and Scripting | 4 | 03-20-2007 01:16 AM |
| Grep for NULL in a pipe delimited file | sureshg_sampat | Shell Programming and Scripting | 5 | 11-21-2006 06:15 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
Problem working with Pipe Delimited Text file
Hello all:
I have a following textfile data with name inst1.txt HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|H|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ DTL|H|7032-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|D|7032-1|1|02-02-2008|02-03-2008|1|M|yyy DTL|D|7032-1|2|02-02-2008|02-03-2008|1|M|yyy DTL|D|7032-1|3|02-02-2008|02-03-2008|1|N|yyy DTL|D|7032-1|4|02-02-2008|02-03-2008|1|N|yyy DTL|H|9999-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|D|9999-1|1|02-02-2008|02-03-2008|1|N|zzz DTL|D|9999-1|2|02-02-2008|02-03-2008|1|N|zzz TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ Output Needed in a new file is: HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|H|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ Criteria: To check if the 8th column is NULL In the original file if the 8th column is NULL then throw all the records including the File Header, File Tail and Record Header which are: File Header: HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ File Tail: TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ Record Header: DTL|H|5464-1|0|02-02-2008|02-03-2008||||F||||||||| Record Detail: DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ Record Header and Record Detail are distuingished by the 2nd column H - Header & D - Detail Part of the solution: nawk -F'|' '$8 == "" ' inst1.txt >null.txt The above command checks for 8th column and throws all the records to a new file null.txt and the new file looks as: HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|H|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ DTL|H|7032-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|H|9999-1|0|02-02-2008|02-03-2008||||F||||||||| TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ The ones in Red are Record Headers corresponding to different Records which shud not appear (but they appear as 8th column is NULL for these too) Any help/suggestion/advice would be greately appreciated. thanks, Ravi |
|
||||
|
Helo rubin
Hello Rubin:
thanks..it worked partially....with the code you gave the following is the output: HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|H|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ DTL|H|9999-1|0|02-02-2008|02-03-2008||||F||||||||| Basically its missing the TAIL and instead has one record Header which is the header of the last Detail record. thanks, ravi |
|
||||
|
Thanks Rubin that works..another problem
Thanks rubin really appreciate that...the command you sent worked for one set of files, I just noticed that there is another text file with similar data just a small difference i was trying to play around making some changes for the code you sent but i am stuck at a point(earlier D was constant for all Detail records but now all the detail records are numbered 1,2,3,4,5 i was using $2=="[1-20]+" but it doesnt work)...my apology that i didnt notice there were two different kind of files, the new file data:
Instead of 'H' its '0' and instead of 'D' its 1,2,3,4,5.... (1,2,3,4,5,...depending on how many dependents that parent record-'0' will have) Same criteria need to check if 8th column is NULL. HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|0|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ DTL|2|5464-1|2|02-02-2008|02-03-2008|1||JJJ DTL|3|5464-1|3|02-02-2008|02-03-2008|1||JJJ DTL|4|5464-1|4|02-02-2008|02-03-2008|1||JJJ DTL|0|7032-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|1|7032-1|1|02-02-2008|02-03-2008|1|M|yyy DTL|2|7032-1|2|02-02-2008|02-03-2008|1|M|yyy DTL|3|7032-1|3|02-02-2008|02-03-2008|1|N|yyy DTL|4|7032-1|4|02-02-2008|02-03-2008|1|N|yyy DTL|0|9999-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|1|9999-1|1|02-02-2008|02-03-2008|1||zzz DTL|2|9999-1|2|02-02-2008|02-03-2008|1||zzz TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ Req - can you throw in couple of lines of explanation as i worked with it a lot but cudn't understand the following bolded ones in the code what its inteded to do. nawk -F'|' 'NR==1; $32=="" && $2=="D" && NR==n+1 {print s} $2=="H" {s=$0; n=NR} $32=="" && $2=="D"; {end=$0} END {print end}' inst1.txt > null.txt thanks, Ravi |
|
|||||
|
You could do something like this,
Code:
nawk -F'|' 'NR==1; $8=="" && $2~/^[1-9]+/ && NR==n+1 {print s} $2==0 {s=$0; n=NR}
$8=="" && $2~/^[1-9]+/; {end=$0} END {print end}' input > output
Quote:
$2=="H"{s=$0;n=NR} -> when a header is seen, store it (s=$0) and its record number (n=NR). NR==n+1 -> If the next record right after its header (NR==n+1), satisfies the other two conditions ( $8=="" and $2="D" ), print the header s saved before. The current record and the other needed ones will be printed later altogether ( $8=="" && $2=="D" ). {end=$0} -> the variable end stores the current record, overwriting the previous one, so in the end ( END {...} ) the last record is printed. With gawk you could simply do -> END{ print $0 }. |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|