Problem working with Pipe Delimited Text file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Problem working with Pipe Delimited Text file
# 1  
Old 01-02-2009
Problem working with Pipe Delimited Text file

Hello all:
I have a following textfile data with name inst1.txt

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|H|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ
DTL|H|7032-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|D|7032-1|1|02-02-2008|02-03-2008|1|M|yyy
DTL|D|7032-1|2|02-02-2008|02-03-2008|1|M|yyy
DTL|D|7032-1|3|02-02-2008|02-03-2008|1|N|yyy
DTL|D|7032-1|4|02-02-2008|02-03-2008|1|N|yyy
DTL|H|9999-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|D|9999-1|1|02-02-2008|02-03-2008|1|N|zzz
DTL|D|9999-1|2|02-02-2008|02-03-2008|1|N|zzz
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ

Output Needed in a new file is:

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|H|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ

Criteria: To check if the 8th column is NULL

In the original file if the 8th column is NULL then throw all the records including the File Header, File Tail and Record Header which are:

File Header: HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
File Tail: TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ

Record Header:
DTL|H|5464-1|0|02-02-2008|02-03-2008||||F|||||||||

Record Detail:
DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ

Record Header and Record Detail are distuingished by the 2nd column H - Header & D - Detail

Part of the solution:
nawk -F'|' '$8 == "" ' inst1.txt >null.txt

The above command checks for 8th column and throws all the records to a new file null.txt and the new file looks as:

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|H|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ
DTL|H|7032-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|H|9999-1|0|02-02-2008|02-03-2008||||F|||||||||
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ

The ones in Red are Record Headers corresponding to different Records which shud not appear (but they appear as 8th column is NULL for these too)

Any help/suggestion/advice would be greately appreciated.


thanks,
Ravi
# 2  
Old 01-03-2009
Code:
nawk -F'|' 'NR==1; $8=="" && $2=="D" && NR==n+1 {print s} $2=="H" {s=$0; n=NR}
                   $8=="" && $2=="D"; {end=$0} END {print end}'  inst1.txt > null.txt


Last edited by rubin; 01-03-2009 at 09:29 PM..
# 3  
Old 01-03-2009
Helo rubin

Hello Rubin:

thanks..it worked partially....with the code you gave the following is the output:

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|H|5464-1|0|02-02-2008|02-03-2008||||F|||||||||

DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ
DTL|H|9999-1|0|02-02-2008|02-03-2008||||F|||||||||



Basically its missing the TAIL and instead has one record Header which is the header of the last Detail record.


thanks,
ravi
# 4  
Old 01-03-2009
My apologies,... my bad,

Code:
nawk -F'|' 'NR==1; $8=="" && $2=="D" && NR==n+1 {print s} $2=="H" {s=$0; n=NR}
                   $8=="" && $2=="D"; {end=$0} END {print end}'  inst1.txt > null.txt


Output from your sample:

Code:
HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|H|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|D|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|D|5464-1|4|02-02-2008|02-03-2008|1||JJJ
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ

Previous code also edited.
# 5  
Old 01-04-2009
Thanks Rubin that works..another problem

Thanks rubin really appreciate that...the command you sent worked for one set of files, I just noticed that there is another text file with similar data just a small difference i was trying to play around making some changes for the code you sent but i am stuck at a point(earlier D was constant for all Detail records but now all the detail records are numbered 1,2,3,4,5 i was using $2=="[1-20]+" but it doesnt work)...my apology that i didnt notice there were two different kind of files, the new file data:

Instead of 'H' its '0' and instead of 'D' its 1,2,3,4,5....
(1,2,3,4,5,...depending on how many dependents that parent record-'0' will have)

Same criteria need to check if 8th column is NULL.

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|0|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|2|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|3|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|4|5464-1|4|02-02-2008|02-03-2008|1||JJJ
DTL|0|7032-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|7032-1|1|02-02-2008|02-03-2008|1|M|yyy
DTL|2|7032-1|2|02-02-2008|02-03-2008|1|M|yyy
DTL|3|7032-1|3|02-02-2008|02-03-2008|1|N|yyy
DTL|4|7032-1|4|02-02-2008|02-03-2008|1|N|yyy
DTL|0|9999-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|9999-1|1|02-02-2008|02-03-2008|1||zzz
DTL|2|9999-1|2|02-02-2008|02-03-2008|1||zzz
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ


Req - can you throw in couple of lines of explanation as i worked with it a lot but cudn't understand the following bolded ones in the code what its inteded to do.

nawk -F'|' 'NR==1; $32=="" && $2=="D" && NR==n+1 {print s} $2=="H" {s=$0; n=NR} $32=="" && $2=="D"; {end=$0} END {print end}' inst1.txt > null.txt


thanks,
Ravi
# 6  
Old 01-04-2009
You could do something like this,

Code:
nawk -F'|' 'NR==1; $8=="" && $2~/^[1-9]+/ && NR==n+1 {print s} $2==0 {s=$0; n=NR}
                   $8=="" && $2~/^[1-9]+/; {end=$0} END {print end}' input > output



Quote:
Originally Posted by ravi0435
...
Req - can you throw in couple of lines of explanation as i worked with it a lot but cudn't understand the following bolded ones in the code what its inteded to do.
....
thanks,
Ravi

$2=="H"{s=$0;n=NR} -> when a header is seen, store it (s=$0) and its record number (n=NR).

NR==n+1 -> If the next record right after its header (NR==n+1), satisfies the other two conditions ( $8=="" and $2="D" ), print the header s saved before.
The current record and the other needed ones will be printed later altogether ( $8=="" && $2=="D" ).

{end=$0} -> the variable end stores the current record, overwriting the previous one, so in the end ( END {...} ) the last record is printed.
With gawk you could simply do -> END{ print $0 }.
# 7  
Old 01-05-2009
whats wrong in my code

Thanks Rubin ...really appreciate that.

I used this code as i dont know how far the Numbers further go(instead of ^[1-9]+ i said != "0" ):

Code:
nawk -F'|' 'NR==1; $8=="" && $2!="0" && NR==n+1 {print s} $2=="0" {s=$0; n=NR} 
                   $8=="" && $2!="0"; {end=$0} END {print end}'  input> output

I get O/P as follows( I am not pasting it twice but thats the O/P i got with my code a blank line after the 1st line - dont know why and last line repeating) :


HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|0|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|2|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|3|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|4|5464-1|4|02-02-2008|02-03-2008|1||JJJ
DTL|0|9999-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|9999-1|1|02-02-2008|02-03-2008|1||zzz
DTL|2|9999-1|2|02-02-2008|02-03-2008|1||zzz
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ



O/P needed is:

HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ
DTL|0|5464-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ
DTL|2|5464-1|2|02-02-2008|02-03-2008|1||JJJ
DTL|3|5464-1|3|02-02-2008|02-03-2008|1||JJJ
DTL|4|5464-1|4|02-02-2008|02-03-2008|1||JJJ
DTL|0|9999-1|0|02-02-2008|02-03-2008||||F|||||||||
DTL|1|9999-1|1|02-02-2008|02-03-2008|1||zzz
DTL|2|9999-1|2|02-02-2008|02-03-2008|1||zzz
TRL|ABCD|10-13-2008 to 10-19-2008.Txt|10-19-2008|170|XYZ


And also thanks for the explanation...with the explanation i played around by changing NR = 0,1,2 and placing the code back and forth but nothing worked....the O/P i obtained closer to the actual o/p is what i pasted above...whats wrong in my code..could you help correcting it ..thanks for all your time.


thanks ,
Ravi
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

2. Shell Programming and Scripting

How to ignore Pipe in Pipe delimited file?

Hi guys, I need to know how i can ignore Pipe '|' if Pipe is coming as a column in Pipe delimited file for eg: file 1: xx|yy|"xyz|zzz"|zzz|12... using below awk command awk 'BEGIN {FS=OFS="|" } print $3 i would get xyz But i want as : xyz|zzz to consider as whole column... (13 Replies)
Discussion started by: rohit_shinez
13 Replies

3. Shell Programming and Scripting

Help with converting Pipe delimited file to Tab Delimited

I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use cat file | sed 's/|//t/g' The above command substituted "/t" not tab in the place of pipe. Sample file: abc|123|2012-01-30|2012-04-28|xyz have to convert to: abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies

4. Shell Programming and Scripting

how to Insert values in multiple lines(records) within a pipe delimited text file in specific cols

this is Korn shell unix. The scenario is I have a pipe delimited text file which needs to be customized. say for example,I have a pipe delimited text file with 15 columns(| delimited) and 200 rows. currently the 11th and 12th column has null values for all the records(there are other null columns... (4 Replies)
Discussion started by: vasan2815
4 Replies

5. Shell Programming and Scripting

How to convert a space delimited file into a pipe delimited file using shellscript?

Hi All, I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted... AA ATIU2345098809 009697 005374 BB ATIU2345097809 005445 006518 CC ATIU9685098809 003215 003571 DD... (7 Replies)
Discussion started by: nithins007
7 Replies

6. UNIX for Dummies Questions & Answers

Delete last value from pipe delimited file

I have a large(ish) pipe delimited file. The last line of the file contains a total row count and a checksum: END|1537451|1328569446 After making other adjustments to the file, I need to strip out the checksum and apply a new value - I have a script to generate the checksum and 'cat' it... (3 Replies)
Discussion started by: relentl3ss
3 Replies

7. Shell Programming and Scripting

convert a pipe delimited file to a':" delimited file

i have a file whose data is like this:: osr_pe_assign|-120|wg000d@att.com|4| osr_evt|-21|wg000d@att.com|4| pe_avail|-21|wg000d@att.com|4| osr_svt|-11|wg000d@att.com|4| pe_mop|-13|wg000d@att.com|4| instar_ready|-35|wg000d@att.com|4| nsdnet_ready|-90|wg000d@att.com|4|... (6 Replies)
Discussion started by: priyanka3006
6 Replies

8. UNIX for Dummies Questions & Answers

Extracting from pipe delimited file.

Hey, I am new to regualar expression. I wanted to extract the information from a pipe delimited file which has some entries like L|S2CMG1B|||-11178399||1|-8.65|IRCSH|BOND||||N|S|IRDL|AUD||CRP|STD|CRP|M|0|1|||CSH||||OTHER|01|DE|KFW|50418Y9T5|||||||||||2|||||| In this I want to extract the... (1 Reply)
Discussion started by: leepan2008
1 Replies

9. UNIX for Dummies Questions & Answers

Replacing a field in pipe delimited TEXT File

Hi, I want to replace a field in a text delimited file with the actual number of records in the same file. HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|0|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ... (3 Replies)
Discussion started by: ravi0435
3 Replies

10. Shell Programming and Scripting

How to split pipe delimited file

I have a pipe delimited input file as below. First byte of the each line indicate the record type. Then i need to split the file based on record_type = null,0,1,2,6 and create 5 files. How do i do this in a ksh script? Pls help |sl||SL|SL|SL|1996/04/03|1988/09/15|C|A|sl||||*|... (4 Replies)
Discussion started by: njgirl
4 Replies
Login or Register to Ask a Question