EDI File Parser


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting EDI File Parser
# 1  
Old 01-25-2015
Question EDI File Parser

I've one EDI file which is to be parsed into 7 different file.

I managed to extract required segments for a file(HEADER) to a separate file(sample3.dat) and is given below.

Code:
$ cat sample3.dat
REF*EI*273543997~
REF*2U*HELLO~
REF*G2*77685|132~
CLM*1000*0.00***12>B>1*N*A*Y*I~
CN1*05~
SBR*P*18*HELLO******16~
AMT*D*0.00~
OI***Y***I~
NM1*IL*1*ABC*DEF*A***MI*1234A~
DTP*573*D8*99991231~
CLM*1001*0.00***12>B>1*N*A*Y*I~
CN1*05~
REF*F8*1000~
SBR*P*18*HELLO******16~
AMT*D*0.00~
OI***Y***I~
NM1*IL*1*ABC*DEF*A***MI*1234A~
DTP*573*D8*99991231~

Expected output is as below
Code:
1000||||1234A|1234A|0.00|99991231|0.00|||||HELLO||12|1|||||||||||||05|HELLO|
1001||||1234A|1234A|0.00|99991231|0.00|||||HELLO||12|1|||||||||||1000||05|HELLO|F8

Second CLM segment is the child claim of first CLM and REF segment is available under that. I wrote the below script to parse the file.
Commented portion of the script below throws syntax error.

Code:
awk -F"*" '{OFS="|"}
/^CLM/ {CLM_NBR = $2}
/^NM1/ {SUB_ID = $10}
/^AMT/ {($2=="D")? CP_AMT = $3 : CP_AMT = $100}
/^AMT/ {($2=="A8")? DSALL_AMT = $3 : DSALL_AMT = $100}
/^AMT/ {($2=="F5")? PATPAID_AMT = $3 : PATPAID_AMT = $100}
/^AMT/ {($2=="A8")? NC = $3 : NC = $100}
/^AMT/ {($2=="EAF")? RPL = $3: RPL = $100}
/^SBR/ {GRP_POL = $4; GRP_NM = $5}
/^CLM/ {split($6,x,">") ; FREQ = x[3]}
##/^CLM/ {split($6,x,">");(x[3]=="7"||x[3]=="8")? (/^REF\*F8/ {REP_CLM = $3;REF_QUAL = $2}) : {REP_CLM = $100;REF_QUAL = $100}} 
REP_CLM = $100;REF_QUAL = $100  ## Temporarily assigned NULL value to be printed as above line throws syntax error
/^CN1/ {ICAP_IND = $2}
/^REF\*2U/ {($2=="2U")? HP = $3 : HP = $100}
/^DTP/ {($2 == "573")? PD_DATE = $4 : PD_DATE = $100
print CLM_NBR,$100,$100,$100,SUB_ID,SUB_ID,CP_AMT,
PD_DATE,"0\.00",DSALL_AMT,PATPAID_AMT,NC,RPL,
GRP_POL,GRP_NM,"12",FREQ,
$100,$100,$100,$100,$100,$100,$100,$100,$100,$100,
REP_CLM,$100,ICAP_IND,HP,REF_QUAL}' sample3.dat

Output after executing the file is given below
Code:
||.00|||||HELLO||12|1|||||||||100||||05~
||.00|||||HELLO||12|1|||||||||100||||05~

I don't understand where my script is going wrong. Can someone please help me in understanding this.
Thank you
# 2  
Old 01-25-2015
When I executed your script, I get:
Code:
1000||||1234A~|1234A~|0.00~|99991231~|0.00|||||HELLO||12|1|||||||||||||05~|HELLO~|
1001||||1234A~|1234A~|0.00~|99991231~|0.00|||||HELLO||12|1|||||||||||||05~|HELLO~|

What OS and version and what awk are you using?

The commented line has faulty syntax. Just try replacing it with simple if statements..

I do not understand what the script is doing. For one I do not get this $100 stuff is that another way of assigning an empty string?.
# 3  
Old 01-25-2015
gawk on Windows 7 Enterprise edition.

I get error while using if statement also.

$100 is to assign NULL values. I forgot to add command to replace '~' which is not a priority.

Simple explanation of my code:

Code:
/^CLM/ {CLM_NBR = $2}

If a line begins with CLM, assign second field value to CLM_NBR.

Code:
/^CLM/ {split($6,x,">");(x[3]=="7"||x[3]=="8")? (/^REF\*F8/ {REP_CLM = $3;REF_QUAL = $2}) : {REP_CLM = $100;REF_QUAL = $100}}

For record starting with CLM, 6th field is again delimited by ">". 6th column is splitted and value is held in an array x. If x[3] is either 7 or 8, then consider the line starting with REF*F8.

Once DTP line is encountered, print all the value stored in the variables.

Thanks.
Ashok
# 4  
Old 01-25-2015
Quote:
Originally Posted by ashokv3
gawk on Windows 7 Enterprise edition.
That may account for the actual output you showed; make sure there's no DOS <CR> line terminators in your data file.

Quote:
I get error while using if statement also.
I don't think so if using if correctly.

Quote:
Code:
/^CLM/ {split($6,x,">");(x[3]=="7"||x[3]=="8")? (/^REF\*F8/ {REP_CLM = $3;REF_QUAL = $2}) : {REP_CLM = $100;REF_QUAL = $100}}

...Commented portion of the script below throws syntax error.
And I think rightly, for syntactical as well as logical/semantical reasons:
- ... ? ... : ... is the conditional assignment operator. You can't use it for flow control. As Scrutinizer said, use if ... else ...
- ( ... ) can't be used in flow control; use { ... }
- /^REF\*F8 can't work as desired here
a) syntactically: a pattern cannot be used inside an action. Use if (/.../) and it will fly
b) logically: it's working on a line that starts with CLM, so ^REF will never be true. When a line with ^REF is encountered, it will not enter this action.

---------- Post updated at 17:29 ---------- Previous update was at 17:02 ----------

Tried to prettify your script. First version delivered the same output from your sample input as did your script. Some modifications of both script and data file delivered the desired output (or pretty close, at least):
Code:
awk -F"*" -vOFS="|" '   {sub (/~$/, NUL)}
         /^CLM/         {CLM_NBR = $2}
         /^NM1/         {SUB_ID = $10}
         /^AMT\*D\*/    {CP_AMT = $3}
         /^AMT\*A8\*/   {DSALL_AMT = $3
                         NC = $3}
         /^AMT\*F5\*/   {PATPAID_AMT = $3}
         /^AMT\*EAF\*/  {RPL = $3}
         /^SBR/         {GRP_POL = $4
                         GRP_NM = $5}
         /^CLM/         {split($6,x,">")
                         FREQ = x[3]
                         FREQ78 = (x[3]=="7"||x[3]=="8")
                        }
         /^CN1/         {ICAP_IND = $2}
         /^REF\*2U\*/   {HP = $3}
         /^REF\*F8\*/ &&
            FREQ78      {REP_CLM = $3
                         REF_QUAL = $2}
         /^DTP/         {if ($2 == "573") PD_DATE = $4
                         print  CLM_NBR, NUL, NUL, NUL, SUB_ID, SUB_ID, CP_AMT, PD_DATE, "0\.00",
                                DSALL_AMT, PATPAID_AMT, NC, RPL, GRP_POL, GRP_NM, "12", FREQ, NUL,
                                NUL, NUL, NUL, NUL, NUL, NUL, NUL, NUL, NUL, REP_CLM, NUL, ICAP_IND, HP, REF_QUAL
                         CLM_NBR = SUB_ID = CP_AMT = PD_DATE = DSALL_AMT = PATPAID_AMT = NC = RPL = GRP_POL = \
                                   GRP_NM = FREQ = REP_CLM = ICAP_IND = HP = REF_QUAL = NUL
                        }
        ' file
1000||||1234A|1234A|0.00|99991231|0\.00|||||HELLO||12|1|||||||||||||05|HELLO|
1001||||1234A|1234A|0.00|99991231|0\.00|||||HELLO||12|7|||||||||||1000||05||F8

Had to make some assumptions (e.g. CLM record appearing before REF F8 record) and had to set the FREQ to 7 in your sample data, but you get the gist (I hope).

Last edited by RudiC; 01-25-2015 at 12:36 PM..
This User Gave Thanks to RudiC For This Post:
# 5  
Old 01-25-2015
Thank you @RudiC.
Please explain why you used below code after print statement. I got the desired output when I removed that.
Code:
CLM_NBR = SUB_ID = CP_AMT = PD_DATE = DSALL_AMT = PATPAID_AMT = NC = RPL = GRP_POL = \                                    
GRP_NM = FREQ = REP_CLM = ICAP_IND = HP = REF_QUAL = NUL

The sample file which I used had DOS character(^M) and script worked when I removed it.

Thanks,
Ashok

Last edited by ashokv3; 01-26-2015 at 02:11 AM..
# 6  
Old 01-26-2015
It replaces all those PATPAID_AMT = $100 in the conditional assignments above, and it is relevant only if you have multiple records to process as it resets the variables to NUL (an unassigned variable equivalent to "empty", more understandable than $100).
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

A Stanza File Parser in Pure ksh

As it was ultimately Don Craguns idea that saved the whole project i can as well give something back to the community. This is my stanza file parser, which was written only using ksh without any external programs. The stanza structure There is some inconsistency as to what exactly is meant by... (0 Replies)
Discussion started by: bakunin
0 Replies

2. Shell Programming and Scripting

Parser

Hi All, I am trying to create a parser to find out what cobol programs are being called by which JCL's. I need to search recursively until the main cobol program is found being called by a JCL. I tried to create a script but I am not able to generalize it. Can someone please help. ... (1 Reply)
Discussion started by: nua7
1 Replies

3. Shell Programming and Scripting

File Parser

Hi need help parsing a file. I have tag fields and values in a file with delimiter |. sample records from the file listed below 8=value|9=value|35=value|49=value|56=value|34=value|50=value|48=value|10=value 8=value|9=value|35=value|49=value|56=value|34=value|51=value|48=value|10=value... (2 Replies)
Discussion started by: subramanian
2 Replies

4. Programming

Parser

Hi Everyone I have an out put of multiple lines which I would like to parse and retrieve certain info from it. The output consists of multiple sections that starts with the line Client: and ends with STL tag: each section separated by an empty line. So basically somehting like Client: ... (10 Replies)
Discussion started by: bombcan1
10 Replies

5. SCO

Add memory and EDI hard disk space

Dear all I would like to know if you have face to the problems as add 1. 256 MB DDR Ram memory 2. IDE 40 GB slaved hard-disk my current machine is GA-SIML Rev 1.0, 1.6 GHz CPU, 1GB HDD, 256 MB DDR-Ram memory, as I add another 256 DDR-Ram, and 40 GB Slaved EDI Hard disk it, the systems... (1 Reply)
Discussion started by: TinhNhi
1 Replies

6. Shell Programming and Scripting

parsing issue with edi file

Hello, We have edi files we need to do some extra parsing on. There is a line that shows up that looks like this: GE|8,845|000000000 We need to parse the file, find the line ( that begins with GE "^GE" ), and remove the comma(s). What is the easiest way to do that ? I know I can grab... (5 Replies)
Discussion started by: fwellers
5 Replies

7. Shell Programming and Scripting

Compare EDI files by skipping selected Segments

Hi, I wanted to compare EDI files present in Two different Directories which can be related by the file names. While comparing the EDI files i have to skip selected segments such as "ISA" "IEA" and "GS" "GE" since this may have datetime stamp and different "Sender" "Receiver" Qual. and... (3 Replies)
Discussion started by: Sivas
3 Replies

8. Shell Programming and Scripting

How to Strip lines off Streamed EDI Output

Attached is a streamed EDI ANSI X12 output where the segment terminator/delimiter is a tilde ~ character. Is it possible to do the following pseudo-code in a unix script (using either sed, awk and/or grep)? Open file StreamedOutput.txt Search for ISA and delete the data up to the tilde ~ char... (7 Replies)
Discussion started by: sapedi
7 Replies

9. Shell Programming and Scripting

need a text parser

i need a simple text parser which can parse a data file created by a softwre so that i can export it to my mysqldb,, datafile created as one record per line with different number of fields. e.g datafile contains following. a=1, b=3, c=4 a=1, c=55, d=abcd a=5, b=hello, c=99, d=help now i... (12 Replies)
Discussion started by: sfaizan
12 Replies

10. Shell Programming and Scripting

Help with an (easy) parser

Hello, i'm workig with a file with structural information about biological macromolecules (proteins etc). In a certain file, the info is structured like this @<TRIPOS>MOLECULE blah 1 blah 2 blah 3 @<TRIPOS>MOLECULE foo 1 foo 2 foo 3 @<TRIPOS>MOLECULE mmm 1 mmm 2 mmm 3 I would... (7 Replies)
Discussion started by: aristegui
7 Replies
Login or Register to Ask a Question