Parse


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parse
# 1  
Old 05-14-2002
Hammer & Screwdriver Parse

Does anybody know how do we parse a file (ex. SIF file) into a delimited text file in UNIX?
# 2  
Old 05-14-2002
What's a SIF file?

Can you describe the layout of what it is now versus what you want ti to look like?
# 3  
Old 05-14-2002
It is actually called, system integration file. In unix, we have SIF files that we receive from other systems.

The format looks like this...,

XYZHEADER 20020503

AAAAAAAABBBBBBBBBBBBCCCCCCCCCDDDDDDDDDDDDDEEEEEEEFFFFFFFFFFFFFFFGGGGGGGGGHHHHHHH

AAAAAAAABBBBBBBBBBBBCCCCCCCCCDDDDDDDDDDDDDEEEEEEEFFFFFFFFFFFFFFFGGGGGGGGGHHHHHHH

AAAAAAAABBBBBBBBBBBBCCCCCCCCCDDDDDDDDDDDDDEEEEEEEFFFFFFFFFFFFFFFGGGGGGGGGHHHHHHH

XYZTRAILER 0000003000.

It has a header record with the date, trailer has number of records in the file and the actual records having fixed length fileds (as I have shown above, A is one filed, B is another filed). Now the task is to divide each multiline row with a delimiter (anything a , or : or tab or spaces) and convert that into a text file.

like this...,

AAAAAAAA, BBBBBBBBBBBB, CCCCCCCCC, DDDDDDDDDDDD.

Then we want to upload this text file into tables and from there we want to develop a front end screen showing label for each filed and displaying the corresponding value for it...

As soon as we get the delimited text file, the other part will be done easily.

I know we could use so many cut,grep and awk... but I need something simple.
# 4  
Old 05-15-2002
Using awk IS the something simple!

The following awk script parses a file called "data.txt" which contains the data from your example. To modify the number and/or size of the fields just change the numbers in the "fieldCount = split ..." line. To change the field delimiter change the SECOND setting for fieldSep.

Code:
awk '
    BEGIN {
        fieldCount = split ("8,12,9,13,7,15,9,7", fieldWidth, ",")
        lineLength=0

        for (i=1; i<=fieldCount; i++)
            lineLength += fieldWidth[i]
    }
    (length ($0) == lineLength) {
        fieldSep=""
        startPos=1

        for (i=1; i<=fieldCount; i++) {
            printf "%s%s", fieldSep, substr ($0, startPos, fieldWidth[i])
            startPos += fieldWidth[i]
            fieldSep = ", "
        }

        printf "\n"
    }
' data.txt

Note the following assumptions:

1) each data line is assumed to be EXACTLY the number of characters required in length

2) the number of records is correct; no checks are done against the the header or trailer

3) each line is short enough to be processed by awk!
# 5  
Old 05-15-2002
Thank you very much for your concern. Really appreciate it.
# 6  
Old 05-15-2002
Hi Kemisola,
I tried your script. It is not giving any errors and at the same time I could not see the output. I tried redirecting the output to another file and it did not work either. Where and how can I see the result file?
I tried the following command to see if the print statement is working or not.
awk '{printf "%s%s", ",", substr ($0, 1, 8)'} data.txt
and it is workign fine.
I did not understand why it is not printing anything when the statement is inside the script...

Is it because the (length ($0) == lineLength) condition ????
# 7  
Old 05-15-2002
Yes, that would be it. The script will ignore all lines that are not the exact length it is expecting, which would be the sum of all the column widths as shown in the split command. In this case, each line would have to be exactly 80 characters. If you have even one trailing space on a line, that would cause the line to be ignored.

Depending on your requirements, there are options available ...

If there is possibility of trailing spaces, the script could chop those off before checking the line length.

Instead of requiring exact line length, it could require a minimum line length, and ignore any excess.

The script could easily output two files: processed lines and non-processed lines/ignored excess.

To see your line lengths, do:
Code:
awk '{print length($0)}' data.txt

Jimbo
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

Parse with SQL

I am trying to parse a string using SQL but am too new and still learning. I have text in a control or field 685 that is variable, but always the same format. field 685 input arr 2q33.3q34(200,900,700-209,000,000)x2 xxx Desired output 2:200900700-209000000 Basically, the # after the... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. Shell Programming and Scripting

Parse html

I downloaded source code using: wget -qO- http://fulgentdiagnostics.com/test/clinical-exome/ | cat > flugentsource.txt Now I am trying to use sed to parse it to confirm a gene count. Basically, output (flugent.txt) all the gene names with a total count after them I'm not all that... (5 Replies)
Discussion started by: cmccabe
5 Replies

3. Shell Programming and Scripting

Parse

Attached file is parsed so that only the three columns result. DACH1 occurs 34 times with an average of 0.881541 NEB occurs 159 times with an average of 0.837628 LTBP1 occurs 46 times with an average of 0.748722 parse result: output.txt (the text is removed and the xxx is seperated in a... (6 Replies)
Discussion started by: cmccabe
6 Replies

4. Shell Programming and Scripting

Parse 2 or more files into one.

Hi, I have a really simple question...I think. I want to be able to parse two or more files into one by reading the first record from each file into new file then go back to the first file and start reading the second record in from each file into new file and so on. I am new to using awk and am... (5 Replies)
Discussion started by: qray2011
5 Replies

5. Shell Programming and Scripting

How to Parse a Prompt?

On the command, when I type in certain commands, they will display a prompt waiting for some input. When I type in the requested input, it will display the info I requested. For example, if I enter the telnet command, it will display a telnet prompt and wait for me to enter something. I... (1 Reply)
Discussion started by: april
1 Replies

6. Shell Programming and Scripting

Perl Parse

Hi I'm writing simple perl script to parse the ftp log as below: Local directory now /home/user/testing 227 Entering Passive Mode (192,254,19,34,8,228). 125 Data connection already open; Transfer starting. 09-25-09 02:33PM 25333629 abc.tar 09-14-09 12:50PM 18015752... (1 Reply)
Discussion started by: netxus
1 Replies

7. Shell Programming and Scripting

Parse

I need a script that will always return an engine of table, which not depends on the table structure. I need it to be done exactly from the "show create table ..." statement. If there is a easiest way, except "show table status", please write. mysql -u root db -sBe "show create table... (1 Reply)
Discussion started by: mirusnet
1 Replies

8. Shell Programming and Scripting

Need help to parse the file

# Start "ABC" SFFd 0 4 Time SFFT 4 8 {Sec} User SFFTimeVal 12 8 {Sec} # Start "CP" SFFT ... (3 Replies)
Discussion started by: navsharan
3 Replies

9. UNIX for Advanced & Expert Users

Parse error

hi,:) onsider the followinf two lines J="$(scriptbc -p 8 $I / \(12 \* 100 \) )" N="$(( $L * 12 ))" In the first line I put \ before * like \* and its working fine. But in the second line if put \ before * i am getting parse error. What might be the reason?Any idea pls. cheers RRK (1 Reply)
Discussion started by: ravi raj kumar
1 Replies

10. Shell Programming and Scripting

How to parse..

Help, I need to get the port number of a Oracle database using the tnsping command. I need to parse it's output. ===================== Attempting to contact (ADDRESS=(PROTOCOL=TCP)(Host=chamar)(Port=1541)) Sometimes may be like this: Attempting to contact... (8 Replies)
Discussion started by: natter
8 Replies
Login or Register to Ask a Question