![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Finding out how long a script runs for and exit reason. | daydreamer | Shell Programming and Scripting | 2 | 01-28-2009 05:32 PM |
| Free Linux Memory by Dropping Caches | Neo | Linux | 0 | 11-29-2008 11:29 AM |
| Sed command dropping last record in File | bheeke | Shell Programming and Scripting | 5 | 09-11-2008 04:41 PM |
| why my script stopped- any reason(urgent please) | krishna9 | Shell Programming and Scripting | 1 | 05-21-2008 12:55 PM |
| strintercept dropping message on unixware | kapilverma_udr | UNIX for Advanced & Expert Users | 2 | 05-31-2005 05:47 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
Dropping Records for unknown reason in awk script
Hi, I have written the following it is pretty sloppy but I don't see any reason why I should be losing 54 records from a 3.5 million line file after using it. What I am doing: I have a 3.5 million record file with about 80,000 records need a correction. They are missing the last data from an append because they didn't have a match. I need to insert defaulted data on these records. My script worked at intended, however I have 54 less output records than input records and I don't know why they were dropped. Code:
#!/bin/ksh
myFile="${1}"
myOutput="${2}"
awk '{
match_flag=substr($0,63,2);
if (NR == 1) insert_data=substr($0,41,22);
if (match_flag == " ") {strt=substr($0,1,40); print strt insert_data "\ \ \ \ \ \ \ \ \ \ \ NM\ X";}
else print $0;}' "${myFile}" >> "${myOutput}"
Basically what I am doing is appending a long string a data to any records that are missing a value in position 3064-3065. Since this file is soo large I can't really provide sample data but I'll attempt to reproduce a short version below. Code:
INPUT: 0001 Ronald McDonald 01 H81 0001256 0100111 V VEEEFKFS SP X 0002 Elmo St. Elmo 02 H82 0089621 001 10 11 01 1 0000WWDFCWWSP X 0003 Cookie Monster 01 H81 0887141 1 . 0 0 . 1 BBB000 QWFJSP X 0004 Tfer Harris 04 H84 0985512 0000000000000000000000BBE00122933NM X 0005 Oscar Grouche 03 H83 0364471 110.VVMWEWGODWFDA X 0006 Dumb Name 02 H82 0000233 111 00 1111 00000000F23202233FFDA X 0007 Butter Face 04 H84 0014666 1111111111111111111111M012291122FDA X 0008 Ford F150 01 H81 0000001 00111 110 110 0011 ..S1102234SSMSP X 0009 Bar Foo 03 H83 7741668 0 1 0 1 0 1 0 1 0 1 0 P019441MEWEDA X 0010 ChoCho Train 04 H84 0014669 1111111111111111111111POWA1224023OB X 0011 Stone Stone 04 H84 0014566 1111111111111111111111M12301MANWEOB X 0012 Problem Record 04 H84 0000000 OUTPUT: 0001 Ronald McDonald 01 H81 0001256 0100111 V VEEEFKFS SP X 0002 Elmo St. Elmo 02 H82 0089621 001 10 11 01 1 0000WWDFCWWSP X 0003 Cookie Monster 01 H81 0887141 1 . 0 0 . 1 BBB000 QWFJSP X 0004 Tfer Harris 04 H84 0985512 0000000000000000000000BBE00122933NM X 0005 Oscar Grouche 03 H83 0364471 110.VVMWEWGODWFDA X 0006 Dumb Name 02 H82 0000233 111 00 1111 00000000F23202233FFDA X 0007 Butter Face 04 H84 0014666 1111111111111111111111M012291122FDA X 0008 Ford F150 01 H81 0000001 00111 110 110 0011 ..S1102234SSMSP X 0009 Bar Foo 03 H83 7741668 0 1 0 1 0 1 0 1 0 1 0 P019441MEWEDA X 0010 ChoCho Train 04 H84 0014669 1111111111111111111111POWA1224023OB X 0011 Stone Stone 04 H84 0014566 1111111111111111111111M12301MANWEOB X 0012 Problem Record 04 H84 0000000 0000000000000000000000 NM X File is fixed length no delimiters. Last edited by mkastin; 4 Weeks Ago at 11:28 AM.. Reason: Fixing all examples and adujusting code to fit examples properly. |
|
||||
|
It would be helpful if you rewrote your example code to work on the sample input you provide. At the moment there is no way of knowing what you are expecting in position 3064. Although your assumption that it is two empty spaces may be at the root of your problem. I also don't understand your print statement: - Code:
print strt insert_data "\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0NM\ X" When I try: - Code:
nawk ' BEGIN{
print "\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0NM\ X"
} '
I get: - Code:
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0NM\ X as my output not: - Code:
0000000000000000000000NMX My advise is scale down your example awk to work with your sample file and maybe someone will reply. Simply reposting the same request without changing it at all seems to be getting you nowhere. Good luck |
|
||||
|
Haha, wow, just realized how horrible my question was. Okay, I adjusted everything and it should hopefully be clearer now. Code:
$ awk ' BEGIN{
print "\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0NM\ X"
} '
awk: cmd. line:1: warning: escape sequence `\ ' treated as plain ` '
0NM X
This statement works fine for me, although the escape sequence isn't necessary. |
|
||||
|
Is there a pattern for the missing records, e.g. at the end? Since your output format is the same as the input, do Code:
cmp -l infile outfile Look for difference that doesn't look like your intended one. The expected difference is you replace blanks of input with fixed values on the output. Try to spot visually (or mechanically) the unintended differences. |
|
||||
|
Another approach is to diff the input and output files and redirect the differences to a file. Then open the file and look to see why the matches in your awk fail for those lines. You can go to the character postions and confirm if the patterns you are trying to match are what you expect.Good luck
|
|
||||
|
Quote:
|
![]() |
| Bookmarks |
| Tags |
| awk, ksh |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|