Help using awk with a text file


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Help using awk with a text file
# 8  
Old 04-02-2013
Using Awk with text file

This is a major step in the right direction, but the output file names still have not been specified. Do you really want leading spaces on the output file names when customer numbers are less than five digits? It is easy to put them in (as requested), but it will make handling these files harder for you. Would you prefer to have leading zeroes added so the filename always starts with a 5 digit customer number? I would not use leading spaces for the account #, I can either have it with leading zero's or without, I prefer without leading zero's.

Where do the last 4 digits of the output file names come from? They don't come from the dates on lines 8, 18, or 19 in your 55 line statements. Are they always supposed to be 0313? Are they supposed to be the two digit month and two digit year corresponding to the date when the script is run? The last 4 digits will come from the system time and date which will always be run on the last day of the month so I would use month and year. I usually use the following to come up with my date,
date '+ %c/%m/%d' > date hold
y=`cat date.hold | cut -c24-25`
m=`cat date.hold | cut -c27 -c28`
then I would use variables $y and $m in naming the file

Are the multiple occurrences of the strings "^L" and "^M" in you input file literal characters that you want kept in the output files? Are they a graphic representation of form feed and carriage return characters that you want kept in the output files? Or, are they a graphic representation of form feed and carriage return characters that you want to be stripped from the output? As you noted they are carriage return and form feed and I would want them in the output file as well

If there are multiple statements for a given customer number, are they always adjacent records in the input file? Yes, if a statement is more than 1 page it would be adjacent to the first statement so if account 9 had multiple pages one would follow after the other.

What shell do you want to use and what operating system are you using? SCO 5.0.7 and using the bourne shell
# 9  
Old 04-02-2013
The following awk script seems to do what you want:
Code:
awk -v my="$(date "+%m%y")" '
{       out = out $0 "\n"
}
(NR % 55) == 10 {
        CN = substr($0, 1, 5)
        if(lCN != CN) {
                if(fn != "") close(fn)
                fn = sprintf("%d%s.txt", lCN = CN, my)
        }
        next
}
(NR % 55) == 0 {
        printf("%s", out) > fn
        out = ""
}
END {   if(fn != "") close(fn)
}' input

Note that the 1st line of your sample input file does not start with a form feed character. So, if you cat the output files to your printer, you will be missing a form feed before the file that comes from the first record of you input file. With you sample input and your choice not to have leading zeros in output file names, the last file printed will be missing the form feed.

This probably doesn't matter for an SCO system, but if someone else wants to try this script on a Solaris/SunOS system, they should use /usr/xpg4/bin/awk or nawk instead of awk.
# 10  
Old 04-02-2013
Test on awk script

Well it is real close but I must be missing a quote in the awk script because it is using the literal of the month and year instead of populating with the real month and year. Did I miss a tick on the copy and paste?
933 Apr 2 17:08 100$(date +%m%y).txt
866 Apr 2 17:08 103$(date +%m%y).txt
866 Apr 2 17:08 105$(date +%m%y).txt
Code:
awk -v my="$(date "+%m%y")" '
{       out = out $0 "\n"
}
(NR % 55) == 10 {
        CN = substr($0, 1, 5)
        if(lCN != CN) {
                if(fn != "") close(fn)
                fn = sprintf("%d%s.txt", lCN = CN, my)
        }
        next
}
(NR % 55) == 0 {
        printf("%s", out) > fn
        out = ""
}
END {   if(fn != "") close(fn)
}' report10.txt

# 11  
Old 04-02-2013
Quote:
Originally Posted by ziggy6
Well it is real close but I must be missing a quote in the awk script because it is using the literal of the month and year instead of populating with the real month and year. Did I miss a tick on the copy and paste?
933 Apr 2 17:08 100$(date +%m%y).txt
866 Apr 2 17:08 103$(date +%m%y).txt
866 Apr 2 17:08 105$(date +%m%y).txt
Code:
awk -v my="$(date "+%m%y")" '
{       out = out $0 "\n"
}
(NR % 55) == 10 {
        CN = substr($0, 1, 5)
        if(lCN != CN) {
                if(fn != "") close(fn)
                fn = sprintf("%d%s.txt", lCN = CN, my)
        }
        next
}
(NR % 55) == 0 {
        printf("%s", out) > fn
        out = ""
}
END {   if(fn != "") close(fn)
}' report10.txt

With the input you provided, I get the files:
Code:
1000413.txt
1030413.txt
1050413.txt
90413.txt

when I run this script on OS X.

Are you sure that the first line of the script you ran was:
Code:
awk -v my="$(date "+%m%y")" '

and not:
Code:
awk -v my='$(date '+%m%y')' '

?
If you used single quotes where I had double quotes in the first line, the command substitution would not be performed when assigning a value to the variable my and you would end up with:
Code:
100$(date +%m%y).txt
103$(date +%m%y).txt
105$(date +%m%y).txt
9$(date +%m%y).txt

as the output file names.
This User Gave Thanks to Don Cragun For This Post:
# 12  
Old 04-02-2013
Test on awk script

Yes I had the quotes. I modified it to this and it worked fine!
Code:
date '+ %c/%m/%d' >date.hold
y=`cat date.hold | cut -c22-25`
m=`cat date.hold | cut -c27-28`
awk  '
{       out = out $0 "\n"
}
(NR % 55) == 10 {
        CN = substr($0, 1, 5)
        if(lCN != CN) {
                if(fn != "") close(fn)
                fn = sprintf("%d%s.txt", lCN = CN, my)
        }
        next
}
(NR % 55) == 0 {
        printf("%s", out) > fn
        out = ""
}
END {   if(fn != "") close(fn)
}' report10.txt

---------- Post updated at 06:51 PM ---------- Previous update was at 06:50 PM ----------

A big thank you to Don for excellent assistance on the script, much appreciated! Smilie
# 13  
Old 04-02-2013
Quote:
Originally Posted by ziggy6
Yes I had the quotes. I modified it to this and it worked fine!
Code:
date '+ %c/%m/%d' >date.hold
y=`cat date.hold | cut -c22-25`
m=`cat date.hold | cut -c27-28`
awk  '
{       out = out $0 "\n"
}
(NR % 55) == 10 {
        CN = substr($0, 1, 5)
        if(lCN != CN) {
                if(fn != "") close(fn)
                fn = sprintf("%d%s.txt", lCN = CN, my)
        }
        next
}
(NR % 55) == 0 {
        printf("%s", out) > fn
        out = ""
}
END {   if(fn != "") close(fn)
}' report10.txt

---------- Post updated at 06:51 PM ---------- Previous update was at 06:50 PM ----------

A big thank you to Don for excellent assistance on the script, much appreciated! Smilie
Note that you're setting shell variables mand y, but you are not setting the awk variable my.

Even if your shell doesn't recognize $(...), it would still be much easier to set the month and year in a single invocation of date instead of using date once, two calls to cat, and two calls to cut. Please try changing the first line of my original script to:
Code:
awk -v my=`date +%m%y` '

if your shell doesn't recognize the $(...) form of command substitution. Note that it is important that you use the exact single quote and back quotes as shown.
This User Gave Thanks to Don Cragun For This Post:
# 14  
Old 04-03-2013
Test on awk script

Ok that new statement worked,
Code:
awk -v my=`date +%m%y` '

I did some more testing to check what would happen with an account that had more than 1 page and it wants to break the statement up in to separate file

Here is an example of one account with 3 pages
Code:
    1  ^M                 ABC PARTS COMPANY             ^M
    2                   100 WILDFLOWER TRAIL          ^M
    3                   MT PLEASANT SC 29579          ^M
    4                          800-555-1212
    5
    6
    7
    8  ^M                                       02/05/13^M
    9  ^M
   10    960    FRONTIER CONSTRUCTION           ^M
   11           P O BOX 84                      ^M
   12           JONES MILLS PA 15646            ^M
   13                                           ^M
   14  ^M
   15  ^M
   16                                                    PAGE   1 OF   3^M
   17  ^M
   18  10/31/12         PB  1PB         8304.54                  8304.54 ^M
   19  10/31/12         PB  2PB         3123.15                 11427.69 ^M
   20  11/01/12         ID  146469       589.75                 12017.44 ^M
   21  11/03/12         ID  146502       275.49                 12292.93 ^M
   22  11/03/12         ID  146503        30.69                 12323.62 ^M
   23  11/05/12         ID  146509       417.85                 12741.47 ^M
   24  11/05/12         ID  146515        75.68                 12817.15 ^M
   25  11/05/12         ID  146516       598.90                 13416.05 ^M
   26  11/05/12         ID  146518       275.49                 13691.54 ^M
   27  11/05/12         ID  146519       158.95                 13850.49 ^M
   28  11/05/12         ID  146520        22.79                 13873.28 ^M
   29  11/05/12         ID  146525        11.13                 13884.41 ^M
   30  11/06/12         ID  146539        90.31                 13974.72 ^M
   31  11/06/12         IC  146547                   19.19-     13955.53 ^M
   32  11/07/12         ID  146552       401.37                 14356.90 ^M
   33  11/07/12         ID  146558       110.18                 14467.08 ^M
   34  11/08/12         ID  146583       615.84                 15082.92 ^M
   35  11/09/12         ID  146590       253.76                 15336.68 ^M
   36  11/12/12         ID  146612       122.72                 15459.40 ^M
   37  11/12/12         ID  146618       143.65                 15603.05 ^M
   38  11/13/12         ID  146620       691.25                 16294.30 ^M
   39  11/13/12         ID  146635        42.35                 16336.65 ^M
   40  11/14/12         ID  146646        16.94                 16353.59 ^M
   41  11/14/12         ID  146653       593.55                 16947.14 ^M
   42  11/14/12         ID  146656       221.88                 17169.02 ^M
   43  11/14/12         ID  146661        90.74                 17259.76 ^M
   44  11/14/12         ID  146662       200.34                 17460.10 ^M
   45  11/14/12         ID  146664        67.03                 17527.13 ^M
   46  11/15/12         ID  146681      2403.09                 19930.22 ^M
   47  11/15/12         ID  146682        35.37                 19965.59 ^L
                ABC PARTS COMPANY             ^M
   48                   100 WILDFLOWER TRAIL          ^M
   49                   MT PLEASANT SC 29579          ^M
   50                          800-555-1212
   51
   52
   53
   54  ^M                                       02/05/13^M
   55  ^M
   56    960    FRONTIER CONSTRUCTION           ^M
   57           P O BOX 84                      ^M
   58           JONES MILLS PA 15646            ^M
   59                                           ^M
   60  ^M
   61  ^M
   62                                                    PAGE   2 OF   3^M
   63  ^M
   64  11/15/12         ID  146684        74.19                 20039.78 ^M
   65  11/16/12         ID  146702       254.08                 20293.86 ^M
   66  11/16/12         ID  146703       299.34                 20593.20 ^M
   67  11/19/12         ID  146719       227.54                 20820.74 ^M
   68  11/19/12         ID  146722        38.01                 20858.75 ^M
   69  11/19/12         ID  146733       142.36                 21001.11 ^M
   70  11/20/12         ID  146740        64.43                 21065.54 ^M
   71  11/20/12         ID  146753       145.62                 21211.16 ^M
   72  11/20/12         ID  146757       314.13                 21525.29 ^M
   73  11/20/12         ID  146761        59.71                 21585.00 ^M
   74  11/20/12         ID  146763        31.84                 21616.84 ^M
   75  11/21/12         PC  CK6192                   16.11-     21600.73 ^M
   76  11/21/12         PC  CK4956                 8561.65-     13039.08 ^M
   77  11/23/12         ID  146789        46.80                 13085.88 ^M
   78  11/23/12         ID  146795        46.48                 13132.36 ^M
   79  11/23/12         ID  146805      1004.47                 14136.83 ^M
   80  11/26/12         ID  146812       100.40                 14237.23 ^M
   81  11/27/12         ID  146819       139.87                 14377.10 ^M
   82  11/27/12         ID  146820       279.29                 14656.39 ^M
   83  11/27/12         ID  146821        48.37                 14704.76 ^M
   84  11/27/12         ID  146822        72.81                 14777.57 ^M
   85  11/27/12         ID  146823        71.24                 14848.81 ^M
   86  11/27/12         ID  146826       156.46                 15005.27 ^M
   87  11/28/12         ID  146828        95.24                 15100.51 ^M
   88  11/28/12         ID  146829        76.79                 15177.30 ^M
   89  11/28/12         ID  146830       163.02                 15340.32 ^M
   90  11/28/12         ID  146842        13.78                 15354.10 ^M
   91  11/28/12         ID  146854       254.83                 15608.93 ^M
   92  11/28/12         ID  146856       915.88                 16524.81 ^M
   93  11/29/12         ID  146877       137.75                 16662.56 ^L
                ABC PARTS COMPANY             ^M
   94                   100 WILDFLOWER TRAIL          ^M
   95                   MT PLEASANT SC 29579          ^M
   96                          800-555-1212
   97
   98
   99
  100  ^M                                       02/05/13^M
  101  ^M
  102    960    FRONTIER CONSTRUCTION           ^M
  103           P O BOX 84                      ^M
  104           JONES MILLS PA 15646            ^M
  105                                           ^M
  106  ^M
  107  ^M
  108                                                    PAGE   3 OF   3^M
  109  ^M
  110  11/29/12         ID  146878        34.27                 16696.83 ^M
  111  11/29/12         ID  146879        33.92                 16730.75 ^M
  112  11/29/12         ID  146880       178.06                 16908.81 ^M
  113  11/29/12         ID  146881        75.83                 16984.64 ^M
  114  11/29/12         ID  146882       148.44                 17133.08
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140  ^M^M
  141                                                           17133.08 ^M
  142  ^M
  143   14283.15      2849.93                                     ^M
  144  ^M
  145                                   STATEMENT MESSAGE HERE          ^M
  146                                   PLEASE RETURN TO US             ^M
  147  ^M

The results ended in two files
2849 Apr 3 08:42 110413.txt
2787 Apr 3 08:42 9600413.txt
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match text to lines in a file, iterate backwards until text or text substring matches, print to file

hi all, trying this using shell/bash with sed/awk/grep I have two files, one containing one column, the other containing multiple columns (comma delimited). file1.txt abc12345 def12345 ghi54321 ... file2.txt abc1,text1,texta abc,text2,textb def123,text3,textc gh,text4,textd... (6 Replies)
Discussion started by: shogun1970
6 Replies

2. UNIX for Beginners Questions & Answers

awk to update file with partial matching line in another file and append text

In the awk below I am trying to cp and paste each matching line in f2 to $3 in f1 if $2 of f1 is in the line in f2 somewhere. There will always be a match (usually more then 1) and my actual data is much larger (several hundreds of lines) in both f1 and f2. When the line in f2 is pasted to $3 in... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Shell Programming and Scripting

Splitting a text file into smaller files with awk, how to create a different name for each new file

Hello, I have some large text files that look like, putrescine Mrv1583 01041713302D 6 5 0 0 0 0 999 V2000 2.0928 -0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.6650 0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5217 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

4. Shell Programming and Scripting

awk to reformat text file

Howdy. AWK beginner here. I need to reformat a text file in the following format: TTGS08-2014001 6018.00 143563.00 ... (2 Replies)
Discussion started by: c47v3770
2 Replies

5. Shell Programming and Scripting

Search and replace from file in awk using a 16 bit text file

Hello, Some time ago a helpful awk file was provided on the forum which I give below: NR==FNR{A=$0;next}{for(j in A){split(A,P,"=");for(i=1;i<=NF;i++){if($i==P){$i=P}}}}1 While it works beautifully on English and Latin characters i.e. within the ASCII range of 127, the moment a character beyond... (6 Replies)
Discussion started by: gimley
6 Replies

6. Shell Programming and Scripting

Awk to convert a text file to CSV file with some string manipulation

Hi , I have a simple text file with contents as below: 12345678900 971,76 4234560890 22345678900 5971,72 5234560990 32345678900 71,12 6234560190 the new csv-file should be like: Column1;Column2;Column3;Column4;Column5 123456;78900;971,76;423456;0890... (9 Replies)
Discussion started by: FreddyDaKing
9 Replies

7. UNIX for Advanced & Expert Users

Help using Awk and cut with a text file

Looking for some help on using awk and cut I have a text file that has fixed information and want to write a script that will prompt the user for an account to search for and pint the output The sample line that has the key information looks like this: Statement to: ... (5 Replies)
Discussion started by: ziggy6
5 Replies

8. Shell Programming and Scripting

search text file in file if this file contains necessary text (awk,grep)

Hello friends! Help me pls to write correct awk and grep statements for my task: I have got files with name filename.txt It has such structure: Start of file FROM: address@domen.com (12...890) abc DATE: 11/23/2009 on Std SUBJECT: any subject End of file So, I must check, if this file... (4 Replies)
Discussion started by: candyme
4 Replies

9. Shell Programming and Scripting

awk to reformat a text file

I am definitely not an expert with awk, and I want to reformat a text file like the following. This is probably a very easy one for an expert out there. I would like to keep the lines in the same order, but move the heading to only be listed once above the lines. This is what the text file... (7 Replies)
Discussion started by: linux4life
7 Replies

10. Shell Programming and Scripting

process text file with awk

I have a text file which represent a http packet: header1 haeder2 ..... ..... headern payload I need to count bytes in the payload. How can I get it using awk? Thanks in advance Andrea Musella (2 Replies)
Discussion started by: littleboyblu
2 Replies
Login or Register to Ask a Question