awk Parse And Create Multiple Files Based on Field Value


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk Parse And Create Multiple Files Based on Field Value
# 1  
Old 02-12-2015
awk Parse And Create Multiple Files Based on Field Value

Hello:

I am working parsing a large input file which will be broken down into multiples based on the second field in the file, in this case: STORE.
The idea is to create each file with the corresponding store number, for example: Report_$STORENUM_$DATETIMESTAMP , and obtaining the $STORENUM value from the second field. Each store report end with a 0x0C which I am using as and end pattern. Perhaps there is a better way...

Here is my code and sample input file:
Code:
 awk  '$1 ~ /DATE/{f="Report_"++i} f{print > f"_"} /"\0x0C"/ {close (f); f=""}' $INPUT_FILE

INPUT_FILE Example:

Code:
      DATE   STORE  *-UPC NUMBER-*  TIME     TERM TRANS OPERATOR      RETAIL  RAIN CHECK   SAVINGS   QTY        DESCRIPTION
 2015-01-01     1           1234  00.24.03  1    11                   $1.00      $1.00      $1.00     1   VARIETY ITEM 6PK
 2015-01-01     1     9999019919  00.20.19  1    11                   $1.00      $1.00      $1.50     1   WATER SOFT 1PK

     DATE   STORE  *-UPC NUMBER-*  TIME     TERM TRANS OPERATOR      RETAIL  RAIN CHECK   SAVINGS   QTY        DESCRIPTION
 2015-01-01     2           1234  00.24.03  1    11                   $1.00      $1.00      $1.00     1   VARIETY ITEM 6PK
 2015-01-01     2     9999019919  00.20.19  1    11                   $1.00      $1.00      $1.50     1   WATER SOFT 1PK


# 2  
Old 02-12-2015
Unless there are several headers per store with "DATE" in it, you can drop the 0x0C test and close (f) just before defining a new file name.
# 3  
Old 02-12-2015
Few adjustments.
Code:
awk '$1 ~ /DATE/ {if (i++) close (f); f="Report_" i "_"} {print > f}' $INPUT_FILE

Open files are properly closed when awk is finished. But an early close() prevents from running out of file descriptors.
# 4  
Old 02-12-2015
Thanks you! Any way to incorporate the STORE number (field $2) in the file name for each file ?
# 5  
Old 02-12-2015
This works if the respective headers are identical (which is NOT the case for your sample! I had to edit it):
Code:
awk     'NR==1  {HD=$0}
         $0==HD {getline; FN="Report_" $2; print HD > FN}
                {print $0 > FN}
        ' file
Report_1:
      DATE   STORE  *-UPC NUMBER-*  TIME     TERM TRANS OPERATOR      RETAIL  RAIN CHECK   SAVINGS   QTY        DESCRIPTION
 2015-01-01     1           1234  00.24.03  1    11                   $1.00      $1.00      $1.00     1   VARIETY ITEM 6PK
 2015-01-01     1     9999019919  00.20.19  1    11                   $1.00      $1.00      $1.50     1   WATER SOFT 1PK

Report_2:
      DATE   STORE  *-UPC NUMBER-*  TIME     TERM TRANS OPERATOR      RETAIL  RAIN CHECK   SAVINGS   QTY        DESCRIPTION
 2015-01-01     2           1234  00.24.03  1    11                   $1.00      $1.00      $1.00     1   VARIETY ITEM 6PK
 2015-01-01     2     9999019919  00.20.19  1    11                   $1.00      $1.00      $1.50     1   WATER SOFT 1PK

This User Gave Thanks to RudiC For This Post:
# 6  
Old 02-12-2015
RuiC:

Thanks so much! It certainly works. Now I have something to work with!!!
# 7  
Old 02-13-2015
The following handles different headers and empty sections, and closes files.
Code:
awk '($1=="DATE") {hd=$0; next} (length(hd)) {if (f) close (f); f="Report_" $2 "_"; print hd > f; hd=""} {print > f}' $INPUT_FILE

.
This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to create subdirectory based on match between two files

In the below awk I am trying to mkdir based of an exact match between file2 line starting with R_2019.... and file1 line starting with R_2019. When a match is found there is a folder located at /home/cmccabe/run with the same name as the match where each $2 in file1 is a new subdirectory in that... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. Shell Programming and Scripting

awk to create separate files but not include specific field in output

I am trying to use awk to create (in this example) 3 seperate text file from the unique id in $1 in file, if it starts with the pattern aa. The contents of each row is used to populate each text file except for $1 which is not needed. It seems I am close but not quite get there. Thank you :). ... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

awk joining multiple lines based on field count

Hi Folks, I have a file with fields as follows which has last field in multiple lines. I would like to combine a line which has three fields with single field line for as shown in expected output. Please help. INPUT hname01 windows appnamec1eda_p1, ... (5 Replies)
Discussion started by: shunya
5 Replies

4. Shell Programming and Scripting

Create multiple files from single file based on row separator

Hello , Can anyone please help me to solve the below - Input.txt source table abc col1 char col2 number source table bcd col1 date col2 char output should be 2 files based on the row separator "source table" abc.txt col1 char (6 Replies)
Discussion started by: Pratik4891
6 Replies

5. Shell Programming and Scripting

awk to parse field and include the text of 1 pipe in field 4

I am trying to parse the input in awk to include the |gc= in $4 but am not able to. The below is close: awk so far: awk '{sub(/\|]+]++/, ""); print }' input.txt Input chr1 955543 955763 AGRN-6|pr=2|gc=75 0 + chr1 957571 957852 AGRN-7|pr=3|gc=61.2 0 + chr1 970621 ... (7 Replies)
Discussion started by: cmccabe
7 Replies

6. Shell Programming and Scripting

awk : Filter a set of data to parse header line and last field of multiple same match.

Hi Experts, I have a data with multiple entry , I want to filter PKG= & the last column "00060110" or "00088150" in the output file: ############################################################################################### PKG= P8SDB :: VGS = vgP8SOra vgP8SDB1 vgP8S001... (5 Replies)
Discussion started by: rveri
5 Replies

7. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

8. Shell Programming and Scripting

Split a file into multiple files based on field value

Hi, I've one requirement. I have to split one comma delimited file into multiple files based on one of the column values. How can I achieve this Unix Here is the sample data. In this case I have split the files based on date column(c4) Input file c1,c2,c3,c4,c5... (1 Reply)
Discussion started by: manasvi24
1 Replies

9. UNIX for Advanced & Expert Users

Create a file based on multiple files

Hey everyone. I am trying to figure out a way to create a file that will be renamed based off of one of multiple files. For example, if I have 3 files (cat.ctl, dog.ctl, and bird.ctl) that gets placed on to an ftp site I want to create a single file called new.cat.ctl, new.dog.ctl, etc for each... (3 Replies)
Discussion started by: coach5779
3 Replies

10. Shell Programming and Scripting

Using awk to create files based on a variable name

Hey all, I am parsing a file which have records containing one of a number of files names: ".psd", ".cr2", ".crw" , ".cr", ".xi", ".jpg", ".xif" etc Somewhere on each line there is a value "Namex.psd" "Namex.crw" etc. The position of this name is highly variable I need to output all the ".psd"... (4 Replies)
Discussion started by: C0ppert0p
4 Replies
Login or Register to Ask a Question