Split and add header and trailer from input file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split and add header and trailer from input file
# 8  
Old 06-03-2014
-v stamp=$(date +%Y%m%d%H%M%S) assign stamp variable with current date and time in YYYYMMDDhhmmss format - Note this is the time the process starts and all files produced in this session will have the same timestamp, even if the process takes a long time to run.

typ=substr($0,34,3) extract 3 chars from 34-36 into typ variable.

fname="xyz_" typ "_" stamp ".txt" build filename string "xyz_" plus typ plus "_" plus timestame plus ".txt" eg (xyz_TOM_201403061355.txt)

if(typ == "PAT") typ="TOM" put PAT lines into TOM file - this forces both TOM and PAT lines into the same file with xyz_TOM_... name.

if (!(fname in A)) {A[fname]; print header > fname} If this is the first line to go into this file (ie fname is not in the A[] array) then store fname in A[] array and output header to fname file.

close(fname) close file handle (this will be re-opened next time a line is appended, this is slower as lots of closing and opening of file buffers but awk is limited in the number of file buffers allowed to be open at once.

print $0 >> fname append current line to fname file

The END block writes the trailer line to each of the end of all files produced in this session once the end of the input file is reached.
This User Gave Thanks to Chubler_XL For This Post:
# 9  
Old 06-03-2014
Thank you again!

A small change , I need to group three patterns in to the same file i.e. TOM ,PAT and SAM in the same file , with file name as xyz_TOM_Datetimestamp

Also can how can I capture records with spaces and any records that do not match pattern TOM/PAT/SAM and put them in a file as error_records_datetimestamp

Here is my updated input file

Code:
 
H00000012345678900000000 xxxxxxxxxxxxxx
D00000012300000000000000 xxxxxxxxTOMxxx
D00000045600000000000000 xxxxxxxxTOMxxx
D00000078900000000000000 xxxxxxxxPATxxx
D00000065000000000000000 xxxxxxxxPATxxx
D00000023100000000000000 xxxxxxxxPATxxx
D00000013200000000000000 xxxxxxxxSAMxxx
D00000036500000000000000 xxxxxxxxSAMxxx
D00000036500000000000000 xxxxxxxxBOBxxx
D00000036500000000000000 xxxxxxxxBOBxxx
D00000036500000000000000 xxxxxxxx   xxx
D00000036500000000000000 xxxxxxxx   xxx
T00000025800000000000000 xxxxxxxxxxxxxx

# 10  
Old 06-03-2014
Try this modification of Chubler_XL's solution:
Code:
awk -v stamp=$(date +%Y%m%d%H%M%S) '
         BEGIN          {fname1="xyz_TOM_" stamp ".txt"
                         fname2="error_records_" stamp ".txt"}
         /^ *$/         {next}
         /^(H|T)/       {print $0 >> fname1
                         print $0 >> fname2
                         next}
         substr($0,34,3) ~ /PAT|SAM|TOM/    \
                        {print $0 >> fname1; next}
                        {print $0 >> fname2}
        ' file

# 11  
Old 06-03-2014
Thanks!

This works good for patterns TOM/PAT/SAM/Spaces

The moment I have a new pattern , I need to adjust the code . If so where should my changes go in the code ?

From the below input file

TOM/PAT/SAM records should go in the file "xyz_TOM_Timestamp.txt"
BOB records should go in "xyz_BOB_Timestamp.txt"
KIM records should go in " xyz_KIM_Timestamp.txt"
Spaces and any records that are other than TOM/PAT/SAM/BOB/KIM should go in "xyz_error_Timestamp.txt

Please advise

Code:
 
H00000012345678900000000 xxxxxxxxxxxxxx
D00000012300000000000000 xxxxxxxxTOMxxx
D00000045600000000000000 xxxxxxxxTOMxxx
D00000078900000000000000 xxxxxxxxPATxxx
D00000065000000000000000 xxxxxxxxPATxxx
D00000023100000000000000 xxxxxxxxPATxxx
D00000013200000000000000 xxxxxxxxSAMxxx
D00000036500000000000000 xxxxxxxxSAMxxx
D00000036500000000000000 xxxxxxxxBOBxxx
D00000036500000000000000 xxxxxxxxBOBxxx
D00000036500000000000000 xxxxxxxx   xxx
D00000036500000000000000 xxxxxxxxKIMxxx
D00000036500000000000000 xxxxxxxxKIMxxx
D00000036500000000000000 xxxxxxxx   xxx
D00000036500000000000000 xxxxxxxxDANxxx
D00000036500000000000000 xxxxxxxxDANxxx
T00000025800000000000000 xxxxxxxxxxxxxx

# 12  
Old 06-03-2014
how about this, it should be fairly easy for you to change the f (from) and t (to) variables on the command line to whatever you like:

Code:
awk -v stamp=$(date +%Y%m%d%H%M%S) \
    -v f="TOM PAT SAM BOB KIM" \
    -v t="TOM TOM TOM BOB KIM" '
BEGIN {
   split(f,from)
   for(i=split(t,to);i;i--) CONV[from[i]]=to[i]
}
/^H/ {header=$0 ; next}
/^T/ {trailer=$0 ; next}
{
   typ=substr($0,34,3)
   if(typ in CONV) fname="xyz_" CONV[typ] "_" stamp ".txt"
   else fname="xyz_error_" stamp ".txt"
   if (!(fname in A)) {A[fname]; print header > fname}
   print $0 >> fname
   close(fname)
}
END {
  for (fname in A) print trailer >> fname
}' Test.txt


If there are only a few of these from->to pairs the above should work fine, but if you find yourself with a large list if may be worth a different approach, like putting them is another translate.txt file and changing the code to load this first.
# 13  
Old 06-04-2014
Superb, works like a charm!

One small change the error file has header and trailer records , can we reduce it to only carry the actual records that have spaces in position 34-36 as well as those records that are not a part of the split condition

Thanks
# 14  
Old 06-04-2014
Sure, this should avoid header/trailer records in error file:

Code:
awk -v stamp=$(date +%Y%m%d%H%M%S) \
    -v f="TOM PAT SAM BOB KIM" \
    -v t="TOM TOM TOM BOB KIM" '
BEGIN {
   split(f,from)
   for(i=split(t,to);i;i--) CONV[from[i]]=to[i]
}
/^H/ {header=$0 ; next}
/^T/ {trailer=$0 ; next}
{
   typ=substr($0,34,3)
   if (typ in CONV) {
       fname="xyz_" CONV[typ] "_" stamp ".txt"
       if (!(fname in A)) {A[fname]; print header > fname}
   } 
   else fname="xyz_error_" stamp ".txt"
   print $0 >> fname
   close(fname)
}
END {
  for (fname in A) print trailer >> fname
}' Test.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Removing Header and Trailer record of a EBCDIC file

I have a EBCDIC multi layout file which has a header record which is 21 bytes, The Detail records are 2427 bytes long and the trailer record is 9 bytes long. Is there a command to remove the header as well as trailer record and read only the detail records while at the same time not altering... (1 Reply)
Discussion started by: abhilashnair
1 Replies

2. Shell Programming and Scripting

Verify the header and trailer in file

please see my requirement, I hope I am clear. (9 Replies)
Discussion started by: mirwasim
9 Replies

3. Shell Programming and Scripting

Script to validate file header and trailer

Hi, I need a script that validates a file header/detail/trailer. File layout is: Header - Rec_Type|File_name|File_Date Detail - Rec_Type|field1|field2|field3... Trailder - Rec_Type|File_name|File_Date|Record_count Sample Data: HDR|customer_data.dat|20120709... (7 Replies)
Discussion started by: ash_sh
7 Replies

4. Shell Programming and Scripting

Remove last few characters in a file but keeping Header and trailer intact

Hi All, I am trying write a simple command using AWK and SED to this but without any success. Here is what I am using: head -1 test1.txt>test2.txt|sed '1d;$d' test1.txt|awk '{print substr($0,0,(length($0)-2))}' >>test2.txt|tail -1 test1.txt>>test2.txt Input: Header 1234567 abcdefgh... (2 Replies)
Discussion started by: nvuradi
2 Replies

5. UNIX for Dummies Questions & Answers

Adding header and trailer into a file

Hi, I want to add the below Header to all the files in sequence File1,File2,File3...etc "ABC,<number of chracter in the file>" e,g - If File1 is as below pqrstuvdt abcdefgh then I want to add the above header into it ,So that File1 becomes as below ABC,17 pqrstuvdt abcdefgh ... (9 Replies)
Discussion started by: spari2
9 Replies

6. Shell Programming and Scripting

Adding Header and Trailer records to a appended file

How can we a shell script and pass date parameters .I have 3 files comming from Datastage with |" delimited I need append 3 files as above: File1: P0000|"47416954|"AU|"000|"INS|"0000|"|"20060601|"99991231|"|"|"|"|"01 File 2:... (2 Replies)
Discussion started by: e1994264
2 Replies

7. Shell Programming and Scripting

Creating Header & Trailer for bulk volume data file

Hi all, I have a requirement to create a Header &Trailer for a file which is having 20 millions of records. If I use the following method, i think it will take more time. cat "Header"> file1.txt cat Data_File>>file1.txt cat "Trailer">>file1.txt since second CAT command has to read all... (4 Replies)
Discussion started by: Raamc
4 Replies

8. Shell Programming and Scripting

Removing Header & Trailer from a file

Hi All, I am karthik. I am new to this forum. I have one requirement. I have a file with header and footer. Header may be like HDR0001 or FILE20090110 (Assume it is unknown so far, but i am sure there is a header in the file) likewise file has the trailer too. I just... (7 Replies)
Discussion started by: karthi_gana
7 Replies

9. Shell Programming and Scripting

Split large file and add header and footer to each small files

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (7 Replies)
Discussion started by: ashish4422
7 Replies

10. Shell Programming and Scripting

Split large file and add header and footer to each file

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (1 Reply)
Discussion started by: ashish4422
1 Replies
Login or Register to Ask a Question