Reformat file using nawk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reformat file using nawk
# 1  
Old 11-23-2011
Reformat file using nawk

Hi all, I have a file with records that look something like this,

Code:
"Transaction ID",Date,Email,"Card Type",Amount,"NETBANX Ref","Root Ref","Transaction Type","Merchant Ref",Status,"Interface ID","Interface Name","User ID"
nnnnnnnnn,"21 Nov 2011 00:10:47",someone@hotmail.co.uk,"Visa Debit",nnnn,d8rkf93jspe840fj,,"Immediate Bill",n-nnnnnnnn-nnnnnnnnnnnnnnnn-n-aaa,n,nnnnn,,system
nnnnnnnnn,"21 Nov 2011 13:46:14","None Given","Visa Debit",nnnn,nanananananananana,nanananananananana,Refund,,n,nnnnn,,aaaaaaaa-24aaaaaa-e

I need to reformat this file to look like this,

Code:
nnnnnnnnn,21/11/2011,2011-11-21 00:10:47.000000,someone@hotmail.co.uk,"Visa Debit",nnnn,nananananananana,,"Immediate Bill",n,nnnnnnnn,nnnnnnnnnnnnnnnn,n,CSS,n-nnnnnnnn-nnnnnnnnnnnnnnnn-n-CSS,n,nnnnn,nnnnn,system,,n,,
nnnnnnnnn,21/11/2011,2011-11-21 13:46:14.000000,"None Given","Visa Debit",nnnnn,nanananananananana,nanananananananana,Refund,0,00000000,0000000000000000,0," "," ",n,nnnnn," ",stockley-24nnn-e,,0,,

I currently have a peice of code that splits the file into the 2 record types and removes the lines containing the phrase "Transaction ID" in field 1, like this,

Code:
cat /netbanx/netbanx.txt | while read LINE
do
  field9=`echo $LINE | awk -F"," '{ print $9 }'`
  echo $field9 | grep '[A-Za-z0-9]' > /dev/null
  if [ $? -ne 0 ];then
  echo "$LINE" | sed 's/^M//g' >> /netbanx/netbanx_noMref.tmp
  else
  field1=`echo $LINE | awk -F"," '{ print $1 }'`
  echo $field1 | grep "Transaction ID" > /dev/null
    if [ $? -ne 0 ];then
    echo $LINE","$field9 |  awk '{FS=","} {OFS=","}  split ($9,aa,"-") {$9=aa[1]","aa[2]","aa[3]","aa[4]","aa[5] } {print}' | sed 's
/^M//g' >> /netbanx/netbanx2.tmp
    fi
  fi
done

Once this has been done the noMref file is reformatted using this code,

Code:
cat /netbanx/netbanx_noMref.tmp | while read LINE
do
field1=`echo $LINE | awk -F"," '{ print $1 }'`
field3tmp=`echo $LINE | awk -F"," '{ print $2 }' | sed 's/"//g'`
field4=`echo $LINE | awk -F"," '{ print $3 }'`
field5=`echo $LINE | awk -F"," '{ print $4 }'`
field6=`echo $LINE | awk -F"," '{ print $5 }'`
field7=`echo $LINE | awk -F"," '{ print $6 }'`
field8=`echo $LINE | awk -F"," '{ print $7 }'`
field9=`echo $LINE | awk -F"," '{ print $8 }'`
field10='0'
field11='00000000'
field12='0000000000000000'
field13='0'
field14="\" "\"
field15="\" "\"
field16=`echo $LINE | awk -F"," '{ print $10 }'`
field17=`echo $LINE | awk -F"," '{ print $11 }'`
field18tmp=`echo $LINE | awk -F"," '{ print $12 }'`
field19=`echo $LINE | awk -F"," '{ print $13 }'`
field20=''
field21='0'

typeset -RZ2 day=`echo $field3tmp | awk '{ print $1 }'`
tmpmonth=`echo $field3tmp | awk '{ print $2 }'`
month=`cat netbanx_datamart_load.sh | grep "#$tmpmonth" | awk '{ print $2 }'`
year=`echo $field3tmp | awk '{ print $3 }'`
timestamp=`echo $field3tmp | awk '{ print $4 }' | cut -c1-5`

field2=`echo $day/$month/$year`
#field3=`echo $day/$month/$year $timestamp`
field3=`echo $year-$month-$day $timestamp:00.000000`

echo $field18tmp | grep '[A-Za-z0-9]' > /dev/null
if [ $? -ne 0 ];then
  field18="\" "\"
else
  field18=$field18tmp
fi

echo "$field1,$field2,$field3,$field4,$field5,$field6,$field7,$field8,$field9,$field10,$field11,$field12,$field13,$field14,$field15,
$field16,$field17,$field18,$field19,$field20,$field21,," >> /netbanx/netbanx_noMref.txt
done

and the netbanx2.tmp file is reeformatted using this code,

Code:
cat /netbanx/netbanx2.tmp | while read LINE
do
field1=`echo $LINE | awk -F"," '{ print $1 }'`
field3tmp=`echo $LINE | awk -F"," '{ print $2 }' | sed 's/"//g'`
field4=`echo $LINE | awk -F"," '{ print $3 }'`
field5=`echo $LINE | awk -F"," '{ print $4 }'`
field6=`echo $LINE | awk -F"," '{ print $5 }'`
field7=`echo $LINE | awk -F"," '{ print $6 }'`
field8=`echo $LINE | awk -F"," '{ print $7 }'`
field9=`echo $LINE | awk -F"," '{ print $8 }'`
field10=`echo $LINE | awk -F"," '{ print $9 }'`
field11=`echo $LINE | awk -F"," '{ print $10 }'`
field12=`echo $LINE | awk -F"," '{ print $11 }'`
field13=`echo $LINE | awk -F"," '{ print $12 }'`
field14=`echo $LINE | awk -F"," '{ print $13 }'`
field15=`echo $LINE | awk -F"," '{ print $18 }'`
field16=`echo $LINE | awk -F"," '{ print $14 }'`
field17tmp=`echo $LINE | awk -F"," '{ print $15 }'`
field18=`echo $LINE | awk -F"," '{ print $16 }'`
field19=`echo $LINE | awk -F"," '{ print $17 }'`
field20=''
field21='0'

typeset -RZ2 day=`echo $field3tmp | awk '{ print $1 }'`
tmpmonth=`echo $field3tmp | awk '{ print $2 }'`
month=`cat netbanx_datamart_load.sh | grep "#$tmpmonth" | awk '{ print $2 }'`
year=`echo $field3tmp | awk '{ print $3 }'`
timestamp=`echo $field3tmp | awk '{ print $4 }' | cut -c1-5`

field2=`echo $day/$month/$year`
field3=`echo $year-$month-$day $timestamp:00.000000`

echo $field17tmp | grep '[A-Za-z0-9]' > /dev/null
if [ $? -ne 0 ];then
  field18="\" "\"
else
  field18=$field17tmp
fi

echo "$field1,$field2,$field3,$field4,$field5,$field6,$field7,$field8,$field9,$field10,$field11,$field12,$field13,$field14,$field15,
$field16,$field17,$field18,$field19,$field20,$field21,," >> /netbanx/netbanx2.txt
done

The 2 output files are then merged together at the end.

This script works perfectly but it runs for about 40 minutes with an input file containing about 13000 lines. The worst offending parts are the file split (7 mins) and the second reformat (30 mins) although that is because most of the records are split into the netbanx2.tmp file

I really need to make this script more efficinet both in run time and cpu usage so any assistance anyone can give would be very much appreciated.

I could do this more efficiently in perl but unfortunately for various reasons this needs to be in shell script.
# 2  
Old 11-23-2011
Can you display a sample of your final output?
# 3  
Old 11-24-2011
Quote:
Originally Posted by Shell_Life
Can you display a sample of your final output?
I have, it's the second box down in my post. I've only given one of each record type as I have to manually obfuscate the data for each line that I want to post.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to reformat output if input file is empty, but not if file has data in it

The below awk improved bu @MadeInGermany, works great as long as the input file has data in it in the below format: input chrX 25031028 25031925 chrX:25031028-25031925 ARX 631 18 chrX 25031028 25031925 chrX:25031028-25031925 ARX 632 14... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

Reformat csv file

Hi, I have a csv file with content like: 1,0,100 1,1,150 2,0,200 2,1,250 3,0,300 3,1,350 I want an output such that all numbers in 3rd col where 2nd col is "0" come in the same col in the output. The same goes for numbers where 2nd col is "1". 1 100 150 2 200 250 3 300 350 Tnx... (2 Replies)
Discussion started by: jamaje
2 Replies

3. Shell Programming and Scripting

[Solved] File reformat

I am using the code below to reformat the input (hp.txt). The output (newhp.txt) is not in the desired format and I can not seem to figure it out. I have attached both. Thank you. perl -aF/\\t/ -lne 'print join(" ",@F) for ("0 A","0 G","0 C","0 T","A 0","G 0","C 0","T 0")' hp.txt > newhp.txt ... (4 Replies)
Discussion started by: cmccabe
4 Replies

4. Shell Programming and Scripting

awk reformat file

Hello: When I tried a perl-oneliner to re-format fasta file. infile.fasta >YAL069W-1.334 Putative promoter CCACACCACACCCACACACC ACACCACACCCACACACACA ACAGCCCTAATCTAACCC >YAL068C-7235.2170 Putative ABC sequence TACGAGAATAATTT ACGTAAATGAAGTT TATATATAAA >gi|31044174|gb|AY143560.1|... (15 Replies)
Discussion started by: yifangt
15 Replies

5. Shell Programming and Scripting

Major File Reformat

Hello, I have many lengthy files that need to be reformatted. I was hoping a sed or awk script could fix this. Here is an example of the original format: P0037 # Degree: 32.999981 # COMMAND: 03 (#01A) Scale 1.296875, 52 (Wooden Crate w/ #2 Label, Bahko) v -3328.000000 12.101541 437.000000... (2 Replies)
Discussion started by: Blue Solo
2 Replies

6. Shell Programming and Scripting

Reformat a file

I have a csv file with 11 columns. The first columns contains the User Id. One User id can have multiple sub Id. The value of Sub Id is in column 10. 100026,captjason@hawaii.rr.com ,jason ,wolford ,1/16/1969, ,US, ,96761 ,15 ,seg_id 100026,captjason@hawaii.rr.com ,jason ,wolford ,1/16/1969,... (3 Replies)
Discussion started by: r_t_1601
3 Replies

7. Shell Programming and Scripting

Reformat the data of a file.

I have a file which have data like A.txt a 1Jan I am in a1. 1Jan I was born. 2Jan I am here. 3Jan I am in a3. b 1Jan I am in b1. c 2Jan I am in c2. d 2Jan I am in d2. 5jan I am in d5. date in the file might be vary evertime. (9 Replies)
Discussion started by: samkhu
9 Replies

8. Shell Programming and Scripting

Please help me reformat this file

I am working with a file of the form; 4256 7726 1 6525 716 1 7626 0838 1 8726 7623 2 8625 1563 2 1662 2628 3 1551 3552 3 1542 7984 ... (3 Replies)
Discussion started by: digipak
3 Replies

9. Shell Programming and Scripting

Reformat Crontab file

I need help writing a script that will reformat a crontab file. The first thing the script is doing is a crontab -l > crontab.txt. I need the crontab.txt file to read "8.00 PM every weekday (Mon-Fri) only in Oct." instead of the orig format "0 20 * 10 1-5" (1 Reply)
Discussion started by: alnita
1 Replies

10. Shell Programming and Scripting

reformat the file

Hi all, I ran into this problem, hope you can help I have a text file like this: Spriden ID First Name Last Name Term Code Detail Code Amount Trans Date Description ... (3 Replies)
Discussion started by: CamTu
3 Replies
Login or Register to Ask a Question