Reformatting Data in AWK


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reformatting Data in AWK
# 1  
Old 03-12-2009
PHP Reformatting Data in AWK

Dear AWK Users,

I have a data set that is so large (Gigabytes) that it cannot be opened in the vi editor in its entirety. But I can manipulate the entire thing in AWK. It is formatted in a regular manner such that it has the variable descriptions or listings preceeding the variables. The latter then follow in discrete batches.

datatype 1 datatype 2 datatype 3 datatype 4 datatype 5
...
...
datatype N
x1_1 x2_1 x3_1 x4_1 x5_1
...
...
xN_1
x1_2 x2_2 x3_2 x4_2 x5_2
...
...
xN_2
x1_N x2_N x3_N x4_N x5_N
...
...
xN_N

I don't need every variable so I would like to extract specific variables depending on need from the dataset and print them out it the following format.

datatype 1 datatype 2 dataype 4
x1_1 x2_1 x4_1
x1_2 x2_2 x4_2
...
...
x1_N x2_N x4_N

Any help would be appreciated.

Regards,


sda_rr
# 2  
Old 03-12-2009
Post a portion of the real data and the exepected output from that portion.

Regards
# 3  
Old 03-12-2009
Franklyn,

Here is a sample of the original format of the data.

Thanks for your help.

Regards,

sda_rr

SET LOCATION ANGLE1 ANGLE2 TYPE
D2 probe101 probe201 probe301 probe401
probe501 probe102 probe201 probe301 probe401
probe502 probe103 probe201 probe301 probe401
probe503 probe104 probe201 probe301 probe401
probe504 probe105 probe201 probe301 probe401
probe505 probe106 probe201 probe301 probe401
probe506 probe107 probe201 probe301 probe401
probe507 probe108 probe201 probe301 probe401
probe508 probe109 probe201 probe301 probe401
probe509 probe110 probe201 probe301 probe401
probe510 cprobe101 cprobe102 cprobe103 cprobe104
1 50 5 0 1
16 1.00E+05 1.50E+05 1.00E+05 1.50E+05
1.13E+05 1.13E+05 1.13E+05 7.50E+04 1.13E+05
8.44E+04 8.44E+04 8.44E+04 5.63E+04 8.44E+04
6.33E+04 6.33E+04 6.33E+04 4.22E+04 6.33E+04
4.75E+04 4.75E+04 4.75E+04 3.16E+04 4.75E+04
3.56E+04 3.56E+04 3.56E+04 2.37E+04 3.56E+04
2.67E+04 2.67E+04 2.67E+04 1.78E+04 2.67E+04
2.00E+04 2.00E+04 2.00E+04 1.33E+04 2.00E+04
1.50E+04 1.50E+04 1.50E+04 1.00E+04 1.50E+04
1.13E+04 1.13E+04 1.13E+04 7.51E+03 1.13E+04
8.45E+03 8.45E+03 8.45E+03 5.63E+03 8.45E+03
2 100 10 0 2
1.18E+05 1.18E+05 1.18E+05 7.88E+04 1.18E+05
8.86E+04 8.86E+04 8.86E+04 5.91E+04 8.86E+04
6.64E+04 6.64E+04 6.64E+04 4.43E+04 6.64E+04
4.98E+04 4.98E+04 4.98E+04 3.32E+04 4.98E+04
3.74E+04 3.74E+04 3.74E+04 2.49E+04 3.74E+04
2.80E+04 2.80E+04 2.80E+04 1.87E+04 2.80E+04
2.10E+04 2.10E+04 2.10E+04 1.40E+04 2.10E+04
1.58E+04 1.58E+04 1.58E+04 1.05E+04 1.58E+04
1.18E+04 1.18E+04 1.18E+04 7.88E+03 1.18E+04
8.87E+03 8.87E+03 8.87E+03 5.91E+03 8.87E+03
2.10E+00 1.05E+02 1.05E+01 0.00E+00 2.10E+00
6 300 30 0 6
1.20E+05 1.20E+05 1.20E+05 8.03E+04 1.20E+05
9.04E+04 9.04E+04 9.04E+04 6.02E+04 9.04E+04
6.78E+04 6.78E+04 6.78E+04 4.52E+04 6.78E+04
5.08E+04 5.08E+04 5.08E+04 3.39E+04 5.08E+04
3.81E+04 3.81E+04 3.81E+04 2.54E+04 3.81E+04
2.86E+04 2.86E+04 2.86E+04 1.91E+04 2.86E+04
2.14E+04 2.14E+04 2.14E+04 1.43E+04 2.14E+04
1.61E+04 1.61E+04 1.61E+04 1.07E+04 1.61E+04
1.21E+04 1.21E+04 1.21E+04 8.04E+03 1.21E+04
9.05E+03 9.05E+03 9.05E+03 6.03E+03 9.05E+03
2.14E+00 1.07E+02 1.07E+01 0.00E+00 2.14E+00
# 4  
Old 03-12-2009
You haven't specify the desired output, assuming you want the 1st, 2nd and the 4th field:

Code:
awk '{print $1, $2, $4}' file

Regards
# 5  
Old 03-12-2009
Reformatting in AWK

Franklin52,

Apologies for not sending the desired output format. I also noticed an error in the original dataset which could have been very misleading. Please find a corrected dataset below followed by the desired output format.

regards,

sda_rr


Code:
SET LOCATION ANGLE1 ANGLE2 TYPE
D2 probe101 probe201 probe301 probe401
probe501 probe102 probe202 probe302 probe402
probe502 probe103 probe203 probe303 probe403
probe503 probe104 probe204 probe304 probe404
probe504 probe105 probe205 probe305 probe405
probe505 probe106 probe206 probe306 probe406
probe506 probe107 probe207 probe307 probe407
probe507 probe108 probe208 probe308 probe408
probe508 probe109 probe209 probe309 probe409
probe509 probe110 probe210 probe310 probe410
probe510 cprobe101 cprobe201 cprobe301 cprobe401
1 50 5 0 1
16 1.00E+05 1.50E+05 1.00E+05 1.50E+05
1.13E+05 1.13E+05 1.13E+05 7.50E+04 1.13E+05
8.44E+04 8.44E+04 8.44E+04 5.63E+04 8.44E+04
6.33E+04 6.33E+04 6.33E+04 4.22E+04 6.33E+04
4.75E+04 4.75E+04 4.75E+04 3.16E+04 4.75E+04
3.56E+04 3.56E+04 3.56E+04 2.37E+04 3.56E+04
2.67E+04 2.67E+04 2.67E+04 1.78E+04 2.67E+04
2.00E+04 2.00E+04 2.00E+04 1.33E+04 2.00E+04
1.50E+04 1.50E+04 1.50E+04 1.00E+04 1.50E+04
1.13E+04 1.13E+04 1.13E+04 7.51E+03 1.13E+04
8.45E+03 8.45E+03 8.45E+03 5.63E+03 8.45E+03
2 100 10 0 2
1.18E+05 1.18E+05 1.18E+05 7.88E+04 1.18E+05
8.86E+04 8.86E+04 8.86E+04 5.91E+04 8.86E+04
6.64E+04 6.64E+04 6.64E+04 4.43E+04 6.64E+04
4.98E+04 4.98E+04 4.98E+04 3.32E+04 4.98E+04
3.74E+04 3.74E+04 3.74E+04 2.49E+04 3.74E+04
2.80E+04 2.80E+04 2.80E+04 1.87E+04 2.80E+04
2.10E+04 2.10E+04 2.10E+04 1.40E+04 2.10E+04
1.58E+04 1.58E+04 1.58E+04 1.05E+04 1.58E+04
1.18E+04 1.18E+04 1.18E+04 7.88E+03 1.18E+04
8.87E+03 8.87E+03 8.87E+03 5.91E+03 8.87E+03
2.10E+00 1.05E+02 1.05E+01 0.00E+00 2.10E+00
6 300 30 0 6
1.20E+05 1.20E+05 1.20E+05 8.03E+04 1.20E+05
9.04E+04 9.04E+04 9.04E+04 6.02E+04 9.04E+04
6.78E+04 6.78E+04 6.78E+04 4.52E+04 6.78E+04
5.08E+04 5.08E+04 5.08E+04 3.39E+04 5.08E+04
3.81E+04 3.81E+04 3.81E+04 2.54E+04 3.81E+04
2.86E+04 2.86E+04 2.86E+04 1.91E+04 2.86E+04
2.14E+04 2.14E+04 2.14E+04 1.43E+04 2.14E+04
1.61E+04 1.61E+04 1.61E+04 1.07E+04 1.61E+04
1.21E+04 1.21E+04 1.21E+04 8.04E+03 1.21E+04
9.05E+03 9.05E+03 9.05E+03 6.03E+03 9.05E+03
2.14E+00 1.07E+02 1.07E+01 0.00E+00 2.14E+00

General Output Format
Code:
SET LOCATION ANGLE1 ANGLE2 TYPE probe101 probe102 probe103 probe104 probe105 probe106 probe107 probe108 probennn
1 50 5 0 1 1.00E+05 1.13E+05 8.44E+04 6.33E+04 4.75E+04 3.56E+04 2.67E+04 2.00E+04 ...
2 100 10 0 2 1.18E+05 8.86E+04 6.64E+04 4.98E+04 3.74E+04 2.80E+04 2.10E+04 1.58E+04 ...
6 300 30 0 6 1.20E+05 9.04E+04 6.78E+04 5.08E+04 3.81E+04 2.86E+04 2.14E+04 1.61E+04 ...


Last edited by radoulov; 03-12-2009 at 12:40 PM.. Reason: added code tags
# 6  
Old 03-12-2009
Something like this (use nawk or /usr/xpg4/bin/awk on Solaris):

Code:
awk 'END { print _ }
$1 ~ /^[0-9]*$/ && $NF ~ /^[0-9]*$/ {
  print _; _ = ""
  }
{ _ = _ ? _ FS $0 : $0 }
' infile

# 7  
Old 03-12-2009
Reformatting in AWK

Radoulov,

Thanks for your suggestion but I am new to AWK and therefore quite lost. I incorporated the code you sent by saving it as a file called 'arrange.awk.' Then on a Solaris system, I typed:

awk -f arrange.awk test.data

But this doesn't work. Can you please give me a bit mroe guidance?
Thanks.

Regards,

sda-rr
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help reformatting column

Hello UNIX experts, I'm stumped finding a method to reformat a column. Input file is a two column tab-delimited file. Essentially, for every term that appears in column 2, I would like to summarize whether that term appears for every entry in column 1. In other words, make a header for each term... (2 Replies)
Discussion started by: torchij
2 Replies

2. Shell Programming and Scripting

awk --> math-operation in data-record and joining with second file data

Hi! I have a pretty complex job - at least for me! i have two csv-files with meassurement-data: fileA ...... (2 Replies)
Discussion started by: IMPe
2 Replies

3. Shell Programming and Scripting

awk help reformatting badly formatted time field

I need help reformatting an input file with spaces in the time field (4th field). I want the field to look like “hh:mm” with appropriate embedded zeros, but instead it has “h :m “ if the hour and/or minute are single character. I'm pretty new to scripting and this is beyond me. Any help would... (4 Replies)
Discussion started by: lisep
4 Replies

4. Shell Programming and Scripting

Help with parsing data with awk , eliminating unwanted data

Experts , Below is the data: --- Physical volumes --- PV Name /dev/dsk/c1t2d0 VG Name /dev/vg00 PV Status available Allocatable yes VGDA 2 Cur LV 8 PE Size (Mbytes) 8 Total PE 4350 Free PE 2036 Allocated PE 2314 Stale PE 0 IO Timeout (Seconds) default --- Physical volumes ---... (5 Replies)
Discussion started by: rveri
5 Replies

5. Shell Programming and Scripting

Reformatting data in matrix form

Hi, Some assistance with respect to the following problem will be very helpful. I want to reformat my dataset in the following manner for subsequent analysis. I have first column values (which repeat for each value of 2nd column) which are names, the second column specifies position ad the... (1 Reply)
Discussion started by: newbie83
1 Replies

6. UNIX for Dummies Questions & Answers

Date reformatting

I have been reformatting dates from a data file to make them mysql compliant. 31-10-2011 Loc1 1-11-2011 Loc2 The first can be captured by this: sed -i '' -e "s#\(..\)-\(..\)-\(....\)#\3-\2-\1#" data.txt and leads to: 2011-10-31 Loc1 The second line is captured as follows: sed -i... (2 Replies)
Discussion started by: figaro
2 Replies

7. UNIX for Dummies Questions & Answers

Date reformatting

I have a file with temperature measurements: Loc1,20090102,71.55 Loc1,20090103,71.65 Loc1,20090104,71.55 Loc1,20090105,71.54 Loc1,20090106,71.54 However, to load this into a database I would like to reformat the dates (column 2) from the yyyymmdd format to the yyyy-mm-dd format. I have... (2 Replies)
Discussion started by: figaro
2 Replies

8. Shell Programming and Scripting

reformatting xml file, sed or awk I think (possibly perl)

I have some xml files that cannot be read using a standard parser, or I am using the wrong parser. The issues seems to be spaces in some of the tags. Here is a sample,<UgUn 2 > <Un> -0.426753 </Un> </UgUn>The parser isn't able to find the number 2, so that information is lost, etc. It seems... (16 Replies)
Discussion started by: LMHmedchem
16 Replies

9. Shell Programming and Scripting

awk multiple file reformatting

I hopefully have a simple request - I need to process multiple files reformatting the output based on tags at the beginning of each line. So the data for the new 3 lines of the output file are in the HDR line and then the details are in the DTL tagged lines. for ifile in $indir do echo... (1 Reply)
Discussion started by: jason_v_brown
1 Replies

10. Filesystems, Disks and Memory

reformatting a floppy!

i am trying to reformat a floppy i am using solaris 9 when i run this: rmformat -F quick /vol/dev/aliases/floppy0 it tells me that it cannot perform the operation on a mounted device. how do i unmount the device and format the floppy? (1 Reply)
Discussion started by: rmuhammad
1 Replies
Login or Register to Ask a Question