formatting data file with awk or sed


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting formatting data file with awk or sed
# 1  
Old 03-25-2010
formatting data file with awk or sed

Hi,

I have a (quite large) data file which looks like:
_____________
header part..
more header part..
x1 x2 x3 x4 x5 x6
x7 x8 x9 x10 x11 x12
x13 ...
... x59 x60
y1 y2 y3 y4...
... y100
______________

where x1, x2,...,x60 and y1, y2,...y100 are numbers of 10 digits (so each line contains 10x6 numbers +5 spaces: 65 characters).
The header spans 80 lines. The real data starts at line 81.
I would like to have an output like this:
______________
x1 y1
x1 y2
x1 y3
x1 y4
...

x2 y1
x2 y2
x2 y3
...
...

x60 y98
x60 y99
x60 y100
______________

Can anybody tell me how can I get it? Maybe using sed, awk, or perl?
Any help would be much appreciated!
# 2  
Old 03-25-2010
Code:
awk 'NR==FNR && NR>=91 {for (i=1;i<=NF;i++) {y[++j]=$i}} 
     NR>FNR && FNR>80 && FNR<91 {for (k=1;k<=NF;k++) {for (x=1;x<=j;x++) print $k,y[x]}}' urfile urfile

# 3  
Old 03-25-2010
One way to do it with Perl -

Code:
perl -ne 'chomp; if (/^\d+/ && $.<=80){$i++<=9 ? push @x,split/ /,$_ : push @y,split/ /,$_} END{foreach $i(@x){foreach $j(@y){print "$i\t$j\n"}}}' yourfile

tyler_durden

---------- Post updated at 09:55 PM ---------- Previous update was at 09:45 PM ----------

Here's the test on a dummy file with similar structure, on my system:

Code:
$ 
$ cat -n data.txt
     1  header line 1
     2  header line 2
     3  header line 3
     4  123 456
     5  901 234
     6  000 111
     7  666 777
     8  334
     9  real data line 1
    10  real data line 2
$ 
$ perl -ne 'chomp; if (/^\d+/ && $.<=8){$i++<=1 ? push @x,split/ /,$_ : push @y,split/ /,$_}
            END {foreach $i(@x){foreach $j(@y){print "$i\t$j\n"}}}' data.txt
123     000
123     111
123     666
123     777
123     334
456     000
456     111
456     666
456     777
456     334
901     000
901     111
901     666
901     777
901     334
234     000
234     111
234     666
234     777
234     334
$ 
$

Line nos. 4 and 5 consist of x values (123, 456, 901, 234).
Line nos. 6, 7 and 8 consist of y values (000, 111, 666, 777, 334).
Line no. 8 in my file corresponds to line no. 80 in yours.
I test for $i++ <= 1 because only the first two matching lines contain x values. You'd test for $i++ <= 9 because the first 10 matching lines contain x values in your file.

HTH,
tyler_durden
# 4  
Old 03-26-2010
Thanks for your replies, but I couldn't get it to work with my file..
I attach here the file I need to work with (data.txt), and I'll try to explain what I exactly want.
I have a file which has this structure:

Quote:
header
more header
more header..
x1 x2 x3 x4 x5 x6
x7 x8 ...
...
... x59 x60
y1 y2 y3 y4 y5 y6
y7 y8 ...
...
... y99 y100
y1 y2 y3 y4 y5 y6
y7 y8 ...
...
... y99 y100
p1 p2 p3 p4 p5 p6
p7 p8 ...
...
... p59 p60
d1 d2 d3 d4 d5 d6
d7 d8 ...
...
...d59 d60
e1 e2 e3 e4 e5 e6
e7 e8 ...
...
... e59 e60
f1 f2 f3 f4 f5 f6
f7 f8 ...
...
... f59 f60
p61 p62 p63 p64 p65 p66
p67 p68 ...
...
...p119 p120
d61 d62 d63 d64 d65 d66
d67 d68 ...
...
...d119 d120
e61 e62 e63 e64 e65 e66
e67 e68 ...
...
... e119 e120
f61 f62 f63 f64 f65 f66
f67 f68 ...
...
... f119 f120
p121 p122 p123 ...
...
...
...
...f5999 f6000
What I need is this output file:

Quote:
x1 y1 p1 e1 d1 f1
x1 y2 p2 e2 d2 f2
x1 y3 p3 e3 d3 f3
...
x1 y100 p100 e100 d100 f100

x2 y1 p101 e101 d101 f101
x2 y2 p102 e102 d102 f102
x2 y3 p103 e103 d103 f103
...
x2 y100 p200 e200 d200 f200

x3 y1 p201 e201 d201 f201
x3 y2 p202 e202 d202 f202
...
...
x60 y1 p5901 e5901 d5901 f5901
...
x60 y100 p6000 e6000 d6000 f6000
In the attached file, the header spans 14 columns. From lines 15 to 24 (inclusive) there are the X values. From 25 to 41 (and also 42 to 58) there are the Y values.
From line 61 to 70, there are the first 60 P values, from 71 to 80 the first E values, ...
From line 101 to 110, there are the values of P61, P62, ....

In total, there are 60 different values for X, 100 different values for Y, and 6000 different values for P,E,D, and F.

I hope I have explained my problem clearly.
Please help me!
# 5  
Old 03-26-2010
Can someone help me please?
# 6  
Old 03-26-2010
I finally got the solution!
After googling a lot and with your help, I solved my problem.

I attach here ("phoenics2gnuplot.txt") the solution. An example input file is like the file ("data.txt") that I attached in two previous posts.
Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Problem in formatting output in sed / awk

I have a file like this : ! 1 ! 542255 ! 50,140.00 ! ! 2 ! 551717 ! 5,805.00 ! ! 3 ! 551763 ! 8,130.00 ! ! 4 ! 551779 ! 750.00 ! ! 5 ! 551810 ! 56,580.00 ! ! 6 ! 551816 ! 1,350.00 ! ! 7 ! 551876 ! 360.00 ! ! 8 ! 551898 ! ... (10 Replies)
Discussion started by: adam1969in
10 Replies

2. Shell Programming and Scripting

Data formatting using awk

Need assistance on the data extraction using awk Below is the format and would like to extract the data in another format ------------------------------------------------------------------------------------------------- Minimum Temperature (deg F ) DAY 1 2 3 4 5 6 7 8 9 10 11... (4 Replies)
Discussion started by: ajayram_arya
4 Replies

3. Shell Programming and Scripting

awk - sed / reading from a data file and doing algebraic operations

Hi everyone, I am trying to write a bash script which reads a data file and does some algebraic operations. here is the structure of data.xml file that I have; 1 <data> 2 . 3 . 4 . 5 </data> 6 <data> 7 . 8 . 9 . 10</data> etc. Each data block contains same number of lines (say... (4 Replies)
Discussion started by: hayreter
4 Replies

4. Shell Programming and Scripting

AWK/Shell script for formatting data in a file

Hi All, Need an urgent help to convert a unix file in to a particular format: **source file:** 1111111 2d2f2h2 3dfgsd3 ........... 1111111 <-- repeats in every nth line. remaining all lines will be different 123ss41 432ff45 ........... 1111111 <-- repetition qwe1234 123weq3... (1 Reply)
Discussion started by: rajivnairfis
1 Replies

5. Shell Programming and Scripting

how to get data from hex file using SED or AWK based on pattern sign

I have a binary (hex) file I need to parse to get some data which are encoded this way: .* b4 . . . 01 12 .* af .* 83 L1 x1 x2 xL 84 L2 y1 y2 yL By another words there is a stream of hexadecimal bytes (in my example separated by space for better readability). I need to get value stored in... (3 Replies)
Discussion started by: sameucho
3 Replies

6. Shell Programming and Scripting

Formatting help needed awk or sed maybe

I am executing the following command: sort file1.txt | uniq -c | sort -n > file2.txt The problem is that in file 2, I get leading spaces, Like so: 1 N/A|A8MW11 8 N/A|ufwo1 9 N/A|a8mw11 10 900003|smoketest297688 10 N/A|a9dg4 10 danny|danni 12... (5 Replies)
Discussion started by: ddurden7
5 Replies

7. Shell Programming and Scripting

sed or awk to extract data from Xml file

Hi, I want to get data from Xml file by using sed or awk command. I want to get the following result : mon titre 1;Createur1;Dossier1 mon titre 1;Createur1;Dossier1 and save it in cvs file (fichier.cvs). FROM this Xml file (test.xml): <playlist version="1"> <trackList> <track>... (1 Reply)
Discussion started by: yeclota
1 Replies

8. Shell Programming and Scripting

Awk formatting of a data file - nested for loops?

Hello - is there any way in awk I can do... 4861 x(1) y(1) z(1) 4959 x(1) y(1) z(1) 5007 x(1) y(1) z(1) 4861 x(2) y(2) z(2) 4959 x(2) y(2) z(2) 5007 x(2) y(2) z(2) 4861 x(3) y(3) z(3) 4959 x(3) y(3) z(3) 5007 x(3) y(3) z(3) to become... 4861 x(1) y(1) z(1) 4861 x(2) y(2) z(2)... (3 Replies)
Discussion started by: catwoman
3 Replies

9. Shell Programming and Scripting

Big data file - sed/grep/awk?

Morning guys. Another day another question. :rolleyes: I am knocking up a script to pull some data from a file. The problem is the file is very big (up to 1 gig in size), so this solution: for results in `grep "^\ ... works, but takes ages (we're talking minutes) to run. The data is held... (8 Replies)
Discussion started by: dlam
8 Replies

10. Shell Programming and Scripting

Clense Junk Data File - Using Shell or awk or sed

Hello Shell Gurus i need help in solving this puzzle. We have a junk data file that needs to be fed into the database. Need to clense the data file thru shell script. I am not a expert and so need help with Here is what i need to do on the input file -Step -1 Replace all pipes ‘|' within... (1 Reply)
Discussion started by: rimss
1 Replies
Login or Register to Ask a Question