awk logic and math help


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers awk logic and math help
# 1  
Old 12-05-2011
awk logic and math help

Hi,

My file has 2 fields and millions of lines.
Code:
variableStep chrom=Uextra span=25
201     0.5952
226     0.330693
251     0.121004
276     0.0736858
301     0.0646982
326     0.0736858
401     0.2952
426     0.230693
451     0.221004
476     0.2736858

Each field either has a header (variableStep . . . .) or 2 fields, one is a coordinate (each coordinate is 25 greater than the previous) and the other is a value associated with it. Notice how the values start at 201, that is because the program I used doesn't output coordinates ($1) with 0 for the second ($2) value. I'm writing a program which is supposed to take the file and insert coordinates and zero's in the gaps, starting at zero.

So far my program looks like this....
Code:
echo Enter the name of the file without the .wig
read NAME
echo Thanks, inserting zeros
awk -v cord=$cord '{if($1 == "variableStep") {cord = 1;{print $0}} else {if ($1 != cord) {print cord "\t" 0} else {print $1 "\t" $2} {cord += 25}}}' ${NAME}.wig > ${NAME}zeros.wig

and I end up with
Code:
variableStep chrom=Uextra span=25
1       0
26      0
51      0
76      0
101     0
126     0
151     0
176     0
201     0
226     0
251     0

What I want the program to do is output
Code:
variableStep chrom=Uextra span=25
1       0
26      0
51      0
76      0
101     0
126     0
151     0
176     0
201     0.5952
226     0.330693
251     0.121004
276     0.0736858
301     0.0646982
326     0.0736858
351     0
376     0
401     0.2952
426     0.230693
451     0.221004
476     0.2736858

I can do this in excel, but it takes forever.... please help!

Thanks

Moderator's Comments:
Mod Comment How to use code tags

Last edited by Franklin52; 12-07-2011 at 05:29 AM.. Reason: Please use code tags for code and data samples, thank you
# 2  
Old 12-05-2011
Try this:

Code:
awk -v cord=${cord:-25} '
    /variableStep/ {
        if( !printed++ )          # allows multiple headers in the input 
            print;
        next
    }
    {
        if( $1 > last+cord)      # current isn't in sequence, catch up
            for( i = last+cord; i < $1; i+= cord )
                printf( "%d 0\n", i );
        printf( "%s %s\n", $1, $2 );
        last = $1;
    }
' ${NAME}.wig > ${NAME}zeros.wig


Last edited by agama; 12-05-2011 at 09:37 PM.. Reason: comments
This User Gave Thanks to agama For This Post:
# 3  
Old 12-06-2011
Thanks agama. Your code helped a lot. I should have been more specific about the details of my file. It contains multiple variableStep headers after a few hundred thousand lines of the two number fields. I also forgot to mention that I was also trying to export my data with tab separations instead of spaces. I made some minor changes and the program works pretty well, but for some reason the appearance of the tabs I tried to put in with \t isn't consistent. I still want to figure out if I can make the fields consistent. Here is the code I have now.
Code:
echo Enter the name of the file without the .wig
read NAME
echo Thanks, inserting zeros

awk -v cord=${cord:-25} '{if($1 == "variableStep") {print $0} else
    {
        if( $1 > last+cord)
            for( i = last+cord; i < $1; i+= cord )
                printf( "%d\t0\n", i"\t" );
        printf( "%s %s\n", $1, "\t" $2 );
        last = $1;
    }}
' ${NAME}.wig > ${NAME}zeros.wig

Code:
6466951     0.316587
6466976     0.5952
6467001     0.2976
6467026     0.2976
6467051     0.2976
6467076     0.437651
6467101    0
6467126    0
6467151    0
6467176    0
6467201    0
6467226    0
6467251    0
6467276    0
6467301    0
6467326    0
6467351    0

Edit
Okay, for some reason the forum spaces all the fields the same distance apart, but in the file the space between $1 and $2 isn't consistent.

Moderator's Comments:
Mod Comment How to use code tags

Last edited by Franklin52; 12-07-2011 at 05:51 AM.. Reason: Please use code tags for code and data samples, thank you
# 4  
Old 12-06-2011
That's what code tags are for, try using them.

---------- Post updated at 01:53 PM ---------- Previous update was at 01:50 PM ----------

Code:
6466951     0.316587
6466976     0.5952
6467001     0.2976
6467026     0.2976
6467051     0.2976
6467076     0.437651
6467101    0
6467126    0
6467151    0
6467176    0
6467201    0
6467226    0
6467251    0
6467276    0
6467301    0
6467326    0
6467351    0

# 5  
Old 12-06-2011
What is a code tag?
# 6  
Old 12-06-2011
Quote my post and see.
# 7  
Old 12-06-2011
Quote:
awk -v cord=${cord:-25} '{if($1 == "variableStep") {print $0} else
{
if( $1 > last+cord)
for( i = last+cord; i < $1; i+= cord )
printf( "%d\t0\n", i"\t" ); #bad form to add tabs outside of format
printf( "%s %s\n", $1, "\t" $2 ); #your misalignment is because you added the tab to $2
last = $1;
}}
' ${NAME}.wig > ${NAME}zeros.wig

Code:
awk '
    /variableStep/ {
        if( !printed++ )
            print;
        next
    }
    {
        if( $1 > last+25 )
            for( i = last+25; i < $1; i+= 25 )
                printf( "%10d\t%10.5f\n", i, 0 );    # use field widths to help align columns
        printf( "%10d\t%10.5f\n", $1, $2 );
        last = $1;
    }
' ${NAME}.wig > ${NAME}zeros.wig

Sample output:
Code:
spot:[/home/scooter/src/test]t34 <t34.data3|more
        25    0.00000
        50    0.00000
        75    0.00000
       100    0.00000
       125    0.00000
       150    0.00000
       175    0.00000
       200    0.00000
       201    0.59520
       226    0.33069
       251    0.12100
       276    0.07369
       301    0.06470
       326    0.07369
       351    0.00000
       376    0.00000
       401    0.29520
       426    0.23069
       451    0.22100
       476    0.27369

This User Gave Thanks to agama For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

[awk] Math & Bold-Font?

Heya There is a script which has presets stored in a tab-seperated file. That script also has $help_text, which will be shown when called with invalid arguments or -h. So i do need to have that file ready, so the help text can get the values out of the file, and print it with the $help_text.... (17 Replies)
Discussion started by: sea
17 Replies

2. Shell Programming and Scripting

Math count %memory using awk

Hi expert, i have log this: Memory: 74410384 Memory: 75831176 Memory: 77961232 Memory: 77074656 Memory: 76086160 Memory: 77128592 Memory: 78045384 Memory: 76696040 Memory: 72401176 Memory: 72520016 Memory: 72137016 Memory: 73175832 Memory: 73034528 Memory: 71770736 Memory:... (4 Replies)
Discussion started by: justbow
4 Replies

3. Shell Programming and Scripting

Count math using awk

Hi expert, I have log : TOTAL-TIME : 2125264636 DATA-BYTES-DOWN : 3766111307032 DATA-BYTES-UP : 455032157567 DL = (3766111307032/2125264636)/1024 = 1.73 UL = (455032157567/2125264636)/1024 = 0.21 I want the result : TOTAL = 1.94 ... (4 Replies)
Discussion started by: justbow
4 Replies

4. Shell Programming and Scripting

awk --> math-operation in a array

Hi main object is categorize the difference of data-values (TLUFT02B - TLUFT12B). herefor i read out data-files which are named acording to the timeformat yyyymmddhhmm. WR030B 266.48 Grad 0 WR050B 271.46 Grad 0 WR120B 268.11 Grad 0 WV030B 2.51 m/s ... (6 Replies)
Discussion started by: IMPe
6 Replies

5. Shell Programming and Scripting

awk evaluating a string as a math expression

Hi, I am writing a script in awk trying to replace strings that are math expressions with their result. For example, I have a file that looks like this: 5-1 32/8-1 4*12 17+1-3 I would like to get the following output: 4 3 48 15 I tried doing it the following way (using the "bc"... (8 Replies)
Discussion started by: avi.levi
8 Replies

6. Shell Programming and Scripting

awk in horizontal and vertical math

Based on input ail,UTT,id1_0,COMBO,21,24,21,19,85 al,UTHAST,id1_0,COMBO,342,390,361,361,1454 and awk code as awk -F, '{ K=0; for(i=NF; i>=(NF-4); i--) { K=K+$i; J=J+$i;} { print K } } END { for ( l in J ) printf("%s ",J); }' I'm trying to add columns and lines in single line. line... (6 Replies)
Discussion started by: busyboy
6 Replies

7. Shell Programming and Scripting

How to use this logic with awk?

Hi friends, I am having 2 files, I just want to compare 2 files each containing 2 columns 1st column is lat, and 2nd column is long, if anyone can understand below logic please help me in writing script with awk.. here each field of file2 needs to be compared with std_file main counter=0... (1 Reply)
Discussion started by: Akshay Hegde
1 Replies

8. Shell Programming and Scripting

awk math and csv output

Hi I have this list 592;1;Z:\WB\DOCS;/FS3_100G/FILER112/BU/MPS/DOCS;;;;\\FILER112\BUMPS-DOCS\;580,116,544,878 Bytes;656,561 ;77,560 592;2;Z:\WB\FOCUS;/FS3_100G/FILER112/BU/MPS/FOCUS;;;;\\FILER112\BUMPS-FOCUS\;172,430 Bytes;6 ;0 ... (12 Replies)
Discussion started by: nakaedu
12 Replies

9. Shell Programming and Scripting

Need help with AWK math

I am trying to do some math, so that I can compare the average of six numbers to a variable. Here is what it looks like (note that when I divide really big numbers, it isn't a real number): $ tail -n 6 named.stats | awk -F\, '{print$1}' 1141804 1140566 1139429 1134210 1084682 895045... (3 Replies)
Discussion started by: brianjb
3 Replies

10. Shell Programming and Scripting

awk math operation on two files

Hi, I need your help. I've got two files and i need to add 2nd line after occurrence of "Group No X" from data2.txt to 3rd line (after occurrence of "Group No X") from data1.txt. There is the same number of "Groups" in both files and the numbers of groups have the same pattern. data1.txt Group... (2 Replies)
Discussion started by: killerbee
2 Replies
Login or Register to Ask a Question