Make multiple awk files into an executable


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Make multiple awk files into an executable
# 1  
Old 03-05-2012
Make multiple awk files into an executable

Hello everyone,

The following are my input files.

Quote:
1.txt
chr1 14765298 14766727 def
chr1 16759093 16760238 def
chr1 16759236 16760238 def
chr1 20782516 20784428 him
chr1 20989962 20991078 her
chr2 31672150 31673532 abc
chr2 33157721 33158124 abc
chr3 34542283 34542962 abc
chr3 38248682 38251416 abc
chr4 58562053 58567653 abc
............................................
............................................
Quote:
2.txt
chr1 21438731 21439423 26.12
chr1 33939851 33940673 34.76
chr1 36779864 36780494 20.16
chr1 36817091 36817917 27.22
chr2 36977015 36977908 19.27
chr3 40475125 40475885 21.58
chr3 40483838 40484616 15.3
chr4 40502827 40503675 10.61
chr4 40532299 40533156 14.78
chr5 43593022 43594143 24.33
...............................................
................................................
The following are my sequence of steps.

Quote:
This step makes a new file with each record in 1.txt against each record in 2.txt. The total records in 1_2.txt will be 10*10=100 records

awk 'NR==FNR{a[FNR]=$0;max2=FNR;next} {for (i=1;i<=max2;i++) print $0,a[i]}' 1.txt 2.txt > 1_2.txt
Quote:
This step compares if column1 matches column5 in 1_2.txt, because I don't want those records that has different chr names. Also, please note that I don't need the column5 again in my output.

awk '{if($1==$5) {print $1"\t"$2"\t"$3"\t"$4"\t"$6"\t"$7"\t"$8}}' 1_2.txt > test.tmp && mv test.tmp 1_2.txt

Quote:
Subtracting column2 and column5 from the previous output and printing the whole record with the subtraction result in the last column into another file.

awk '{(v=$2-$5); {print $0"\t"v}}' 1_2.txt > 1_2_distance.txt
Quote:
Now, I want to see if the last column, the subtraction result is in the range of -5000 to 5000. and print the whole record into another file.

awk '{if($8>=-5000 && $8<=5000) {print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7"\t"$8}}' 1_2_distance.txt > 1_2_distance_5K.txt
Can someone please let me know about how to make these bunch of steps into a single script so that I start the script with 1.txt and 2.txt, after execution gives me the final 1_2_distance_5K.txt file.

Thanks in advance.
# 2  
Old 03-05-2012
Hi jacobs.smith,

What would be the output with the input of your first post? Otherwise paste input and output after your process. It would be easy to accomplish the task and check if it is correct.
This User Gave Thanks to birei For This Post:
# 3  
Old 03-05-2012
Hi Birei,

Thanks for ur post.

I somehow missed the output.

Here is the output. A small change, in my last step, instead of -5000 to 5000, I chose -1000000 to 1000000 for our input convenience. In that case, my final output would be this one

Quote:
output.txt
chr1 21438731 21439423 26.12 20782516 20784428 him 656215
chr1 21438731 21439423 26.12 20989962 20991078 her 448769
Basically I want to grab those records that are at a distance of 1000000 between column2 of file1 and column2 of file2 and print the whole record from both files.

Please feel free to re-post any comments u might come across.

Thanks in advance.
# 4  
Old 03-05-2012
Give a try to next awk script:
Code:
$ cat 1.txt
chr1 14765298 14766727 def
chr1 16759093 16760238 def
chr1 16759236 16760238 def
chr1 20782516 20784428 him
chr1 20989962 20991078 her
chr2 31672150 31673532 abc
chr2 33157721 33158124 abc
chr3 34542283 34542962 abc
chr3 38248682 38251416 abc
chr4 58562053 58567653 abc
$ cat 2.txt
chr1 21438731 21439423 26.12
chr1 33939851 33940673 34.76
chr1 36779864 36780494 20.16
chr1 36817091 36817917 27.22
chr2 36977015 36977908 19.27
chr3 40475125 40475885 21.58
chr3 40483838 40484616 15.3
chr4 40502827 40503675 10.61
chr4 40532299 40533156 14.78
chr5 43593022 43594143 24.33
$ cat script.awk
BEGIN {
        if ( ARGC != 3 ) {
                print "Usage: awk -f script.awk <file1> <file2>"
                exit 0
        }
}

FNR == NR {
        f1_data[ FNR ] = $0
        next
}

FNR < NR {
        for ( i = 1; i <= length( f1_data ); i++ ) {
                split( f1_data[ i ], fields )
                if ( fields[1] != $1 ) {
                        next
                }
                substraction = fields[2] - $2
                if (substraction >= -1000000 && substraction <= 1000000 ) {
                        for ( j = 2; j <= length( fields ); j++ ) {
                                f1_line = (f1_line ? f1_line " " : "" ) fields[j]
                        }
                        printf "%s %s %d\n", $0, f1_line, substraction
                        f1_line = ""
                }
        }
}
$ awk -f script.awk 1.txt 2.txt 
chr1 21438731 21439423 26.12 20782516 20784428 him -656215
chr1 21438731 21439423 26.12 20989962 20991078 her -448769

This User Gave Thanks to birei For This Post:
# 5  
Old 03-06-2012
Quote:
Originally Posted by birei
Give a try to next awk script:
Code:
$ cat 1.txt
chr1 14765298 14766727 def
chr1 16759093 16760238 def
chr1 16759236 16760238 def
chr1 20782516 20784428 him
chr1 20989962 20991078 her
chr2 31672150 31673532 abc
chr2 33157721 33158124 abc
chr3 34542283 34542962 abc
chr3 38248682 38251416 abc
chr4 58562053 58567653 abc
$ cat 2.txt
chr1 21438731 21439423 26.12
chr1 33939851 33940673 34.76
chr1 36779864 36780494 20.16
chr1 36817091 36817917 27.22
chr2 36977015 36977908 19.27
chr3 40475125 40475885 21.58
chr3 40483838 40484616 15.3
chr4 40502827 40503675 10.61
chr4 40532299 40533156 14.78
chr5 43593022 43594143 24.33
$ cat script.awk
BEGIN {
        if ( ARGC != 3 ) {
                print "Usage: awk -f script.awk <file1> <file2>"
                exit 0
        }
}

FNR == NR {
        f1_data[ FNR ] = $0
        next
}

FNR < NR {
        for ( i = 1; i <= length( f1_data ); i++ ) {
                split( f1_data[ i ], fields )
                if ( fields[1] != $1 ) {
                        next
                }
                substraction = fields[2] - $2
                if (substraction >= -1000000 && substraction <= 1000000 ) {
                        for ( j = 2; j <= length( fields ); j++ ) {
                                f1_line = (f1_line ? f1_line " " : "" ) fields[j]
                        }
                        printf "%s %s %d\n", $0, f1_line, substraction
                        f1_line = ""
                }
        }
}
$ awk -f script.awk 1.txt 2.txt 
chr1 21438731 21439423 26.12 20782516 20784428 him -656215
chr1 21438731 21439423 26.12 20989962 20991078 her -448769

Hi Birei,

Thanks for your time.

But, it is not producing any output. All I get is a blank output. I did exactly what you have written.
# 6  
Old 03-06-2012
Sure?

Same input and same awk program? Try debugging with prints inside the script to see where it fails.

I can't help much because I can't reproduce your problem, but post your OS and awk version, and perhaps other users have any idea.
This User Gave Thanks to birei For This Post:
# 7  
Old 03-06-2012
Quote:
Originally Posted by birei
Sure?

Same input and same awk program? Try debugging with prints inside the script to see where it fails.

I can't help much because I can't reproduce your problem, but post your OS and awk version, and perhaps other users have any idea.
Hi Birei,

I tried using gawk -f script.awk and it works only for the input files. When I try it with other files, it doesn't do anything.

I doubt if the input files in this post are space separated.

Mine are tab separated.

Any thoughts?

Thanks again.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Xargs to call python executable to process multiple bam files

I am running the below loop that to process the 3 bam files (which isn't always the case). A .py executable is then called using | xargs sh to further process. If I just run it with echo the output is fine and expected, however when | xargs sh is added I get the error. I tried adding | xargs... (4 Replies)
Discussion started by: cmccabe
4 Replies

2. Shell Programming and Scripting

Bash script make itself executable

Is there a way to make this make itself executable? Thanks. :-) cat > somescript.sh << \EOF #!/bin/bash block_count=$(sudo tune2fs -l /dev/sda1 | awk '/^Block count:/ {print $NF}') reserved_block_count=$(sudo tune2fs -l /dev/sda1 | awk '/^Reserved block count:/ {print $NF}') perl -e... (4 Replies)
Discussion started by: drew77
4 Replies

3. SuSE

How To make bootable USB with multiple ISO Files?

Hi All, I would need your assistance to make a bootable USB with SUSE LINUX Enterprise Server I have already downloaded relevant OS (Trail Version) packages @ 1) SLES-11-SP4-DVD-i586-GM-DVD1 2) SLES-11-SP4-DVD-i586-GM-DVD2 when I tried to open these packages with PowerISO one of the... (7 Replies)
Discussion started by: Leaner_963
7 Replies

4. Shell Programming and Scripting

Make multiple files of equal length

I have 150 files with 4 columns each but variable row lengths that I need to combine by column. I do not have any common column. I want to use "paste " command in unix to do it but before that I have to get all my files to be of equal length. Is there a way using awk or sed to fill up n no. of... (7 Replies)
Discussion started by: manishabh
7 Replies

5. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

6. Shell Programming and Scripting

How to make an editing script work for multiple files?

Hey everybody, I have a script for making a string substitution in a file. I am trying to modify it in order to make the same modifcation to multiples files. here is what I have so far. #!/bin/csh set p1="$1" shift set p2="$1" shift foreach x ($*) if ( { grep -w -c "$p1" $x } ) then mv... (7 Replies)
Discussion started by: iwatk003
7 Replies

7. Shell Programming and Scripting

Unable to make script executable

Hello everybody, I'm unable to make my shell script an executable file. The details are as follows: PATH includes my $HOME/bin i.e. /rchome/rc1/bin HOME directory is /rchome/rc1 script name is prep_mig.sh permissions set are 755 It's executing if I give below command sh prep_mig.sh but... (4 Replies)
Discussion started by: jitu.keshwani
4 Replies

8. Shell Programming and Scripting

run and make an executable file

as i said before i'm a beginner in shell programming and i have two questions: how to run an executable file in shell scripts like for example let's say the file called "prog.exe", what's the shell command to run this file? also how can i make the shell file an executable file (if it is... (5 Replies)
Discussion started by: _-_shadow_-_
5 Replies

9. Shell Programming and Scripting

Can I make "touch" create executable files by manipulating umask?

I'm getting to grips with this concept of the umask. What I thought was, setting umask uga+rwx would result in creating files with all permissions for everyone. Seems not to be the case though. Read and write bits get set, but not the execute bit. Is there some gap in my understanding, or is... (2 Replies)
Discussion started by: tphyahoo
2 Replies

10. Solaris

How to make a script executable by all users?

I have a script in my home direcroty which upon execution gives the essential system information like memory,cpu etc and is currently owned by root:root. Now I want to see that every non root user will run this file and grab the reqired system info. I know this is some thing associated with chown... (2 Replies)
Discussion started by: chrs0302
2 Replies
Login or Register to Ask a Question