Challenging Awk array problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Challenging Awk array problem
# 8  
Old 05-21-2010
The other condition that must be met isn't met: chr5 == chr1 (false) Smilie
This User Gave Thanks to alister For This Post:
# 9  
Old 05-21-2010
...doh!

time for bed! Smilie

Code:
$ tr -s ' ' <Edit1
607 687 174 0 0 chr1 3000001 3000156 -194195276 - L1_Mur2 LINE L1 -4310 1567 1413 1
607 917 214 114 45 chr1 3000237 3000733 -194194699 - L1_Mur2 LINE L1 -4488 1389 913 1
607 215 31 0 30 chr1 3000733 3000766 -194194666 + (TTTG)n Simple_repeat Simple_repeat 2 33 0 2
607 845 233 76 114 chr1 3000766 3000792 -194194640 - L1_Mur2 LINE L1 -6816 912 887 1
607 621 250 65 37 chr1 3001287 3001583 -194193849 - Lx9 LINE L1 -1596 6048 5742 3
607 1320 197 332 7 chr1 37600000 37676290 -194193427 - RLTR25A LTR ERVK 0 1028 625 4

$ tr -s ' ' <Edit2
4|17999 - gi|149361523|ref|NC_000074.5|NC_000074 chr1 3000072 TTTATCGTCATCGTC
28|3721 + gi|149352351|ref|NC_000069.5|NC_000069 chr3 154935392 GAGTTTTACAGTCCA
28|3721 + gi|149288852|ref|NC_000067.5|NC_000067 chr1 152633707 GAGTTTTACAGTCCA
28|3721 + gi|149361432|ref|NC_000073.5|NC_000073 chr7 86595415 GAGTTTTACAGTCCA
34|3145 - gi|149321426|ref|NC_000084.5|NC_000084 chr18 43464724 ACGGCTTACGA
34|3145 - gi|149354224|ref|NC_000071.5|NC_000071 chr1 37676290 ACGGCTTACGA

$ paste -d\\n Edit1 Edit2 |awk '{chr=$6; min=$7; max=$8; s=$11" "$12" "$13; getline; if (chr==$4 && $5>=min && $5<=max) print $0;}'
4|17999 - gi|149361523|ref|NC_000074.5|NC_000074 chr1  3000072  TTTATCGTCATCGTC
34|3145  - gi|149354224|ref|NC_000071.5|NC_000071 chr1  37676290 ACGGCTTACGA

This User Gave Thanks to curleb For This Post:
# 10  
Old 05-21-2010
///Quick question for the OP...are the files sorted and the records are guaranteed in the same order? Otherwise, what's the key to tie the records? I ask since your initial evaluation seems to focus on flags like chr1...///

Thank you very much guys. I did not expect such quick responses. Somehow I did not get email alerts too.

To answer your question, file 2 is sorted by 'chr' (ascending) and file 1 is sorted by field 1 (ascending). I have not tried the code given here yet. I will check it out.

Thanks a ton you guys - you rock.

---------- Post updated at 09:12 PM ---------- Previous update was at 09:08 PM ----------

Quote:
Originally Posted by polsum

To answer your question, file 2 is sorted by 'chr' (ascending) and file 1 is sorted by field 1 (ascending).
oopsy - file 2 is sorted by field 1 and file 1 is sorted by chr (chr1 to chr19).

---------- Post updated at 09:47 PM ---------- Previous update was at 09:12 PM ----------

Quote:
Originally Posted by curleb

$ paste -d\\n Edit1 Edit2 |awk '{chr=$6; min=$7; max=$8; s=$11" "$12" "$13; getline; if (chr==$4 && $5>=min && $5<=max) print $0;}'
Sorry being so dumb but when i execute this code in my xp computer, i am getting all kinds of errors like " the system cant find the path specified", and at s=$11^ unexpected newline or end of a string found" . Can you please tell me where am I doing wrong? thanks.
# 11  
Old 05-21-2010
That code you're using is incomplete. When reformating my code, curleb neglected (or chose) to omit ", s" after the "print $0". Take a look at the code as I posted to see what I'm talking about. https://www.unix.com/302423713-post4.html Without that bit, the code will not append fields 11-13 when it should.

That shouldn't be the source of your errors, though. Have you run unix tools on this windows machine before? Can you confirm that you have paste and awk?

Regards,
Alister
This User Gave Thanks to alister For This Post:
# 12  
Old 05-21-2010
Neglected is more accurate. Intrigued by that use of paste. Sorry.
This User Gave Thanks to curleb For This Post:
# 13  
Old 05-22-2010
Quote:
Originally Posted by alister
That shouldn't be the source of your errors, though. Have you run unix tools on this windows machine before? Can you confirm that you have paste and awk?

Regards,
Alister
I usually run simple scripts on my machine and they work fine. I generally run a program like this: gawk "code"

But I am bit confused by
paste -d\\n file1 file2 |

I know I should not type "paste" . The files 1 and 2 are in the install directory of gawk. So I guess i should not use \\n either. But then I still get "s=$11^ unexpected newline or end of string" error.
# 14  
Old 05-22-2010
What do you mean you should not type "paste"? Or that you should not use \\n? "paste" is the name of the command. You must type it. "\\n" is an option argument to paste that tells it to use a newline when merging the two files; it is crucial that it is used. You should enter the code exactly as posted: "paste -d\\n file1 file2 | awk....". If the data files aren't named file1 and file2, then change the filenames to point to the correct locations, but nothing else.
This User Gave Thanks to alister For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Index problem in associate array in awk

I am trying to reformat the table by filling any missing rows. The final table will have consecutive IDs in the first column. My problem is the index of the associate array in the awk script. infile: S01 36407 53706 88540 S02 69343 87098 87316 S03 50133 59721 107923... (4 Replies)
Discussion started by: yifangt
4 Replies

2. Shell Programming and Scripting

Problem with awk array when loading from shell variable

Hi, I have a problem with awk array when iam trying to use awk in solaris box as below..Iam unable to figure out the problem.. Need your help. is there any alternative to make it in arrays from variable values nawk 'BEGIN {SUBSEP=" "; split("101880|110045 101887|110045 101896|110045... (9 Replies)
Discussion started by: cskumar
9 Replies

3. Shell Programming and Scripting

Using awk array problem

I am trying to map values in the input file, where 2nd column depends on the specific value in the 1st column. When 1st column is A place 1 into 2nd column, when it is B, place 2, when C place 3, otherwise no change. My input: U |100|MAIN ST |CLMN1|1 A |200|GREEN LN |CLMN2|2 1 |12... (4 Replies)
Discussion started by: migurus
4 Replies

4. Shell Programming and Scripting

awk array problem

Hi, Im trying to count bats flying through an infrared beam array. One of the experts here helped me a few months ago but now I am having a problem that is stumping me. here is the original code that works (with two differnt patterns in array): # this has been changed to operate under the... (15 Replies)
Discussion started by: cmp260
15 Replies

5. Shell Programming and Scripting

AWK Array problem

Dear All, I am facing problem to get right output through awk program I have file in which “B” value is appearing multiple time and I need to capture all these values. My script is BEGIN { FS=" " } { if ( substr($1,1,5) == "START" ) { i =... (2 Replies)
Discussion started by: arvindng
2 Replies

6. Shell Programming and Scripting

Problem with lookup values on AWK associative array

I'm at wits end with this issue and my troubleshooting leads me to believe it is a problem with the file formatting of the array referenced by my script: awk -F, '{if (NR==FNR) {a=$4","$3","$2}\ else {print a "," $0}}' WBTSassignments1.txt RNCalarms.tmp On the WBTSassignments1.txt file... (2 Replies)
Discussion started by: JasonHamm
2 Replies

7. Shell Programming and Scripting

awk array problem

hi i am trying to perform some calculations with awk and arrays. i have this so far: awk 'NR==FNR{ for(i=1; i<=NF; i++) {array+=$i} tot++;next} {for(i=1; i<=NF; i++) {avg=array/tot} {diff=(array - avg)}} {for(i=1; i<=NF; i++) {printf("%5.8f\n",diff)}}' "$count".txt "$count".ttt >... (4 Replies)
Discussion started by: npatwardhan
4 Replies

8. Shell Programming and Scripting

Very Challenging Problem. Please read fully.

Hi, This is the Third thread i'm putting here for the same problem. :( Actually, i'm trying a script like this.. but its taking a long time.. about 3 days to complete fully.. #!/bin/ksh if then exit 1 fi while read i do while read j do field7=`echo $j|cut -d "|"... (12 Replies)
Discussion started by: RRVARMA
12 Replies

9. Programming

A challenging problem involving symbolic links.

Hello, I'm working on an application that bridges together several applications involved in creating a video workflow for editing with digital cinema cameras. The main platform is MacOSX. Because of the nature of some of the utilities for working with this video footage I must spoof filenames... (2 Replies)
Discussion started by: ibloom
2 Replies

10. UNIX for Dummies Questions & Answers

A Challenging Situation : i hope the moderators will respond to this problem..

I have the following situation : i have 4 Unix Sco servers, one Windows 2000 server, and an ADSL internet connection. All the servers, that is the 4 unix and the windows server have real static IPs supplied by my ISP. the servers are connected to a Switch , the switch is connected to an... (2 Replies)
Discussion started by: BAM
2 Replies
Login or Register to Ask a Question