Problem with my loop and awk script


 
Thread Tools Search this Thread
Operating Systems Linux Problem with my loop and awk script
# 1  
Old 08-18-2014
Problem with my loop and awk script

Sorry if this is a super simple issue, but am extremely new to this and am trying to teach myself as I go along. But can someone please help me out?

I have a data file similar to this for many samples, for all chromosomes
Code:
Sample      Chr     bp           p       roh
Sample1	1	49598178	0	1
Sample1	1	49598207	0	1
Sample1	1	49598209	0	1
Sample1	2	49974371	0	1
Sample1	2	49974670	0	1
Sample1	2	49974931	0	1
Sample1	2	50003025	0.000001	1
 etc etc

I would like to determine the maximum and minimum bp for each chromosome for each sample, and then output the distance between the min and max bp for each chromosome for each sample.

As I am super new to all this, and I have noone to help me, I was wondering if someone here could help me out?

I have so far written this:
Code:
for sample in `listofsamples.txt`
do
awk 'chr=$2; for (chr=1; chr<=25; chr++) { 
     NR==1 { min=$3; max=$3; length=0; next } 
           { max < $3 {max=$3} min > $3 {min=$3} } 
     END { roh=(max-min)/1000000; print "sample", "chr", "min", "max", "length"; print $sample, chr, min, max, length }}' awktestfile.txt
done

And I keep getting the thi following syntax error message
Code:
'/file: line 2: syntax error near unexpected token `do
'/file: line 2: `do

I have no idea what it means - please help? Any advice would be greatly appreciated.

Best

V

Last edited by Franklin52; 08-18-2014 at 11:08 AM.. Reason: Please use code tags
# 2  
Old 08-18-2014
This line seems wrong
Code:
for sample in `listofsamples.txt`

The back ticks imply that listofsamples.txt is a command. It seems you want
to process the contents of listofsamples.txt. This would make more sense
Code:
for sample in `cat listofsamples.txt`

but I am not sure that is what you really need. What is in the file awktestfile.txt?
# 3  
Old 08-18-2014
Thanks blackrageous. The awktestfile.txt is the data file I am trying to analyse.

Last edited by rbatte1; 08-19-2014 at 01:39 PM..
# 4  
Old 08-18-2014
For the sample input shown, what output are you hoping to produce?

Are all lines for a given Sample/Chr pair on adjacent lines in your input file? And, if not, how big are your input files?
# 5  
Old 08-19-2014
Thanks Don, I have a single data file which simply lists all base pairs for all chromosomes for the samples one under the other, so the file is quite large (there are over 3000 samples). Ideally I would like to output for each sample, the max and min bp for each chromosome eg.
Code:
Sample1	1	49598178	49598209
Sample1	2	49974371	50003025

From this I think I should be able to continue my analysis.

Thanks

Last edited by Franklin52; 08-19-2014 at 05:58 AM.. Reason: Fixed code tags
# 6  
Old 08-19-2014
Try
Code:
awk     'NR==1          {next}
                        {samchr=$1" "$2}
         samchr != osc  {print osc, MIN, MAX, (MAX-MIN)/10000; osc=samchr; MIN=1E24; MAX=-1E24}
         $3 > MAX       {MAX = $3}
         $3 < MIN       {MIN = $3}
         END            {print samchr, MIN, MAX, (MAX-MIN)/10000}
        ' file
Sample1 1 49598178 49598209 0.0031
Sample1 2 49974371 50003025 2.8654

# 7  
Old 08-19-2014
Thanks RudiC, that's amazing. I seem to always over complicate it.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Awk: problem for loop through variable

Hi, input: AAA|1 my script (the function is just an example): gawk 'BEGIN{FS=OFS="|"} function repeat(str, n, rep, i){ for(i=1; i<=n; i++) rep=rep str return rep } { variable_1=repeat($1,$2) variable_2=repeat($1,$2+1) variable_3=repeat($1,$2+3) ... (5 Replies)
Discussion started by: beca123456
5 Replies

2. Shell Programming and Scripting

PERL script loop problem

I have written the below PERL script to reprocess messages from a failure queue. It basically browses all the messages in the failure queue to individual files in a directory and then scans those files to determine the originating queue. The script will then move each message in turn from the... (0 Replies)
Discussion started by: chris01010
0 Replies

3. Shell Programming and Scripting

Problem Using If & For loop in AWK Command

I am parsing file for the fields using awk command, first i check 26th field for two characters using substr function if it matches then using for loop on array i search 184th field for 4 chars if it matches then i print the required fields but on execution i get the error, please help...... (5 Replies)
Discussion started by: siramitsharma
5 Replies

4. Shell Programming and Scripting

Problem passing a search pattern to AWK inside a script loop

Learning, stumbling! My progress in shell scripting is slow. Now I have this doubt: I have the following file (users.txt): AU0909,on AU0309,off AU0209,on AU0109,off And this file (userson.txt) AU0909 AU0209 AU0109 AU0309 I just want to set those users on userson.txt to "off" in... (14 Replies)
Discussion started by: quinestor
14 Replies

5. UNIX for Dummies Questions & Answers

simple script with while loop getting problem

Hello forum memebers. can you correct the simple while program. #! /bin/ksh count=10 while do echo $count count='expr$count-1' done I think it will print 10 to 1 numbers but it running for indefinite times. (2 Replies)
Discussion started by: rajkumar_g
2 Replies

6. Shell Programming and Scripting

while loop problem in c shell script

Hi all, i write a script c shell set i = 1 while ( $i <= $#array ) echo "$array" @ i++ end i want to set it to i = i +2 in that statement . Can anybody help me? ---------- Post updated at 02:46 PM ---------- Previous update was at 02:35 PM ---------- anybody not how to solve it??? (2 Replies)
Discussion started by: proghack
2 Replies

7. Shell Programming and Scripting

Problem with while loop in shell script

Hi All, How to read a file upto last line(End Of Line) I wrote below program: cat R2_20060719.610.txt | while read LINE do echo "$LINE" done above code reading all lines from a file and skipping last line...... is there anything wrong in my code. Please help me out from this... (20 Replies)
Discussion started by: rkrgarlapati
20 Replies

8. Shell Programming and Scripting

problem in while loop in a script

i have a script that will read each line and then grep a particular pattern and do some_stuff. Below the script while read j do q1=0 q1=`$j | grep 'INFO - LPBatch:' | wc -l` if then $j | tr -s " " | cut -d " " -f8,42,43 >> nav1.txt fi q2=0 q2=`$j | grep 'INFO - Number of Intervals... (12 Replies)
Discussion started by: ali560045
12 Replies

9. Shell Programming and Scripting

awk and loop problem

Good morning, Sir's i would like to ask for help regarding to my awk and loop problem, a script that will check my files a and b then if it will see there was a time below 3am it will echo the file that contains below 3am file, for this example it will redirect the file a to an output. $ cat a... (3 Replies)
Discussion started by: invinzin21
3 Replies

10. Shell Programming and Scripting

Shell Script loop problem

I am writing a shell script that simulates the `wc -w` command without actually using wc itself. My problem is that the script will only read the first line of the file and just keep looping through it. I have tried both while and for loops and got the same result. Can anyone help? ... (1 Reply)
Discussion started by: MaxMouse
1 Replies
Login or Register to Ask a Question