Count lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Count lines
# 1  
Old 05-17-2013
Count lines

Hello,

I have a file with two columns like the following

FILE1:

Code:
chr1	61042
chr1	61153
chr1	61446
chr1	61457
chr1	61621
chr10	61646
chr10	61914
chr10	62024
chr10	62782

Alos, I have another file

FILE2:

Code:
chr1	61150	61600
chr10	61675	62200

Now for each line specifying range in FILE2, I want to count the number of lines from FILE1 that falls with in the range.

Sample output:

Code:
chr1	61150	61600       3
chr10	61675	62200        2

Could any one suggest how to solve this.

Thanks in advance,

Last edited by Scrutinizer; 05-18-2013 at 02:05 AM.. Reason: code tags
# 2  
Old 05-17-2013
rkk,
Try this :

Code:
while read line
do
fst=`echo $line|awk '{print $1}'`
min=`echo $line|awk '{print $2}'`
max=`echo $line|awk '{print $3}'`
cnt=0
while read line1
do
val=`echo $line1|awk '{print $2}'`
if [[ $val -le $max && $val -ge $min ]]
 then
  cnt=`expr $cnt + 1`
fi
done < file1
echo "$fst $min $max $cnt" >> output
done<file2

Thanks,
Vijay
# 3  
Old 05-17-2013
An awk approach:
Code:
awk -F'\t' '
        NR == FNR {
                A[$1] = $2 "," $3
                next
        }
        $1 in A {
                n = split( A[$1], V, "," )
                if ( $2 >= V[1] && $2 <= V[2] )
                        R[$1 OFS V[1] OFS V[2]]++
        }
        END {
                for ( k in R )
                        print k, R[k]
        }
' file2 file1

---------- Post updated at 15:33 ---------- Previous update was at 15:29 ----------

@vmenon, if you writing a shell script always use shell built-ins where ever possible.

No need to use awk to split each fields in a file. You can read each field separately in a while loop read statement:
Code:
while read f1 f2 f3
do
       ....
done < file2

This User Gave Thanks to Yoda For This Post:
# 4  
Old 05-17-2013
use bedtools intersect. that will be easy
# 5  
Old 05-17-2013
Another
Code:
awk

Code:
awk 'NR==FNR{aL[$1]=$2;aH[$1]=$3;next} ($1 in aL) {if ($2>=aL[$1] && $2<=aH[$1]) c[$1]++} END { for (i in c) {print i,c[i]}}' f2 f1
chr1 3
chr10 2

Edit: some more readable.
Code:
awk '
	NR==FNR	{
		low[$1]=$2
		high[$1]=$3
		next} 
	($1 in low) {
		if ($2>=low[$1] && $2<=high[$1])
			c[$1]++} 
	END {
		for (i in c) {
			print i,c[i]}
	}
' f2 f1


Last edited by Jotne; 05-17-2013 at 05:53 PM..
# 6  
Old 05-17-2013
Thanks all..

Yoda's script gives me what I wanted. But I am missing something.

I think I need to give more details:

FILE1:

Code:
chr1	61042
chr1	61153
chr1	61446
chr1	61457
chr1	61621
chr1 100010
chr1 100138
chr1 100145
chr1 100150
chr1 100280
chr10	61646
chr10	61914
chr10	62024
chr10	62782

Alos, I have another file

FILE2:

Code:
chr1	61150	61600
chr1 100100        100200
chr10	61675	62200

Now for each line specifying range in FILE2, I want to count the number of lines from FILE1 that falls with in the range.

Sample output:

Code:
chr1	61150	61600       3
chr1 100100        100200     3
chr10	61675	62200       2

But from Yoda's code I got
Code:
chr1 100100        100200     3
chr10	61675	62200       2

I am missing the first line..

Can you suggest any modifications?

Thanks,

Last edited by Scrutinizer; 05-18-2013 at 02:06 AM.. Reason: code tags
# 7  
Old 05-17-2013
Adapting Jotne's approach we get to the result that the requestor requested:
Code:
awk     'NR==FNR        {low[$1]=$2
                         high[$1]=$3
                         next}
         $2 >= low [$1] &&
         $2 <= high[$1] {c[$1]++}
         END            {for (i in c) print i, low[i], high[i], c[i]}
        ' file2 file1
chr1 61150 61600 3
chr10 61675 62200 2

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count lines in section

I am tiring to cont numbers of line between the "!" in CISCO routers I have no problem to extract the input and change the empty line with ! ! 5 Cable5/0/1 U0 4 5 Cable5/0/1 U1 4 ! 5 Cable5/0/1 U2 4 ... (4 Replies)
Discussion started by: sharong
4 Replies

2. UNIX for Advanced & Expert Users

Count no. of lines of execution

Hi all, I have my script to execute number of commands (command line interface) using TCL. the execution and response of the commands get stored in some log file. While the execution is going on i need only the time of execution and the number of line getting executed to be displayed in... (1 Reply)
Discussion started by: Syed Imran
1 Replies

3. Shell Programming and Scripting

Count lines containing substring

I have 2 files, and I want to count how many lines contain matching words. Example: file1 a_+b a_+b_+c file2 ab a_+b a_+bc I want to get 1, as the the first line of file1 is a substring of the first line of file2. While the second line isn't. I suspect using sdiff, but not sure how to... (3 Replies)
Discussion started by: Viernes
3 Replies

4. Shell Programming and Scripting

Count lines and use if then ksh

I try to count number of lines of a data.txt file and then if number of lines is greater than 1 then email me the file. I could not find what is wrong with my code, hope you can point out the mistake i made #! /bin/ksh count =`cat /from/file/data.txt | wc -l` if ]; then mailx -s... (4 Replies)
Discussion started by: sabercats
4 Replies

5. Solaris

WC -l does not count all the lines in a file? HELP

I have a file that I need to merge with another like file. Normally I remove the trailer reocrd and merge the file and update the trailer record of the second file. I did a WC -l on the first file before I removed the trailer record, and again afterwards. The count came back the same. I opened the... (6 Replies)
Discussion started by: Harleyrci
6 Replies

6. Shell Programming and Scripting

count lines in a pattern

Hi, I had posted few days back and got replies on how to extract patterns from a file. I had another question. I want to count the number of lines a particular pattern. I thought of somethings like using NF variable, etc, but they didnt work. Here is sample input. ... (9 Replies)
Discussion started by: sandeepk1611
9 Replies

7. Shell Programming and Scripting

Count certain lines

Hi! I have a file that looks like this: AAG ---------------------------------------------------------------------- Number of residues in the repeat = 3 AGA ---------------------------------------------------------------------- Number of residues in the repeat = 3 AGG ... (2 Replies)
Discussion started by: vanesa1230
2 Replies

8. Shell Programming and Scripting

Count the no of lines between two words

Please help in the following problem: Input is: Pritam 123 456 Patil myname youname Pritam myproject thisproject iclic Patil remaining text some more text I need the command which will display the no of lines between two words in the whole file. e.g. Display all the no of lines... (5 Replies)
Discussion started by: zsudarshan
5 Replies

9. Shell Programming and Scripting

Parse and count lines

I have a data file in the following format (refer to input file) with multiple lines containing some information. I need an output file to loop thorough the input file with summarized information as seen below (refer to output file) ‘Date Time' and ‘Beta Id' input file values should be concatenated... (7 Replies)
Discussion started by: shekharaj
7 Replies

10. UNIX for Dummies Questions & Answers

How to count lines - ignoring blank lines and commented lines

What is the command to count lines in a files, but ignore blank lines and commented lines? I have a file with 4 sections in it, and I want each section to be counted, not including the blank lines and comments... and then totalled at the end. Here is an example of what I would like my... (6 Replies)
Discussion started by: kthatch
6 Replies
Login or Register to Ask a Question