Count lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Count lines
# 8  
Old 05-17-2013
Thanks Rudi..,

I reposted my message with little modifications.. could you please check and repost the code

Best
# 9  
Old 05-17-2013
Here is the modified code:
Code:
awk '
        NR == FNR {
                A[$1","$2] = $3
                next
        }
        {
                for ( k in A )
                {
                        n = split ( k, V, "," )
                        if ( $1 == V[1] )
                        {
                                if ( $2 >= V[2] && $2 <= A[$1","V[2]] )
                                        R[$1 OFS V[2] OFS A[$1","V[2]]]++
                        }
                }
        }
        END {
                for ( k in R )
                        print k, R[k]
        }
' file2 file1

# 10  
Old 05-17-2013
Hi Yoda,

Thanks for the code. The complexity for this code very high as I have 9million lines in file1 and 11000 lines in file2.

Is there any quick way in awk??

Best
# 11  
Old 05-17-2013
I'm not sure if there is another way without for loop visiting each record in file2 for comparison.

May be someone else in this forum has a better idea.

By the way what system are you on and how long it is taking to complete execution?
# 12  
Old 05-18-2013
You could try this approach (which avoids cycling through every record in file2):
Code:
awk '
  NR==FNR {
    j=++Ranges[$1]
    Low[$1,j]=$2
    High[$1,j]=$3
    next
  }
  $1 in Ranges {
    for(j=1; j<=Ranges[$1]; j++) if (Low[$1,j]<=$2 && $2<=High[$1,j]) Number[$1,j]++
  }
  END {
    for(i in Ranges) for(j=1; j<=Ranges[i]; j++) print i, Low[i,j], High[i,j], Number[i,j]
  }
' OFS='\t' file2 file1

I presumed that a range match is only valid if the key in field 1 matches as well.

Last edited by Scrutinizer; 05-20-2013 at 07:36 AM..
# 13  
Old 05-20-2013
awk or python

awk
Code:
gawk '{
        n=split($0,arr," ")
        if(NR==FNR){
         min[arr[1]]=arr[2]
         max[arr[1]]=arr[3]
         cnt[arr[1]]=0
        }
        else{
         if(arr[2] >= min[arr[1]] && arr[2] <= max[arr[1]])
           cnt[arr[1]]++
        }
}
END{
 for (i in cnt)
  print i" "min[i]" "max[i]" "cnt[i]
}' b a

python
Code:
dic={}
cnt=0
with open("b.txt") as f:
 for line in f:
  line=line.replace("\n","")
  words = line.split(" ")
  dic[words[0]]={'MIN':words[1],'MAX':words[2],'CNT':0}
print(dic)
with open("a.txt") as f:
 for line in f:
  line=line.replace("\n","")
  words = line.split(" ")
  if words[1]>=dic[words[0]]['MIN'] and words[1]<=dic[words[0]]['MAX']:
   dic[words[0]]['CNT']+=1
for i in dic:
 print(i,dic[i]['MIN'],dic[i]['MAX'],dic[i]['CNT'])

# 14  
Old 05-20-2013
@summer_cherry. The gawk / awk does not produce the correct output for multiple ranges with the same $1 (see post #6).

---
The python script produces (Python 2.7.2):
Code:
line 7, in <module>
    dic[words[0]]={'MIN':words[1],'MAX':words[2],'CNT':0}
IndexError: list index out of range

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count lines in section

I am tiring to cont numbers of line between the "!" in CISCO routers I have no problem to extract the input and change the empty line with ! ! 5 Cable5/0/1 U0 4 5 Cable5/0/1 U1 4 ! 5 Cable5/0/1 U2 4 ... (4 Replies)
Discussion started by: sharong
4 Replies

2. UNIX for Advanced & Expert Users

Count no. of lines of execution

Hi all, I have my script to execute number of commands (command line interface) using TCL. the execution and response of the commands get stored in some log file. While the execution is going on i need only the time of execution and the number of line getting executed to be displayed in... (1 Reply)
Discussion started by: Syed Imran
1 Replies

3. Shell Programming and Scripting

Count lines containing substring

I have 2 files, and I want to count how many lines contain matching words. Example: file1 a_+b a_+b_+c file2 ab a_+b a_+bc I want to get 1, as the the first line of file1 is a substring of the first line of file2. While the second line isn't. I suspect using sdiff, but not sure how to... (3 Replies)
Discussion started by: Viernes
3 Replies

4. Shell Programming and Scripting

Count lines and use if then ksh

I try to count number of lines of a data.txt file and then if number of lines is greater than 1 then email me the file. I could not find what is wrong with my code, hope you can point out the mistake i made #! /bin/ksh count =`cat /from/file/data.txt | wc -l` if ]; then mailx -s... (4 Replies)
Discussion started by: sabercats
4 Replies

5. Solaris

WC -l does not count all the lines in a file? HELP

I have a file that I need to merge with another like file. Normally I remove the trailer reocrd and merge the file and update the trailer record of the second file. I did a WC -l on the first file before I removed the trailer record, and again afterwards. The count came back the same. I opened the... (6 Replies)
Discussion started by: Harleyrci
6 Replies

6. Shell Programming and Scripting

count lines in a pattern

Hi, I had posted few days back and got replies on how to extract patterns from a file. I had another question. I want to count the number of lines a particular pattern. I thought of somethings like using NF variable, etc, but they didnt work. Here is sample input. ... (9 Replies)
Discussion started by: sandeepk1611
9 Replies

7. Shell Programming and Scripting

Count certain lines

Hi! I have a file that looks like this: AAG ---------------------------------------------------------------------- Number of residues in the repeat = 3 AGA ---------------------------------------------------------------------- Number of residues in the repeat = 3 AGG ... (2 Replies)
Discussion started by: vanesa1230
2 Replies

8. Shell Programming and Scripting

Count the no of lines between two words

Please help in the following problem: Input is: Pritam 123 456 Patil myname youname Pritam myproject thisproject iclic Patil remaining text some more text I need the command which will display the no of lines between two words in the whole file. e.g. Display all the no of lines... (5 Replies)
Discussion started by: zsudarshan
5 Replies

9. Shell Programming and Scripting

Parse and count lines

I have a data file in the following format (refer to input file) with multiple lines containing some information. I need an output file to loop thorough the input file with summarized information as seen below (refer to output file) ‘Date Time' and ‘Beta Id' input file values should be concatenated... (7 Replies)
Discussion started by: shekharaj
7 Replies

10. UNIX for Dummies Questions & Answers

How to count lines - ignoring blank lines and commented lines

What is the command to count lines in a files, but ignore blank lines and commented lines? I have a file with 4 sections in it, and I want each section to be counted, not including the blank lines and comments... and then totalled at the end. Here is an example of what I would like my... (6 Replies)
Discussion started by: kthatch
6 Replies
Login or Register to Ask a Question