Sponsored Content
Top Forums Shell Programming and Scripting Count number of pattern matches per line for all files in directory Post 302898771 by Don Cragun on Wednesday 23rd of April 2014 09:46:58 PM
Old 04-23-2014
Assuming that I am correct in believing that the desired bonus output you provided:
Code:
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   1   1   1
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   3   4   2
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   5   2   2
ACYPI000008-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   1   2   1
ACYPI000008-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   1   1   1

should have been:
Code:
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   1   1   1
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   3   4   2
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   5   2   2
ACYPI000008-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   1   2   1
ACYPI000008-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt   3   1   1

and with the sets of three spaces changed to tabs, the following script (using awk instead of perl) seems to also do what you want:
Code:
#!/bin/ksh
awk '
{	nm = nc = ncM = 0
	for(i = 1; i <= NF; i++)
		if(match($i, /comp[0-9]/)) {
			nm++
			if(++nc > ncM)
				ncM = nc
		} else	nc = 0
	if(nm)	printf("%s\t%d\t%d\t%d\n", FILENAME, FNR, nm, ncM)
}' $(cat IDs)

producing the output:
Code:
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt	1	1	1
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt	3	4	2
ACYPI55796-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt	5	2	2
ACYPI000008-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt	1	2	1
ACYPI000008-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt	3	1	1

If someone wants to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count the number of files in a directory

Hi All, How do i find out the number of files in a directory using unix command ? (14 Replies)
Discussion started by: Raynon
14 Replies

2. Shell Programming and Scripting

awk to count pattern matches

i have an awk statement which i am using to count the number of occurences of the number ,5, in the file: awk '/,5,/ {count++}' TRY.txt | awk 'END { printf(" Total parts: %d",count)}' i know there is a total of 10 matches..what is wrong here? thanks (16 Replies)
Discussion started by: npatwardhan
16 Replies

3. Shell Programming and Scripting

count number of files in a directory

what's the script to do that? i want to only count the number of files in that directory, not including any sub directories at all (5 Replies)
Discussion started by: finalight
5 Replies

4. Shell Programming and Scripting

Perl line count if it matches a pattern

#!/usr/bin/perl use Shell; open THEFILE, "C:\galileo_integration.txt" || die "Couldnt open the file!"; @wholeThing = <THEFILE>; close THEFILE; foreach $line (@wholeThing){ if ($line =~ m/\\0$/){ @nextThing = $line; if ($line =~ s/\\0/\\LATEST/g){ @otherThing =... (2 Replies)
Discussion started by: nmattam
2 Replies

5. UNIX for Dummies Questions & Answers

Read directory files and count number of lines

Hello, I'm trying to create a BASH file that can read all the files in my working directory and tell me how many words and lines are in that file. I wrote the following code: FILES="*" for f in "$FILES" do echo -e `wc -l -w $f` done My issue is that my file is outputting in one... (4 Replies)
Discussion started by: jl487
4 Replies

6. UNIX for Dummies Questions & Answers

Count number of files in directory excluding existing files

Hi, Please let me know how to find out number of files in a directory excluding existing files..The existing file format will be unknown..each time.. Thanks (3 Replies)
Discussion started by: ammu
3 Replies

7. Shell Programming and Scripting

How to count the number of files starting with a pattern in a Directory

Hi! In our current directory there are around 35000 files. Out of these a few thousands(around 20000) start with, "testfiles9842323879838". I want to count the number of files that have filenames starting with the above pattern. Please help me with the command i could use. Thank... (7 Replies)
Discussion started by: atechcorp
7 Replies

8. Shell Programming and Scripting

grep - match files containing minimum number of pattern matches

I want to search a bunch of files and list only those containing a minimum number of pattern matches. So if I want to identify files containing 3 (or more) instances of the pattern "said:" and I have file1 that contains the lines: He said: She said: and file2 that contains the lines: He... (3 Replies)
Discussion started by: stumpyuk
3 Replies

9. Shell Programming and Scripting

How to count number of files in directory and write to new file with number of files and their name?

Hi! I just want to count number of files in a directory, and write to new text file, with number of files and their name output should look like this,, assume that below one is a new file created by script Number of files in directory = 25 1. a.txt 2. abc.txt 3. asd.dat... (20 Replies)
Discussion started by: Akshay Hegde
20 Replies

10. Shell Programming and Scripting

Count the number of subset of files in a directory

hi I am trying to write a script to count the number of files, with slightly different subset name, in a directory for example, in directory /data, there are a subset of files that are name as follow /data/data_1_(1to however many).txt /data/data_2_(1 to however many).txt... (12 Replies)
Discussion started by: piynik
12 Replies
X2SYS_MERGE(1gmt)					       Generic Mapping Tools						 X2SYS_MERGE(1gmt)

NAME
x2sys_merge - Merge an updated COEs tables SYNOPSIS
x2sys_merge -Amain_COElist.d -Mnew_COElist.d DESCRIPTION
x2sys_merge will read two crossovers data base and output the contents of the main one updated with the COEs in the second one. The second file should only contain updated COEs relatively to the first one. That is, it MUST NOT contain any new two tracks intersections (This point is NOT checked in the code). This program is useful when, for any good reason like file editing NAV correction or whatever, one had to recompute only the COEs between the edited files and the rest of the database. -A Specify the file main_COElist.d with the main crossover error data base. -M Specify the file new_COElist.d with the newly computed crossover error data base. OPTIONS
No space between the option flag and the associated arguments. EXAMPLES To update the main COE_data.txt with the new COEs estimations saved in the smaller COE_fresh.txt, try x2sys_merge -ACOE_data.txt -MCOE_fresh.txt > COE_updated.txt SEE ALSO
x2sys_binlist(1), x2sys_cross(1), x2sys_datalist(1), x2sys_get(1), x2sys_init(1), x2sys_list(1), x2sys_put(1), x2sys_report(1) GMT 4.5.7 15 Jul 2011 X2SYS_MERGE(1gmt)
All times are GMT -4. The time now is 05:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy