Sponsored Content
Top Forums Shell Programming and Scripting Count and print the number of occurences Post 302835113 by arch on Sunday 21st of July 2013 11:33:46 PM
Old 07-22-2013
Count and print the number of occurences

I have some files as shown below

Code:
GLL  ALM  654-656  654  656 
SEM  LYG  655-657  655  657  
SEM  LYG  655-657  655  657  
ALM  LEG  656-658  656  658  
ALM  LEG  656-658  656  658  
ALM  LEG  656-658  656  658  
LEG  LEG  658-660  658  660 
LEG  LEG  658-660  658  660

The value of GLL is 654. The value of ALM is 656. In the same way, 4th column represents the values of first column. 5th column represents the values of second column.

I tried the following program to count the occurrences of each number in the fourth and fifth column.

Code:
 for i in folder1/*.pdb;
do
awk '
BEGIN {
    path=sprintf("%s", "/home/arch/Desktop/folder2/")
}
!s[1":"$4":"$5]++{sU[$4]++;tot++} 
!s[2":"$4":"$5]++{sU[$5]++;tot++} 
END { 
    sub(/.*\//,"",FILENAME)
    for (x in sU) 
        print x, sU[x], sU[$1] > path FILENAME;       
}'  $i;
 done

The above program prints as follows

Code:
660 1 
654 1 
655 1 
656 2 
657 1 
658 2

Desired Output:-

Code:
 660 LEG 1
 654 GLL 1
 655 SEM 1
 656 ALM 2
 657 LYG 1
 658 LEG 2

your suggestions would be appreciated!!

Last edited by arch; 07-22-2013 at 02:24 AM.. Reason: code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to count the number of occurences of this pattern?

Hi all, I have a pattern like this in a file: 123 4 56 789 234 5 67 789 121 3 56 789 222 4 65 789 321 6 90 100 478 8 40 789 243 7 80 789 How can I count the number of occurences of '789' (4th column) in this set...? Thanks for all your help! K (7 Replies)
Discussion started by: kripssmart
7 Replies

2. Shell Programming and Scripting

Perl - Count occurences

I have enclosed the script. I am able to find the files that contain my search string but when I try to count the occurences within the file I get zero always. Any help on this. #!/usr/bin/perl my $find = $ARGV; my $replace = $ARGV; my $glob = $ARGV; @filelist = <*$glob>; # process each... (22 Replies)
Discussion started by: TimHortons
22 Replies

3. UNIX for Dummies Questions & Answers

Count number of occurences of a word

I want to count the number of occurences of say "200" in a file but that file also contains various stuff including dtaes like 2007 or smtg like 200.1 so count i am getting by doing grep -c "word" file is wrong Please help!!!!! (8 Replies)
Discussion started by: shikhakaul
8 Replies

4. Shell Programming and Scripting

Count number of occurences of a character in a field defined by the character in another field

Hello, I have a text file with n lines in the following format (9 column fields): Example: contig00012 149606 G C 49 68 60 18 c$cccccacccccccccc^c I need to count the number of lower-case and upper-case occurences in column 9, respectively, of the... (3 Replies)
Discussion started by: s052866
3 Replies

5. Shell Programming and Scripting

to count the number of occurences of a column value

im trying to count the number of occurences of column 2 value(starting from KKK*) of the below file, file.txt using the code cat file.txt | awk ' BEGIN { print "Category Counts"} {FS=","} {NR > 2} { cats = cats + 1} END { for(c in cats) { print c, "=", cats} } ' but its returning as ... (6 Replies)
Discussion started by: michaelrozar17
6 Replies

6. Shell Programming and Scripting

Awk to count occurences

Hi, i am in need of an awk script to accomplish the following: Input table looks like: Student1 arts Student2 science Student3 arts Student4 science Student5 science Student6 science Student7 science Student8 science Student9 science Student10 science Student11 science... (8 Replies)
Discussion started by: saint2006
8 Replies

7. UNIX for Dummies Questions & Answers

Count pattern occurences

hi, I have a text..and i need to find a pattern in the text and count to the no of times the pattern occured. i have used grep command ..but the problem is , it shows the occurrences of the pattern but doesn't count no of times the pattern occuries. (5 Replies)
Discussion started by: nvnni
5 Replies

8. Shell Programming and Scripting

Count number of occurences using awk

Hi Guys, I have 2 files like below file1 xx yy file2 b yy b2 xx c1 yy xx yy Now I want an idea which can count occurences of text from file1 and file2 so outbout would be kind of (9 Replies)
Discussion started by: prashant2507198
9 Replies

9. Shell Programming and Scripting

Count the occurences of strings

I have some text files in a folder f1 with 10 columns. The first five columns of a file are shown below. aab abb 263-455 263 455 aab abb 263-455 263 455 aab abb 263-455 263 455 bbb abb 26-455 26 455 bbb abb 26-455 26 455 bbb aka 264-266 264 266 bga bga 230-232 230 ... (10 Replies)
Discussion started by: gomez
10 Replies

10. AIX

UNIX ksh - To print the PID number and repeat count

This question is asked in an interview today that I have to return output with each PID number and the count of each PID number logged today. Here is the script that I have written. Can you confirm if that would work or not. The interviewer didn't said if my answer is correct or not. Can someone... (5 Replies)
Discussion started by: Subodh Kumar
5 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 05:15 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy