How to count the number of strings?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to count the number of strings?
# 1  
Old 05-29-2013
How to count the number of strings?

Hi,

I have a text file as shown below. I would like to count the unique number of connections of each person in the first and second column. Third column is the ID numbers of first column persons and fourth column is the ID numbers of second column persons.

Code:
susan  ali      156   294
susan  ali      156   294          
susan  anna  156   67          
rex    rex       432   564           
rex    rex       432   564 
philip sama   543   22

for example, susan has two connections with ali and anna. susan's ID is 156. Ali's and anna's ID are 294, 67 respectively. In the ouput, last column is the number of connections of each person. Total connections are the sum of the connections of each person.

your help would be appreciated!!



Output:
Code:
susan  156   :-   ali         294     anna    67   2
rex       432  :-   rex        564                        1
philip   543  :-   sama     22                          1   
ali        294  :-   susan    156                        1
anna    67    :-    susan    156                       1
rex      564   :-   rex        432                        1
sama   22     :-  philip     543                        1

Total connections:-8


Last edited by Scott; 05-29-2013 at 07:48 AM.. Reason: Code tags
# 2  
Old 05-29-2013
try

Code:
awk '{A[$1 FS $3 ":-" FS $2 FS $4]++;B++}END{for(i in A){print i,A[i]};print "Total connections:-"B}' file

# 3  
Old 05-29-2013
Hi pamu,

Thanks for your code. Your code's output is different from my desired output. susan's connections should print on the same line. rex has two connections. It doesn't print. The total connections should be 8.
# 4  
Old 05-29-2013
Quote:
ali 294 :- susan 156 1
How do you get this, when ali and susan has two conntection?
Code:
susan  ali      156   294
susan  ali      156   294

# 5  
Old 05-29-2013
Hi jotne,

Thanks for your comment. I also need to count the connections of persons in the second column. susan has connection with ali. In the same way, ali has also connection with susan. ali's ID is 294. susan's Id is 156. Total noSmilief connection of ali is one. I need to count only unique number of connections.

Last edited by mohamad; 05-29-2013 at 09:11 AM..
# 6  
Old 05-29-2013
Here is an awk program:
Code:
awk '
        {
                A[++c] = $0
                N[$1","$3]
                N[$2","$4]
        }
        END {
                for ( k in N )
                {
                        s = k ":- "
                        for ( i = 1; i <= c; i++ )
                        {
                                n = split ( A[i], V )
                                if ( k == V[1]","V[3] )
                                {
                                        if ( !( (k","V[2]) in R ) )
                                        {
                                                s = s OFS V[2] OFS V[4]
                                                ++j
                                        }
                                        R[k","V[2]]
                                }
                                if ( k == V[2]","V[4] )
                                {
                                        if ( !( (k","V[1]) in R ) )
                                        {
                                                s = s OFS V[1] OFS V[3]
                                                ++j
                                        }
                                        R[k","V[1]]
                                }

                        }
                        print s, j
                        t += j
                        j = 0
                }
                print "Total Connections: " t
        }
' OFS='\t' file

Produces output:
Code:
ali,294:-       susan   156     1
rex,432:-       rex     564     1
anna,67:-       susan   156     1
susan,156:-     ali     294     anna    67      2
philip,543:-    sama    22      1
sama,22:-       philip  543     1
rex,564:-       rex     432     1
Total Connections: 8

# 7  
Old 05-29-2013
Hi yoda,

Thanks for your code. When I try your code for my original file(9 columns , around 1000 lines), I get the incorrect number of connections for some persons. But code works well with less than 30 lines.

Last edited by mohamad; 05-30-2013 at 09:14 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count the occurences of strings

I have some text files in a folder f1 with 10 columns. The first five columns of a file are shown below. aab abb 263-455 263 455 aab abb 263-455 263 455 aab abb 263-455 263 455 bbb abb 26-455 26 455 bbb abb 26-455 26 455 bbb aka 264-266 264 266 bga bga 230-232 230 ... (10 Replies)
Discussion started by: gomez
10 Replies

2. Shell Programming and Scripting

Count the number of strings

I have 500 text files in a folder. The data of the text files are shown below. USA Germany 23-12 USA Germany 23-12 USA Germany 23-12 France Germany 15-12 France Germany 15-12 France Italy 25-50 China China 30-32 China China 30-32 I would... (1 Reply)
Discussion started by: sahith
1 Replies

3. UNIX for Dummies Questions & Answers

How to search and count strings?

Hi, Is there a command to do a sensitive/in-sensitive search for a string on a line and print how many times that string appears? For example, if I have a line of text below: dog cat rat apple banana dog lion tiger dog Is there a command to search for dog that will print out 3 as a... (7 Replies)
Discussion started by: newbie_01
7 Replies

4. Shell Programming and Scripting

How do I count strings on each line?

Hi Im a very inexperienced bioinformatician I have a large DNA file with about 10000 lines of sequence and need to count the occurrence of TA for each line for example in the file TACGCGCGATA TATATATA GGCGCGTATA I would like to get an output like: 2 4 2 I have tried... (3 Replies)
Discussion started by: Manchesterpaul
3 Replies

5. UNIX for Dummies Questions & Answers

Count the number of strings in a block

Hi, I have the following text in a file: ISA*00* *00* *ZZ*ENS_EDI *ZZ*GATE0215 *110106*2244*U*00401*006224402*1*P*>~ GS*HP*ENS_EDI*GATE0215*20110106*2244*6224402*X*004010X091A1~ ST*835*00006~... (2 Replies)
Discussion started by: donisback
2 Replies

6. Shell Programming and Scripting

how to add the number of row and count number of rows

Hi experts a have a very large file and I need to add two columns: the first one numbering the incidence of records and the another with the total count The input file: 21 2341 A 21 2341 A 21 2341 A 21 2341 C 21 2341 C 21 2341 C 21 2341 C 21 4567 A 21 4567 A 21 4567 C ... (6 Replies)
Discussion started by: juelillo
6 Replies

7. Shell Programming and Scripting

count identical strings print last row and count

I have a sorted file like: Apple 3 Apple 5 Apple 8 Banana 2 Banana 3 Grape 31 Orange 7 Orange 13 I'd like to search $1 and if $1 is not the same as $1 in the previous row print that row and print the number of times $1 was found. so the output would look like: Apple 8 3 Banana... (2 Replies)
Discussion started by: dcfargo
2 Replies

8. Shell Programming and Scripting

count the number of lines that start with the number

I have a file with contents similar to this. abcd 1234 4567 7666 jdjdjd 89289 9382 92 jksdj 9823 298 I want to write a shell script which count the number of lines that start with the number (disregard the lines starting with alphabets) (1 Reply)
Discussion started by: grajp002
1 Replies

9. Shell Programming and Scripting

How to count unique strings

How do I count the total number of unique strings from a file using Perl? Any help is appreciated.. (6 Replies)
Discussion started by: my_Perl
6 Replies

10. Shell Programming and Scripting

Count strings on single line?

I use grep -c often, but cannot for the life of me count the number of occurences of a string on the same line (or within a file): $ cat myfile hello457903485897hello 34329048hellojsdfkljlaskdjgh182390 $ grep -c 2 $ How do I count the number of occurences of "hello" in myfile (i.e. 3)?... (6 Replies)
Discussion started by: cs03dmj
6 Replies
Login or Register to Ask a Question