Sponsored Content
Top Forums Shell Programming and Scripting Count and print the most repeating string in each line Post 302938478 by AshwaniSharma09 on Monday 16th of March 2015 10:30:09 PM
Old 03-16-2015
Count and print the most repeating string in each line

Hi all,

I have a file in which each string from column 1 is associated with one or multiple strings from column 2. For an example, in the sample input below, Gene1 from column1 is associated with two different strings from column 2 (BP1 and BP2).For every unique string from column 1, I need to print the most associated string from column 2.


Input.txt:
Code:
Gene1   BP1
Gene1   BP1
Gene1   BP2
Gene1   BP1
Gene1   BP2
Gene2   BP3
Gene2   BP3
Gene2   BP3
Gene2   BP3
Gene3   BP7
Gene3   BP8
Gene3   BP7
Gene3   BP8

Output.txt:
Code:
Gene1   BP1   3
Gene2   BP3   4
Gene3   BP7   2   BP8   2

Here,
BP1 is highest number of connections (3 out of 5) with Gene1.
BP3 is highest number of connections (4 out of 4) with Gene2.
BP7 and BP8 have equal number of connections (each 2) with Gene3.

Your time is much appreciated.
Thanks!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grep a string and print a string from the line below it

I know how to grep, copy and paste a string from a line. Now, what i want to do is to find a string and print a string from the line below it. To demonstrate: Name 1: ABC Age: 3 Sex: Male Name 2: DEF Age: 4 Sex: Male Output: 3 Male I know how to get "3". My biggest problem is to... (4 Replies)
Discussion started by: kingpeejay
4 Replies

2. Shell Programming and Scripting

awk: sort lines by count of a character or string in a line

I want to sort lines by how many times a string occurs in each line (the most times first). I know how to do this in two passes (add a count field in the first pass then sort on it in the second pass). However, can it be done more optimally with a single AWK command? My AWK has improved... (11 Replies)
Discussion started by: Michael Stora
11 Replies

3. Shell Programming and Scripting

Count and print all repeating words in a line

Gurus, I have a file containing lines like this : Now, number of words in each line varies. My need is, if a word repeats in a line get it printed. Also total number of repeats. So, the output would be : Any help would be highly appreciated. Thanks & Regards (5 Replies)
Discussion started by: AshwaniSharma09
5 Replies

4. Shell Programming and Scripting

Extract string from multiple file based on line count number

Hi, I search all forum, but I can not find solutions of my problem :( I have multiple files (5000 files), inside there is this data : FILE 1: 1195.921 -898.995 0.750312E-02-0.497526E-02 0.195382E-05 0.609417E-05 -2021.287 1305.479-0.819754E-02 0.107572E-01 0.313018E-05 0.885066E-05 ... (15 Replies)
Discussion started by: guns
15 Replies

5. Shell Programming and Scripting

Compare last 90 logs and print repeating lines with >20

*log files are in date order sample logs... ciscoresets_20120314 ciscoresets_20120313 ciscoresets_20120312 ciscoresets_20120311 ciscoresets_20120310 cat ciscoresets_20120314 SYDGRE04,10,9 SYDGRE04,10,10 SYDGRE04,10,11 SYDGRE04,10,12 SYDGRE04,10,13 SYDGRE04,10,14 SYDGRE04,10,15... (2 Replies)
Discussion started by: slashbash
2 Replies

6. Shell Programming and Scripting

Print String Every Specific Line

Dear All, I have input file like this, 001 059 079 996 758 079 069 059 079 ... ... Desired output: AA 001 BB 059 (4 Replies)
Discussion started by: attila
4 Replies

7. UNIX for Dummies Questions & Answers

How to count a string in a line and report it?

Hi, I have a text file full of such line (this is only 1 line, tab delimited): 1 108 . C T 553.90 . ... (19 Replies)
Discussion started by: a_bahreini
19 Replies

8. Shell Programming and Scripting

String search and print next all lines in one line until blank line

Dear all I want to search special string in file and then print next all line in one line until blank lines come. Help me plz for same. My input file and desire op file is as under. i/p file: A1/EXT "BSCABD1_21233G1" 757 130823 1157 RADIO X-CEIVER ADMINISTRATION BTS EXTERNAL FAULT ... (7 Replies)
Discussion started by: jaydeep_sadaria
7 Replies

9. Shell Programming and Scripting

How to print line starting with certain string together with its following line?

Dear all, How can I print line starting with certain string together with its following line. Example is as follows: Input file: @M01596:22:000000000-A7YH7:1:1101:16615:1070 2:N:0:1... (2 Replies)
Discussion started by: huiyee1
2 Replies

10. UNIX for Beginners Questions & Answers

Count occurences of the word without it repeating

Hi, I would like to count the number of ALA occurences without having them to be repeated. In the script I have written now it has 40 repetitions of ALA but it has to be 8. ALA is chosen as one of the 20 values it can have when the script asks for the input of AAA, which for this example is chosen... (7 Replies)
Discussion started by: Aurimas
7 Replies
COLRM(1)						    BSD General Commands Manual 						  COLRM(1)

NAME
colrm -- remove columns from a file SYNOPSIS
colrm [start [stop]] DESCRIPTION
The colrm utility removes selected columns from the lines of a file. A column is defined as a single character in a line. Input is read from the standard input. Output is written to the standard output. If only the start column is specified, columns numbered less than the start column will be written. If both start and stop columns are spec- ified, columns numbered less than the start column or greater than the stop column will be written. Column numbering starts with one, not zero. Tab characters increment the column count to the next multiple of eight. Backspace characters decrement the column count by one. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of colrm as described in environ(7). EXIT STATUS
The colrm utility exits 0 on success, and >0 if an error occurs. SEE ALSO
awk(1), column(1), cut(1), paste(1) HISTORY
The colrm command appeared in 3.0BSD. BSD
August 4, 2004 BSD
All times are GMT -4. The time now is 08:32 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy