05-03-2010
Count number of occurences of a character in a field defined by the character in another field
Hello,
I have a text file with n lines in the following format (9 column fields):
Example:
contig00012 149606 G C 49 68 60 18 c$cccccacccccccccc^c
I need to count the number of lower-case and upper-case occurences in column 9, respectively, of the character presented in the 4th column.
In my example line the question would be how many "C" and "c" is there in column 9 (c$cccccacccccccccc^c)? The answer would be "C"=0 and "c"=17
If possible characters proceded by a '^' (e.g. ^c) and characters followed by a '$' (e.g c$) should not be counted. So the answer will instead be "C"=0 and "c"=15
Thanks.
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
i have a string like echo "a|b|c" . i want to count the | symbols in this string . how to do this .plz tell the command (11 Replies)
Discussion started by: kamesh83
11 Replies
2. HP-UX
For counting the occurences of specific character in the file
I am issuing the command
grep -o 'character' filename | wc -w
It works in other shells but not in HP-UX as there is no option -o for grep.
What do I do now? (9 Replies)
Discussion started by: superprogrammer
9 Replies
3. Shell Programming and Scripting
Hi,
I have a huge text file. It looks like
abcde bangalo country 12345 lastfield
i want to get first 3 characters from field1 and first 3 characters from field 2 and insert the result as a new field. example the result should be:
abcde bangalo abcban country 12345 lastfield
Please... (4 Replies)
Discussion started by: ajithshankar@ho
4 Replies
4. Shell Programming and Scripting
Hi Guys,
I have a file like this:
aaa b c d e f
fsss g h i k l
qqq r t h n
I want:
aaa b c d e f
fsss g h i k l
qqq r t h , n
ggg p t e d u
qqq i o s , k (2 Replies)
Discussion started by: npatwardhan
2 Replies
5. UNIX for Dummies Questions & Answers
Hi !
Is there a more convenient way to do the following:
When, in $2, there is more then one letter (separated by ","), it returns the letter on the second position to a new line in $2 with the same content in $1 and $3 as the line this letter comes from.
infile:
aaaaa A,B 1,2,3,4,5... (6 Replies)
Discussion started by: lucasvs
6 Replies
6. Shell Programming and Scripting
Hi all,
I have a requirement to replace a field with a character as per the length of the field.
Suppose i have a file where second field is of 20 character length. I want to replace second field with 20 stars (*). like ********************
As the field is not a fixed one, i want to do the... (2 Replies)
Discussion started by: gani_85
2 Replies
7. UNIX for Dummies Questions & Answers
Hi,
let's say an input looks like:
A|C|C|D
A|C|I|E
A|B|I|C
A|T|I|B
as the title of the thread explains, I am trying to get something like:
1|A=4
2|C=2|B=1|T=1
3|I=3|C=1
4|D=1|E=1|C=1|B=1
i.e. a count of every character in each field (first column of output) independently, sorted... (4 Replies)
Discussion started by: beca123456
4 Replies
8. Shell Programming and Scripting
Hello,
I try to sort results of occurences in an array by using awk but I can't find the right command. that's why I'm asking your help ! :)
Please see below the command that I run:
awk '{ for ( i=1; i<=length; i++ ) arr++ }END{ for ( i in arr ) { print i, arr } }' dictionnary.txt
... (3 Replies)
Discussion started by: destin45
3 Replies
9. Shell Programming and Scripting
Each record coming with column names. I have to replace them in each record as shown below
TIME=20181219110000261|CHAN=FMBKHJBAAAADPCFNAAAAAABA|EVNT=SWIclst|VALU=Session FMBKHJBAAAADPCFNAAAAAABA started|SRC=NSS|UCPU=0|SCPU=0
Output should look like: ... (9 Replies)
Discussion started by: sudhakar1987
9 Replies
10. Shell Programming and Scripting
I have a csv dataset like this :
C,rs18768
G,rs13785
GA,rs1065
G,rs1801279
T,rs9274407
A,rs730012
I'm thinking of use like awk, sed to covert the dataset to this format: (if it's two character, then keep the same)
CC,rs18768
GG,rs13785
GA,rs1065
GG,rs1801279
TT,rs9274407... (7 Replies)
Discussion started by: nengcheng
7 Replies
COMM(1) BSD General Commands Manual COMM(1)
NAME
comm -- select or reject lines common to two files
SYNOPSIS
comm [-123f] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1;
lines only in file2; and lines in both files.
The filename ``-'' means the standard input.
The following options are available:
-1 Suppress printing of column 1.
-2 Suppress printing of column 2.
-3 Suppress printing of column 3.
-f Fold case in line comparisons.
Each column will have a number of tab characters prepended to it equal to the number of lower numbered columns that are being printed. For
example, if column number two is being suppressed, lines printed in column number one will not have any tabs preceding them, and lines
printed in column number three will have one.
comm assumes that the files are lexically sorted; all characters participate in line comparisons.
EXIT STATUS
comm exits 0 on success, >0 if an error occurred.
SEE ALSO
cmp(1), diff(1), sort(1), uniq(1)
STANDARDS
The comm utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
BSD
June 6, 1993 BSD