In the 1st part, the input data a.txt (attached file. I only copied the 1st 100 rows to include the header which has the sample names) is queried. Columns 1-9 do not contain relevant information. Columns 313-319 contain the target samples against which all the test samples are compared ( columns 10-312).
The rows that survive the comparison operation are written to a1.txt ( columns 1-9 don't contain relevant information).
Next the header containing the sample names is extracted from a.txt, and your code is executed for counting the number of unique values.
Next I manually add the values in the 0/1 and 1/1 columns, and create a totals column. I then sort the total column from high to low. The sample with the highest total indicates the most similarity to target sample (column 313).
I like the way you transposed the result and would like to also transpose the header with sample names. So instead of COL1. COL2, ..., I would like the sample names from row 28 ( columns 10-319), such as shown below.
Moderator's Comments:
Please use CODE tags around sample input and output as well as around code segments.
I note that this output doesn't include any output for the fields that have the value ./.. Do you only want to display data in your output for the 1/1, 0/0, and 0/1 value counts?
Does the output order matter for the middle three columns?
You also said that your output should be sorted in decreasing order on the values in the last column, but your sample output appears to be unsorted???
Are columns 313-319 supposed to be counted and printed along with the test samples, or are just columns 10-312 supposed to be counted and printed?
Hi Friends,
I am having some trouble reading into an array. Basically, I am trying to grep for a pattern and extract it's value and store the same into an array. For eg., if my input is:
<L:RECORD>name=faisel farooq,age=21,
company=TCS,project=BT</L:RECORD>
<L:RECORD>name=abc... (1 Reply)
Hi All,
I have a file which is having 3 columns as (string string integer)
a b 1
x y 2
p k 5
y y 4
.....
.....
Question:
I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Could anybody help with this?
I have input below .....
david,39
david,39
emelie,40
clarissa,22
bob,42
bob,42
tim,32
bob,39
david,38
emelie,47
what i want to do is count how many names there are with different ages, so output would be like this ....
david,2
emelie,2
clarissa,1... (3 Replies)
Hi
I have the following info in a file -
<Cell id="25D"/>
<Cell id="26A"/>
<Cell id="26B"/>
<Cell id="26C"/>
<Cell id="27A"/>
<Cell id="27B"/>
<Cell id="27C"/>
<Cell id="28A"/>
I would like to know how would you go about counting all... (4 Replies)
Hi !
input:
A|B|C|D
A|F|C|E
A|B|I|C
A|T|I|B
As the title of the thread says, I would need to get:
1|3|2|4
I tried different variants of this command, but I don't manage to obtain what I need:
gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a++} END {for (b in a) print b}' input
... (2 Replies)
Hi, I have tab-deliminated data similar to the following:
dot is-big 2
dot is-round 3
dot is-gray 4
cat is-big 3
hot in-summer 5
I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows:
dot 3
cat 1
hot 1
is... (5 Replies)
Hello Team,
I need your help on the following:
My input file a.txt is as below:
3330690|373846|108471
3330690|373846|108471
0640829|459725|100001
0640829|459725|100001
3330690|373847|108471
Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Hello experts,
I am converting a number into its binary output as :
read n
echo "obase=2;$n" | bc
I wish to count the maximum continuous occurrences of the digit 1.
Example :
1. The binary equivalent of 5 = 101. Hence the output must be 1.
2. The binary... (3 Replies)
Hi,
tab-separated input:
blabla_1 A,B,C,C
blabla_2 A,E,G
blabla_3 R,Q,A,B,C,R,Q
output:
blabla_1 3
blabla_2 3
blabla_3 5
After splitting $2 in an array, I am trying to store the number of unique elements in a variable, but have some difficulties resetting the variable to 0 before... (6 Replies)
Hello,
I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited
My input file looks like this:
Colum1 Colum2 Colum3 Colum4 Coulmn5
1.1 100 100 a b
1.1 100 100 a c
1.2 200 205 a d
1.3 300 301 a y
1.3 300... (6 Replies)