Inserting column data based on category assignment
please help with the following.
I have 4 col data .. instrument , category, variable and value. the instruments belong to particular categories and they all measure some variables (var1 and var2 in this example), the last column is the value an instrument outputs for a variable.
I have used some blank rows for ease of understanding , there is no blank row in the actual dataset.
In this example instruments (ab,bc,pt and ef) belong to cat1 ; instruments (cd,gh and pt ) belong to cat2.
As you can see , there are some rows missing like
I want to impute these rows , if there is a consensus value within the same (cat var ) combination above 60%.
For example, in the part of data
(cat1 var1) has a value of aa 2 out of 3 times (66%). Since this is greater than cutoff of 60% , we can impute the missing instrument (ef) value in this category (cat1) and variable(var1) as aa.
This is my desired output, row order doesn't matter and blank rows not needed.
I do have a big tab delimited file of the following format
aa 344 456
aa 34 67
bb 34 90
bb 23 100
bb 1 89
d 0 12
e 45 678
e 78 90
e 56 90
....
....
....
I would like to transpose the data based on the category on column one and get the output file in the following tab delimited... (8 Replies)
Hi guys,
I need to append new data at the end of each line of the files. This new data is based on substring (3rd fields) of last column.
Input file xxx.csv:
U1234|1-5X|orange|1-5X|Act|1-5X|0.1 /sac/orange 12345 0
U5678|1-7X|grape|1-7X|Act|1-7X|0.1 /sac/grape 5678 0... (5 Replies)
Please consider the following file, I have many groups which can be of 3 types, T1 (Serial_Number 1) T2 (Serial_Number 2) and T1*T2 (all other Serial_Number).
I want to only consider groups that have both T1 and T2 present and their values are different from each other. In the example file,... (8 Replies)
Hi,
I have a data file with :
01/28/2012,1,1,98995
01/28/2012,1,2,7195
01/29/2012,1,1,98995
01/29/2012,1,2,7195
01/30/2012,1,1,98896
01/30/2012,1,2,7083
01/31/2012,1,1,98896
01/31/2012,1,2,7083
02/01/2012,1,1,98896
02/01/2012,1,2,7083
02/02/2012,1,1,98899
02/02/2012,1,2,7083
I... (1 Reply)
Started using unix commands recently.
I have 50 gzip files. I want to grep each of these files for a line count based particular category in column 3. How can I do that?
For example
Sr.No Date City Description Code Address
1 06/09 NY living here 0909 10st st nyc
2 ... (5 Replies)
Hi All,
I need some help in parsing out the first (top) data lines of each category (categories are based on the first column a, b, c, d, e.( see example file below) from a big file
a dfg 3 6 8 9
a fgh 5 7 0 9
a gkl 5 2 4 7
a glo 7 0 1 5
b ghj 9 0 4 2
b mkl 7 8 0 5
b jkl 9 0 4 5
c jkl 2... (1 Reply)
Hi,
I've shown an example of what I would like to achieve below. In the example file, I would like to sum the values in column 2 for each distinct category in column 3 (presumably making an array?) and print the sum as well as the category name and length (note:length always corresponds with... (8 Replies)
My input file:
AVI.out <detail>named as the RRM .</detail>
AVI.out <detail>Contains 1 RRM .</detail>
AR0.out <detail>named as the tellurite-resistance.</detail>
AWG.out <detail>Contains 2 HTH .</detail>
ADV.out <detail>named as the DENR family.</detail>
ADV.out ... (10 Replies)