awk code to process column pairs and find those with more than 1 set of possible values
Hi,
I have a very wide dataset with pairs of columns starting from column 7 onwards (see example below).
For each pair of columns 7&8, 9&10, 11&12... I would like to produce an indicator of whether there is more than 1 possibility for each pair in the column. A pair 0 0 represents a missing value and should be ignored (ie: not counted as its own class).
So for the first pair of columns the possibilities are 0 0 and 1 2. However because 0 0 represents a missing value, the only remaining combination is 1 2. Therefore there is only 1 possibility for the pair of columns.
For the second pair of columns the possibilities are 0 0, 1 1 and 1 2. Therefore there are 2 possibilities for the pair of columns.
The output for the above example would then be: 0 1 0 0 1 1 where the zero means there is only 1 possibility for the pair and a 1 means there is more than one possibility for the pair in the column.
I hope that makes sense - it is a bit difficult to explain.
I would then also like to apply the filtering 0 1 0 0 1 1 to a second file which looks as follows (there is one row for every pair of columns in the first file):
The final output would then be a file with rows from column 2 of this file where the corresponding column of the filter is a 1.
eg: gen2
gen5
gen6
.
.
Any help would be appreciated. I have a code to do this in R and one in J but they are a bit too resource expensive and I thought that something using awk might work a lot better.
Dear Guyz:)
I have 2 different input files like this. I would like to pick the values or letters from the inputfile2 based on inputfile1 keys (A,F,N,X,Z).
I have done similar task by using awk but in that case the inputfiles are similar like in inputfile2 (all keys in 1st column and values in... (16 Replies)
Hello,
I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2
SDF4 -0.211654
SDF4 0.978068
... (1 Reply)
I have an awk script to find the maximum value of the 2nd column of a 2 column datafile, but I need to find the top 5 maximum values of the 2nd column.
Here is the script that works for the maximum value.
awk 'BEGIN { subjectmax=$1 ; max=0} $2 >= max {subjectmax=$1 ; max=$2} END {print... (3 Replies)
Hi all!
I have a data set in this tab separated format : Label, Value1, Value2
An instance is "data.txt" :
0 1 1
-1 2 3
0 2 2
I would like to parse this data set and generate two files, one that has only data with the label 0 and the other with label -1, so my outputs should be, for... (1 Reply)
Hi,
My input files is like this
axis1 0 1 10
axis2 0 1 5
axis1 1 2 -4
axis2 2 3 -3
axis1 3 4 5
axis2 3 4 -1
axis1 4 5 -6
axis2 4 5 1
Now, these are my following tasks
1. Print a first column for every two rows that has the same value followed by a string.
2. Match on the... (3 Replies)
There are 3 values (cols 3,4,5) for each name (col 1) and level (col2 ). Some levels for some of the names do not exist. Files are space delimited
SSGG765 L1 1 2 3
SSGG765 L2 4 5 6
GUHJHJJ7 L1 7 8 9
GUHJHJJ7 L5 10 12 13
FFRTGGG L1 11 1 3
Given a list of pairwise names, I want... (5 Replies)
I try to enlarge the htop column's width. I've found a solution, but it seems very specific and also too difficult. Is there any simpler way to make all the characters in a column visible? (0 Replies)
Dear folks
I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Hi All,
Does anyone have any suggestions/examples of how i could show only lines where the first field is not duplicated. If the first field is listed more than once it shouldnt be shown even if the other columns make it unique.
Example file :
876,RIBDA,EC2
876,RIBDH,EX7
877,RIBDF,E28... (4 Replies)
Please help me to get required output for both scenario 1 and scenario 2 and need separate code for both scenario 1 and scenario 2
Scenario 1
i need to do below changes only when column1 is CR and column3 has duplicates rows/values. This inputfile can contain 100 of this duplicated rows of... (1 Reply)