Read Two Columns - Apply Condition on Six other columns
Hello All,
Here is my input
Here are the conditions
1. First look for common genes on column 6
2. Then consider the highest value in column 7 pertaining to column 6
3. Starting columns 10 through 14, even if one column has a value equal to or greater than 0.5, then print only that row and exclude all other rows of the same gene.
4. Even if the highest value in column 7 row has no values equal or greater than 0.5, then go to the next highest value and see if the condition is met.
5. For any gene, if none of the column values from 10 through 14 has a value equal to or greater than 0.5, then don’t print those records at all.
So my output will be
What did I do so far?
1. I was using awk ‘!x[$6]++’ command after sorting on column6. And from there, I was piping the output by repeating if loops in awk for columns 10 through 14. I just realized that most useful data is being thrown out by doing it this way. After carefully looking at the data, I came up with this question.
Thanks in advance. Please comment if you have any questions.
Last edited by jim mcnamara; 12-05-2014 at 04:30 PM..
You can do this in bash using 'while read -a your_array;do ... done < file_in'. You must set up the array variable in advance, and your columns will be put away in the array for each line/row. You can test the columns and decide whether to echo out (reproduce) the row.
Hello,
I have a requirement to apply hashing algorithm on flat file on one or more columns dynamically based on header
sample input file
ID|NAME|AGE|GENDER
10|ABC|30|M
20|DEF|20|F
say if i want multiple columns based on the header example id,name or id,age or name,gender and hash and... (13 Replies)
Hi everyone,
I have a situation in which I have multiple (3 at last count) date columns in a CSV file (, delim), which need to be changed from:
January 1 2017 (note, no comma after day)
to:
YYYY-MM-DD
So far, I am able to convert a date using:
date --date="January 12, 1990" +%Y-%m-%d
... (7 Replies)
HI All,
I'm embedding SQL query in Script which gives following output:
Assignee Group Total
ABC Group1 17
PQR Group2 5
PQR Group3 6
XYZ Group1 10
XYZ Group3 5
I have saved the above output in a file.
How do i sum up the contents of this output so as to get following output:
... (4 Replies)
I need to write the list of files to a new file in one column , the second column would contain the first line of that file (header record extracted through head -1 ) and the third column would contain the last record of that file (trailer record tail -1 ) .
Example :- folder where the files... (8 Replies)
Hi Friends,
Hope all is well.
I have an input file like this
a gene1 10
b gene1 2
c gene2 20
c gene3 10
d gene4 5
e gene5 6
Steps to reach output.
1. Print unique values of column1 as column of the matrix, which will be
a
b
c (5 Replies)
I have a file some thing like this:
GN Name=YWHAB;
RC TISSUE=Keratinocyte;
RC TISSUE=Thymus;
CC -!- FUNCTION: Adapter protein implicated in the regulation of a large
CC spectrum of both general and specialized signaling pathways
GN Name=YWHAE;
RC TISSUE=Liver;
RC ... (13 Replies)
Hi I have a matrix with n rows and m columns like below example. i want to extract all the pairs with values <200.
Input
A B C D
A 100 206 51 300
B 206 100 72 48
C 351 22 100 198
D 13 989 150 100
Output format
A,A:200
A,C:51
B,B:100... (2 Replies)
Hello everyone,
I searched the forum looking for answers to this but I could not pinpoint exactly what I need as I keep having trouble.
I have many files each having two columns and hundreds of rows.
first column is a string (can have many words) and the second column is a number.The files are... (5 Replies)
I have a control file which tells me which are the fields in the files I need to compare and based on the values I need to print the exact value if key =Y and output is Y , or if output is Y/N then I need to print only Y if it matches or N if it does not match and if output =N , then skip the feild... (7 Replies)
Hi all,
I have created a script which adding two columns and removing two columns for all files.
Filename: Cust_information_1200_201010.txt
Source Data:
"1","Cust information","123","106001","street","1-203 high street"
"1","Cust information","124","105001","street","1-203 high street"
... (0 Replies)