I have a large file that contains 114 total columns with over 6,000 rows and a header; the final 27 columns are coded in A/T/G/C. There is also a reference column coded A/T/C/G.
e.g. OLD_file
I want to create a new file, where the first 26 columns are the same, but I recode the final 27 columns to 0/1 based. Now if the column labeled 'ref' is my reference column, I want to do a logical over the 27 final columns based on the reference column:
For the ith (from column 27 to end) column and jth row (for all rows in the file), if the (i,j)th entry = the ith entry in the reference row of the OLD_file, the (i,j)th entry in the NEW_file = 1 , otherwise the (i,j)th entry in the NEW_file = 0.
So the NEW_file would be recoded:
e.g. NEW_file
I created a loop in R for this, but it was too slow; I was hoping gawk would be faster-- any ideas?
Gents
Is it possible to update the code to get the desired output files from the input list. I called variable to the first column.
I need to consider the first column as key to grep the values in the second column according to the desired request.
input list
(attached )
output1
... (12 Replies)
Guys,
May i know how can we de reference the code reference variable.?
my $a = sub{$a=shift;$b=shift;print "SUM:",($a+$b),"\n";};
print $a->(4,5);
How can we print the whole function ?
Please suggest me regarding this.
Thanks for your time :)
Cheers,
Ranga :) (0 Replies)
I have a genotype.bim file where it contains information about SNPs and genotype. As a hypothetical example, let's say
genotype.bim
snp1 ... A G
snp2 ... G T
snp3 ... G T
snp4 ... G A
...
snpN ... C G
where first column identifies each SNP and 5th and 6th column has genotype... (3 Replies)
I have a file that has been partially recoded so that data points that were formerly letter combinations are now -1, 0, or 1. I need to finish recoding the GG and CC data points. The file looks like this:
ID 1 2 3 4 5 6 7 8
83845676 0 0 0 0 CC -1 CC CC
838469. -1 -1 1 GG CC 0 CC 1
83847041... (10 Replies)
Hi, Iam new to unix. I have one input file .
Input file :
ID1~Name1~Place1
ID2~Name2~Place2
ID3~Name3~Place3
I need output such that only first column should change to fixed width column of 15 characters of length.
Output File:
ID1<<12 spaces>>Name1~Place1
ID2<<12... (5 Replies)
Not sure if this is a Linux issue or specific to SuSE Linux, but, in the infinite wisdom of the developers they decided to do away with the dos2unix and unix2dos commands which were very handy in handling the CR/LF issue between unix and dos/windows files.
More to the point I've created a tr... (1 Reply)