I have a genotype.bim file where it contains information about SNPs and genotype. As a hypothetical example, let's say
genotype.bim
where first column identifies each SNP and 5th and 6th column has genotype information.
First step is todesignate the first allele of each SNP from the bim file and recode it as 0, then recode the second allele as 1. So for the snp1, A=0, G=1, for snp2, G=0,T=1, for snp3, G=0,T=1, so forth. Then we apply these designations to genotype.ped file.
genotype.ped
first two columns are id numbers (they are identical). suceeding two columns (3rd,4th) correpond to snp1, (5th,6th) correpond to snp2, etc; each snp contains two columns of genotype information in the ped file. now I want to recode the allele in the same way it was done for the bim file. so for snp1, A=0, G=1, so the 3rd,4th column of the first row will be 0 0 (A A) and 5th,6th column will be 0 1 (G T) because for snp2, G=0,T=1,
then the desirable ouput will look like
If you can contribute your idea as to how to write a generalized script for this problem (I have thousands of Snps and individuals), your help will be really appreciated. Thanks in advance!
Moderator's Comments:
Please use code tags when posting data and code samples!
Last edited by vgersh99; 08-09-2012 at 11:07 AM..
Reason: code tags, please!
Hi,
I need to find whether the first character in a line is a alphabet or a number. If its a number i should sort it numerically. If its a alphabet i should sort it based on the ASCII value.And if it is something other than alphabet or number then sort it based on ASCII value.
The code i used... (2 Replies)
Not sure if this is a Linux issue or specific to SuSE Linux, but, in the infinite wisdom of the developers they decided to do away with the dos2unix and unix2dos commands which were very handy in handling the CR/LF issue between unix and dos/windows files.
More to the point I've created a tr... (1 Reply)
I have a file that has been partially recoded so that data points that were formerly letter combinations are now -1, 0, or 1. I need to finish recoding the GG and CC data points. The file looks like this:
ID 1 2 3 4 5 6 7 8
83845676 0 0 0 0 CC -1 CC CC
838469. -1 -1 1 GG CC 0 CC 1
83847041... (10 Replies)
Hello,
I have a large file that contains 114 total columns with over 6,000 rows and a header; the final 27 columns are coded in A/T/G/C. There is also a reference column coded A/T/C/G.
e.g. OLD_file
col1 col2 3 ref ... 27 28 29 30 ...
1 r 22 A ... G A G A ...
2 f 22 C ... T T C T ...... (2 Replies)
I have a text file in the following format
CCCCCGCCCCCCCCCCcCCCCCCCCCCCCCCC
AAAATAAAAAAAAAAAaAAAAAAAAAAAAAAA
TGTTTTTTTTTTTTGGtTTTTTTTTTTTTTTT
TTTT-TTTTTTTTTCTtTTTTTTTTTTTTTTT
Each row/line will have 32 letters and each line will only have multiple occurrences of 2 letters out of a pool... (1 Reply)
Hello,
I have to find out whether the last character is digit or alphabet. I manage to strip the last character but would need some help if there is one liner available to test the above.
set x = WM
echo $x | sed 's/.*\(.$\)/\1/'
O/P
M
I would like a one liner code to test whether the... (1 Reply)
I wanted to know if there was a more efficient to do this. I was to setup a conditional for every letter of the alphabet, like so (I am parsing an array):
for i in "${arr}"; do
if ]; then
echo "$i starts with A"
else echo "$i does not start with A"
fi
done
I want to do this A-Z, is there... (6 Replies)
Hi e
Hi everyone, I can't make this script work,
#! /bin/bash declare -A crypt=( ="A" ="a" ="B" ="b" ="C" ="c" =' ' ='!' ) encode () { local word=$1 for ((i=0; i<${#word}; ++i)) ; do local char=${word:$i:1} printf %s' ' ${crypt} done ... (5 Replies)