I wrote the following script in R. However, i can not run it. Because the data file is so big. Therefore, i need to write it in shell script. Could you please help me?
######################################
Code:
data=as.matrix(read.table("data.txt"))
file=as.matrix(read.table("file.txt"))
n1=dim(file)[1] # number of lines in file.txt
n2=dim(data)[1] # number of lines in data.txt
control=file[,3:4] # 3th and 4th column of file.txt
new=matrix(nrow=n1, ncol=1) # new matrix to store the output
count=0
for (j in 1:n1)
{
count=count+1
for (i in 1:n2)
{
if (data[i, ((2*j)-1):(2*j)]!=c(control[j,1],control[j,1])&& data[i, ((2*j)-1):(2*j)]!=c(control[j,1],control[j,2])&& data[i, ((2*j)-1):(2*j)]!=c(control[j,2],control[j,1])&& data[i, ((2*j)-1):(2*j)]!=c(control[j,2],control[j,2]))
{
new[count]=file[j,1]
}
}
}
################################
data.txt is genotype data and looks like
G A G A G A G G G A G A ...
G A G G G A A G G G G G ...
...
G A G A G A G A ...
file.txt looks like
snp1 265 G T
snp2 546 A G
snp3 905 A G
snp4 965 T G
...
new.txt which is the output should looks like
snp1
snp4
...
So, the algorithm compares the columns from data.txt
i.e 1st and 2nd column
G A
G A
..
G A
by the 1st line 3th 4th column of the file.txt (G T) and if it is not any of the combination (G T, G G, T G, T T) then it reports to new.txt
#!/usr/bin/ksh
cut -d' ' -f1,2 data.txt > data2.txt
# Input for loop will be: G A snp1 265 G T
paste data2.txt file.txt |
while read m1 m2 m3 m4 m5 m6; do
if [[ "${m1}" = "${m5}" && "${m2}" = "${m6}" ]]; then
echo ${m3}
fi
done
Hello,
I am on a Mac and trying to clean up some monthly files with a very simple SED:
sed '3,10d;/<ACROSS>/,$d' input.txt > output.txt
(from the input, delete lines 3 - 10; then delete from the line containing <ACROSS> to the end of the file)
then output to output.txt
Even when I try... (2 Replies)
hi all
syslog is not getting written.
i am getting following two logs snmpd.log & authlog logs.
please tell what are two logs snmpd.log & authlog logs.
why syslog is not written. (16 Replies)
Just learned c language ,but I don't know where to start to write some applications under Linux ,I really appreciate it if anybody can help me find some books or sites on it. (2 Replies)
Hi, I have got sample linux driver written in C.
I got also some assembly code, compiled into .o file (using as compiler).
In my Makefile I got:
obj-m += someDriver.o
someDriver-objs := CFile1.o CFile2.o ASMFile.o
default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modulesUnfortunatelly I cannot... (0 Replies)
Hi folks
I'm new here in this forum and hope to find someone who can help me.
I couldn't find a solution already posted in the forum.
Does anyone know what algorithm the shell command find uses? It looks like Top-Down but I need to be 100% sure. So can anyone confirm that?
Thx for your... (4 Replies)
Hi all,
I need to calculate MAC value using HMAC-SHA256 algorithm with a message and a key.
Is there any Linux APIs/utilities already exist for HMAC-SHA256?
Thanks,
Amio (3 Replies)
Hello,
Which command in unix can tell whether a file is being used/written by another process.
e.g. If one process is copying a very big file in some directory and there is another cronjob process which checks for a new file and in this directory and process the file. I want to check, if the... (4 Replies)