An algorithm to be written in linux command


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting An algorithm to be written in linux command
# 1  
Old 10-19-2011
An algorithm to be written in linux command

Hi All,

I wrote the following script in R. However, i can not run it. Because the data file is so big. Therefore, i need to write it in shell script. Could you please help me?

######################################
Code:
data=as.matrix(read.table("data.txt"))
file=as.matrix(read.table("file.txt"))
n1=dim(file)[1]    # number of lines in file.txt
n2=dim(data)[1]  # number of lines in data.txt
control=file[,3:4] # 3th and 4th column of file.txt
new=matrix(nrow=n1, ncol=1)  # new matrix to store the output
count=0
for (j in 1:n1)
{
 count=count+1
  for (i in 1:n2)
 {  
  if (data[i, ((2*j)-1):(2*j)]!=c(control[j,1],control[j,1])&& data[i, ((2*j)-1):(2*j)]!=c(control[j,1],control[j,2])&& data[i, ((2*j)-1):(2*j)]!=c(control[j,2],control[j,1])&& data[i, ((2*j)-1):(2*j)]!=c(control[j,2],control[j,2]))  
   {
    new[count]=file[j,1]
   }
  } 
}

################################
data.txt is genotype data and looks like

G A G A G A G G G A G A ...
G A G G G A A G G G G G ...
...
G A G A G A G A ...

file.txt looks like

snp1 265 G T
snp2 546 A G
snp3 905 A G
snp4 965 T G
...

new.txt which is the output should looks like

snp1
snp4
...

So, the algorithm compares the columns from data.txt
i.e 1st and 2nd column

G A
G A
..
G A

by the 1st line 3th 4th column of the file.txt (G T) and if it is not any of the combination (G T, G G, T G, T T) then it reports to new.txt

Does that make sense?

Thanks in advance,


Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by senayasma; 10-19-2011 at 01:13 PM..
# 2  
Old 10-19-2011
For each $1 and $2 in "data.txt" you want to compare with the equivalent record in "file.txt" for $3 and $4.

If not the same, then display $1 from "file.txt".

Is this correct?
# 3  
Old 10-19-2011
yes this is correct.
# 4  
Old 10-19-2011
See if this works for you:
Code:
#!/usr/bin/ksh
cut -d' ' -f1,2 data.txt > data2.txt
# Input for loop will be: G A snp1 265 G T
paste data2.txt file.txt |
while read m1 m2 m3 m4 m5 m6; do
  if [[ "${m1}" = "${m5}" && "${m2}" = "${m6}" ]]; then
    echo ${m3}
  fi
done

Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Simple sed command not working; could be a Mac/Linux vs. PC/Linux issue

Hello, I am on a Mac and trying to clean up some monthly files with a very simple SED: sed '3,10d;/<ACROSS>/,$d' input.txt > output.txt (from the input, delete lines 3 - 10; then delete from the line containing <ACROSS> to the end of the file) then output to output.txt Even when I try... (2 Replies)
Discussion started by: verbatim
2 Replies

2. Solaris

syslog is not getting written

hi all syslog is not getting written. i am getting following two logs snmpd.log & authlog logs. please tell what are two logs snmpd.log & authlog logs. why syslog is not written. (16 Replies)
Discussion started by: nikhil kasar
16 Replies

3. Programming

book on linux application written with c

Just learned c language ,but I don't know where to start to write some applications under Linux ,I really appreciate it if anybody can help me find some books or sites on it. (2 Replies)
Discussion started by: hgdcjq
2 Replies

4. Programming

Linking Linux Driver written in C with ASM module

Hi, I have got sample linux driver written in C. I got also some assembly code, compiled into .o file (using as compiler). In my Makefile I got: obj-m += someDriver.o someDriver-objs := CFile1.o CFile2.o ASMFile.o default: $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modulesUnfortunatelly I cannot... (0 Replies)
Discussion started by: Chrisdot
0 Replies

5. Shell Programming and Scripting

Algorithm of find command

Hi folks I'm new here in this forum and hope to find someone who can help me. I couldn't find a solution already posted in the forum. Does anyone know what algorithm the shell command find uses? It looks like Top-Down but I need to be 100% sure. So can anyone confirm that? Thx for your... (4 Replies)
Discussion started by: oku
4 Replies

6. UNIX and Linux Applications

Cryotography -Linux API for HMAC-SHA256 algorithm

Hi all, I need to calculate MAC value using HMAC-SHA256 algorithm with a message and a key. Is there any Linux APIs/utilities already exist for HMAC-SHA256? Thanks, Amio (3 Replies)
Discussion started by: amio
3 Replies

7. UNIX for Dummies Questions & Answers

File being used/written

Hello, Which command in unix can tell whether a file is being used/written by another process. e.g. If one process is copying a very big file in some directory and there is another cronjob process which checks for a new file and in this directory and process the file. I want to check, if the... (4 Replies)
Discussion started by: sanjay92
4 Replies
Login or Register to Ask a Question