Problem with multiple grep in bash loop


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Problem with multiple grep in bash loop
# 1  
Old 02-21-2013
Problem with multiple grep in bash loop

Hello,
I am trying to create a matrix of 0's and 1's depending on whether a gene and sample name are found in the same line in a file called results.txt. An example of the results.txt file is (tab-delimited):

Sample1 Gene1 ## Gene2 ##
Sample2 Gene2 ## Gene 4 ##
Sample3 Gene3 ## Gene 6 ##

The matrix should look like this:
Code:
            Sample1  Sample 2    Sample3
Gene1        1               0           0
Gene2        0               1           0
Gene3        0               0           1

where 1 = 'match found for both gene and sample' and 0 for 'no match'.

My code reads in three files: GenesList.txt (gene names), SampleList.txt (Sample names), and the results.txt file. The output matrix should be Matrix.out.

The multiple grep condition on a single line does not seem to be working properly for me, or maybe I have a problem with my If statement??

For the resultant matrix I expect for any row to have both 0's and 1's, but I end up with an entire matrix of only 1's or only 0's. What am I missing from this bash script?

Code:
for Gene in `cat GenesList.txt`; #Read all the genes in the file
do
      echo -n -e $Gene '\t' >> FusionMatrix.out # Echo gene name (designates the row in my table)
        for Sample in `cat SampleList.txt`;
        do
    grep_output= grep $Sample result.txt | grep $Gene

        #if ["$grep_output" == ""] #my first attempt
        if [[ $grep_output ]]; # second attempt
        then echo -n -e "1" '\t' >> Matrix.out
        else echo -n -e "0" '\t' >> Matrix.out
        fi # Value of 1 if gene involved in that sample, else 0
        done

    echo "">> Matrix.out #Move to next line
done

Thanks for any advice!!
# 2  
Old 02-22-2013
Quote:
Originally Posted by InfoSeeker2
Hello,
I am trying to create a matrix of 0's and 1's depending on whether a gene and sample name are found in the same line in a file called results.txt. An example of the results.txt file is (tab-delimited):

Sample1 Gene1 ## Gene2 ##
Sample2 Gene2 ## Gene 4 ##
Sample3 Gene3 ## Gene 6 ##

The matrix should look like this:
Code:
            Sample1  Sample 2    Sample3
Gene1        1               0           0
Gene2        0               1           0
Gene3        0               0           1

where 1 = 'match found for both gene and sample' and 0 for 'no match'.

My code reads in three files: GenesList.txt (gene names), SampleList.txt (Sample names), and the results.txt file. The output matrix should be Matrix.out.

The multiple grep condition on a single line does not seem to be working properly for me, or maybe I have a problem with my If statement??

For the resultant matrix I expect for any row to have both 0's and 1's, but I end up with an entire matrix of only 1's or only 0's. What am I missing from this bash script?

Code:
for Gene in `cat GenesList.txt`; #Read all the genes in the file
do
      echo -n -e $Gene '\t' >> FusionMatrix.out # Echo gene name (designates the row in my table)
        for Sample in `cat SampleList.txt`;
        do
    grep_output= grep $Sample result.txt | grep $Gene

        #if ["$grep_output" == ""] #my first attempt
        if [[ $grep_output ]]; # second attempt
        then echo -n -e "1" '\t' >> Matrix.out
        else echo -n -e "0" '\t' >> Matrix.out
        fi # Value of 1 if gene involved in that sample, else 0
        done

    echo "">> Matrix.out #Move to next line
done

Thanks for any advice!!
This code certainly is not portable, but if you're getting the 0's and 1's and heading correctly, the simple changes marked below in red may be enough to get this script to work on your system. Since I have no idea what any of your input files look like, I have no way of testing this suggestion.

Code:
for Gene in `cat GenesList.txt`; #Read all the genes in the file
do
      echo -n -e $Gene '\t' >> FusionMatrix.out # Echo gene name (designates the row in my table)
        for Sample in `cat SampleList.txt`;
        do
    grep_output=$(grep "$Sample" result.txt | grep "$Gene")

        #if ["$grep_output" == ""] #my first attempt
        #if [[ $grep_output ]]; # second attempt
        if [ "$grep_output" != "" ]
        then echo -n -e "1" '\t' >> Matrix.out
        else echo -n -e "0" '\t' >> Matrix.out
        fi # Value of 1 if gene involved in that sample, else 0
        done

    echo "">> Matrix.out #Move to next line
done

Note that the spaces before and after the opening square bracket and the space before the closing square bracket in the if statement are crucial.

This could be made much more efficient by getting rid of the calls to cat and by using grep -c for your last grep or by replacing most of this shell code with an awk script, but before I would suggest the changes needed to do that, I would need some actual data to verify that my suggested changes would work.

Good luck...
# 3  
Old 02-22-2013
If I understand what you want to do, You can do something like the below:

Code:
$ cat t
Sample1 Gene1   ##      Gene2   ##
Sample2 Gene2   ##      Gene4   ##
Sample3 Gene3   ##      Gene6   ##

$ cat test.sh
s="Sample1"
g1="Gene1"
g2="Gene2"
grep -q "$s[[:cntrl:]]$g1\|$s[[:cntrl:]][[:alnum:]]*[[:cntrl:]]##[[:cntrl:]]$g1" t
if [ $? = 0 ]; then
  echo "Found First" 
else
  echo "Not Found First"
fi
grep -q "$s[[:cntrl:]]$g2\|$s[[:cntrl:]][[:alnum:]]*[[:cntrl:]]##[[:cntrl:]]$g2" t
if [ $? = 0 ]; then
  echo "Found second"
else
  echo "Not Found second"
fi

$ test.sh
Found First
Found second

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to use grep in a loop using a bash script?

Dear all, Please help with the following. I have a file, let's call it data.txt, that has 3 columns and approx 700,000 lines, and looks like this: rs1234 A C rs1236 T G rs2345 G T Please use code tags as required by forum rules! I have a second file, called reference.txt,... (1 Reply)
Discussion started by: aberg
1 Replies

2. Shell Programming and Scripting

Loop through multiple files in bash script

Hi Everybody, I'm a newbie to shell scripting, and I'd appreciate some help. I have a bunch of .txt files that have some unwanted content. I want to remove lines 1-3 and 1028-1098. #!/bin/bash for '*.txt' in <path to folder> do sed '1,3 d' "$f"; sed '1028,1098 d' "$f"; done I... (2 Replies)
Discussion started by: BabyNuke
2 Replies

3. UNIX for Dummies Questions & Answers

Problem using Grep for a loop

Dear All, I have problem with generalizing my code and I can't see where the problem is. I have a main.txt file that has all the information and then I have 4 folders (headoffice, branch, management, office) that each have a 2 of files keep.txt and throw.txt and each of them have few lines... (3 Replies)
Discussion started by: A-V
3 Replies

4. Shell Programming and Scripting

Detail on For loop for multiple file input and bash variable usage

Dear mentors, I just need little explanation regarding for loop to give input to awk script for file in `ls *.txt |sort -t"_" -k2n,2`; do awk script $file done which sorts file in order, and will input one after another file in order to awk script suppose if I have to input 2 or... (4 Replies)
Discussion started by: Akshay Hegde
4 Replies

5. Shell Programming and Scripting

(BASH) Using a loop variable to grep something in a file?

Hi, I have a loop running until a variable L that is read previously in the full script. I'd like to grep some information in an input file at a line that contains the value of the loop parameter $i. I've tried to use grep, but the problem is nothing is written in the FILE files. It seems grep... (5 Replies)
Discussion started by: DMini
5 Replies

6. Shell Programming and Scripting

Problem with for loop in bash

I'm trying to do a script where I want to see if all users home directories are only writable by owner. However, in my script I do not know how to implement the for loop so that all directories are checked. In mine, I am only checking the permissions for the first directory found. I do know that a... (3 Replies)
Discussion started by: detatchedd
3 Replies

7. Shell Programming and Scripting

bash if loop for checking multiple parameters

Hello, I've got next problem: I want to examine at the beginning of a script in an if loop that: 1. Is there 4 parameters given 2. If first state is true then: is there switches -e and -d? 3. At the end, how can i indentify them as variebles regardlees to its order. I was thinking like... (2 Replies)
Discussion started by: szittyafergeteg
2 Replies

8. Shell Programming and Scripting

problem with while loop in BASH shell

I have file named script1 as follows: #!/bin/bash count="0" echo "hello" echo "$count" while do echo "$count" count=`expr $count + 1` done ----------- when I run it, I get ./script1: line 9: syntax error near unexpected token `done' ./script1: line 9: `done' I... (6 Replies)
Discussion started by: npatwardhan
6 Replies

9. Shell Programming and Scripting

Bash while loop problem

Hi, I'm having a problem with the while loop in bash. I try the following script: #!/bin/bash while true do echo "test" done When I try this, it gives me this error: while: Too few arguments. What am I doing wrong? Thanks (5 Replies)
Discussion started by: Kweekwom
5 Replies

10. Shell Programming and Scripting

Simple bash for loop problem

I'm just trying to make a script that runs in command line to echo each line in a text file. Everything i found on google is telling me to do it like this but when I run it it just echos removethese.txt and thats it. Anyone know what im doing wrong? for i in removethese.txt; do echo $i; done ... (4 Replies)
Discussion started by: kingdbag
4 Replies
Login or Register to Ask a Question