Fixing a shell script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fixing a shell script
# 1  
Old 10-09-2015
Fixing a shell script

I have this shell script that I wrote to check an input file to see if it is empty or not, and then clean the file from any line that starts with the sign "<" (without quotation marks" and then spell the number of line of the file, and the empty lines, too. The script then will create two output files, DNA.out and RNA.out.
First I get an error message at that says:

Code:
./script.sh: line 3: [: dna_input.txt: integer expression expected

but it gives me the results I want.
Here is the code:

Code:
#!/bin/bash                                                                                                                      
#check to see if there is an input file:                                                                                         
if [ $1 -lt 1 ]
then
  echo "Usage: $0 file ..."
  exit 1
fi

#Check if the file is empty or not                                                                                               
file=$1
if [[ -s $1 ]]
then
  echo ""
  echo "**** $file has data."
  echo "Number of non-empty lines:"
  grep -cve '^\s*$' $file
  echo "Mumber of empty lines:"
  grep -ce '^\s*$' $file
  #grep -cvP '^\s*$' $file -- above line originally was like this one                                                            
  echo ""
  cat $1 | sed 's/>/\n>/g' > temp1.txt
  cat temp1.txt | sed '/^>/ d' > temp2.txt
  #Remove duplicate empty lines:                                                                                                 
  awk '!NF{if(++n <=1) print; next}; {n=0; print}' < temp2.txt > DNA.out
  echo ""
  #convert to mRNA and remove temp1, temp2 to avoid confusion                                                                    
  tr ACGT UGCA < DNA.out > RNA.out
  rm temp1.txt temp2.txt

else
  echo "**** $1 has no data, or file does not exist."
  echo "**** done!"
  echo ""
fi;

After I get the two output files (DNA.out and RNA.out) I use another script to convert the contents of these two files into Amino Acids. The conversion script is:
Code:
#!/bin/sh                                                                                                                        
while read rna;do
  aawork=$(echo "${rna}" |sed -n -e 's/\(...\)/\1 /gp' | sed -f rna.sed)
  echo "$aawork" | sed 's/ //g'
  echo "$aawork" | tr ' ' '\012' | sort | sed '/^$/d' | uniq -c | sed 's/[ ]*\([0-9]*\) \(.*\)/\2: \1/'
done

This is how I use it:
Code:
./conversion.sh < RNA.out

where rna.sed is:
Code:
s/UUU /Phe /g
s/UUC /Phe /g
s/UUA /Leu /g
s/UUG /Leu /g
s/UCU /Ser /g
s/UCC /Ser /g
s/UCA /Ser /g
s/UCG /Ser /g
s/UAU /Tyr /g
s/UAC /Tyr /g
s/UAA /STOP /g
s/UAG /STOP /g
s/UGU /Cys /g
s/UGC /Cys /g
s/UGA /STOP /g
s/UGG /Trp /g
s/CUU /Leu /g
s/CUC /Leu /g
s/CUA /Leu /g
s/CUG /Leu /g
s/CCU /Pro /g
s/CCC /Pro /g
s/CCA /Pro /g
s/CCG /Pro /g
s/CAU /His /g
s/CAC /His /g
s/CAA /Gln /g
s/CAG /Gln /g
s/CGU /Arg /g
s/CGC /Arg /g
s/CGA /Arg /g
s/CGG /Arg /g
s/AUU /Ile /g
s/AUC /Ile /g
s/AUA /Ile /g
s/AUG /Met /g
s/ACU /Thr /g
s/ACC /Thr /g
s/ACA /The /g
s/ACG /Thr /g
s/AAU /Asn /g
s/AAC /Asn /g
s/AAA /Lys /g
s/AAG /Lys /g
s/AGU /Ser /g
s/AGC /Ser /g
s/AGA /Arg /g
s/AGG /Arg /g
s/GUU /Val /g
s/GUC /Val /g
s/GUA /Val /g
s/GUG /Val /g
s/GCU /Ala /g
s/GCC /Ala /g
s/GCA /Ala /g
s/GCG /Ala /g
s/GAU /Asp /g
s/GAC /Asp /g
s/GAA /Glu /g
s/GAG /Glu /g
s/GGU /Gly /g
s/GGC /Gly /g
s/GGA /Gly /g
s/GGG /Gly /g

Now I want to know if I can put these two scripts together in one file and if possible to clean up the script.

My input file (sample) to be used with the first script (script.sh) is here:
Code:
>Header_Sequence_1
GTACGACGGAGTGTTATAAGATGGGAAATCGGATACCAGATGAAATTGTGGATCGGTGCAAAA
GTCGGCAGATATCGTTGAAGTCATAGGTGATTATGTTCAATTAAAGAAGCAAGGCCGAAACTAC
TTTGGACTCTGTCCTTTTCATGGAGAAAGCACACCTTCGTTTTCCGTATCGCCCGACAAACAGAT
TTTTCATTGCTTTGGCTGCGGAGCGGGCGGCAATGTTTTCTCTTTTTTAAGGCAGATGGAAGGCT
ATTCTTTTGCCGAGTCGGTTTCTCACCTTGCTGACAAATACCAAATTGATTTTCCAGATGATATAA
CAGTCCATTCCGGAGCCCGGCCAGAG

>Header_Sequence_2
TCTTCTGGAGAACAAAAAATGGCTGAGGCACATGAGCTCCTGAAGAAATTTTACCATCATTTGT
TAATAAATACAAAAGAAGGTCAAGAGGCACTGGATTATCTGCTTTCTAGGGGCTTTACGAAAGA
GCTGATTAATGAATTTCAGATTGGCTATGCTCTTGATTCTTGGGACTTTATCACGAAATTCCTTGT
AAAGAGGGGATTTAGTGAGGCGCAAATGGAAAAAGCGGGTCTCCTGATCAGACGCGAAGACGGAAGCGGATATTTCGACCGCTTCAGAAACC
GTGTCATGTTTCCGATCCATGATCATCACGGGGCTGTTGTTGCTTTCTCAGGCAGGGCTCTTGG

>Header_Sequence_3
CCGCTGTATTCTCAGCCAAGCGGTATAGTCTCCGCTGTATTCTCAGCCCCAGCCGTTCCACTCAG
AGGAACTTTAAAGGATGTTCCTGTTGAGGGCTCATCATCGTCATCGTCATCATCATCATCATCAT
CATCATCATCATCATCAACATCAACCGTCGCACCAGCAAATAAGGCAAGAACTGGAGAAGACGC
AGAAGGCAGTCAAGATTCTAGTGGTACTGAAGCTTCTGGTAGCCAGGGTTCTGAAGAGGAAGG
TAGTGAAGACGATGGCCAAACTAGTGCTGCTTCCCAACCCACTACTCCAGCTCAAAGTGAAGGC
GCAACTACCGAAACCATAGAAGCTACTCCAAAAGAAGAATGCGGCACTTCATTTGTAATGTGGT
TCGGAGAAGGTACCCCAGCTGCGACATTGAAGTGTGGTGCCTACACTATCGTCTATGCACCTAT
AAAAGACCAAACAGATCCCGCACCAAGATATATCTCTGGTGAAGTTACATCTGTAACCTTTGAA
AAGAGTGATAATACAGTTAAAATCAAGGTTAACGGTCAGGATTTCAGCACTCTCTCTGCTAATTC
AAGTAGTCCAACTGAAAATGGCGGATCTGCGGGTCAGGCTTCATCAAGATCAAGAAGATCACT
CTCAGAGGAAACCAGTGAAGCTGCTGCAACCGTCGATTTGTTTGCCTTTACCCTTGATGGTGGT
AAAAGAATTGAAGTGGCTGTACCAAACGTCGAAGATGCATCTAAAAGAGACAAGTACAGTTTG
GTTGCAGACGATAAACCTTTCTATACCGGCGCAAACAGCGGCACTACCAATGGTGTCTACAGGT
TGAATGAGAACGGAGACTTGGTTGATAAGGACAACACAGT

to sum up what I do:
Code:
./script.sh input.txt

Which generates: DNA.out and RNA.out (with the error I mentioned), then:
Code:
./conversion.sh < RNA.out

where conversion.sh uses rna.sed


I hope I could make my questions clear and I appreciate your help.

Last edited by faizlo; 10-09-2015 at 05:15 AM.. Reason: minor edit
# 2  
Old 10-09-2015
This one creates three files: DNA.OUT, RNA.OUT, and AminoAcids from the input file, and prints the numbers of non-empty and empty lines in the file:
Code:
awk '
/^[     ]*$/    {EMP++
                 next
                }
/^>/    {$0=""
        }

        {print > "DNA.out"
         gsub (/A/, "U")
         gsub (/C/, "c")
         gsub (/G/, "C")
         gsub (/T/, "A")
         gsub (/c/, "G")
         print > "RNA.OUT"
         gsub (/.../, "& ")
         gsub (/UU[CU] /, "Phe ")
         gsub (/UA[UC] /, "Tyr ")
         gsub (/GC[ACGU] /, "Ala ")
         gsub (/GG[ACGU] /, "Gly ")
         gsub (/CC[ACGU] /, "Pro ")
         gsub (/AC[ACGU] /, "Thr ")
         gsub (/GU[ACGU] /, "Val ")
         gsub (/(CG[ACGU]|AG[AG]) /, "Arg ")
         gsub (/(CU[ACGU]|UU[AG]) /, "Leu ")
         gsub (/(UC[ACGU]|AG[CU]) /, "Ser ")
         print > "AminoAcids"
        }

END     {print "lines: ", NR-EMP
         print "empty: ", EMP
        }
' file

I did NOT convert ALL of the RNA-Amino combinations from your sed file; should be easliy doable for you with the samples given...
This User Gave Thanks to RudiC For This Post:
# 3  
Old 10-09-2015
Hello Faizlo
Your error:
Code:
./script.sh: line 3: [: dna_input.txt: integer expression expected

is produced by comparing a string with an integer.
Code:
#!/bin/bash                                                                                                                      
#check to see if there is an input file:                                                                                         
if [ $1 -lt 1 ]
then
  echo "Usage: $0 file ..."
  exit 1
fi

You probably wanted either:
if [ -z "$1" ] (if its empty, show error)
or
if [ ${#1} -eq 1 ] (if its just one string/word)

hth
# 4  
Old 10-09-2015
I wonder if
Code:
#!/bin/bash                                                                                                                      
#check to see if there is an input file:                                                                                         
if [ $1 -lt 1 ]
then
  echo "Usage: $0 file ..."
  exit 1
fi

which is wrong as you saw the error : You cant compare CHARS with Integer which is what is expected, should not be more like:
Code:
#!/bin/bash                                                                                                                      
#check to see if there is an input file:                                                                                         
if [ "$#" -lt 1 ]
then
  echo "Usage: $0 file ..."
  exit 1
fi

# 5  
Old 10-09-2015
Thank you all for your help. The error has gone.

RudiC: thank you for your awk script.

Can my scripts be combined together in just one script? My script gives the frequencies of each Amino Acid. Also, I ask to learn how to add a while loop in a script if it takes an input of its own.

Thank you all again

Last edited by faizlo; 10-09-2015 at 08:14 PM..
# 6  
Old 10-10-2015
I like my code to run in /tmp file when I am using a static files.
Code:
#!/bin/bash                                    
for i in /home/*/Desktop/test.txt; do cp $i /tmp/test.txt && echo "Usage: $i file ... " || echo "File is not where you think it should be";
sleep 3
done
exit 0

Consider trapping ( think that is what it called. Do a google) the file. So after checking that's the file exists with code above, follow up with ...
Code:
cd /tmp
sudo ./test.txt

And then within test.txt. Have this line of code. Which also checks the file was placed in /tmp. Then runs your test.txt file.
This is called trapping and insure the program will stop running when it is completed. You can also run sub routines under the trap before running main code body.
Code:
FILE=/tmp/$(basename $0)
trap ` cp /your/file/ /to/anydir && echo " ... " ; ` EXIT
if [ -e $FILE ]; then

# 7  
Old 10-10-2015
Quote:
Originally Posted by faizlo
.
.
.
Can my scripts be combined together in just one script?
Yes. Done.

Quote:
My script gives the frequencies of each Amino Acid.
.
.
.
That wasn't too clear from your spec. Try
Code:
awk  '
BEGIN           {split ("UU[CU] UA[UC] GC[ACGU] GG[ACGU] CC[ACGU] AC[ACGU] GU[ACGU] (CG[ACGU]|AG[AG]) (CU[ACGU]|UU[AG]) (UC[ACGU]|AG[CU])", TMP1)
                 for (i=split ("Phe Tyr Ala Gly Pro Thr Val Arg Leu Ser", TMP2); i>0; i--)      {AACID[TMP2[i]]
                                                                                                 BASES[TMP2[i]]=TMP1[i]
                                                                                                }
                 if (DEBUG) {for (t in TMP1) print TMP2[t], TMP1[t]}
                }


/^[     ]*$/    {EMP++
                 next
                }
/^>/            {$0=""
                }

                {print > "DNA.OUT"
                 gsub (/A/, "U")
                 gsub (/C/, "c")
                 gsub (/G/, "C")
                 gsub (/T/, "A")
                 gsub (/c/, "G")
                 print > "RNA.OUT"
                 gsub (/.../, "& ")
                 for (a in AACID) ACNT[a] += gsub (BASES[a], a)
                 print > "AminoAcids"
                }

END             {print "lines: ", NR-EMP
                 print "empty: ", EMP
                 for (a in ACNT) print a, ACNT[a]
                }
' file
lines:  28
empty:  2
Ser 54
Val 39
Tyr 16
Ala 18
Gly 25
Pro 30
Thr 48
Leu 45
Arg 36
Phe 37

Please note that there are several incomplete base triples in your file and that I did not cover all triples, just a few to prove the method working.
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Getting started with fixing bugs for Linux

Okay I want to try my luck at fixing bugs for the Fedora OS, but I guess this question deals with any Linux distro or any open source OS for that matter. I want to know how I can start fixing bugs on the OS level. For example the particular bug that I want to target is this logout bug I mean... (6 Replies)
Discussion started by: sreyan32
6 Replies

2. Shell Programming and Scripting

Help fixing awk code

can someone please help me spot and fix the issue with the following code: awk -F, -v SEARCHPATT="(Wed|Tue)" -v ADDISTR="Mon|Tue|Wed|Thu|Fri|Sat|Sun" -vVF="$VALFOUND" "BEGIN{ {D = D = 1 D = D = 2 } $0 ~ "," VF "," {L = 1 ... (9 Replies)
Discussion started by: SkySmart
9 Replies

3. UNIX for Advanced & Expert Users

Help with fixing screen position

Hey guys, I am trying to make print a pattern with * on a 10*10 two dimensional array in a for loop and I want the incoming 10*10 to overlap the previous 10*10 so that the * look like it is moving. is there a way to fix the screen position? ever time it prints a 10*10 the screen moves. ... (3 Replies)
Discussion started by: amit14august
3 Replies

4. AIX

Fixing security problem

Hi I use Rapid 7 to check some servers ( AIX 5.3 ) for security problems. There are 2 problems I don't know to deal with 1. Problem : TCP Sequence Number Approximation Vulnerability Solution : _Enable TCP MD5 Signature 2. Problem : HTTP Basic Authentication Enable Solution : _ Use... (5 Replies)
Discussion started by: bobochacha29
5 Replies

5. Homework & Coursework Questions

Help fixing my database script

1. The problem statement, all variables and given/known data: I need help I get a variant of syntax errors when compiling my script to maintain a database. It's a simple database meant to create/view/maintain vehicles. 2. Relevant commands, code, scripts, algorithms: my if statements have... (5 Replies)
Discussion started by: gamernerd101
5 Replies

6. Solaris

help needed for fixing zfs bug

Hi Experts I've problem in a my office server (solaris 10 - x86) version. x4600 M2 hardware This system is getting rebooted because of zfs bug I've applied patch using live upgrade with live new environment created and applied the patch which oracle suggested( 144501-19), it asks for... (3 Replies)
Discussion started by: SunSolars_admin
3 Replies

7. Shell Programming and Scripting

help fixing awk statement

awk "BEGIN {if($MessageREAD<$ThresholdW) {print \"OK\" ; exit 0} else if(($MessageREAD>=$ThresholdW) && ($MessageREAD<$ThresholdC)) {print \"WARNING\" ; exit 1}" else if($MessageREAD<=$ThresholdC) {print \"CRITICAL\" ;... (4 Replies)
Discussion started by: SkySmart
4 Replies

8. Shell Programming and Scripting

Fixing the width of a word

Is there a way to fix the width of the word being printed to a file? I am trying to create an output to a file with columns , like a spread sheet. I have used "\t" to adjust the columns but still it does not show well in the file, mainly due to the variable length values in the column so \t does... (1 Reply)
Discussion started by: davidtd
1 Replies

9. Linux

fixing with sed

I am trying to replace the value of $f3 but its not working . I don't know what I am missing here . cat dim_copy.20080516.sql | grep -i "create view" | grep -v OPSDM002 | while read f1 f2 f3 f4 f5 f6 f7 f8 f9 do echo " $f3 " sed -e... (13 Replies)
Discussion started by: capri_drm
13 Replies
Login or Register to Ask a Question