02-11-2013
RNA-seq analysis
I am processing RNA-seq data files that have been aligned using RUM. One of the output files is a *.sam that includes:
Unique alignments
Non-unique alignments
original read files
I want to extract only the unique alignments by pulling out alignments that have "IH:i:1" (indicates this read had only one alignment).
I have tried.... grep "IH:i:1" file.sam > filtered.sam
but this will also return "IH:i:11 IH:i:12" etc etc. I have also tried "IH:i:1 " which returns nothing (I believe it's tab delimited)
Any suggestions?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
Normally, I will manually to use "ll" command to list the following file from \FILE\CACHE\ directory and check the jump seq. Can I write a script to loop or/and check jump seq file (if jumped seq and show "missing seq no" message for me)
-rw-rw----+ 1 user develop 14012 Sep 4... (1 Reply)
Discussion started by: happyv
1 Replies
2. Shell Programming and Scripting
Greetings,
I am new to scripting, but find if I can see the code working for a given problem, then I can eventually figure it out. (9 Replies)
Discussion started by: rocinante
9 Replies
3. Shell Programming and Scripting
Hi guys. i have the following script:
1 #!/bin/bash
2 linkcount=$(grep "/portal" tickets | wc -l)
3 grep "/portal" tickets > links
4 for i in $(seq 1 $linkcount); do
5 echo "BLYAT"
6 let link$i=$(sed -n "$i"p links)
7 echo $
8 done
the problem is, that "let" can`t... (1 Reply)
Discussion started by: neverhood
1 Replies
4. Shell Programming and Scripting
Hi all. Im trying to use a sequence in a while loop like this below. I need it for navigating a year, month, day folder structure where a user can input the start date and have it go to the desired end date. The script will grab a certain file on each day then move onto the next. Ive got all that... (3 Replies)
Discussion started by: Grizzly
3 Replies
5. Shell Programming and Scripting
I usually just browse the forum/google for answers, however I've been stuck on a problem for a number of hours now and I've decided to join up and actually ask I've searched the forum ad naseum in an attempt to find answer to my query, however so far I have been unsuccessful.
I'm no expert... (3 Replies)
Discussion started by: gtc
3 Replies
6. UNIX for Dummies Questions & Answers
Assuming one does not have such luxuries as bash, zsh, jot, rs, perl, etc. what is the most elegant way to print out a formatted date series like this:
01-01-2010
01-02-2010
01-03-2010
...
02-01-2010
02-02-2010
...
Can I accomplish this with just basic shell builtins and seq, or... (3 Replies)
Discussion started by: uiop44
3 Replies
7. Shell Programming and Scripting
I have a file like this...
2183842512010-11-25 15379043 453130325
2386225062010-11-30 4946518 495952336
2386225062010-11-30 4946518 495952345
2386225062010-11-25 262066688 -516224026
2679350512010-11-25 262066688 -516224124
3196089062010-11-25 262066688 203238229... (5 Replies)
Discussion started by: issaq84mohd
5 Replies
8. Shell Programming and Scripting
Greetings,
Using linux based OS and KSH.
I m trying to make a simple script to parse some logs to show a count per hour on a specific alarm starting from midnight to the current hour.
So I format my "HOUR" variable to show the current time and so I can use it in the following bit of code.... (6 Replies)
Discussion started by: Sekullos
6 Replies
9. Shell Programming and Scripting
I have a below file
FILE.cfg
JAN_01
VAR1=4
VAR2=SUM
VAR3=PRIVATE
JAN_10
VAR1=44
VAR2=GUN
VAR3=NATUR
JAN_20
VAR1=3
VAR2=TQN
VAR3=COMMA
code: (JAN_10 is argument passed from script) (6 Replies)
Discussion started by: Roozo
6 Replies
10. Shell Programming and Scripting
Hi! I'm trying to do this:
1 -
2 -
3 -
4 -
5 -
I'm using seq for this:
seq 1 20 > filename.txt
How do I get the "-"? I've tried -f per man but can't get anything to work. Also, is there an easier or better way than using sequence? Thanks! (6 Replies)
Discussion started by: TonyBe
6 Replies
LEARN ABOUT DEBIAN
compalign
COMPALIGN(1) General Commands Manual COMPALIGN(1)
NAME
compalign - compare two multiple alignments
SYNOPSIS
compalign [-options] <trusted-alignment> <test-alignment>
DESCRIPTION
compalign calculates the fractional "identity" between the trusted alignment and the test alignment. The two files must contain exactly the
same sequences, in exactly the same order.
The identity of the multiple sequence alignments is defined as the averaged identity over all N(N-1)/2 pairwise alignments.
The fractional identity of two sets of pairwise alignments is in turn defined as follows (for aligned known sequences k1 and k2, and
aligned test sequences t1 and t2):
matched columns / total columns
where total columns = the total number of columns in which there is
a valid (nongap) symbol in k1 or k2;
matched columns = the number of columns in which one of the
following is true:
k1 and k2 both have valid symbols at a given column; t1 and t2
have the same symbols aligned in a column of the t1/t2
alignment;
k1 has a symbol aligned to a gap in k2; that symbol in t1 is
also aligned to a gap;
k2 has a symbol aligned to a gap in k1; that symbol in t2 is
also aligned to a gap.
Because scores for all possible pairs are calculated, the algorithm is of order (N^2)L for N sequences of length L; large sequence sets
will take a while.
OPTIONS
Available options:
-h Print short help and usage info.
-c Only compare under marked #=CS consensus structure.
--informat <s>
Specify that both alignments are in format <s> (MSF, for instance).
--quiet
Suppress verbose header (used in regression testing).
SEE ALSO
afetch(1), alistat(1), compstruct(1), revcomp(1), seqsplit(1), seqstat(1), sfetch(1), shuffle(1), sindex(1), sreformat(1), stranslate(1),
weight(1).
AUTHOR
Sean Eddy
HHMI/Department of Genetics
Washington University School of Medicine
4444 Forest Park Blvd., Box 8510
St Louis, MO 63108 USA
Phone: 1-314-362-7666
FAX : 1-314-362-2157
Email: eddy@genetics.wustl.edu
This manual page was written by Nelson A. de Oliveira <naoliv@gmail.com>,
for the Debian project (but may be used by others).
Mon, 01 Aug 2005 15:28:08 -0300 COMPALIGN(1)