Sponsored Content
Full Discussion: fast sequence extraction
Top Forums UNIX for Dummies Questions & Answers fast sequence extraction Post 302665175 by Ygor on Monday 2nd of July 2012 05:37:22 AM
Old 07-02-2012
Try...
Code:
$ head file[12]
==> file1 <==
>someseq
GAACTTGAGATCCGGGGAGCAGTGGATCTC
CACCAGCGGCCAGAACTGGTGCACCTCCAG
GCCAGCCTCGTCCTGCGTGTC
>another seq
GGCATTTTTGTGTAATTTTTGGCTGGATGAGGT
GACATTTTCATTACTACCATTTTGGAGTACA
>seq3450
TTTTCCTGTTCACTGCTGCTTTTCTATAGACAGCA
GCAGCAAGCAGTAAGAGAAAGTA

==> file2 <==
someseq 5       10
another seq     1       12
seq3450 3       10

$ awk 'NR==FNR{if($0~/^>/){i=substr($0,2);getline};a[i]=a[i] $0;next}{print ">" $1 ORS substr(a[$1], $2, $3-$2+1)}' file1 FS=\\t file2
>someseq
TTGAGA
>another seq
GGCATTTTTGTG
>seq3450
TTCCTGTT

$

 

7 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Need help fast

I am trying to reset the IP address on a Unix HP box here in my office and I am stuck in this EM100 mode and cant issue any commands. Any help would be great. By the way I no zero about unix. Thanks (0 Replies)
Discussion started by: zx6ninja
0 Replies

2. Solaris

what is that 1 in the instruction!~ (please help fast)

Hi all, make_lofs /.cdrom/<something>/<something> 1 what does this instruction mean? Note:both the "something" are obviously different . I would like to know what that 1 means, the rest of the instruction is clear!! Thanks (6 Replies)
Discussion started by: wrapster
6 Replies

3. Solaris

How do you ufsrestore the fast way?

hi, on my sol9 box i create my backup using the below command: /usr/sbin/ufsdump 0uf /dev/rmt/0n /u1 /usr/sbin/ufsdump 0uf /dev/rmt/0n /u2 /usr/sbin/ufsdump 0uf /dev/rmt/0n /u3 /usr/sbin/ufsdump 0uf /dev/rmt/0n /u4 now on the new sol10 box, to restore i use this commands: cd /u1... (3 Replies)
Discussion started by: pinoy43v3r
3 Replies

4. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ... (1 Reply)
Discussion started by: manigrover
1 Replies

5. Shell Programming and Scripting

Help me in this script fast

i have log files that represent names, times and countries, each name come once in country but may in diff times i need at end each name visited which country and its USA | Tony | 12:25:22:431 Italy | Tony | 09:33:11:212 **** Italy| John | 08:22:12:349 France | Adam | 14:22:42:981... (2 Replies)
Discussion started by: teefa
2 Replies

6. Shell Programming and Scripting

Sequence extraction

i want to extract specific region of interest from big file. i have only start position, end position and seq id, see my query is: I have file1 is this >GL3482.1 GAACTTGAGATCCGGGGA GCAGTGGATCTCCACCAG CGGCCAGAACTGGTGCAC CTCCAGGCCAGCCTCGTC CTGCGTGTC >GL3550.1... (14 Replies)
Discussion started by: harpreetmanku04
14 Replies

7. Shell Programming and Scripting

Extraction of upstream and downstream regions from long sequence file

Hello, here I am posting my query again with modified data input files. see my query is : i have two input files file1 and file2. file1 is smalldata.fasta >gi|546671471|gb|AWWX01449637.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449636, whole genome shotgun sequence... (20 Replies)
Discussion started by: harpreetmanku04
20 Replies
RE-PCR(1)						      General Commands Manual							 RE-PCR(1)

NAME
re-PCR -- Find sequence tagged sites (STS) in DNA sequences SYNOPSIS
re-PCR [-hV] -p hash-file [-g gaps] [-n mism] [-lq] [primer ...] re-PCR [-hV] -P hash-file [-g gaps] [-n mism] [-l] [-m margin] [-O+|-] [-C batchcnt] [-o outfile] [-r+|-] [primers-file ...] re-PCR [-hV] -s hash-file [-g gaps] [-n mism] [-lq] [-m margin] [-o outfile] [-r+|-] [left right lo[-hi] [...]] re-PCR [-hV] -S hash-file [-g gaps] [-n mism] [-lq] [-m margin] [-O+|-] [-C batchcnt] [-o outfile] [-r+|-] [stsfile ...] DESCRIPTION
Implements reverse searching (called Reverse e-PCR) to make it feasible to search the human genome sequence and other large genomes by per- forming STS and primer searches. OPTIONS
-p=hash-file Perform primer lookup using hash-file -P=hash-file Perform primer lookup using hash-file -s=hash-file Perform STS lookup using hash-file -S=hash-file Perform STS lookup using hash-file -n=mism Set max allowed mismatches per primer for lookup -g=gaps Set max allowed indels per primer for lookup -m=margin Set variability for STS size for lookup -l Use presize alignments (only if gaps>0) -G Print alignments in comments -d=min-max Set default STS size -r=+|- Enable/disable reverse STS lookup -O=+|- Enable/disable syscall optimisation -C=batchcnt Set number of STSes per batch -o=outfile Set output file name -q Quiet (no progress indicator) EXAMPLE
famap -tN -b genome.famap org/chr_*.fa fahash -b genome.hash -w 12 -f3 ${PWD}/genome.famap re-PCR -s genome.hash -n1 -g1 ACTATTGATGATGA AGGTAGATGTTTTT 120-200 See famap(1) and fahash(1) SEE ALSO
/usr/share/doc/ncbi-epcr/README.txt bioperl(1), e-pcr(1), famap(1) and fahash(1) AUTHORS
This manual page was written by Andreas Tille <tille@debian.org> for the Debian system (but may be used by others). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version published by the Free Software Foundation. On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL. April 2008 RE-PCR(1)
All times are GMT -4. The time now is 11:37 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy