Sponsored Content
Top Forums Shell Programming and Scripting Extract the part of sequences from a file Post 302881037 by Klashxx on Thursday 26th of December 2013 08:37:01 AM
Old 12-26-2013
Use:
Code:
line= line.strip().split()

Instead of:
Code:
line= line.strip().split('\t')

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Extract a part of file name

Hi, I want to extract a part of filename and pass it as a parameter to one of the scripts. Could someone help. File name:- NLL_NAM_XXXXX.XXXXXXX_1_1.txt. Here i have to extract only XXXXX.XXXXXXX and the position will be constant. that means that i have to extract some n characters from... (6 Replies)
Discussion started by: dnat
6 Replies

2. Shell Programming and Scripting

How to extract certain part of log file?

Hi there, I'm having some problem with UNIX scripting (ksh), perhaps somebody can help me out? For example: ------------ Sample content of my log file (text file): -------------------------------------- File1: .... info_1 ... info_2 ... info_3 ... File2: .... info_1 ... info_2 ...... (10 Replies)
Discussion started by: superHonda123
10 Replies

3. Shell Programming and Scripting

extract part of text file

I need to extract the following lines from this text and put it in different files. From xxxx@gmail.com Thu Jun 10 21:15:46 2010 Return-Path: <xxxxx@gmail.com> X-Original-To: xxx@localhost Delivered-To:xxxx@localhost Received: from ubuntu (localhost ) by ubuntu (Postfix) with ESMTP... (11 Replies)
Discussion started by: waxo
11 Replies

4. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ... (2 Replies)
Discussion started by: Diya123
2 Replies

5. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
Discussion started by: empyrean
4 Replies

6. Shell Programming and Scripting

Extract part of file

Hello All, I need to extract part of a file into a new file My file is Define schema xxxxxx Insert into table ( a ,b ,c ,d ) values ( 1, 2, 3, (15 Replies)
Discussion started by: nnani
15 Replies

7. Shell Programming and Scripting

Extract sequences of bytes from binary for differents blocks

Hello to all, I would like to search sequences of bytes inside big binary file. The bin file contains blocks of information, each block begins is estructured as follow: 1- Each block begins with the hex 32 (1 byte) and ends with FF. After the FF of the last block, it follows 33. 2- Next... (59 Replies)
Discussion started by: Ophiuchus
59 Replies

8. Shell Programming and Scripting

Extract sequences from a FASTA file based on another file

I have two files. File1 is shown below. >153L:B|PDBID|CHAIN|SEQUENCE RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM DIGTTHDDYANDVVARAQYYKQHGY >16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
Discussion started by: nelsonfrans
7 Replies

9. Programming

Extract part of an archive to a different file

I need to save part of a file to a different one, start and end offset bytes are provided by two counters in long format. If the difference is big, how should I do it to prevent buffer overflow in java? (7 Replies)
Discussion started by: Tribe
7 Replies

10. Shell Programming and Scripting

Extract a part of variable/line content in a file

I have a variable and assigned the following values ***XYZ_201519_20150929140642_20150929140644_211_0_0_211 I need to read this variable from backward and stop read when I get first underscore (_) In this scenario I should get 211 Thanks Kris (3 Replies)
Discussion started by: mkris
3 Replies
Bio::Tools::Run::BEDTools(3pm)				User Contributed Perl Documentation			    Bio::Tools::Run::BEDTools(3pm)

NAME
Bio::Tools::Run::BEDTools - Run wrapper for the BEDTools suite of programs *BETA* SYNOPSIS
# use a BEDTools program $bedtools_fac = Bio::Tools::Run::BEDTools->new( -command => 'subtract' ); $result_file = $bedtools_fac->run( -bed1 => 'genes.bed', -bed2 => 'mask.bed' ); # if IO::Uncompress::Gunzip is available... $result_file = $bedtools_fac->run( -bed1 => 'genes.bed.gz', -bed2 => 'mask.bed.gz' ); # be more strict $bedtools_fac->set_parameters( -strandedness => 1 ); # and even more... $bedtools_fac->set_parameters( -minimum_overlap => 1e-6 ); # create a Bio::SeqFeature::Collection object $features = $bedtools_fac->result( -want => 'Bio::SeqFeature::Collection' ); DEPRECATION WARNING
Most executables from BEDTools v>=2.10.1 can read GFF and VCF formats in addition to BED format. This requires the use of a new input file param, shown in the following documentation, '-bgv', in place of '-bed' for the executables that can do this. This behaviour breaks existing scripts. DESCRIPTION
This module provides a wrapper interface for Aaron R. Quinlan and Ira M. Hall's utilities "BEDTools" that allow for (among other things): o Intersecting two BED files in search of overlapping features. o Merging overlapping features. o Screening for paired-end (PE) overlaps between PE sequences and existing genomic features. o Calculating the depth and breadth of sequence coverage across defined "windows" in a genome. (see <http://code.google.com/p/bedtools/> for manuals and downloads). OPTIONS
"BEDTools" is a suite of 17 commandline executable. This module attempts to provide and options comprehensively. You can browse the choices like so: $bedtools_fac = Bio::Tools::Run::BEDTools->new; # all bowtie commands @all_commands = $bedtools_fac->available_parameters('commands'); @all_commands = $bedtools_fac->available_commands; # alias # just for default command ('bam_to_bed') @btb_params = $bedtools_fac->available_parameters('params'); @btb_switches = $bedtools_fac->available_parameters('switches'); @btb_all_options = $bedtools_fac->available_parameters(); Reasonably mnemonic names have been assigned to the single-letter command line options. These are the names returned by "available_parameters", and can be used in the factory constructor like typical BioPerl named parameters. As a number of options are mutually exclusive, and the interpretation of intent is based on last-pass option reaching bowtie with potentially unpredicted results. This module will prevent inconsistent switches and parameters from being passed. See <http://code.google.com/p/bedtools/> for details of BEDTools options. FILES
When a command requires filenames, these are provided to the "run" method, not the constructor ("new()"). To see the set of files required by a command, use "available_parameters('filespec')" or the alias "filespec()": $bedtools_fac = Bio::Tools::Run::BEDTools->new( -command => 'pair_to_bed' ); @filespec = $bedtools_fac->filespec; This example returns the following array: #bedpe #bam bed #out This indicates that the bed ("BEDTools" BED format) file MUST be specified, and that the out, bedpe ("BEDTools" BEDPE format) and bam ("SAM" binary format) file MAY be specified (Note that in this case you MUST provide ONE of bedpe OR bam, the module at this stage does not allow this information to be queried). Use these in the "run" call like so: $bedtools_fac->run( -bedpe => 'paired.bedpe', -bgv => 'genes.bed', -out => 'overlap' ); The object will store the programs STDERR output for you in the "stderr()" attribute: handle_bed_warning($bedtools_fac) if ($bedtools_fac->stderr =~ /Usage:/); For the commands 'fasta_from_bed' and 'mask_fasta_from_bed' STDOUT will also be captured in the "stdout()" attribute by default and all other commands can be forced to capture program output in STDOUT by setting the -out filespec parameter to '-'. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org Rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: http://redmine.open-bio.org/projects/bioperl/ AUTHOR - Dan Kortschak Email dan.kortschak adelaide.edu.au CONTRIBUTORS
Additional contributors names and emails here APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new() Title : new Usage : my $obj = new Bio::Tools::Run::BEDTools(); Function: Builds a new Bio::Tools::Run::BEDTools object Returns : an instance of Bio::Tools::Run::BEDTools Args : run() Title : run Usage : $result = $bedtools_fac->run(%params); Function: Run a BEDTools command. Returns : Command results (file, IO object or Bio object) Args : Dependent on filespec for command. See $bedtools_fac->filespec and BEDTools Manual. Also accepts -want => '(raw|format|<object_class>)' - see want(). Note : gzipped inputs are allowed if IO::Uncompress::Gunzip is available Command <in> <out> annotate bgv ann(s) #out graph_union bg_files #out fasta_from_bed seq bgv #out mask_fasta_from_bed seq bgv #out bam_to_bed bam #out bed_to_IGV bgv #out merge bgv #out sort bgv #out links bgv #out b12_to_b6 bed #out overlap bed #out group_by bed #out bed_to_bam bgv #out shuffle bgv genome #out slop bgv genome #out complement bgv genome #out genome_coverage bed genome #out window bgv1 bgv2 #out closest bgv1 bgv2 #out coverage bgv1 bgv2 #out subtract bgv1 bgv2 #out pair_to_pair bedpe1 bedpe2 #out intersect bgv1|bam bgv2 #out pair_to_bed bedpe|bam bgv #out bgv* signifies any of BED, GFF or VCF. ann is a bgv. NOTE: Replace 'bgv' with 'bed' unless $use_bgv is set. want() Title : want Usage : $bowtiefac->want( $class ) Function: make factory return $class, or 'raw' results in file or 'format' for result format All commands can return Bio::Root::IO commands returning: can return object: - BED or BEDPE - Bio::SeqFeature::Collection - sequence - Bio::SeqIO Returns : return wanted type Args : [optional] string indicating class or raw of wanted result result() Title : result Usage : $bedtoolsfac->result( [-want => $type|$format] ) Function: return result in wanted format Returns : results Args : [optional] hashref of wanted type Note : -want arg does not persist between result() call when specified in result(), for persistence, use want() _determine_format() Title : _determine_format( $has_run ) Usage : $bedtools-fac->_determine_format Function: determine the format of output for current options Returns : format of bowtie output Args : [optional] boolean to indicate result exists _read_bed() Title : _read_bed() Usage : $bedtools_fac->_read_bed Function: return a Bio::SeqFeature::Collection object from a BED file Returns : Bio::SeqFeature::Collection Args : _read_bedpe() Title : _read_bedpe() Usage : $bedtools_fac->_read_bedpe Function: return a Bio::SeqFeature::Collection object from a BEDPE file Returns : Bio::SeqFeature::Collection Args : _validate_file_input() Title : _validate_file_input Usage : $bedtools_fac->_validate_file_input( -type => $file ) Function: validate file type for file spec Returns : file type if valid type for file spec Args : hash of filespec => file_name version() Title : version Usage : $version = $bedtools_fac->version() Function: Returns the program version (if available) Returns : string representing location and version of the program perl v5.12.3 2011-06-18 Bio::Tools::Run::BEDTools(3pm)
All times are GMT -4. The time now is 01:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy