Split file into multiple files using delimiter Post: 302791003

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file into multiple files

I have a file ehich has multiple create statements as create abc 123 one two create xyz 456 four five create nnn 666 six four I want to separte each create statement in seperate files

2. UNIX for Dummies Questions & Answers

split a file into multiple files

Hi All, I have a file ABC.txt and I need to split this file on every 250 rows. And the file name should be ABC1.txt , ABC2.txt and so on. I tried with split command split -l 250 <filename> '<filename>' but the file name returned was ABC.txtaa ABC.txtab. Please...

3. Shell Programming and Scripting

Split a file into multiple files

Hi, i have a file like this: 1|2|3|4|5| 1|2|8|4|6| Trailer1||||| 1|2|3| Trailer2||| 3|4|5|6| 3|4|5|7| 3|4|5|8| Trailer2||| I want to generate 3 files out of this based on the trailer record. Trailer record string can be different for each file or it may be same for one or two. No...

4. Shell Programming and Scripting

renaming files using split with a delimiter

I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be...

5. Shell Programming and Scripting

Split file into multiple files

Hi I have a file that has multiple sequences; the sequence name is the line starting with '>'. It looks like below: infile.txt: >HE_ER tttggtgccttgactcggattgggggacctcccttgggagatcaatcccctgtcctcctgctctttgctc cgtgaaaaggatccacctatgacctctagtcctcagacccaccagcccaaggaacatctcaccaatttca >M7B_Ho_sap...

6. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment...

7. Shell Programming and Scripting

split file into multiple files

Hi, I have a file of the following syntax that has around 120K records that are tab separated. input.txt abc def klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango unix_0000103; abc def klm 415 439 . + . klm_mango unix_00001043; I am looking for an awk oneliner...

8. Shell Programming and Scripting

Split files based on row delimiter count

I have a huge file (around 4-5 GB containing 20 million rows) which has text like: <EOFD>11<EOFD>22<EORD>2<EOFD>2222<EOFD>3333<EORD>3<EOFD>44<EOFD>55<EORD>66<EOFD>888<EOFD>9999<EORD> Actually above is an extracted file from a Sql Server with each field delimited by <EOFD> and each row ends...

9. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like: ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3 AA;ax;ay;az;01;02;03;04;05;06;07;08;09 BB;bx;by;bz;03;05;33;44;15;26;27;08;09 I want to split this table in to multiple files: ...

LEARN ABOUT DEBIAN

bio::tools::run::phylo::paml::baseml

Bio::Tools::Run::Phylo::PAML::Baseml(3pm)		User Contributed Perl Documentation		 Bio::Tools::Run::Phylo::PAML::Baseml(3pm)

NAME

       Bio::Tools::Run::Phylo::PAML::Baseml - Wrapper aroud the PAML program baseml

SYNOPSIS

	 use Bio::Tools::Run::Phylo::PAML::Baseml;
	 use Bio::AlignIO;
	 my $alignio = Bio::AlignIO->new(-format => 'phylip',
					-file	=> 't/data/gf-s85.phylip');
	 my $aln = $alignio->next_aln;

	 my $bml = Bio::Tools::Run::Phylo::PAML::Baseml->new();
	 $bml->alignment($aln);
	 my ($rc,$parser) = $bml->run();
	 while( my $result = $parser->next_result ) {
	   my @otus = $result->get_seqs();
	   my $MLmatrix = $result->get_MLmatrix();
	   # 0 and 1 correspond to the 1st and 2nd entry in the @otus array
	 }

DESCRIPTION

       This is a wrapper around the baseml program of PAML (Phylogenetic Analysis by Maximum Likelihood) package of Ziheng Yang.  See
       http://abacus.gene.ucl.ac.uk/software/paml.html for more information.

       This module will generate a proper baseml.ctl file and will run the program in a separate temporary directory to avoid creating temp files
       all over the place and will cleanup after itself..

       The values you can feed to the configuration file are documented here.

	   'noisy'   => [ 0..3,9],
	   'verbose' => [ 0,1,2], # 0:concise, 1:detailed, 2:too much
	   'runmode' => [0..5],
	   # for runmode
	   # 0: use the provided tree structure(s) in treefile
	   # 1,2: mean heuristic search by star-decomposition alg
	   # 2: starts from star tree while 1 reads a multifurcating
	   # tree from treefile and ties to estimate the best
	   # bifurcating tree
	   # 3: stepwise addition
	   # 4: NNI perturbation with the starting tree
	   # Tree search DOES NOT WORK WELL so estimate a tree
	   # using other programs first
	   'model'   => '0',
	   # for model
	   # 0: JC69 (uncorrected)
	   # 1: K80  (transitions/transversion weighted differently)
	   # 2: F81
	   # 3: F84
	   # 4: HKY85
	   # 5: T92 (Tamura 92)
	   # 6: TN93 (Tajima-Nei) correct for multiple substitutions
	   # 7: REV (aka GTR)
	   # 8: UNREST
	   # 9: REVu
	   #10: UNRESTu
	   # See Yang 1994 JME 39:105-111

	   # model 8 special case of the REV model
	   # model 9 is special case of unrestricted model
	   # can also supply special rate parameters
	   # so for example (from pamlDOC.pdf
	   # $model  = '8 [2 (CT) (AG)]'; # TN93
	   # $model  = '8 [2 (TA AT TG CA CG) (AG)]'; # TN93
	   # $model  = '9 [1 (TC CT AG GA)]; # K80
	   # $model  = '9 [0]'; # JC69
	   # $model  = '9 [11 (TA) (TG) (CT) (CA) (CG) (AT) (AC) (AG) (GT) (GC) (GA)],

	   'outfile' => 'mlb',
	   'fix_kappa'=> [0,1], # 0:estimate kappa, 1:fix kappa
	   'kappa'    => '2.5', # initial or fixed kappa
	   'fix_alpha'=> [1,0], # 0: estimate gamma shape param
	   # 1: fix it at alpha
	   'alpha'    => '0', # initial of fixed alpha
	   # 0: infinity (constant rate)
	   'Malpha'   => [0,1], # different alphas for genes

	   'fix_rho'=> [1,0], # 0: estimate gamma shape param
						 # 1: fix it at alpha
	   'rho'    => '0', # initial of fixed alpha
	   # 0: infinity (constant rate)

	   'ncatG'    => '5', # number of categories in the dD,AdG, or nparkK models of rates
	   'nparK'    => [0..4], # rate-class models
	   # 1:rk 2:rk&fK
	   # 3:rK&MK(1/K) 4:rK&MK
	   'nhomo'    => [0..4], # 0 & 1: homogeneous,
	   # 2: kappa for brances
	   # 3:N1 4:N2
	   'getSE'    => [0,1],
	   'RateAncestor' => [1,0,2], # rates (alpha > 0) or
	   # ancestral states
	   'cleandata' => [1,0], # remove sites with
	   # ambiguity data (1:yes or 0:no)

	   'fix_blength' => [-1,0,1,2], # 0: ignore, -1: random,
	   # 1: initial, 2: fixed

	   # 'icode'	=> [ 0..10], # (with RateAncestor=1.
	   #try "GC" in data,model=4,Mgene=4)
	   'ndata'    => [5,1..10],
	   'clock'    => [0..3], # 0: no clock, 1: clock, 2: local clock, 3: CombinedAnalysis
	   'Small_Diff' => '1e-6', #underflow issues?

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the
       Bioperl mailing list.  Your participation is much appreciated.

	 bioperl-l@bioperl.org			- General discussion
	 http://bioperl.org/wiki/Mailing_lists	- About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address
       it. Please include a thorough description of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the
       web:

	 http://redmine.open-bio.org/projects/bioperl/

AUTHOR - Jason Stajich
       Email jason-at-bioperl.org

CONTRIBUTORS

       Sendu Bala - bix@sendu.me.uk

APPENDIX

       The rest of the documentation details each of the object methods.  Internal methods are usually preceded with a _

   program_name
	Title	: program_name
	Usage	: $obj->program_name()
	Function: holds the program name
	Returns:  string
	Args	: None

   program_dir
	Title	: program_dir
	Usage	: ->program_dir()
	Function: returns the program directory, obtained from ENV variable.
	Returns:  string
	Args	:

   new
	Title	: new
	Usage	: my $obj = Bio::Tools::Run::Phylo::PAML::Baseml->new();
	Function: Builds a new Bio::Tools::Run::Phylo::PAML::Baseml object
	Returns : Bio::Tools::Run::Phylo::PAML::Baseml
	Args	: -alignment => the L<Bio::Align::AlignI> object
		  -tree => the L<Bio::Tree::TreeI> object if you want to use runmode
			   0 or 1
		  -save_tempfiles => boolean to save the generated tempfiles and
				     NOT cleanup after onesself (default FALSE)

   run
	Title	: run
	Usage	: $yn->run();
	Function: run the Baseml analysis using the default or updated parameters
		  the alignment parameter must have been set
	Returns : 3 values,
		  $rc = 1 for success, 0 for errors
		  hash reference of the Yang calculated Ka/Ks values
			   this is a set of pairwise observations keyed as
			   sequencenameA->sequencenameB->datatype
		  hash reference same as the previous one except it for the
		  Nei and Gojobori calculated Ka,Ks,omega values
	Args	: optionally, a value appropriate for alignment() and one for tree()
	NB	: Since Baseml doesn't handle spaces in tree node ids, if a tree is
		  in use spaces will be converted to underscores in both the tree node
		  ids and alignment sequence ids.

   error_string
	Title	: error_string
	Usage	: $obj->error_string($newval)
	Function: Where the output from the last analysus run is stored.
	Returns : value of error_string
	Args	: newvalue (optional)

   alignment
	Title	: alignment
	Usage	: $baseml->alignment($aln);
	Function: Get/Set the L<Bio::Align::AlignI> object
	Returns : L<Bio::Align::AlignI> object
	Args	: [optional] L<Bio::Align::AlignI>
	Comment : We could potentially add support for running directly on a file
		  but we shall keep it simple
	See also: L<Bio::SimpleAlign>

   get_parameters
	Title	: get_parameters
	Usage	: my %params = $self->get_parameters();
	Function: returns the list of parameters as a hash
	Returns : associative array keyed on parameter names
	Args	: none

   set_parameter
	Title	: set_parameter
	Usage	: $baseml->set_parameter($param,$val);
	Function: Sets a baseml parameter, will be validated against
		  the valid values as set in the %VALIDVALUES class variable.
		  The checks can be ignored if on turns of param checks like this:
		    $baseml->no_param_checks(1)
	Returns : boolean if set was success, if verbose is set to -1
		  then no warning will be reported
	Args	: $paramname => name of the parameter
		  $value     => value to set the parameter to
	See also: L<no_param_checks()>

   set_default_parameters
	Title	: set_default_parameters
	Usage	: $baseml->set_default_parameters(0);
	Function: (Re)set the default parameters from the defaults
		  (the first value in each array in the
		   %VALIDVALUES class variable)
	Returns : none
	Args	: boolean: keep existing parameter values
	NB	: using this isn't an especially good idea! You don't need to do
		  anything to end up using default parameters: hence 'default'!

Bio::Tools::Run::Wrapper methods
   no_param_checks
	Title	: no_param_checks
	Usage	: $obj->no_param_checks($newval)
	Function: Boolean flag as to whether or not we should
		  trust the sanity checks for parameter values
	Returns : value of no_param_checks
	Args	: newvalue (optional)

   save_tempfiles
	Title	: save_tempfiles
	Usage	: $obj->save_tempfiles($newval)
	Function:
	Returns : value of save_tempfiles
	Args	: newvalue (optional)

   outfile_name
	Title	: outfile_name
	Usage	: my $outfile = $baseml->outfile_name();
	Function: Get/Set the name of the output file for this run
		  (if you wanted to do something special)
	Returns : string
	Args	: [optional] string to set value to

   tempdir
	Title	: tempdir
	Usage	: my $tmpdir = $self->tempdir();
	Function: Retrieve a temporary directory name (which is created)
	Returns : string which is the name of the temporary directory
	Args	: none

   cleanup
	Title	: cleanup
	Usage	: $baseml->cleanup();
	Function: Will cleanup the tempdir directory after a PAML run
	Returns : none
	Args	: none

   io
	Title	: io
	Usage	: $obj->io($newval)
	Function:  Gets a L<Bio::Root::IO> object
	Returns : L<Bio::Root::IO>
	Args	: none

perl v5.12.3							    2011-06-18				 Bio::Tools::Run::Phylo::PAML::Baseml(3pm)

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file into multiple files

Discussion started by: glamo_2312

2. UNIX for Dummies Questions & Answers

split a file into multiple files

Discussion started by: kumar66

3. Shell Programming and Scripting

Split a file into multiple files

Discussion started by: pparthji

4. Shell Programming and Scripting

renaming files using split with a delimiter

Discussion started by: swimulator

5. Shell Programming and Scripting

Split file into multiple files

Discussion started by: jdhahbi

6. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

Discussion started by: lv99

7. Shell Programming and Scripting

split file into multiple files

Discussion started by: jacobs.smith

8. Shell Programming and Scripting

Split files based on row delimiter count

Discussion started by: amvip

9. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

Discussion started by: trymega

LEARN ABOUT DEBIAN

bio::tools::run::phylo::paml::baseml