Sponsored Content
Top Forums Shell Programming and Scripting Split a huge data into few different files?! Post 302366659 by patrick87 on Friday 30th of October 2009 04:49:39 AM
Old 10-30-2009
Split a huge data into few different files?!

Input file data contents:
Code:
>seq_1
MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA
>seq_2
AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE
>seq_3
ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM
ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA
>seq_4
TTLPPAPVSPTTTTQAEDAAAAATLASQRAKLKASSRISAPANILLGASGADGVKSPLWS
EKERVVERRSPSPSGRNVERPKSTGSTGEPAQPNNSHAGMNLSQSTGPPSASFLRSPAPD
>seq_5
FDSQLSPIVGGNWASMVNTPLMPMFGSKGGGEGGSFGGLASPGLDGATAKLGSWATGTTT
GQAGIVLDDVRKFRRSARISGSGATGFGGGALGGMYDDQPAQASTNGQQQRRVSPSQLNS
>seq_6
AQQNAINLGLAGLQQQQQQHQQQLRSGAASPGLSSQQAAVAAQQNWRNGLGSPAVDSSDQ
YSQHGMGAFGMGSPANLSANAQLANLFALQQQMMQQQQMQQLNMAAAAGIALTPVQMMGL
QQQQQQAMLSPGGRGFGMGMNGMGMNGMMGMGMGGMGSPRRSPRQSDRSPGGKTNLPSTV
.
.
.
.

Output file 1 contents:
Code:
>seq_1
MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA
>seq_2
 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE
>seq_3
ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM
ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA

Output file 2 contents:
Code:
>seq_4
TTLPPAPVSPTTTTQAEDAAAAATLASQRAKLKASSRISAPANILLGASGADGVKSPLWS
EKERVVERRSPSPSGRNVERPKSTGSTGEPAQPNNSHAGMNLSQSTGPPSASFLRSPAPD
>seq_5
FDSQLSPIVGGNWASMVNTPLMPMFGSKGGGEGGSFGGLASPGLDGATAKLGSWATGTTT
GQAGIVLDDVRKFRRSARISGSGATGFGGGALGGMYDDQPAQASTNGQQQRRVSPSQLNS
>seq_6
AQQNAINLGLAGLQQQQQQHQQQLRSGAASPGLSSQQAAVAAQQNWRNGLGSPAVDSSDQ
YSQHGMGAFGMGSPANLSANAQLANLFALQQQMMQQQQMQQLNMAAAAGIALTPVQMMGL
QQQQQQAMLSPGGRGFGMGMNGMGMNGMMGMGMGGMGSPRRSPRQSDRSPGGKTNLPSTV

If I have a long list data inside a file, how I can divide the data into different file?
I need three data inside each file.
For example, my data source got 300 sequence.
I need it to divide 3 sequence in a file. Total desired output are 100 files that content 3 sequence each.
Do anybody got idea to solve my trouble?
Thanks a lot for all of your guide.

Last edited by pludi; 10-30-2009 at 05:51 AM.. Reason: code tags, please...
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

2. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ... (4 Replies)
Discussion started by: ad23
4 Replies

3. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

4. Shell Programming and Scripting

how to split a huge file by every 100 lines

into small files. i need to add a head.txt and tail.txt into small files at the begin and end, and give a name as q1.xml q2.xml q3.xml .... thank you very much. (2 Replies)
Discussion started by: dtdt
2 Replies

5. Shell Programming and Scripting

Split a file into several files using a data

Hi All, I have file(File1) with data like below: 102100|LName|Gender|Company|Branch|Bday|Salary|Age 102100|bbbb|male|cccc|dddd|19900814|15000|20| 102101|asdg|male|gggg|ksgu|19911216||| 102102|bdbm|male|kkkk|acke|19931018||23| 102102|kfjg|male|kkkc|gkgg|19921213|14000|24|... (2 Replies)
Discussion started by: sarav.shan
2 Replies

6. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

7. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

8. Shell Programming and Scripting

Split JSON to different data files

Hi Gurus, I have below JSON file, now I want to rewrite this file into a new file. I will appreciate if anyone can help me to provide the solution...I can't use jq. { "_id": "3ad893cb4cf1560add7b4caffd4b6126", "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f", "name":... (4 Replies)
Discussion started by: manas_ranjan
4 Replies

9. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies

10. Solaris

Split huge File System

Gents I have huge NAS File System as /sys with size 10 TB and I want to Split each 1TB in spirit File System to be mounted in the server. How to can I do that without changing anything in the source. Please your support. (1 Reply)
Discussion started by: AbuAliiiiiiiiii
1 Replies
Bio::Variation::SNP(3pm)				User Contributed Perl Documentation				  Bio::Variation::SNP(3pm)

NAME
Bio::Variation::SNP - submitted SNP SYNOPSIS
$SNP = Bio::Variation::SNP->new (); DESCRIPTION
Inherits from Bio::Variation::SeqDiff and Bio::Variation::Allele, with additional methods that are (db)SNP specific (ie, refSNP/subSNP IDs, batch IDs, validation methods). FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR
Allen Day <allenday@ucla.edu> APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ get/set-able methods Usage : $is = $snp->method() Function: for getting/setting attributes Returns : a value. probably a scalar. Args : if you're trying to set an attribute, pass in the new value. Methods: -------- id type observed seq_5 seq_3 ncbi_build ncbi_chr_hits ncbi_ctg_hits ncbi_seq_loc ucsc_build ucsc_chr_hits ucsc_ctg_hits heterozygous heterozygous_SE validated genotype handle batch_id method locus_id symbol mrna protein functional_class is_subsnp Title : is_subsnp Usage : $is = $snp->is_subsnp() Function: returns 1 if $snp is a subSNP Returns : 1 or undef Args : NONE subsnp Title : subsnp Usage : $subsnp = $snp->subsnp() Function: returns the currently active subSNP of $snp Returns : Bio::Variation::SNP Args : NONE add_subsnp Title : add_subsnp Usage : $subsnp = $snp->add_subsnp() Function: pushes the previous value returned by subsnp() onto a stack, accessible with each_subsnp(). Sets return value of subsnp() to a new Bio::Variation::SNP object, and returns that object. Returns : Bio::Varitiation::SNP Args : NONE each_subsnp Title : each_subsnp Usage : @subsnps = $snp->each_subsnp() Function: returns a list of the subSNPs of a refSNP Returns : list Args : NONE perl v5.14.2 2012-03-02 Bio::Variation::SNP(3pm)
All times are GMT -4. The time now is 01:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy