Sponsored Content
Top Forums Shell Programming and Scripting Split a huge data into few different files?! Post 302366659 by patrick87 on Friday 30th of October 2009 04:49:39 AM
Old 10-30-2009
Split a huge data into few different files?!

Input file data contents:
Code:
>seq_1
MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA
>seq_2
AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE
>seq_3
ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM
ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA
>seq_4
TTLPPAPVSPTTTTQAEDAAAAATLASQRAKLKASSRISAPANILLGASGADGVKSPLWS
EKERVVERRSPSPSGRNVERPKSTGSTGEPAQPNNSHAGMNLSQSTGPPSASFLRSPAPD
>seq_5
FDSQLSPIVGGNWASMVNTPLMPMFGSKGGGEGGSFGGLASPGLDGATAKLGSWATGTTT
GQAGIVLDDVRKFRRSARISGSGATGFGGGALGGMYDDQPAQASTNGQQQRRVSPSQLNS
>seq_6
AQQNAINLGLAGLQQQQQQHQQQLRSGAASPGLSSQQAAVAAQQNWRNGLGSPAVDSSDQ
YSQHGMGAFGMGSPANLSANAQLANLFALQQQMMQQQQMQQLNMAAAAGIALTPVQMMGL
QQQQQQAMLSPGGRGFGMGMNGMGMNGMMGMGMGGMGSPRRSPRQSDRSPGGKTNLPSTV
.
.
.
.

Output file 1 contents:
Code:
>seq_1
MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA
>seq_2
 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE
>seq_3
ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM
ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA

Output file 2 contents:
Code:
>seq_4
TTLPPAPVSPTTTTQAEDAAAAATLASQRAKLKASSRISAPANILLGASGADGVKSPLWS
EKERVVERRSPSPSGRNVERPKSTGSTGEPAQPNNSHAGMNLSQSTGPPSASFLRSPAPD
>seq_5
FDSQLSPIVGGNWASMVNTPLMPMFGSKGGGEGGSFGGLASPGLDGATAKLGSWATGTTT
GQAGIVLDDVRKFRRSARISGSGATGFGGGALGGMYDDQPAQASTNGQQQRRVSPSQLNS
>seq_6
AQQNAINLGLAGLQQQQQQHQQQLRSGAASPGLSSQQAAVAAQQNWRNGLGSPAVDSSDQ
YSQHGMGAFGMGSPANLSANAQLANLFALQQQMMQQQQMQQLNMAAAAGIALTPVQMMGL
QQQQQQAMLSPGGRGFGMGMNGMGMNGMMGMGMGGMGSPRRSPRQSDRSPGGKTNLPSTV

If I have a long list data inside a file, how I can divide the data into different file?
I need three data inside each file.
For example, my data source got 300 sequence.
I need it to divide 3 sequence in a file. Total desired output are 100 files that content 3 sequence each.
Do anybody got idea to solve my trouble?
Thanks a lot for all of your guide.

Last edited by pludi; 10-30-2009 at 05:51 AM.. Reason: code tags, please...
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

2. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ... (4 Replies)
Discussion started by: ad23
4 Replies

3. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

4. Shell Programming and Scripting

how to split a huge file by every 100 lines

into small files. i need to add a head.txt and tail.txt into small files at the begin and end, and give a name as q1.xml q2.xml q3.xml .... thank you very much. (2 Replies)
Discussion started by: dtdt
2 Replies

5. Shell Programming and Scripting

Split a file into several files using a data

Hi All, I have file(File1) with data like below: 102100|LName|Gender|Company|Branch|Bday|Salary|Age 102100|bbbb|male|cccc|dddd|19900814|15000|20| 102101|asdg|male|gggg|ksgu|19911216||| 102102|bdbm|male|kkkk|acke|19931018||23| 102102|kfjg|male|kkkc|gkgg|19921213|14000|24|... (2 Replies)
Discussion started by: sarav.shan
2 Replies

6. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

7. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

8. Shell Programming and Scripting

Split JSON to different data files

Hi Gurus, I have below JSON file, now I want to rewrite this file into a new file. I will appreciate if anyone can help me to provide the solution...I can't use jq. { "_id": "3ad893cb4cf1560add7b4caffd4b6126", "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f", "name":... (4 Replies)
Discussion started by: manas_ranjan
4 Replies

9. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies

10. Solaris

Split huge File System

Gents I have huge NAS File System as /sys with size 10 TB and I want to Split each 1TB in spirit File System to be mounted in the server. How to can I do that without changing anything in the source. Please your support. (1 Reply)
Discussion started by: AbuAliiiiiiiiii
1 Replies
mtext_from_data(3m17n)						 The m17n Library					    mtext_from_data(3m17n)

NAME
mtext_from_data - Allocate a new M-text with specified data. SYNOPSIS
MText* mtext_from_data (const void * data, int nitems, enum MTextFormat format) DESCRIPTION
Allocate a new M-text with specified data. The mtext_from_data() function allocates a new M-text whose character sequence is specified by array data of nitems elements. format specifies the format of data. When format is either MTEXT_FORMAT_US_ASCII or MTEXT_FORMAT_UTF_8, the contents of data must be of the type unsigned char, and nitems counts by byte. When format is either MTEXT_FORMAT_UTF_16LE or MTEXT_FORMAT_UTF_16BE, the contents of data must be of the type unsigned short, and nitems counts by unsigned short. When format is either MTEXT_FORMAT_UTF_32LE or MTEXT_FORMAT_UTF_32BE, the contents of data must be of the type unsigned, and nitems counts by unsigned. The character sequence of the M-text is not modifiable. The contents of data must not be modified while the M-text is alive. The allocated M-text will not be freed unless the user explicitly does so with the m17n_object_unref() function. Even in that case, data is not freed. RETURN VALUE
If the operation was successful, mtext_from_data() returns a pointer to the allocated M-text. Otherwise it returns NULL and assigns an error code to the external variable merror_code. ERRORS
MERROR_MTEXT COPYRIGHT
Copyright (C) 2001 Information-technology Promotion Agency (IPA) Copyright (C) 2001-2011 National Institute of Advanced Industrial Science and Technology (AIST) Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License <http://www.gnu.org/licenses/fdl.html>. Version 1.6.2 12 Jan 2011 mtext_from_data(3m17n)
All times are GMT -4. The time now is 04:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy