Sponsored Content
Top Forums Shell Programming and Scripting Split a huge data into few different files?! Post 302367199 by Scrutinizer on Sunday 1st of November 2009 11:10:48 PM
Old 11-02-2009
Quote:
Originally Posted by patrick87
Hi Scrutinizer, do you have any idea to get my desired output result?
I try to replace the space of header with "_" and try your suggested code.
Unfortunately, it still can't work Smilie
Thanks a lot for your advise.
Hi patrick87,

The problem is, I put random spaces and : characters inside the labels of your input examples you gave and both scripts still work as expected. I have to assume your real world data sets somehow do not correspond with the input format you provided. You would have to take a small part (say 7 records) of an actual, anonymized, file, then run my scripts on them to see if they also produce the strange results and then post that example input file here, and also list the strange resulting file names and their content, so I can have a look.

S.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

2. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ... (4 Replies)
Discussion started by: ad23
4 Replies

3. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

4. Shell Programming and Scripting

how to split a huge file by every 100 lines

into small files. i need to add a head.txt and tail.txt into small files at the begin and end, and give a name as q1.xml q2.xml q3.xml .... thank you very much. (2 Replies)
Discussion started by: dtdt
2 Replies

5. Shell Programming and Scripting

Split a file into several files using a data

Hi All, I have file(File1) with data like below: 102100|LName|Gender|Company|Branch|Bday|Salary|Age 102100|bbbb|male|cccc|dddd|19900814|15000|20| 102101|asdg|male|gggg|ksgu|19911216||| 102102|bdbm|male|kkkk|acke|19931018||23| 102102|kfjg|male|kkkc|gkgg|19921213|14000|24|... (2 Replies)
Discussion started by: sarav.shan
2 Replies

6. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

7. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

8. Shell Programming and Scripting

Split JSON to different data files

Hi Gurus, I have below JSON file, now I want to rewrite this file into a new file. I will appreciate if anyone can help me to provide the solution...I can't use jq. { "_id": "3ad893cb4cf1560add7b4caffd4b6126", "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f", "name":... (4 Replies)
Discussion started by: manas_ranjan
4 Replies

9. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies

10. Solaris

Split huge File System

Gents I have huge NAS File System as /sys with size 10 TB and I want to Split each 1TB in spirit File System to be mounted in the server. How to can I do that without changing anything in the source. Please your support. (1 Reply)
Discussion started by: AbuAliiiiiiiiii
1 Replies
Locale::Po4a::Pod(3)					User Contributed Perl Documentation				      Locale::Po4a::Pod(3)

NAME
Locale::Po4a::Pod - convert POD data from/to PO files SYNOPSIS
use Locale::Po4a::Pod; my $parser = Locale::Po4a::Pod->new (sentence => 0, width => 78); # Read POD from STDIN and write to STDOUT. $parser->parse_from_filehandle; # Read POD from file.pod and write to file.txt. $parser->parse_from_file ('file.pod', 'file.txt'); DESCRIPTION
Locale::Po4a::Pod is a module to help the translation of documentation in the POD format (the preferred language for documenting Perl) into other [human] languages. STATUS OF THIS MODULE
I think that this module is rock stable, and there is only one known bug with /usr/lib/perl5/Tk/MainWindow.pod (and some other pages, see below) which contains: C<" #n"> Lack of luck, in the po4a version, this was split on the space by the wrapping. As result, in the original version, the man page contains " #n" and mine contains "" #n"" which is logic since C<foobar> is rewritten "foobar". Complete list of pages having this problem on my box (from 564 pages; note that it depends on the chosen wrapping column): /usr/lib/perl5/Tk/MainWindow.pod /usr/share/perl/5.8.0/overload.pod /usr/share/perl/5.8.0/pod/perlapi.pod /usr/share/perl/5.8.0/pod/perldelta.pod /usr/share/perl/5.8.0/pod/perlfaq5.pod /usr/share/perl/5.8.0/pod/perlpod.pod /usr/share/perl/5.8.0/pod/perlre.pod /usr/share/perl/5.8.0/pod/perlretut.pod INTERNALS
As a derived class from Pod::Parser, Locale::Po4a::Pod supports the same methods and interfaces. See Pod::Parser for all the details; briefly, one creates a new parser with "Locale::Po4a::Pod->new()" and then calls either parse_from_filehandle() or parse_from_file(). new() can take options, in the form of key/value pairs, that control the behavior of the parser. The recognized options common to all Pod::Parser children are: alt If set to a true value, selects an alternate output format that, among other things, uses a different heading style and marks =item entries with a colon in the left margin. Defaults to false. code If set to a true value, the non-POD parts of the input file will be included in the output. Useful for viewing code documented with POD blocks with the POD rendered and the code left intact. indent The number of spaces to indent regular text, and the default indentation for =over blocks. Defaults to 4. loose If set to a true value, a blank line is printed after a =head1 heading. If set to false (the default), no blank line is printed after =head1, although one is still printed after =head2. This is the default because it's the expected formatting for manual pages; if you're formatting arbitrary text documents, setting this to true may result in more pleasing output. quotes Sets the quote marks used to surround C<> text. If the value is a single character, it is used as both the left and right quote; if it is two characters, the first character is used as the left quote and the second as the right quote; and if it is four characters, the first two are used as the left quote and the second two as the right quote. This may also be set to the special value none, in which case no quote marks are added around C<> text. sentence If set to a true value, Locale::Po4a::Pod will assume that each sentence ends in two spaces, and will try to preserve that spacing. If set to false, all consecutive whitespace in non-verbatim paragraphs is compressed into a single space. Defaults to true. width The column at which to wrap text on the right-hand side. Defaults to 76. SEE ALSO
Pod::Parser, Locale::Po4a::Man(3pm), Locale::Po4a::TransTractor(3pm), po4a(7) AUTHORS
Denis Barbier <barbier@linuxfr.org> Martin Quinson (mquinson#debian.org) COPYRIGHT AND LICENSE
Copyright 2002 by SPI, inc. This program is free software; you may redistribute it and/or modify it under the terms of GPL (see the COPYING file). perl v5.16.3 2014-06-10 Locale::Po4a::Pod(3)
All times are GMT -4. The time now is 09:39 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy