Sponsored Content
Top Forums Shell Programming and Scripting Modification of perl script to split a large file into chunks of 5000 chracters Post 303017073 by gimley on Tuesday 8th of May 2018 10:34:08 PM
Old 05-08-2018
Modification of perl script to split a large file into chunks of 5000 chracters

I have a perl script which splits a large file into chunks.The script is given below
Code:
use strict;
use warnings;
open (FH, "<monolingual.txt") or die "Could not open source file. $!";
my $i = 0;
while (1) {
    my $chunk;
	print "process part $i\n";
	open(OUT, ">part$i.log") or die "Could not open destination file";
	$i ++;
	if (!eof(FH)) {
		read(FH, $chunk, 5000);
		print OUT $chunk;
	} 
	if (!eof(FH)) {
		$chunk = <FH>;
		print OUT $chunk;
	}
	close(OUT);
	last if eof(FH);
}

I want the script to create chunks of 5000 characters or a bit less but not more than that.
How do I modify the chunk size to ensure that each chunk is of 5000 characters. When I run it some chunks are more than 5000 characters.
Many thanks for your kind help
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split A Large File

Hi, I have a large file(csv format) that I need to split into 2 files. The file looks something like Original_file.txt first name, family name, address a, b, c, d, e, f, and so on for over 100,00 lines I need to create two files from this one file. The condition is i need to ensure... (4 Replies)
Discussion started by: nbvcxzdz
4 Replies

2. HP-UX

Need to split a large data file using a Unix script

Greetings all: I am still new to Unix environment and I need help with the following requirement. I have a large sequential file sorted on a field (say store#) that is being split into several smaller files, one for each store. That means if there are 500 stores, there will be 500 files. This... (1 Reply)
Discussion started by: SAIK
1 Replies

3. Shell Programming and Scripting

Split Large File

HI, i've to split a large file which inputs seems like : Input file name_file.txt 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00003|CCCC|MAIL|DATEOFBIRTHT|.......... (1 Reply)
Discussion started by: AMARA
1 Replies

4. Shell Programming and Scripting

how to get split output of a file, using perl script

Hi, I have file: data.log.1 ### s1 main.build.3495 main.build.199 main.build.3408 ###s2 main.build.3495 main.build.3408 main.build.199 I want to read this file and store in two arrays in Perl. I have following command, which is working fine on command prompt. perl -n -e... (1 Reply)
Discussion started by: ashvini
1 Replies

5. Shell Programming and Scripting

Split file into chunks of low & high byte

Hi guys, i have a question about spliting a binary file into 2 chunks. First chunk with all high bytes and the second one with all low bytes. What unix tools can i use? And how can this be performed? I looked in manpages of split and dd but this does not help. Thanks (2 Replies)
Discussion started by: basta
2 Replies

6. Shell Programming and Scripting

Split a large file

I have a 3 GB text file that I would like to split. How can I do this? It's a giant comma-separated list of numbers. I would like to make it into about 20 files of ~100 MB each, with a custom header and footer. The file can only be split on commas, but they're plentiful. Something like... (3 Replies)
Discussion started by: CRGreathouse
3 Replies

7. Shell Programming and Scripting

perl script to split the text file after every 4th field

I had a text file(comma seperated values) which contains as below 196237,ram,25-May-06,ram.kiran@xyz.com,204183,Pavan,4-Jun-07,Pavan.Desai@xyz.com,237107,ram Chandra,15-Mar-10,ram.krishna@xyz.com ... (3 Replies)
Discussion started by: giridhar276
3 Replies

8. Shell Programming and Scripting

Split a large array into small chunks

Hi, I need to split a large array "@sharedArray" into 10 small arrays. The arrays should be like @sharedArray1,@sharedArray2,@sharedArray3...so on.. Can anyone help me with the logic to do so :(:confused: (6 Replies)
Discussion started by: rkrish
6 Replies

9. UNIX for Beginners Questions & Answers

Split large file into smaller files without disturbing the entry chunks

Dears, Need you help with the below file manipulation. I want to split the file into 8 smaller files but without cutting/disturbing the entries (meaning every small file should start with a entry and end with an empty line). It will be helpful if you can provide a one liner command for this... (12 Replies)
Discussion started by: Kamesh G
12 Replies

10. UNIX for Beginners Questions & Answers

Trying To Split a Large File

Trying to split a 35gb file into 1000mb parts. My research shows I should you this. split -b 1000m file.txt and my return is "split: cannot open 'crunch1.txt' for reading: No such file or directory" so I tried split -b 1000m Documents/Wordlists/file.txt and I get nothing other than the curser just... (3 Replies)
Discussion started by: sub terra
3 Replies
fcopy(n)						       Tcl Built-In Commands							  fcopy(n)

__________________________________________________________________________________________________________________________________________________

NAME
fcopy - Copy data from one channel to another. SYNOPSIS
fcopy inchan outchan ?-size size? ?-command callback? _________________________________________________________________ DESCRIPTION
The fcopy command copies data from one I/O channel, inchan to another I/O channel, outchan. The fcopy command leverages the buffering in the Tcl I/O system to avoid extra copies and to avoid buffering too much data in main memory when copying large files to slow destinations like network sockets. The fcopy command transfers data from inchan until end of file or size bytes have been transferred. If no -size argument is given, then the copy goes until end of file. All the data read from inchan is copied to outchan. Without the -command option, fcopy blocks until the copy is complete and returns the number of bytes written to outchan. The -command argument makes fcopy work in the background. In this case it returns immediately and the callback is invoked later when the copy completes. The callback is called with one or two additional arguments that indicates how many bytes were written to outchan. If an error occurred during the background copy, the second argument is the error string associated with the error. With a background copy, it is not necessary to put inchan or outchan into non-blocking mode; the fcopy command takes care of that automatically. However, it is nec- essary to enter the event loop by using the vwait command or by using Tk. You are not allowed to do other I/O operations with inchan or outchan during a background fcopy. If either inchan or outchan get closed while the copy is in progress, the current copy is stopped and the command callback is not made. If inchan is closed, then all data already queued for outchan is written out. Note that inchan can become readable during a background copy. You should turn off any fileevent handlers during a background copy so those handlers do not interfere with the copy. Any I/O attempted by a fileevent handler will get a "channel busy" error. Fcopy translates end-of-line sequences in inchan and outchan according to the -translation option for these channels. See the manual entry for fconfigure for details on the -translation option. The translations mean that the number of bytes read from inchan can be different than the number of bytes written to outchan. Only the number of bytes written to outchan is reported, either as the return value of a syn- chronous fcopy or as the argument to the callback for an asynchronous fcopy. EXAMPLE
This first example shows how the callback gets passed the number of bytes transferred. It also uses vwait to put the application into the event loop. Of course, this simplified example could be done without the command callback. proc Cleanup {in out bytes {error {}}} { global total set total $bytes close $in close $out if {[string length $error] != 0} { # error occurred during the copy } } set in [open $file1] set out [socket $server $port] fcopy $in $out -command [list Cleanup $in $out] vwait total The second example copies in chunks and tests for end of file in the command callback proc CopyMore {in out chunk bytes {error {}}} { global total done incr total $bytes if {([string length $error] != 0) || [eof $in] { set done $total close $in close $out } else { fcopy $in $out -command [list CopyMore $in $out $chunk] -size $chunk } } set in [open $file1] set out [socket $server $port] set chunk 1024 set total 0 fcopy $in $out -command [list CopyMore $in $out $chunk] -size $chunk vwait done SEE ALSO
eof(n), fblocked(n), fconfigure(n) KEYWORDS
blocking, channel, end of line, end of file, nonblocking, read, translation Tcl 8.0 fcopy(n)
All times are GMT -4. The time now is 08:48 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy