Sponsored Content
Top Forums UNIX for Advanced & Expert Users sed working slow on big files Post 302497147 by Perderabo on Wednesday 16th of February 2011 01:16:57 PM
Old 02-16-2011
Some process must write the file. Rewrite the process to omit the trailing spaces. Or pipe that process through the sed command as the file is written. As for faster, maybe perl:
Code:
perl -pe 's/ *$//'

. For real speed a custom c program is needed. But not writing the spaces to start with would be optimum.
This User Gave Thanks to Perderabo For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk not working as expected with BIG files ...

I am facing some strange problem. I know, there is only one record in a file 'test.txt' which starts with 'X' I ensure that with following command, awk /^X/ test.txt | wc -l This gives me output = '1'. Now I take out this record out of the file, as follows : awk /^X/ test.txt >... (1 Reply)
Discussion started by: videsh77
1 Replies

2. Shell Programming and Scripting

bash script working for small size files but not for big size files.

Hi, I have one file stat. Stat file contents are as follows: for example. H50768020040913,00260100,507680,13,0000000643,0000000643,00000,0000 H50769520040808,00260100,507695,13,0000000000,0000000000,00000,0000 H50770620040611,00260100,507706,13,0000000000,0000000000,00000,0000 Now i... (1 Reply)
Discussion started by: davidpreml
1 Replies

3. Shell Programming and Scripting

Big (at least to me) sed proble

hi all, i am again surrounded by a big problem,,, i have 2 files file1.txt file2.txt aaaa xxxx xxxxx xxxxxxxxxxxxxxx zzzz zzzz zzz bbb aaaa xx xxxx xxxx xxx zzzz zzzz zzz... (1 Reply)
Discussion started by: go4desperado
1 Replies

4. AIX

How to send big files over slow network?

Hi, I am trying to send oracle archives over WAN and it is taking hell a lot of time. To reduce the time, I tried to gzip the files and send over to the other side. That seems to reduce the time. Does anybody have experienced this kind of problem and any possible ways to reduce the time. ... (1 Reply)
Discussion started by: giribt
1 Replies

5. Shell Programming and Scripting

sed of big html files

hi friends, i have to cut a large html file between tag " <!-- DEFACEMENTS ROWS -->" "<!-- DISCLAIMER FOOTER -->" and store cut data in other file please help me!!!! (2 Replies)
Discussion started by: praneshbmishra
2 Replies

6. Shell Programming and Scripting

Very big text file - Too slow!

Hello everyone, suppose there is a very big text file (>800 mb) that each line contains an article from wikipedia. Each article begins with a tag (<..>) containing its url. Currently there are 10^6 articles in the file. I want to take random N articles, eliminate all non-alpharithmetic... (14 Replies)
Discussion started by: fedonMan
14 Replies

7. Shell Programming and Scripting

Why is SED so slow?

I have many files which contain about two million lines. Now I want to use sed to delete the 9th line and add a new line behind the 8th line. I use the command as follows: for((i=1;i<100;i++)); do echo $i; sed -i '9d' $i.dat; sed -i '8a this is a new line' $i.dat; done But it is... (3 Replies)
Discussion started by: wxuyec
3 Replies

8. Shell Programming and Scripting

sed Very Slow

Hi We are using sed to clean up a file of a pattern and its talking a lot of time on XML output file The command that we are using is sed -e "s/tns1://g" $OUTPUTFILENM > $TEMPFILE Where $OUTPUTFILENM is the file to be cleaned and $TEMPFILE is the cleaned output Can you... (3 Replies)
Discussion started by: jimmyb
3 Replies

9. Shell Programming and Scripting

Improve script - slow process with big files

Gents, Please can u help me to improve this script to be more faster, it works perfectly but for big files take a lot time to end the job.. I see the problem is in the step (while) and in this part the script takes a lot time.. Please if you can find a best way to do will be great. ... (13 Replies)
Discussion started by: jiam912
13 Replies

10. Shell Programming and Scripting

Using sed command to replace "|" with ^ for all *.dat files in a folder not working

I am trying to use the below sed command to replace all "|" to ^, in a folder had 50 dat files. when i tried with 1 file it worked but when i tried with wild card, is not working. sed -i 's/"|"/\^/g' *.dat Is this the proper way to use sed command thank you very much for help. (3 Replies)
Discussion started by: cplusplus1
3 Replies
IO::Async::Process(3pm) 				User Contributed Perl Documentation				   IO::Async::Process(3pm)

NAME
"IO::Async::Process" - start and manage a child process SYNOPSIS
use IO::Async::Process; use IO::Async::Loop; my $loop = IO::Async::Loop->new; my $process = IO::Async::Process->new( command => [ "tr", "a-z", "n-za-m" ], stdin => { from => "hello world ", }, stdout => { on_read => sub { my ( $stream, $buffref ) = @_; while( $$buffref =~ s/^(.*) // ) { print "Rot13 of 'hello world' is '$1' "; } return 0; }, }, on_finish => sub { $loop->stop; }, ); $loop->add( $process ); $loop->run; DESCRIPTION
This subclass of IO::Async::Notifier starts a child process, and invokes a callback when it exits. The child process can either execute a given block of code (via fork(2)), or a command. EVENTS
The following events are invoked, either using subclass methods or CODE references in parameters: on_finish $exitcode Invoked after the process has exited by normal means (i.e. an exit(2) syscall from a process, or "return"ing from the code block), and has closed all its file descriptors. on_exception $exception, $errno, $exitcode Invoked when the process exits by an exception from "code", or by failing to exec(2) the given command. $errno will be a dualvar, containing both number and string values. Note that this has a different name and a different argument order from "Loop->open_child"'s "on_error". If this is not provided and the process exits with an exception, then "on_finish" is invoked instead, being passed just the exit code. CONSTRUCTOR
$process = IO::Async::Process->new( %args ) Constructs a new "IO::Async::Process" object and returns it. Once constructed, the "Process" will need to be added to the "Loop" before the child process is started. PARAMETERS
The following named parameters may be passed to "new" or "configure": on_finish => CODE on_exception => CODE CODE reference for the event handlers. Once the "on_finish" continuation has been invoked, the "IO::Async::Process" object is removed from the containing "IO::Async::Loop" object. The following parameters may be passed to "new", or to "configure" before the process has been started (i.e. before it has been added to the "Loop"). Once the process is running these cannot be changed. command => ARRAY or STRING Either a reference to an array containing the command and its arguments, or a plain string containing the command. This value is passed into perl's exec(2) function. code => CODE A block of code to execute in the child process. It will be called in scalar context inside an "eval" block. setup => ARRAY Optional reference to an array to pass to the underlying "Loop" "spawn_child" method. fdn => HASH A hash describing how to set up file descriptor n. The hash may contain the following keys: via => STRING Configures how this file descriptor will be configured for the child process. Must be given one of the following mode names: pipe_read The child will be given the writing end of a pipe(2); the parent may read from the other. pipe_write The child will be given the reading end of a pipe(2); the parent may write to the other. Since an EOF condition of this kind of handle cannot reliably be detected, "on_finish" will not wait for this type of pipe to be closed. pipe_rdwr Only valid on the "stdio" filehandle. The child will be given the reading end of one pipe(2) on STDIN and the writing end of another on STDOUT. A single Stream object will be created in the parent configured for both filehandles. socketpair The child will be given one end of a socketpair(2); the parent will be given the other. The family of this socket may be given by the extra key called "family"; defaulting to "unix". The socktype of this socket may be given by the extra key called "socktype"; defaulting to "stream". If the type is not "SOCK_STREAM" then a IO::Async::Socket object will be constructed for the parent side of the handle, rather than "IO::Async::Stream". Once the filehandle is set up, the "fd" method (or its shortcuts of "stdin", "stdout" or "stderr") may be used to access the "IO::Async::Handle"-subclassed object wrapped around it. The value of this argument is implied by any of the following alternatives. on_read => CODE The child will be given the writing end of a pipe. The reading end will be wrapped by an "IO::Async::Stream" using this "on_read" callback function. into => SCALAR The child will be given the writing end of a pipe. The referenced scalar will be filled by data read from the child process. This data may not be available until the pipe has been closed by the child. from => STRING The child will be given the reading end of a pipe. The string given by the "from" parameter will be written to the child. When all of the data has been written the pipe will be closed. stdin => ... stdout => ... stderr => ... Shortcuts for "fd0", "fd1" and "fd2" respectively. stdio => ... Special filehandle to affect STDIN and STDOUT at the same time. This filehandle supports being configured for both reading and writing at the same time. METHODS
$pid = $process->pid Returns the process ID of the process, if it has been started, or "undef" if not. Its value is preserved after the process exits, so it may be inspected during the "on_finish" or "on_exception" events. $process->kill( $signal ) Sends a signal to the process $running = $process->is_running Returns true if the Process has been started, and has not yet finished. $exited = $process->is_exited Returns true if the Process has finished running, and finished due to normal exit(2). $status = $process->exitstatus If the process exited due to normal exit(2), returns the value that was passed to exit(2). Otherwise, returns "undef". $exception = $process->exception If the process exited due to an exception, returns the exception that was thrown. Otherwise, returns "undef". $errno = $process->errno If the process exited due to an exception, returns the numerical value of $! at the time the exception was thrown. Otherwise, returns "undef". $errstr = $process->errstr If the process exited due to an exception, returns the string value of $! at the time the exception was thrown. Otherwise, returns "undef". $stream = $process->fd( $fd ) Returns the IO::Async::Stream or IO::Async::Socket associated with the given FD number. This must have been set up by a "configure" argument prior to adding the "Process" object to the "Loop". The returned object have its read or write handle set to the other end of a pipe or socket connected to that FD number in the child process. Typically, this will be used to call the "write" method on, to write more data into the child, or to set an "on_read" handler to read data out of the child. The "on_closed" event for these streams must not be changed, or it will break the close detection used by the "Process" object and the "on_finish" event will not be invoked. $stream = $process->stdin $stream = $process->stdout $stream = $process->stderr $stream = $process->stdio Shortcuts for calling "fd" with 0, 1, 2 or "io" respectively, to obtain the IO::Async::Stream representing the standard input, output, error, or combined input/output streams of the child process. EXAMPLES
Capturing the STDOUT stream of a process By configuring the "stdout" filehandle of the process using the "into" key, data written by the process can be captured. my $stdout; my $process = IO::Async::Process->new( command => [ "writing-program", "arguments" ], stdout => { into => $stdout }, on_finish => sub { print "The process has finished, and wrote: "; print $stdout; } ); $loop->add( $process ); Note that until "on_finish" is invoked, no guarantees are made about how much of the data actually written by the process is yet in the $stdout scalar. See also the "run_child" method of IO::Async::Loop. To handle data more interactively as it arrives, the "on_read" key can instead be used, to provide a callback function to invoke whenever more data is available from the process. my $process = IO::Async::Process->new( command => [ "writing-program", "arguments" ], stdout => { on_read => sub { my ( $stream, $buffref ) = @_; while( $$buffref =~ s/^(.*) // ) { print "The process wrote a line: $1 "; } return 0; }, }, on_finish => sub { print "The process has finished "; } ); $loop->add( $process ); If the code to handle data read from the process isn't available yet when the object is constructed, it can be supplied later by using the "configure" method on the "stdout" filestream at some point before it gets added to the Loop. In this case, "stdin" should be configured using "pipe_read" in the "via" key. my $process = IO::Async::Process->new( command => [ "writing-program", "arguments" ], stdout => { via => "pipe_read" }, on_finish => sub { print "The process has finished "; } ); $process->stdout->configure( on_read => sub { my ( $stream, $buffref ) = @_; while( $$buffref =~ s/^(.*) // ) { print "The process wrote a line: $1 "; } return 0; }, ); $loop->add( $process ); Sending data to STDIN of a process By configuring the "stdin" filehandle of the process using the "from" key, data can be written into the "STDIN" stream of the process. my $process = IO::Async::Process->new( command => [ "reading-program", "arguments" ], stdin => { from => "Here is the data to send " }, on_finish => sub { print "The process has finished "; } ); $loop->add( $process ); The data in this scalar will be written until it is all consumed, then the handle will be closed. This may be useful if the program waits for EOF on "STDIN" before it exits. To have the ability to write more data into the process once it has started. the "write" method on the "stdin" stream can be used, when it is configured using the "pipe_write" value for "via": my $process = IO::Async::Process->new( command => [ "reading-program", "arguments" ], stdin => { via => "pipe_write" }, on_finish => sub { print "The process has finished "; } ); $loop->add( $process ); $process->stdin->write( "Here is some more data " ); AUTHOR
Paul Evans <leonerd@leonerd.org.uk> perl v5.14.2 2012-10-24 IO::Async::Process(3pm)
All times are GMT -4. The time now is 10:53 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy