sed multiple replace


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed multiple replace
# 1  
Old 10-11-2010
Question sed multiple replace

Hello! I'm using sed to perform a lots of replaces in one text file. I call it this way:
Code:
sed -f commands.txt in.txt > out.txt

commands.txt has about 1000 lines, each one is some variation of:
Code:
s/from/to/gI

And in.txt has about 300 000 lines. So the problem is that operation takes about 20 minutes to complete. Is there any way to optimize? I suppose it works not the way it should. Thanks.

Moderator's Comments:
Mod Comment Use code tags please, ty.

Last edited by zaxxon; 10-11-2010 at 05:55 AM..
# 2  
Old 10-11-2010
Command-wise it looks pretty optimal to me. Perhaps you could split up "in.txt" into 4 or more different parts and process them independently and in parallel in the background to 4 or more different outputfiles that could then be merged after the fact....
# 3  
Old 10-11-2010
Quote:
Originally Posted by Scrutinizer
Command-wise it looks pretty optimal to me. Perhaps you could split up "in.txt" into 4 or more different parts and process them independently and in parallel in the background to 4 or more different outputfiles that could then be merged after the fact....
Thanks for your reply, I'll try it out. Though I'm still convinced it works too slow. Already tried to read from STDIN this way:
Code:
mysql -uUser -pPassword -Dbase --execute="SELECT id,str FROM in" | sed -f commands.txt > out.txt

And I still get result file in 15-20 minutes.

---------- Post updated at 10:43 PM ---------- Previous update was at 08:31 PM ----------

Well I divided in.txt into pieces each of 50K lines. It appeared to be 7 files in result (average size is 1.5-4 Mb). Then I ran several `sed` instances (separating them by &). The scheme works like charm, but still it takes about 8 minutes to parse all pieces.

I just don't get it. Total size of input text is just 20 Mb. File with sed commands is just 100 Kb. These sizes are not that big imho. But why so slow then?

Computers are rather fast. I tried both windows (2.6 GHz, 2Gb RAM) and unix (Xeon 8x2.50GHz, 1Gb RAM) servers.
# 4  
Old 10-11-2010
I think the problem is in the I-switch. I did a couple of quick tests. On average it took 10 times longer with the I-switch than it did without. Perhaps the I switch is not necessary everywhere, or perhaps you can prune the number of combinations of uppercase and lowercase search strings, for example, instead of:
Code:
s/from/to/gI

use
Code:
s/from/to/g
s/FROM/to/g

or
Code:
s/\(from\|FROM\)/to/g

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 10-11-2010
or try
Code:
sed 's/from\|FROM/to/g' file1> file2

Code:
# time sed 's/from\|FROM/to/g' file1> file2
real    0m0.269s
user    0m0.225s
sys     0m0.040s
 
# time sed 's/from/to/gi' file1> file2
real    0m0.325s
user    0m0.289s
sys     0m0.033s
 
# time sed 's/\(from\|FROM\)/to/g' file1> file2
real    0m0.791s
user    0m0.750s
sys     0m0.029s

These 2 Users Gave Thanks to ygemici For This Post:
# 6  
Old 10-11-2010
Thanks ygemici. Confirmed, the \( \)are superfluous and take more time, so they should be left out.
# 7  
Old 10-11-2010
Thank you Scrutinizer for your excellent ideas in forums too Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed command to replace multiple column in one go

Hi, I want to replace the value in more than one column. For one column ,following command is working - sed 's/./$value_to_replace/$column number' file_name e.g. suppose this is input 1111000000 command - sed 's/./M/5' output= 1111M000000 For two column also command is like - cat... (22 Replies)
Discussion started by: Preeti Chandra
22 Replies

2. Shell Programming and Scripting

Replace multiple lines through sed

Hi All, I have a input file as sample below <this is not starting of file> record line1 line2 line3 end line4 line5 record line6 line7 line8 my requirement is this, i want to select a pattern between first record and end, whatever is written between first record and end. and... (0 Replies)
Discussion started by: adgangwar
0 Replies

3. Shell Programming and Scripting

using sed to find and replace multiple numbers

I have looked around and there are several examples of how to use sed, but I don't think any of them help me very much with what I am trying to do. I have a text file like this.... 1! SRCNAM = 00001 ! 1! X = 50.0000, 0.0000,... (10 Replies)
Discussion started by: mercury.int
10 Replies

4. Shell Programming and Scripting

How to replace multiple text in a file using sed

can anyone please help me in the below scenario: File1: Hello1 Hello1 i want to use sed to replace multiple occurances of Hello1 in file 1 to welcome. Thanks a ton for the help (9 Replies)
Discussion started by: amithkhandakar
9 Replies

5. Shell Programming and Scripting

SED multiple find and replace

Hi, searched through the forums and not really found what I am looking for. I am a bit of novice when it comes to anything above basic scripting and not even that when it comes to the sed command. I have been reading the tutorials online but still struggling to get what I need :wall: ... (10 Replies)
Discussion started by: colinwilson1303
10 Replies

6. Shell Programming and Scripting

SED : Replace whole line on multiple execution

Hi, I am have one file with a line group=project_live I need to replace it with line group=project_live_support before I execute some application related script. The potentianl problem is when I replace this with sed using command sed... (2 Replies)
Discussion started by: bhaskar_m
2 Replies

7. Shell Programming and Scripting

sed replace multiple occurrences on the same line, but not all

Hi there! I am really enjoying working with sed. I am trying to come up with a sed command to replace some occurrences (not all) in the same line, for instance: I have a command which the output will be: 200.300.400.5 0A 0B 0C 01 02 03 being that the last 6 strings are actually one... (7 Replies)
Discussion started by: ppucci
7 Replies

8. Shell Programming and Scripting

Replace multiple lines between tags using sed

I have a file example.txt with content look like this: <TAG> 1 2 3 </TAG> and I use a sed command to replace everything between <TAG></TAG> as below: sed -e 's/\(<TAG>\)*\(<.*\)/something/g' example.txt > example.txt.new But unfortunately, the command failed to replace as i want, it... (23 Replies)
Discussion started by: dollylamb
23 Replies

9. Shell Programming and Scripting

sed find and replace multiple lines

I am new to linux and would like to modify the contents of a file preferably using a one line. The situation is as follows <start> some lines "I am the string" "replace string" more lines here <end> In the above example,On encountering "I am the string", the "replace string "should be... (6 Replies)
Discussion started by: supersimha
6 Replies

10. Shell Programming and Scripting

using sed command to replace multiple lines

the file contains the follwoing lines /* * Copyright (C) 1995-1996 by XXX Corporation. This program * contains proprietary and confidential information. All rights reserved * except as may be permitted by prior written consent. * * $Id: xxx_err.h,v 1.10 2001/07/26 18:48:34 zzzz $ ... (1 Reply)
Discussion started by: radha.kalivar
1 Replies
Login or Register to Ask a Question