Speeding up search and replace in a for loop


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Speeding up search and replace in a for loop
# 1  
Old 06-15-2012
Speeding up search and replace in a for loop

Hello,
I am using sed in a for loop to replace text in a 100MB file. I have about 55,000 entries to convert in a csv file with two entries per line. The following script works to search file.txt for the first field from conversion.csv and then replace it with the second field. While it works fine, it takes quite some time to run. Does anyone have any suggestions on how to speed it up?

Code:
while read LINE
do
  OLD=$( echo $LINE | cut -d, -f1 )
  NEW=$( echo $LINE | cut -d, -f2 )
  sed -i "s#\"$OLD\"#\"$NEW\"#g" file.txt
done < conversion.csv

# 2  
Old 06-15-2012
You could try to use a sed command file. Write all substitute-commands into one file and then call sed once, like:
Code:
while read LINE
do
  OLD=$( echo $LINE | cut -d, -f1 )
  NEW=$( echo $LINE | cut -d, -f2 )
  echo "s#\"$OLD\"#\"$NEW\"#g" >>commands.txt
done < conversion.csv
sed -i -f commands.txt file.txt

But I am not sure, if sed can handle so many substitutions in one run. You just have to try it.
This User Gave Thanks to hergp For This Post:
# 3  
Old 06-15-2012
Post withdrawn. Major flaw found.

hergp's is essentially the same. Whether it will work will depend on the version of sed and how much free memory you have.

Last edited by methyl; 06-15-2012 at 09:45 AM..
# 4  
Old 06-15-2012
Similar strategy, different tools.

Code:
awk -F, '{print "%s/\""$1"\"/\""$2"\"/g"} END {print "x"}' conversion.csv | ex -s file.txt

or using sed/echo to generate the ex commands:

Code:
{ sed 's#,#"/"#; s#.*#%s/"&"/g#' conversion.csv; echo x; } | ex -s file.txt

Caveat: If the fields in the csv file contain regular expression metacharacters, none of these solutions (including your original) will work reliably (they could fail silently, as the metacharacters may not necessarily create invalid syntax).

Regards,
Alister

Last edited by alister; 06-15-2012 at 10:50 AM.. Reason: Overlooked double quotes in original post.
This User Gave Thanks to alister For This Post:
# 5  
Old 06-17-2012
Thanks for the suggestions everyone. Here's how it went.
My original method: 99,640 CPU seconds
hergp's method: 207,613 CPU seconds
alister's awk method: 42,552 CPU seconds

Thanks alister for reducing the time by more than 50%!

Last edited by pbluescript; 06-17-2012 at 08:39 PM.. Reason: Forgot a word.
# 6  
Old 06-17-2012
Thank you for reporting back. Happy to help.

Regards,
Alister
# 7  
Old 06-18-2012
Wow, I did not expect ex to be so much more efficient than sed.

Do you have the total run-time in seconds for the three approaches too, pbluescript?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help speeding up script

This is my first experience writing unix script. I've created the following script. It does what I want it to do, but I need it to be a lot faster. Is there any way to speed it up? cat 'Tax_Provision_Sample.dat' | sort | while read p; do fn=`echo $p|cut -d~ -f2,4,3,8,9`; echo $p >> "$fn.txt";... (20 Replies)
Discussion started by: JohnN6
20 Replies

2. Shell Programming and Scripting

Speeding up substitutions

Hi all, I have a lookup table from which I am looking up values (from col1) and replacing them by corresponding values (from col2) in another file. lookup file a,b c,d So just replace a by b, and replace c by d. mainfile a,fvvgeggsegg,dvs a,fgeggefddddddddddg... (7 Replies)
Discussion started by: senhia83
7 Replies

3. Shell Programming and Scripting

Nested search in a file and replace the inner search

Hi Team, I am new to unix, please help me in this. I have a file named properties. The content of the file is : ##Mobile props east.url=https://qa.east.corp.com/prop/end west.url=https://qa.west.corp.com/prop/end south.url=https://qa.south.corp.com/prop/end... (2 Replies)
Discussion started by: tolearn
2 Replies

4. Shell Programming and Scripting

search replace with loop and variable

Hi, could anyone help me with this, tried several times but still not getting it right or having enough grounding to do it outside of javascript: Using awk or sed or bash: need to go through a text file using a for next loop, replacing substrings in the file that consist of a potentially multi... (3 Replies)
Discussion started by: wind
3 Replies

5. Shell Programming and Scripting

perl search and replace - search in first line and replance in 2nd line

Dear All, i want to search particular string and want to replance next line value. following is the test file. search string is tmp,??? ,10:1 "???" may contain any 3 character it should remain the same and next line replace with ,10:50 tmp,123 --- if match tmp,??? then... (3 Replies)
Discussion started by: arvindng
3 Replies

6. UNIX for Dummies Questions & Answers

Speeding/Optimizing GREP search on CSV files

Hi all, I have problem with searching hundreds of CSV files, the problem is that search is lasting too long (over 5min). Csv files are "," delimited, and have 30 fields each line, but I always grep same 4 fields - so is there a way to grep just those 4 fields to speed-up search. Example:... (11 Replies)
Discussion started by: Whit3H0rse
11 Replies

7. Programming

PERL, search and replace inside foreach loop

Hello All, Im a Hardware engineer, I have written this script to automate my job. I got stuck in the following location. CODE: .. .. ... foreach $key(keys %arr_hash) { my ($loc,$ind,$add) = split /,/, $arr_hash{$key}; &create_verilog($key, $loc, $ind ,$add); } sub create_verilog{... (2 Replies)
Discussion started by: riyasnr007
2 Replies

8. Shell Programming and Scripting

awk - replace number of string length from search and replace for a serialized array

Hello, I really would appreciate some help with a bash script for some string manipulation on an SQL dump: I'd like to be able to rename "sites/WHATEVER/files" to "sites/SOMETHINGELSE/files" within the sql dump. This is quite easy with sed: sed -e... (1 Reply)
Discussion started by: otrotipo
1 Replies

9. UNIX for Dummies Questions & Answers

Speeding up a Shell Script (find, grep and a for loop)

Hi all, I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs. The first thing I did was: ls -R | grep .pdf > /tmp/dave_pdfs.outWhich generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like... (8 Replies)
Discussion started by: Dave Stockdale
8 Replies

10. Shell Programming and Scripting

Perl: Search for string on line then search and replace text

Hi All, I have a file that I need to be able to find a pattern match on a line, search that line for a text pattern, and replace that text. An example of 4 lines in my file is: 1. MatchText_randomNumberOfText moreData ReplaceMe moreData 2. MatchText_randomNumberOfText moreData moreData... (4 Replies)
Discussion started by: Crypto
4 Replies
Login or Register to Ask a Question