Finding a text in files & replacing it with unique strings | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Finding a text in files & replacing it with unique strings

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 01-25-2013
gordom gordom is offline
Registered User
 
Join Date: Jan 2013
Last Activity: 28 January 2013, 8:47 AM EST
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Finding a text in files & replacing it with unique strings

Hallo Everyone.
I have to admit I'm shell scripting illiterate . I need to find certain strings in several text files and replace each of the string by unique & corresponding text.
I prepared a csv file with 3 columns: <filename>;<old_pattern>;<new_pattern>


Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ cat file.csv
 file1.txt;abc;123
 file2.txt;XYZ;6789

A very kind guy (much more skilled than me) helped me with the script:


Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ cat script.sh
 for i in `cat file.csv`; do
 file=`echo $i | cut -d ";" -f1`;
 pattern1=`echo $i | cut -d ";" -f2`;
 pattern2=`echo $i | cut -d ";" -f3`;
 sed -i "s/$pattern1/$pattern2/" $file;
 done

To find & replace text I run the script and grep:


Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ bash script.sh
dominik@dominik-VirtualBox:~/Pulpit/test/1$ grep . file{1,2}*

With the above file.csv example the script works fine and do what's intended. The problem starts if <old_pattern> and <new_pattern> text have white spaces:


Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ cat file.csv
 file1.txt;abc jkl nm;1 2 3
 file2.txt;XYZ rt;67 89

In that case script returns errors.I tried to modify file.csv by putting text between quotation marks but it didn't help. How the script should be adjusted to work with text consisting of white spaces? I would appreciate any help from you. Thank you very much in advance. Regards,
gordom

Last edited by gordom; 01-28-2013 at 08:20 AM.. Reason: deleting text formatting
Sponsored Links
    #2  
Old 01-25-2013
Don Cragun's Avatar
Don Cragun Don Cragun is online now Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 2 September 2014, 2:34 PM EDT
Location: San Jose, CA, USA
Posts: 4,500
Thanks: 177
Thanked 1,511 Times in 1,283 Posts
How, exactly, do your current scripts fail? What diagnostic messages are being written? What output are you getting and what output do you want? Please also give us sample input files (and all of the desired outputs corresponding to those sample input files).

Does the 2nd field in file.csv contain fixed strings or regular expressions?
With your current scripts, it looks like you intentionally have leading spaces in file.csv that are not part of the file names and you have trailing spaces on some lines that you don't want to appear in the replacement strings. Is there some reason why you have leading and trailing spaces in file.csv?

Would you prefer to perform all of these actions in a single awk script instead of calling cut three times per file processed and sed once per file processed?
Sponsored Links
    #3  
Old 01-25-2013
spacebar's Avatar
spacebar spacebar is offline
Registered User
 
Join Date: Oct 2009
Last Activity: 2 June 2014, 3:23 PM EDT
Location: spaceBAR Central
Posts: 440
Thanks: 0
Thanked 70 Times in 70 Posts
Try it like this:

Code:
while read rec
do
  file=`echo $rec | cut -d ";" -f1`
  pattern1=`echo $rec | cut -d ";" -f2`
  pattern2=`echo $rec | cut -d ";" -f3`
  sed -i "s/${pattern1}/${pattern2}/" $file
done <file.csv

    #4  
Old 01-25-2013
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
 
Join Date: May 2005
Last Activity: 2 September 2014, 8:39 AM EDT
Location: In the leftmost byte of /dev/kmem
Posts: 4,210
Thanks: 44
Thanked 801 Times in 632 Posts
First off, could you PLEASE stay away from the text formatting! Actually your text contained more formatting tags than actual text. If one tries to quote a part of it like i did it is hard work to sift through this endless stream of size- font- and whatnot-tags.

Quote:
Originally Posted by gordom View Post
Code:
 file=`echo $i | cut -d ";" -f1`;
 pattern1=`echo $i | cut -d ";" -f2`;
 pattern2=`echo $i | cut -d ";" -f3`;

The problem starts if <old_pattern> and <new_pattern> text have white spaces:
The problem is in the lines i quoted for you. All these lines are unquoted strings and therefore spaces are processed away by the shell. The shell has a so-called "internal field separator", which is the space char per default. This way the shell understands that you give two arguments (and not one which contains a space char) to a command in the following line:


Code:
command arg1 arg2

If you don't want this behavior, you would have to quote:


Code:
command "arg1 arg2"

The same is the case with your lines: if a part of a variable contains a space char the shell would see the following:


Code:
pattern1=`echo $i | cut -d ";" -f2`;         # before processing
pattern1=abc def geh;         # after evaluating the subshell

In this case the interpretable command would be "pattern1=abc" and "def" and "geh" would be treated as (indecipherable) other commands, which perhaps leads to some error message, in addition to "pattern1" not having the value you expect it to have.

First off, you really, really should not use backticks. Use them never, never ever, but use "$(....)" instead.

Further, in this case you should use neither because the use of "echo | cut" is completely unnecessary. The shell can do this well on its own and it is even shorter and a lot faster:


Code:
while IFS=";" read file pattern replacement ; do
     sed -i "s/${pattern}/${replacement}/" "$file"
done

Another thing is: you really should NOT use "sed -i". The reasons are explained here in detail. Use a temporary file instead and delete it afterwards:


Code:
while IFS=";" read file pattern replacement ; do
     sed "s/${pattern}/${replacement}/" "$file" > tmpfile
     mv tmpfile "$file"
done


I hope this helps.

bakunin
Sponsored Links
    #5  
Old 01-26-2013
RudiC RudiC is offline Forum Advisor  
Registered User
 
Join Date: Jul 2012
Last Activity: 1 September 2014, 3:01 PM EDT
Location: Aachen, Germany
Posts: 4,159
Thanks: 69
Thanked 995 Times in 943 Posts
Little to add to bakunin's explanations, except for the data source to read from: redirect input of the entire loop to your file.csv: ... done < file.csv
Sponsored Links
    #6  
Old 01-28-2013
gordom gordom is offline
Registered User
 
Join Date: Jan 2013
Last Activity: 28 January 2013, 8:47 AM EST
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by bakunin View Post
First off, could you PLEASE stay away from the text formatting!
Sorry for that it was unintentionally. I pasted a text from word processor my mistake. The original post was already corrected.

Quote:
Originally Posted by bakunin View Post
In this case the interpretable command would be "pattern1=abc" and "def" and "geh" would be treated as (indecipherable) other commands, which perhaps leads to some error message, in addition to "pattern1" not having the value you expect it to have.
That exactly what was happening.

Quote:
Originally Posted by bakunin View Post
Code:
while IFS=";" read file pattern replacement ; do
     sed "s/${pattern}/${replacement}/" "$file" > tmpfile
     mv tmpfile "$file"
done

Quote:
Originally Posted by RudiC View Post
Little to add to bakunin's explanations, except for the data source to read from: redirect input of the entire loop to your file.csv: ... done < file.csv
In the end, thanks everyone for help. Specially bakunin & RudiC the script seems to work perfect now. Great
Regards,
gordom
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Finding/replacing strings in some files based on a file Talkabout Shell Programming and Scripting 9 07-14-2012 11:00 PM
Extended replacing of nonspecific strings in text files [beware complicated !] pasc Shell Programming and Scripting 6 06-02-2011 12:19 PM
Finding Unique strings which match pattern tektips UNIX for Dummies Questions & Answers 5 02-16-2009 04:13 PM
Finding & Replacing specific Fields ddraj2015 UNIX for Dummies Questions & Answers 2 01-15-2009 05:38 PM
help with finding & replacing pattern in a file dusoo Shell Programming and Scripting 0 05-23-2008 03:59 AM



All times are GMT -4. The time now is 03:01 PM.