Need help with sed replace script to include spaces


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Need help with sed replace script to include spaces
# 1  
Old 02-05-2015
Need help with sed replace script to include spaces

I've got a sed shell script I've been using for quite a while to do batch find/replace operations on .xml files containing various types of text entries. This is the script I use:
Code:
#!/bin/bash
while read text_old text_new; do
        sed_script+="s/$text_old/$text_new/g;"
done < trans_old_to_new
find . -type f -name \*.xml -exec sed -i.BAK "$sed_script" '{}' +

Accompanying the shell script is a text or xml file with a simple 2-column structure. The left column contains the text to find, the right column contains the text to replace it with, and the two columns are separated with a space.
Code:
00-11-22-33-44-55 AA-BB-CC-DD-EE-FF
11-22-33-44-55-66 BB-CC-DD-EE-FF-GG

Is there a way to modify this shell script or the format of the 2-column replacement matrix to accommodate for text strings containing spaces? For example:
Code:
Replace this text containing spaces SEPARATOR With this text containing spaces

# 2  
Old 02-06-2015
Change the delimter in file trans_old_to_new to any character which is not present in your search or replacement string

Code:
#!/bin/bash
IFS="|" # Set IFS to delimiter used in trans_old_to_new
while read text_old text_new; do
        sed_script+="s/$text_old/$text_new/g;"
done < trans_old_to_new
find . -type f -name \*.xml -exec sed -i.BAK "$sed_script" '{}' +

# 3  
Old 02-06-2015
Excellent, thank you. I've set it up to use a pipe and it works for plain text entries.
One thing I should have included in my original post was that text_old and text_new may contain forward- and/or back-slashes for paths, so text_old_to_new might contain something like this:
Code:
\some\old\path|\a\new\path

and the actual xml file might contain a line similar to this:
Code:
<Result path="S:\output\series\some\old\path\2014-10-01.txt" /><Source value="S:\input\fieldset\list.txt />

The pink text is what needs to be changed, but is not. I tried escaping the slashes in the trans_old_to_new file, like so:
Code:
\\some\\old\\path|\\a\\new\\path

but then I get the following error when running the script:
Code:
sed: -e expression #1, char 146: unterminated `s' command

Is there some different way of escaping the slashes?
# 4  
Old 02-06-2015
Quote:
Originally Posted by Agreppa
One thing I should have included in my original post was that text_old and text_new may contain forward- and/or back-slashes for paths,
This problem is related but different than the first one: the first problem (spaces) was that the character you wanted to use was special in some say to the shell - the solution was to move that special meaning around to another character (pipe instead of space) by redefining the "IFS" (internal field separator) char. Note that - once you might need the pipe char in your expressions - you can move the special meaning somewhere else by using even another character you will not need otherwise at the moment to fill that role.

Back to your problem at hand: the backslash (and many other) character(s) have an equally special meaning to sed as the space char has to the shell, but in this case you cannot shift these around. All these are used in "regular expressions" and are called "metacharacters". Here is a list of the most oftenly used ones: ".*&\[{]}/"

The backslash is used to strip their special meaning from these characters and make them ordinary characters again. For example the the dot "." is used to denote one character, regardless of which it is. The sequence "\." instead means a single dot char. The same way "\x" means a literal "x" (even though "x" would mean that anyway) and "\\" means a literal backslash. The easiest way therefore is to just escape any character on the list prior to using it in an expression. The only character we leave out is the "|" which will be consumed by the shell anyway:

Code:
#! /bin/bash
set -o lastpipe

local sed_script=""
local text_old=""
local text_new=""

sed 's/[^|]/\\&/g' /path/to/trans_old_to_new |\
while IFS="|" read text_old text_new junk ; do
     sed_script+="s/${text_old}/${text_old}/g;"
done

find . -type f -name \*.xml -exec sed -i.BAK "$sed_script" {} +

Notice, that i used a pipeline and set "lastpipe" for this so that "$sed_script" is not local to the while-loop. This is only possible for bash upwards of some version i do not know for sure (IIRC 4.2), but i find this way of writing a loop fed by some datastream easier to read.

I also introduced a third variable "junk" into the read-statement to filter out unwanted artefacts that might be on some input lines.

I hope this helps.

bakunin
# 5  
Old 02-06-2015
Or use a plain-text substitution.
This is not possible with sed, but with perl and with awk:
Code:
#!/bin/bash
find . -type f -name \*.xml |
while IFS="" read -r targetfile
do
 cp -p "$targetfile" "$targetfile".BAK &&
 awk -F"|" '
NR==FNR {map[$1]=$2; next} {for (m in map) {for (li=0; (i=index($0,m)) && i>li; li=i) {$0=substr($0,1,i-1) map[m] substr($0,i+length(m))}} print}
' trans_old_to_new "$targetfile".BAK > "$targetfile"
done

# 6  
Old 02-06-2015
Bakunin, that script throws a bunch of errors:
Code:
/path/to/translate.sh: line 2: set: lastpipe: invalid option name
/path/to/translate.sh: line 4: local: can only be used in a function
/path/to/translate.sh: line 5: local: can only be used in a function
/path/to/translate.sh: line 6: local: can only be used in a function

Also, I think in your script you may have duplicated the column reference here:
Code:
sed_script+="s/${text_old}/${text_old}/g;"

Just to be clear, there is no file named text_old nor text_new. There are 3 files in total:
1. the xml file containing the text to be replaced
2. the translate.sh script containing the sed command
3. the trans_old_to_new file, containing two columns: old text and new text.
# 7  
Old 02-06-2015
Quote:
Originally Posted by Agreppa
Bakunin, that script throws a bunch of errors:
Sorry, copied that off from the terminal without double-checking: "text_old" should be "text_new" of course, and "local" is an alias in my environment - use "typeset" instead.

Quote:
Originally Posted by Agreppa
Code:
/path/to/translate.sh: line 2: set: lastpipe: invalid option name

This, i am afraid, i can do nothing about: as i told you only relatively recent bash-versions know this option, yours might be not recent enough. As i explained, i used it only to avoid the ugly bottom-steered while-loop, which you seem to have to use in this case.

I hope this helps.

bakunin
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed replace one space and leave other spaces untouched

Hi Friends, I looked up online, but couldn't figure out a proper solution. I have an input file where the columns are separated by multiple spaces and the column content is separated by single space. For example, Chr1 hello world unix is fun In the above example, chr1 is first... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

2. Shell Programming and Scripting

How to catch a two word keyword which may contain a new line(may include spaces or tab) in it?

How to catch a two word keyword which may contain a new line(may include spaces or tab) in it. for example there is a file a.txt. $more a.txt create view as (select from ......... .......... ( select .... ( select ...... .. select only no ((( number ( select end (12 Replies)
Discussion started by: neelmani
12 Replies

3. Shell Programming and Scripting

sed to replace the matching pattern with equal number of spaces

Hi I have written a shell script which used sed code below sed -i 's/'"$Pattern"'/ /g' $FileName I want to count the length of Pattern and replace it with equal number of spaces in the FileName. I have used $(#pattern) to get the length but could not understand how to replace... (8 Replies)
Discussion started by: rakeshkumar
8 Replies

4. Shell Programming and Scripting

making find/sed to include directory names with spaces

how can i make find/sed to include directory names with spaces the command is like this for i in `find wp-content/themes -type f -print0 | xargs -0 grep -l -iE 'e'`;do sed -i -e 's/word1/word2/gI' "$i";done but it skips one directory names with spaces sed: can't read ./Nova: No such... (5 Replies)
Discussion started by: vanessafan99
5 Replies

5. Shell Programming and Scripting

How to Use Sed Command to replace white spaces with comma from between two fields - Mayank

SHELL SCRIPT Hi I have a file in the following format Mayank Sushant Dheeraj Kunal ARUN Samir How can i replace the white space in between and replace them with a comma?? The resultant output should be Mayank,Sushant Dheeraj,Kunal ARUN,Samir i tried using sed -e... (8 Replies)
Discussion started by: mayanksargoch
8 Replies

6. Shell Programming and Scripting

Include white spaces while using CUT command

Hi I tried to extract 19 characters (default) enclosed with in tag from a file using cut command. If the characters comprises of double space, the cut command gives the output with a single spacing. file 1 <name>Kumar Rajasekaran</name> cut -c7-26 "file1" the out put i received is ... (48 Replies)
Discussion started by: Sekar1
48 Replies

7. Shell Programming and Scripting

Help with sed matching <tag1> newline spaces <tag2> and replace the value in the same string format

Hi, I'm very new to shell scripting and have searched google and this forum for quite some time now. I have the following in my xml file: <recipients> <member>value1</member> </recipients> I need to find a string <recipients> that follows with a new-line and bunch of spaces and... (5 Replies)
Discussion started by: mgharios
5 Replies

8. Shell Programming and Scripting

sed replace spaces between quotes with a variable

I have lines with: elseif (req.http.host ~ "^(www.)?edificationtube.com$|www.edificationtube.org www.edificationtube.net edificationtube.org www.edificationtube.com edificationtube.net") { elseif (req.http.host ~ "^(www.)?collegecontender.com$|www.collegecontender.com collegecontenders.com... (3 Replies)
Discussion started by: EXT3FSCK
3 Replies

9. Shell Programming and Scripting

Using sed to replace a string in file with a string in a variable that contains spaces

Hi, i call my shell like: my_shell "my project name" my script: #!/bin/bash -vx projectname=$1 sed s/'PROJECT_NAME ='/'PROJECT_NAME = '$projectname/ <test_config_doxy >temp cp temp test_config_doxy the following error occurres: sed s/'PROJECT_NAME ... (2 Replies)
Discussion started by: vivelafete
2 Replies

10. Shell Programming and Scripting

How to include PDF file with spaces

I am able to include a pdf file as an attachment in an email using the following: echo "" > reports elm -s "RW100 PDF Reports" me@myemail.com < reports However, if I have a filename that contains some spaces, I'm not so lucky. I've tried: echo "" > reports but no luck. I keep getting... (1 Reply)
Discussion started by: lawadm1
1 Replies
Login or Register to Ask a Question