Finding a text in files & replacing it with unique strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding a text in files & replacing it with unique strings
# 1  
Old 01-25-2013
Finding a text in files & replacing it with unique strings

Hallo Everyone.
I have to admit I'm shell scripting illiterate . I need to find certain strings in several text files and replace each of the string by unique & corresponding text.
I prepared a csv file with 3 columns: <filename>;<old_pattern>;<new_pattern>

Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ cat file.csv
 file1.txt;abc;123
 file2.txt;XYZ;6789

A very kind guy (much more skilled than me) helped me with the script:

Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ cat script.sh
 for i in `cat file.csv`; do
 file=`echo $i | cut -d ";" -f1`;
 pattern1=`echo $i | cut -d ";" -f2`;
 pattern2=`echo $i | cut -d ";" -f3`;
 sed -i "s/$pattern1/$pattern2/" $file;
 done

To find & replace text I run the script and grep:

Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ bash script.sh
dominik@dominik-VirtualBox:~/Pulpit/test/1$ grep . file{1,2}*

With the above file.csv example the script works fine and do what's intended. The problem starts if <old_pattern> and <new_pattern> text have white spaces:

Code:
dominik@dominik-VirtualBox:~/Pulpit/test/1$ cat file.csv
 file1.txt;abc jkl nm;1 2 3
 file2.txt;XYZ rt;67 89

In that case script returns errors.I tried to modify file.csv by putting text between quotation marks but it didn't help. How the script should be adjusted to work with text consisting of white spaces? I would appreciate any help from you. Thank you very much in advance. Regards,
gordom

Last edited by gordom; 01-28-2013 at 09:20 AM.. Reason: deleting text formatting
# 2  
Old 01-25-2013
How, exactly, do your current scripts fail? What diagnostic messages are being written? What output are you getting and what output do you want? Please also give us sample input files (and all of the desired outputs corresponding to those sample input files).

Does the 2nd field in file.csv contain fixed strings or regular expressions?
With your current scripts, it looks like you intentionally have leading spaces in file.csv that are not part of the file names and you have trailing spaces on some lines that you don't want to appear in the replacement strings. Is there some reason why you have leading and trailing spaces in file.csv?

Would you prefer to perform all of these actions in a single awk script instead of calling cut three times per file processed and sed once per file processed?
# 3  
Old 01-25-2013
Try it like this:
Code:
while read rec
do
  file=`echo $rec | cut -d ";" -f1`
  pattern1=`echo $rec | cut -d ";" -f2`
  pattern2=`echo $rec | cut -d ";" -f3`
  sed -i "s/${pattern1}/${pattern2}/" $file
done <file.csv

# 4  
Old 01-25-2013
First off, could you PLEASE stay away from the text formatting! Actually your text contained more formatting tags than actual text. If one tries to quote a part of it like i did it is hard work to sift through this endless stream of size- font- and whatnot-tags.

Quote:
Originally Posted by gordom
Code:
 file=`echo $i | cut -d ";" -f1`;
 pattern1=`echo $i | cut -d ";" -f2`;
 pattern2=`echo $i | cut -d ";" -f3`;

The problem starts if <old_pattern> and <new_pattern> text have white spaces:
The problem is in the lines i quoted for you. All these lines are unquoted strings and therefore spaces are processed away by the shell. The shell has a so-called "internal field separator", which is the space char per default. This way the shell understands that you give two arguments (and not one which contains a space char) to a command in the following line:

Code:
command arg1 arg2

If you don't want this behavior, you would have to quote:

Code:
command "arg1 arg2"

The same is the case with your lines: if a part of a variable contains a space char the shell would see the following:

Code:
pattern1=`echo $i | cut -d ";" -f2`;         # before processing
pattern1=abc def geh;         # after evaluating the subshell

In this case the interpretable command would be "pattern1=abc" and "def" and "geh" would be treated as (indecipherable) other commands, which perhaps leads to some error message, in addition to "pattern1" not having the value you expect it to have.

First off, you really, really should not use backticks. Use them never, never ever, but use "$(....)" instead.

Further, in this case you should use neither because the use of "echo | cut" is completely unnecessary. The shell can do this well on its own and it is even shorter and a lot faster:

Code:
while IFS=";" read file pattern replacement ; do
     sed -i "s/${pattern}/${replacement}/" "$file"
done

Another thing is: you really should NOT use "sed -i". The reasons are explained here in detail. Use a temporary file instead and delete it afterwards:

Code:
while IFS=";" read file pattern replacement ; do
     sed "s/${pattern}/${replacement}/" "$file" > tmpfile
     mv tmpfile "$file"
done


I hope this helps.

bakunin
# 5  
Old 01-26-2013
Little to add to bakunin's explanations, except for the data source to read from: redirect input of the entire loop to your file.csv:... done < file.csv
# 6  
Old 01-28-2013
Quote:
Originally Posted by bakunin
First off, could you PLEASE stay away from the text formatting!
Sorry for that - it was unintentionally. I pasted a text from word processor - my mistake. The original post was already corrected.

Quote:
Originally Posted by bakunin
In this case the interpretable command would be "pattern1=abc" and "def" and "geh" would be treated as (indecipherable) other commands, which perhaps leads to some error message, in addition to "pattern1" not having the value you expect it to have.
That exactly what was happening.

Quote:
Originally Posted by bakunin
Code:
while IFS=";" read file pattern replacement ; do
     sed "s/${pattern}/${replacement}/" "$file" > tmpfile
     mv tmpfile "$file"
done

Quote:
Originally Posted by RudiC
Little to add to bakunin's explanations, except for the data source to read from: redirect input of the entire loop to your file.csv:... done < file.csv
In the end, thanks everyone for help. Specially bakunin & RudiC - the script seems to work perfect now. Great Smilie
Regards,
gordom
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Command to show unique strings in two files

how to display the unique strings in two files using shell script or commands. I tried diff and cmp but it shows the entire line, i need only the mismatched strings. File1: sat,sun,mon,tue rose,lilly,lotus white,red,blue,green,pink File2: sat,sun,mon,tue rose,sunflower,lotus... (4 Replies)
Discussion started by: Arun_Linux
4 Replies

2. Shell Programming and Scripting

Finding/replacing strings in some files based on a file

Hi, We have a file (e.g. a .csv file, but could be any other format), with 2 columns: the old value and the new value. We need to modify all the files within the current directory (including subdirectories), so find and replace the contents found in the first column within the file, with the... (9 Replies)
Discussion started by: Talkabout
9 Replies

3. Shell Programming and Scripting

Extended replacing of nonspecific strings in text files [beware complicated !]

Well, to make another post at this helpful forum :b::D: I recently tried something like this, I want to replace all those numberings/letters that are located between <string>file://localhost/var/mobile/Applications/ and /Documents/</string> numberings =---- replace with: first... (6 Replies)
Discussion started by: pasc
6 Replies

4. Shell Programming and Scripting

Replacing variable Text between fixed strings

Hello all, This is my first post and I hope you can help me out. I searched for quite some hours now and haven't found a simple solution to my problem. It is as following: I got this file: dl.dropbox.com/u/14586156/stuff/Bookmarks.plist and want to replace the Text between... (9 Replies)
Discussion started by: pasc
9 Replies

5. Homework & Coursework Questions

Finding/replacing text and redirection help

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: What command would rename "sequentialInsert", in ~cs252/Assignments/commandsAsst/project/arrayops.h, to... (2 Replies)
Discussion started by: lothwen
2 Replies

6. UNIX for Dummies Questions & Answers

Finding Unique strings which match pattern

I need to grep for a pattern in a file. Files are huge and have several repeated occurances of the strings which match pattern. I just need the strings which contain the pattern in the output. For eg. The contents of my file are as follows. The pattern I want to match by is ABCD ... (5 Replies)
Discussion started by: tektips
5 Replies

7. UNIX for Dummies Questions & Answers

Finding & Replacing specific Fields

All I have a very large file (aproximately 150,000) as shown below separated by pipe "|". I need to replace data in 2, 16, 17, 23 fields that are of time stamp format. My goal is to look in those fields and it ends with "000000|" then replace it with "000|". In other words, make it as 6 digit... (2 Replies)
Discussion started by: ddraj2015
2 Replies

8. Shell Programming and Scripting

help with finding & replacing pattern in a file

Hi everyone. Could u be so kind and help me with on "simple" shell script? 1. i need to search a file line by line for a pattern. example of a lines in that file 2947 domain = feD,id = 00 0A 02 48 17 1E 1D 39 DE 00 0E 00,Name Values:snNo = f10 Add AttFlag = 0 2. i need to find... (0 Replies)
Discussion started by: dusoo
0 Replies

9. Shell Programming and Scripting

replacing strings with text from other file

Hi, Im trying to update some properties files with text from another file: file1 user=xyz file2 user= after script file2 user=xyz Im using this reading the $QUARTZURL,ETC... from quartz.properties: echo... (1 Reply)
Discussion started by: mc1392
1 Replies

10. Solaris

finding & replacing blank rows/spaces in a file

Can anyone help me find and replace blank rows in a file with a numeric value (ie blankrow=someTxtOrNumValue), the file is over 500,000 rows long so it would need to be the quickest way as I'll need to do this for multiple files...I would be greatfull for any suggestions....thanks sample file:... (2 Replies)
Discussion started by: Gerry405
2 Replies
Login or Register to Ask a Question