How to replace the complex strings from a file using sed or awk?


 
Thread Tools Search this Thread
Top Forums Programming How to replace the complex strings from a file using sed or awk?
# 8  
Old 02-24-2015
Hi, Badhrish.
Quote:
Could you please let me know where exactly to feed the input files in your code.
The heart of the solution are these lines:
Code:
diff -y --suppress-common-lines data? |
colordiff |
ansifilter -B   # -B bbcode; -H html; -L latex, etc.

the input files are are provided just like you did with with your diff, except that I called them data1 and data2, and I used the shell meta-character "?" to allow expansion of those filenames.

If you are just looking at the output at a terminal, then ansifilter is not required. I used it to produce bbcode markup to paste here. There are other uses, for example if you would be including the output in an HTML email message.
Quote:
... is there a way to get the text coloured exactly in the position where the discrepancy is? Something like below ...
Nothing occurs to me off-hand, but a Google search might be useful. If I get some time, I'll look into it.

Best wishes ... cheers, drl
# 9  
Old 02-25-2015
Thanks for your time DRL. Please let me know if you can crack it. I am also trying my best here.

I came across this error while running this code. Please suggest how to fix it.

Code:
$ ./newfilecompare.bash

-----
 Input data files data?:
==> File1.txt <==
ABCD*DEFG~
HI*JK~
LMN*OP~

==> File2.txt <==
ABCD*DEFG~
HIH*JK~
LMN*OP~
FGH*NM~

-----
 Results:
./newfilecompare.bash: line 21: colordiff: command not found
./newfilecompare.bash: line 22: ansifilter: command not found

---------- Post updated at 08:23 PM ---------- Previous update was at 01:11 PM ----------

Quote:
Originally Posted by RudiC
No surprise as you're not priniting anything. That awkneeds two files, first the result of diff -y --suppress-common-lines file[12], second the result of diff -y file[12].
Hi Rudi, This time my input being XML, the font tags gets bypassed in my html report. So firstly I converted by difference report into HTML using enscript and in the o/p all symbols "<" and ">" will be converted to underlying code.

Input1:
Code:
<LIST>                                  
<ControlSegment                         
ISACONTROLNUMBER="58677398"             
GSCONTROLNUMBER="58677398"              
groupControlNumber="58677398"           
time="21:31:03.0130000-08:00"  />				
</LIST>

Input2:
Code:
<LIST>
<ControlSegment
ISACONTROLNUMBER="58677399"     
GSCONTROLNUMBER="58677399"      
groupControlNumber="58677399"   
time="21:31:03.2570000-08:00" /> 
entityIdentifierCode2=""
</LIST>



Code:
<!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
<HEAD>
<TITLE>Enscript Output</TITLE>
</HEAD>
<BODY>
<A NAME="top">
<A NAME="file1">
<H1>xmlcompare.txt</H1>

<PRE>
&lt;LIST&gt;                                                          				&lt;LIST&gt;
&lt;ControlSegment                                                           				 &lt;ControlSegment
ISACONTROLNUMBER=&quot;58677398&quot;                                                           |     ISACONTROLNUMBER=&quot;58677399&quot;                     
GSCONTROLNUMBER=&quot;58677398&quot;                                                            |     GSCONTROLNUMBER=&quot;58677399&quot;        
groupControlNumber=&quot;58677398&quot;                                                         |       groupControlNumber=&quot;58677399&quot;                 
time=&quot;21:31:03.0130000-08:00&quot;  /&gt;                                                  |       time=&quot;21:31:03.2570000-08:00&quot; /&gt;      
												  &gt;    entityIdentifierCode2=&quot;&quot;
&lt;/LIST&gt;												&lt;/LIST&gt;

</PRE>
<HR>
<ADDRESS>Generated by <A HREF="http://www.iki.fi/~mtr/genscript/">GNU enscript 1.6.4</A>.</ADDRESS>
</BODY>
</HTML>

Now I fed the HTML file to awk script, and the output was like each and every lines were appended with <font> tags(this time symbols will not be replaced as underlying code, hence they appear as actual html tags) and when you view it in the browser the whole data appears red. Is there a better way to handle this? I am so stuck here. Thank you.

Code:
awk     'FNR==NR {T[$0]; next}
         $0 in T {printf "\"<font color=\"red\">\"" $0 "\"</font>\""; next} 1 ' <(cat `pwd`/editedfile.html) <(cat `pwd`/editedfile.html)

# 10  
Old 02-25-2015
Hi.

The message colordiff: command not found means just that. It may be on your system, but you have not included its location into your PATH variable -- a set of locations in which the shell looks for commands. Another reason might be that it is not installed on your system. Try running the command:
Code:
which colordiff

which on my main system produces:
Code:
/usr/bin/colordiff

On a system to which I have access to, for example:
Code:
which colordiff

produces:
Quote:
/usr/bin/which: no colordiff in (/home/drl/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin)
because, although it is available for that system, I have not installed it from the repository. On that system:
Code:
yum info colordiff

produces:
Code:
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
 * base: mirrors.gigenet.com
 * centosplus: mirrors.cmich.edu
 * contrib: mirror.ubiquityservers.com
 * epel: ftp.osuosl.org
 * extras: mirror.us.leaseweb.net
 * updates: ftp.osuosl.org
Available Packages
Name        : colordiff
Arch        : noarch
Version     : 1.0.9
Release     : 3.el6
Size        : 23 k
Repo        : epel
Summary     : Color terminal highlighter for diff files
URL         : http://colordiff.sourceforge.net/
License     : GPLv2+
Description : Colordiff is a wrapper for diff and produces the same output but
            : with pretty syntax highlighting.  Color schemes can be customized.

You have not mentioned what system you are working on, so I cannot help farther for this issue. In the worst case, the command may not be available at all for your hardware/software platform, or you may need to ask the system administrator (SA) to install it. For many Linux systems, it you are the SA, then you may be able to install it. Noting that it had been available on sourceforge, it may be able to be installed in your personal files, and used from there.

As I mentioned earlier, the ansifilter is not strictly necessary unless you intend to use the colored results other than on the terminal.

If all this is too much to take in or too much work for this task, then the other previous solutions may be a better use of your time.

I have found something which does illustrate character-level differences (insertions, deletions, replacements), but not in color. An advantage is that it is a shell script, so you would probably be able to use it easily, but no color is involved (although an enterprising person might be able to add color).

Best wishes ... cheers, drl

( Edit 1: correct minor typos )

Last edited by drl; 02-25-2015 at 11:40 AM..
# 11  
Old 02-25-2015
What happens if you apply my UNALTERED script to the two files that contain the results of the two diff operations?
Don't cat the input files, don't cat the diff results; awk is well able to read those results.
# 12  
Old 02-26-2015
Hi DRL-Sorry that I missed to mention my system info. It is GNU\LINUX. And I fount that colordiff package is not present. Since I am focusing on multi-platform compatibility(UNIX/LINUX) I wouldn't able to use this command in my script. I really appreciate your help thus far Smilie

---------- Post updated at 12:16 PM ---------- Previous update was at 11:58 AM ----------

Hi Rudi- Your awk works perfect for my XML files also. But when I convert my final text file into HTML report using enscript command all symbols "<"and">" are converted to underlying code "&lt;" and "&gt;". This affects including font tags too(which is not desired).
Code:
&lt;font color="red"&gt; and &lt;/font&gt;

Hence in the browser they appear as is. That is why I applied enscript command in 1st place and appended the file with awk so as to get the proper output like below, but this time as i mentioned earlier all the lines are effected.
Code:
<font color="red"> and </font>


Last edited by Badhrish; 02-26-2015 at 08:54 AM..
# 13  
Old 03-13-2015
Found Solution

Hello World, I've managed to encode the XML data(using sed replace), which helps in displaying your XML data as such in the browser, yet with proper applied HTML attributes. I hope this will be useful for others who come across such requirement. Thanks to Rudi and DRL for their code snippets.

Code:
awk 'FNR==NR {T[$0]; next}
     $0 in T {printf "<font color=\"red\">" $0 "</font>"; next} 1 ' <(diff --width=220 -y --suppress-common-lines FILE[12] | sed -e "s/\(.\)\([A_Za-z0-9]*\)\(>\)\(.\)/\1\2\&gt;\4/g" -e "s/\(.\)\([A_Za-z0-9]*\)\(>\)/\1\2\&gt;/g" -e "s/\(<\)\([A_Za-z0-9]*\)\(.*\)/\&lt;\2\3/g" -e "s/\(.\)\(<\)\([A_Za-z0-9]*\)\(.\)/\1\&lt;\3\4/g") <(diff --width=$WIDTH -y FILE[12] | sed -e "s/\(.\)\([A_Za-z0-9]*\)\(>\)\(.\)/\1\2\&gt;\4/g" -e "s/\(.\)\([A_Za-z0-9]*\)\(>\)/\1\2\&gt;/g" -e "s/\(<\)\([A_Za-z0-9]*\)\(.*\)/\&lt;\2\3/g" -e "s/\(.\)\(<\)\([A_Za-z0-9]*\)\(.\)/\1\&lt;\3\4/g") | sed -r 's/([^^>])(<font>)/\1\n\2/g;s/(<\/font>)([^$>])/\1\n\2/g' > XMLTEMP1.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk/sed to replace variable in file

Hi All I have one file with multiple lines in it, each line has static text and some variable enclosed in <<filename>> as well. e.g. as below 123, <<file1.txt>> this is my name, I stay at <<city.txt>> Thanks for visiting 348384y, this is my name <<fileabc.txt>>, I stay at near the mall of... (8 Replies)
Discussion started by: reldb
8 Replies

2. UNIX for Beginners Questions & Answers

sed find 2 strings and replace one

Hi Everyone, I want to find this 2 strings in a single line a file and replace the second string. this is the line i need to find <param name="user" value="CORE_BI"/> find user and CORE_BI and replace only CORE_BI with admin so finally the line should look like this. <param... (5 Replies)
Discussion started by: shajay12
5 Replies

3. Shell Programming and Scripting

Complex Filter using grep, awk or sed

Hi, I'm not very familiar witrh sed or awk and hope the somebody can help me to solve my problem. I need to filter a text report using grep, sed or awk. I would like to cut out text lines with the pattern INFO and if exists the following lines of the pattern DETAILS. I need te keep the lines with... (4 Replies)
Discussion started by: Frankg
4 Replies

4. Shell Programming and Scripting

Relocation strings using awk/sed from a index file

Hi All, I'd always appreciate all helps from this website. I would like to relocate strings based on the index number from an index file. Index numbers are shown on the first column in the index file (index.txt) and I would like to relocate "path" based on index numbers. Paths are placed... (11 Replies)
Discussion started by: jypark22
11 Replies

5. Shell Programming and Scripting

Complex string operation (awk, sed, other?)

I have a file that contains RewriteRules for 200 countries (2 examples for 1 country below): RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=de_AT #& RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT I have... (5 Replies)
Discussion started by: usshadowop
5 Replies

6. Shell Programming and Scripting

Using sed to replace strings if NOT found

Dear expert, I need an urgent help. I would like to update my /etc/ntp.conf file using sed. 1) if script find this string "127.127.1.0" then add the lone below #server 127.127.1.0 2) is script find this string "fudge 127.127.1.0 stratum 10" then add #fudge 127.127.1.0 stratum 10 ... (7 Replies)
Discussion started by: lamoul
7 Replies

7. Shell Programming and Scripting

Sed or awk for batch replace file name

Can you please point me in the correct direction? I need a line or script to run though a given directory and find all files with "@domain.local" in there names and simple remove that. For example if the files were named 1234@domain.local the file would then become 1234. (1 Reply)
Discussion started by: binary-ninja
1 Replies

8. Shell Programming and Scripting

Using sed to replace two different strings?

Hey everyone! Simple question - I am trying to use sed to replace two different strings. As it stands I can implement this as: sed -i 's/TIMEOUT//g' sed -i 's/null//g' And it works. However, is it possible to shrink that down into a single command? Will there be any performance benefits? (3 Replies)
Discussion started by: msarro
3 Replies

9. Shell Programming and Scripting

Replace Strings with sed or awk

Hello i need some help with the usage of sed. Situation : 2 textfiles, file.in , file.out In the first textfile which is called file.in are the words for the substitution. Every word is in a new-line like : Firstsub Secondsub Thridsub ... In the second textflie wich is called file.out is... (5 Replies)
Discussion started by: Kingbruce
5 Replies

10. Shell Programming and Scripting

Complex Sed/Awk Question?

Hello, So i have this file called /apps/turnout which looks like that of the contents of the /etc/shadow (but not exactly) the file has a long list in it. basically, the contents of this file looks something similar to the following: jajajajalala:D#$#AFVAdfda lalabavisof:#%R@fafla#$... (3 Replies)
Discussion started by: SkySmart
3 Replies
Login or Register to Ask a Question