How to replace the complex strings from a file using sed or awk?


 
Thread Tools Search this Thread
Top Forums Programming How to replace the complex strings from a file using sed or awk?
# 1  
Old 02-20-2015
How to replace the complex strings from a file using sed or awk?

[/CODE]Dear All,

I am having a requirement to find the difference between 2 files and generate a discrepancy report out of it as an html page. I prefer using diff -y file1 file2 since it gives user friendly layout to know any discrepancy in the record and unique records among the 2 file. Here's how it looks like.
File1:
Code:
ABCD*DEFG~ 
HI*JK~
LMN*OP~

File2:
Code:
ABCD*DEFG~
HIH*JK~
LMN*OP~
FGH*NM~

Output is :

Code:
ABCD*DEFG~                                                   ABCD*DEFG~
HI*JK~                                                        |  HIH*JK~
                                                                  > XY*Z~
LMN*OP~                                                        LMN*OP~
                                                                  > FGH*NM~

I need to replace the lines that has bad data with html tags as prefix and suffix w/o altering the inundation of the output

Code:
ABCD*DEFG~             ABCD*DEFG~
<font color="red">HI*JK~  | HIH*JK~</font>
<font color="red">            > XY*Z~</font>
LMN*OP~                            LMN*OP~
<font color="red">            > FGH*NM~</font>

I am not able to use | and > as FS or delimiter in awk or sed since my actual files might also contain such characters. Please suggest me the best solution to overcome this challenge.

Last edited by Badhrish; 02-20-2015 at 04:24 AM.. Reason: Please use [code][/code] tags.
# 2  
Old 02-20-2015
With a recent bash providing "process substitution", you could try
Code:
awk     'FNR==NR {T[$0]; next}
         $0 in T {print "<font color=\"red\">" $0 "</font>"; next}
         1
        ' <(diff -y --suppress-common-lines file[12]) <(diff -y file[12])
<font color="red">ABCD*DEFG~                               |    ABCD*DEFG~</font>
<font color="red">HI*JK~                                  |    HIH*JK~</font>
LMN*OP~                                LMN*OP~
<font color="red">                                  >    FGH*NM~</font>


Last edited by RudiC; 02-20-2015 at 06:34 AM..
# 3  
Old 02-20-2015
Hi Rudi, Thanks for the reply but I am not getting the output out of this code. Please check if I am missing something

My file :diffoutput.txt
Code:
ABCD*DEFG~                                                  ABCD*DEFG~
HI*JK~                                                        | HIH*JK~
                                                              > XY*Z~
LMN*OP~                                                         LMN*OP~
                                                              > FGH*NM~

Awk code: awktest.sh
Code:
#! /bin/bash

awk     'FNR==NR {T[$0]; next}
         $0 in T {print "<font color=\"red\">" $0 "</font>"; next} 1 ' diffoutput.txt


Last edited by Scrutinizer; 02-20-2015 at 07:16 AM.. Reason: icode -> code tags
# 4  
Old 02-20-2015
No surprise as you're not priniting anything. That awkneeds two files, first the result of diff -y --suppress-common-lines file[12], second the result of diff -y file[12].
# 5  
Old 02-20-2015
My bad..I altered it and the cmd did work like beauty. But it is messing up the inundation of the final output. Means when I view this in html page the format is gone Smilie
Each record has to be preserved by not going in next line. I did use --width=100 so as to accommodate each record in it's position, but in vain.
Is there a way to preserve the layout ? Hope I am not asking much...!

Code:
ABCD*DEFG~                                                      ABCD*DEFG~
<font color="red">HI*JK~                                                              | HIH*JK~</font>LMN*OP~                                                           LMN*OP~
<font color="red">                                                            > FGH*NM~</font>

# 6  
Old 02-20-2015
Hi.

Also a few available utilities:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate colorize diff output.
# ANSIfilter:
# André Simon - Startseite

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C diff colordiff ansifilter

pl " Input data files data?:"
head data?

pl " Results:"
diff -y --suppress-common-lines data? |
colordiff |
ansifilter -B   # -B bbcode; -H html; -L latex, etc.

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
diff (GNU diffutils) 2.8.1
colordiff diff (GNU diffutils) 2.8.1
ansifilter - ( local: ~/executable/ansifilter, 2014-01-28 )

-----
 Input data files data?:
==> data1 <==
ABCD*DEFG~ 
HI*JK~
LMN*OP~

==> data2 <==
ABCD*DEFG~
HIH*JK~
LMN*OP~
FGH*NM~

-----
 Results:
ABCD*DEFG~                            | ABCD*DEFG~
HI*JK~                                | HIH*JK~
                                      > FGH*NM~

Best wishes ... cheers, drl
This User Gave Thanks to drl For This Post:
# 7  
Old 02-24-2015
Where to feed data

Quote:
Originally Posted by drl
Hi.

Also a few available utilities:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate colorize diff output.
# ANSIfilter:
# André Simon - Startseite

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C diff colordiff ansifilter

pl " Input data files data?:"
head data?

pl " Results:"
diff -y --suppress-common-lines data? |
colordiff |
ansifilter -B   # -B bbcode; -H html; -L latex, etc.

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
diff (GNU diffutils) 2.8.1
colordiff diff (GNU diffutils) 2.8.1
ansifilter - ( local: ~/executable/ansifilter, 2014-01-28 )

-----
 Input data files data?:
==> data1 <==
ABCD*DEFG~ 
HI*JK~
LMN*OP~

==> data2 <==
ABCD*DEFG~
HIH*JK~
LMN*OP~
FGH*NM~

-----
 Results:
ABCD*DEFG~                            | ABCDE*DEFG~
HI*JK~                                | HIH*JK~
                                      > FGH*NM~

Best wishes ... cheers, drl
Am not good with C. Could you please let me know where exactly to feed the input files in your code. Also is there a way to get the text coloured exactly in the position where the discrepancy is? Something like below

Code:
ABCD*DEFG~                            | ABCDE*DEFG~
HI*JK~                                | HIH*JK~
                                      > FGH*NM~


Last edited by Badhrish; 02-24-2015 at 05:12 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk/sed to replace variable in file

Hi All I have one file with multiple lines in it, each line has static text and some variable enclosed in <<filename>> as well. e.g. as below 123, <<file1.txt>> this is my name, I stay at <<city.txt>> Thanks for visiting 348384y, this is my name <<fileabc.txt>>, I stay at near the mall of... (8 Replies)
Discussion started by: reldb
8 Replies

2. UNIX for Beginners Questions & Answers

sed find 2 strings and replace one

Hi Everyone, I want to find this 2 strings in a single line a file and replace the second string. this is the line i need to find <param name="user" value="CORE_BI"/> find user and CORE_BI and replace only CORE_BI with admin so finally the line should look like this. <param... (5 Replies)
Discussion started by: shajay12
5 Replies

3. Shell Programming and Scripting

Complex Filter using grep, awk or sed

Hi, I'm not very familiar witrh sed or awk and hope the somebody can help me to solve my problem. I need to filter a text report using grep, sed or awk. I would like to cut out text lines with the pattern INFO and if exists the following lines of the pattern DETAILS. I need te keep the lines with... (4 Replies)
Discussion started by: Frankg
4 Replies

4. Shell Programming and Scripting

Relocation strings using awk/sed from a index file

Hi All, I'd always appreciate all helps from this website. I would like to relocate strings based on the index number from an index file. Index numbers are shown on the first column in the index file (index.txt) and I would like to relocate "path" based on index numbers. Paths are placed... (11 Replies)
Discussion started by: jypark22
11 Replies

5. Shell Programming and Scripting

Complex string operation (awk, sed, other?)

I have a file that contains RewriteRules for 200 countries (2 examples for 1 country below): RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=de_AT #& RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT I have... (5 Replies)
Discussion started by: usshadowop
5 Replies

6. Shell Programming and Scripting

Using sed to replace strings if NOT found

Dear expert, I need an urgent help. I would like to update my /etc/ntp.conf file using sed. 1) if script find this string "127.127.1.0" then add the lone below #server 127.127.1.0 2) is script find this string "fudge 127.127.1.0 stratum 10" then add #fudge 127.127.1.0 stratum 10 ... (7 Replies)
Discussion started by: lamoul
7 Replies

7. Shell Programming and Scripting

Sed or awk for batch replace file name

Can you please point me in the correct direction? I need a line or script to run though a given directory and find all files with "@domain.local" in there names and simple remove that. For example if the files were named 1234@domain.local the file would then become 1234. (1 Reply)
Discussion started by: binary-ninja
1 Replies

8. Shell Programming and Scripting

Using sed to replace two different strings?

Hey everyone! Simple question - I am trying to use sed to replace two different strings. As it stands I can implement this as: sed -i 's/TIMEOUT//g' sed -i 's/null//g' And it works. However, is it possible to shrink that down into a single command? Will there be any performance benefits? (3 Replies)
Discussion started by: msarro
3 Replies

9. Shell Programming and Scripting

Replace Strings with sed or awk

Hello i need some help with the usage of sed. Situation : 2 textfiles, file.in , file.out In the first textfile which is called file.in are the words for the substitution. Every word is in a new-line like : Firstsub Secondsub Thridsub ... In the second textflie wich is called file.out is... (5 Replies)
Discussion started by: Kingbruce
5 Replies

10. Shell Programming and Scripting

Complex Sed/Awk Question?

Hello, So i have this file called /apps/turnout which looks like that of the contents of the /etc/shadow (but not exactly) the file has a long list in it. basically, the contents of this file looks something similar to the following: jajajajalala:D#$#AFVAdfda lalabavisof:#%R@fafla#$... (3 Replies)
Discussion started by: SkySmart
3 Replies
Login or Register to Ask a Question