Format diff output


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Format diff output
# 1  
Old 04-01-2010
Format diff output

I need to compare two directories with tab separated files. I'm using diff to do this. diff output doesn't identify which column values are different, it just tells which lines are different. Is there any way to format diff output. Thanks

Code:
f1.txt
210	998877	phone	9981128209	add	111 nw st.
310	998877	usg	650	ex	11
310	998877	usg	850	ex	11
410	998877	web	1003		
210	998878	phone	9981128210	add	112 nw st.
310	998878	usg	750	ex	11
410	998878	web	930		
	

f2.txt
210	998877	phone	9981128209	add	111 nw st.
310	998877	usg	650	ex	11.00
310	998877	usg	750	ex	11
410	998877	web	1203		
210	998878	phone	9981128210	add	112 nw st.
310	998878	usg	750	ex	11
410	998878	web	850		

diff output -
 
 diff -b f1.txt f2.txt

2,4c2,4
< 310   998877  usg     650     ex      11
< 310   998877  usg     850     ex      11
< 410   998877  web     1003
---
> 310   998877  usg     650     ex      11.00
> 310   998877  usg     750     ex      11
> 410   998877  web     1203
7c7
< 410   998878  web     930
---
> 410   998878  web     850

I want to reformat -

310 998877 column6 11 11.00
310 998877 column4 850 750
410 998877 column4 1003 1203
410 998878 column4 930 850

# 2  
Old 04-01-2010
I just did this quick at the command line, and cut and paste when the results returned correctly. You're better off creating an awk command file and using the -f option to run these awk commands. i.e.

Code:
diff <diff options and files> | awk -f <awkfile>

I assumed that columns 1-3 always matched. If incorrect you can modify to your needs.

Code:
diff -b f1.txt f2.txt | awk '/^>/ { a[i++]=$0;maxi=i } /^</ { b[j++]=$0;maxj=j }

END{for (i=0;i<maxi;i++) { split(a[i],c); split(b[i],d);

if ( c[5] != d[5] ) 
  printf "%s %s column4 %s %s\n", c[2], c[3], d[5], c[5]
else
  if (c[6] != d[6] )
    printf "%s %s column5 %s %s\n", c[2], c[3], d[6], c[6]
  else
    if ("c[7]" != "d[7]" )
      printf "%s %s column6 %s %s\n", c[2], c[3], d[7], c[7]
} } '

# 3  
Old 04-03-2010
Hi.

There are several schemes to draw attention to the areas in lines which differ. One such is colordiff. Colorized text is not easy to paste in here, but the program can also do an interesting job by marking up the lines with subtractions and additions. For example:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate coloration of diff output.
# colordiff - a tool to colorize diff output

# Infrastructure details, environment, commands for forum posts. 
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo ; echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
c=$( ps | grep $$ | awk '{print $NF}' )
version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
[ "$c" = "$s" ] && p="$s" || p="$c"
version >/dev/null 2>&1 && version "=o" $p wdiff colordiff
set -o nounset
echo

echo " Samples of data files:"
specimen data1 data2 \
|| { head -5 $FILE ; echo " --" ; tail -5 $FILE; }

echo
echo " Results:"
wdiff -n data1 data2 |
colordiff

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
GNU wdiff 0.5
colordiff diff (GNU diffutils) 2.8.1

 Samples of data files:
Whole: 5:0:5 of 7 lines in file "data1"
210	998877	phone	9981128209	add	111 nw st.
310	998877	usg	650	ex	11
310	998877	usg	850	ex	11
410	998877	web	1003		
210	998878	phone	9981128210	add	112 nw st.
310	998878	usg	750	ex	11
410	998878	web	930		

Whole: 5:0:5 of 7 lines in file "data2"
210	998877	phone	9981128209	add	111 nw st.
310	998877	usg	650	ex	11.00
310	998877	usg	750	ex	11
410	998877	web	1203		
210	998878	phone	9981128210	add	112 nw st.
310	998878	usg	750	ex	11
410	998878	web	850	

 Results:
210	998877	phone	9981128209	add	111 nw st.
310	998877	usg	650	ex	[-11-]	{+11.00+}
310	998877	usg	[-850-]	{+750+}	ex	11
410	998877	web	[-1003-]	{+1203+}		
210	998878	phone	9981128210	add	112 nw st.
310	998878	usg	750	ex	11
410	998878	web	[-930-]	{+850+}

Although not shown here, the surrounded strings are also colored on the display. The command can also run diff internally, but I did not find that display as useful as the one involving word-diff -- wdiff.

The colordiff was in the Debian repository I use, but you can also find it at ColorDiff for several flavors of *nix.

I think that Jeffrey Friedl wrote a perl code that high-lighted differences by inverting the display scheme for the parts of the strings that differed. However, I could not find that code with a quick Google. It may be in one of his books on regular expressions, Mastering Regular Expressions, Third Edition - O'Reilly Media

I'm sure that there are other solutions, likely found by searching with keywords such as high-lighting, differences, etc.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with diff output

I am running diff between two directories dir1 and dir2. diff --exclude --recursive --brief -b dir1 dir2 The output of the above command is Files dir1/java/abc/bcd/abc9991.java and dir2/java/abc/bcd/abc9991.java differ Files dir1/java/abc/bcd/abc9933.java and... (11 Replies)
Discussion started by: gaurav99
11 Replies

2. UNIX for Dummies Questions & Answers

Output formatting for diff -y

Hi, I wasn't sure whether to post this in the dummies or expert section, here's what I'm trying to do, but I suspect I'm missing the boat and should perhaps be using some of diff's builtin output functionality. diff -yb --suppress-common-lines file1.js file2.js >> ~/results.txt When I... (5 Replies)
Discussion started by: Buckaroo Banzai
5 Replies

3. UNIX for Dummies Questions & Answers

What does this diff output mean?

35d34 < What does that mean in diff? (3 Replies)
Discussion started by: glev2005
3 Replies

4. Shell Programming and Scripting

Processing diff output

How to get diff to not print the chevrons and the dashes? In this case the differences are all single line differences. Also the first few lines don't matter. How to get the output to always exclude the first few lines? Thanks! (1 Reply)
Discussion started by: stevensw
1 Replies

5. Shell Programming and Scripting

diff output next to each other

I have two files to compare, but diff output doesn't give me decent output I want. The portion of the two files are shown below. file 1) Authorize <1> Transaction Database Slave 3 <1> CPM HTTP Proxy Server <1> SSP (TDB Server) <1> CPM Application Authorization <7> CPM Script... (5 Replies)
Discussion started by: Daniel Gate
5 Replies

6. UNIX for Dummies Questions & Answers

diff with exact difference format

Hi when we diff of 2 files then output is showing lines which are different and then we have to manually find which word is different diff file1 file2 is there any way to show or highlight ONLY the words which are different via diff or some other command ? Thanks rel (1 Reply)
Discussion started by: reldb
1 Replies

7. Shell Programming and Scripting

Formatting the output from diff

Hi, i need to display the mismatches from two files.The output what is get is the entire rows which mismatch from file 1 are displayed first and the corresponding rows from file 2 are displayed below it. Sample output: From Test Run 1 - The row count of file2.txt is 23 From Test Run 1 -... (9 Replies)
Discussion started by: ragavhere
9 Replies

8. Shell Programming and Scripting

Is there a way to limit DIFF output

Hello is there a way to limit the number of lines output by the DIFF command? I tried -C 200 ect and -c but it continues to print out the whole huge file. Reason needed is i'm trying to do alot of DIFFs on a long list of files and would like to only get back an indicator which files are... (2 Replies)
Discussion started by: bobk544
2 Replies

9. UNIX for Dummies Questions & Answers

diff output

I have two CSV files and I would like to create a third CSV file containing the differences between the two. I understand the diff command can be used to list differences between two files. My problem is that when I pipe the output into a third CSV file, the line numbers and other formatting... (3 Replies)
Discussion started by: paulp
3 Replies

10. Shell Programming and Scripting

diff 2 files; output diff's to 3rd file

Hello, I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3. For example: file 1 123 1234 123456 file 2 123 2345 23456 file 3 should have 2345 23456 I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
Discussion started by: blt123
2 Replies
Login or Register to Ask a Question