Sponsored Content
Top Forums Shell Programming and Scripting Compare two files word by word Post 302676183 by drl on Tuesday 24th of July 2012 07:43:51 AM
Old 07-24-2012
Hi.

A stylized display of word-by-word differences can be seen with wdiff:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate word-by-word differences.

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C wdiff

FILES="data*"
pl " Input files data*:"
head $FILES

pl " Word-by-word differences:"
wdiff $FILES

pl " Word-by-word differences, avoid spanning newlines:"
wdiff -n $FILES

exit 0

producing
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
wdiff GNU wdiff 0.5

-----
 Input files data*:
==> data1 <==
aaa bbb ccc
dddd efg hij

==> data2 <==
aaa bcd edgf
xxx yyy kkh

-----
 Word-by-word differences:
aaa [-bbb ccc
dddd efg hij-] {+bcd edgf
xxx yyy kkh+}

-----
 Word-by-word differences, avoid spanning newlines:
aaa [-bbb ccc-]
[-dddd efg hij-] {+bcd edgf+}
{+xxx yyy kkh+}

See man wdiff for details. The utility colordiff might be of use if displaying on a terminal.

Best wishes ... cheers, drl
This User Gave Thanks to drl For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Can a shell script pull the first word (or nth word) off each line of a text file?

Greetings. I am struggling with a shell script to make my life simpler, with a number of practical ways in which it could be used. I want to take a standard text file, and pull the 'n'th word from each line such as the first word from a text file. I'm struggling to see how each line can be... (5 Replies)
Discussion started by: tricky
5 Replies

2. Shell Programming and Scripting

To read data word by word from given file & storing in variables

File having data in following format : file name : file.txt -------------------- 111111;name1 222222;name2 333333;name3 I want to read this file so that I can split these into two paramaters i.e. 111111 & name1 into two different variables(say value1 & value2). i.e val1=11111 &... (2 Replies)
Discussion started by: sjoshi98
2 Replies

3. UNIX for Dummies Questions & Answers

Script to search for a particular word in files and print the word and path name

Hi, i am new to unix shell scripting and i need a script which would search for a particular word in all the files present in a directory. The output should have the word and file path name. For example: "word" "path name". Thanks for the reply in adv,:) (3 Replies)
Discussion started by: virtual_45
3 Replies

4. Programming

Python: Compare 2 word lists

Hi. I am trying to write a Python programme that compares two different text files which both contain a list of words. Each word has its own line worda wordb wordc I want to compare textfile 2 with textfile 1, and if there's a word in textfile 2 that is NOT in textfile 1, I want to... (6 Replies)
Discussion started by: Bloomy
6 Replies

5. Shell Programming and Scripting

Find and replace a word in all the files (that contain the word) under a directory

Hi Everyone, I am looking for a simple way for replacing all the files under a directory that use the server "xsgd1234dap" with "xsdr3423pap". For Example: In the Directory, $pwd /home/nick $ grep -l "xsgd1234dap" *.sh | wc -l 119 I have "119" files that are still using... (5 Replies)
Discussion started by: filter
5 Replies

6. UNIX for Dummies Questions & Answers

Find EXACT word in files, just the word: no prefix, no suffix, no 'similar', just the word

I have a file that has the words I want to find in other files (but lets say I just want to find my words in a single file). Those words are IDs, so if my word is ZZZ4, outputs like aaZZZ4, ZZZ4bb, aaZZZ4bb, ZZ4, ZZZ, ZyZ4, ZZZ4.8 (or anything like that) WON'T BE USEFUL. I need the whole word... (6 Replies)
Discussion started by: chicchan
6 Replies

7. Shell Programming and Scripting

Search for a specific word and print only the word from the input file

Hi, I have a sample file as shown below, I am looking for sed or any command which prints the complete word only from the input file. Ex: $ cat "sample.log" I am searching for a word which is present in this file We can do a pattern search using grep but I need to cut only the word which... (1 Reply)
Discussion started by: mohan_kumarcs
1 Replies

8. Shell Programming and Scripting

Find a word and increment the number in the word & save into new files

Hi All, I am looking for a perl/awk/sed command to auto-increment the numbers line in file, P1.tcl: run_build_model sparc_ifu_dec run_drc set_faults -model path_delay -atpg_effectiveness -fault_coverage add_delay_paths P1 set_atpg -abort_limit 1000 run_atpg -ndetects 1000 I would like... (6 Replies)
Discussion started by: jypark22
6 Replies

9. UNIX for Beginners Questions & Answers

UNIX script to check word count of each word in file

I am trying to figure out to find word count of each word from my file sample file hi how are you hi are you ok sample out put hi 1 how 1 are 1 you 1 hi 1 are 1 you 1 ok 1 wc -l filename is not helping , i think we will have to split the lines and count and then print and also... (4 Replies)
Discussion started by: mirwasim
4 Replies

10. UNIX for Beginners Questions & Answers

How to search for a word in column header that fully matches the word not partially in awk?

I have a multicolumn text file with header in the first row like this The headers are stored in an array called . which contains I want to search for each elements of this array from that multicolumn text file. And I am using this awk approach for ii in ${hdr} do gawk -vcol="$ii" -F... (1 Reply)
Discussion started by: Atta
1 Replies
DOCDIFF(1)						      General Commands Manual							DOCDIFF(1)

NAME
docdiff -- character/word-oriented diff SYNOPSIS
docdiff [options] file1 file2 DESCRIPTION
This manual page documents briefly the docdiff commands. This manual page was written for the Debian distribution because the original program does not have a manual page. Instead, it has docu- mentation in the HTML format; see below. docdiff is a program that compares two files and shows the difference. It can compare files word by word, char by char, or line by line. It has several output formats such as HTML/XHTML, tty, Manued, or user-defined markup. It supports several encodings and end-of-line characters, including ASCII, UTF-8, EUC-JP, Shift_JIS, CR, LF, and CRLF. OPTIONS
--resolution=RESOLUTION specify resolution (granularity) line|word|char (default is word) --line set resolution to line --word set resolution to word --char set resolution to char --encoding=ENCODING specify character encoding ASCII|EUC-JP|Shift_JIS|UTF-8|auto (default is auto) --ascii same as --encoding=ASCII --eucjp same as --encoding=EUC-JP --sjis same as --encoding=Shift_JIS --utf8 same as --encoding=UTF-8 --eol=EOL specify end-of-line character CR|LF|CRLF|auto (default is auto) --cr same as --eol=CR --lf same as --eol=LF --crlf same as --eol=CRLF --format=FORMAT specify output format tty|manued|html|wdiff|user (default is html; user tags have to be described in config file) --tty same as --format=tty --manued same as --format=manued --html same as --format=html --wdiff same as --format=wdiff --digest digest output, do not show all --cache use file cache --no-config-file do not read config files --verbose run verbosely --help show usage --version show version --license show license --author show author(s) SEE ALSO
/usr/share/doc/docdiff/readme.html. AUTHOR
This manual page was written by akira yamada akira@debian.org for the Debian system (but may be used by others). DOCDIFF(1)
All times are GMT -4. The time now is 08:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy