Comparing lists when "diff" isn't sufficiently stringent


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Comparing lists when "diff" isn't sufficiently stringent
# 1  
Old 01-17-2013
Comparing lists when "diff" isn't sufficiently stringent

Greetings.

I like to compare two lists of numbers, A.txt and B.txt, to see the numbers that are in B.txt but not in A.txt. I only need the "deletions" with reference to A.txt. Using the diff command doesn't work because it considers too many of the deletions just changes because they are a digit different than one in A.txt.

Input,

A.txt:
Code:
0113429
0113430
0113510
0113590
0113993

B.txt:
Code:
0113429
0113430
0113510
0113589
0113590

Output.txt:

Code:
added
0113993

deleted
0113589

I would like something that only accepts or reject perfect matchs, changes are not of interest.

Thanks!

Last edited by Twinklefingers; 01-17-2013 at 04:13 PM.. Reason: added input and output example
# 2  
Old 01-17-2013
Show the input you have, and the output you want.
# 3  
Old 01-17-2013
Done, thanks!
# 4  
Old 01-17-2013
diff works perfectly for this data, so it cannot be an example of your problem.

What would some data that doesn't work with diff look like? What digits are allowed to differ and still be considered the 'same'?
# 5  
Old 01-17-2013
The data there was just an illustration, the actual files are lists of 1501 and 451 numbers respectively. The difference in number of lines is likely another issue.

The match must be 100% perfect to be considered non-unique and discarded. The output should only be of unique elements, any change would be considered a new and non-unique element.
# 6  
Old 01-17-2013
The comm utility sounds like what you want, then. It outputs three columns -- lines only found in file 1, lines only found in file 2, and lines found in both. It needs sorted (or at least identical) order but since you were using diff, it sounds like we can depend on that. You can suppress any of the columns with the -1, -2, -3 options.

Code:
comm -2 -3 file1 file2 > onlyinfile1
comm -1 -3 file1 file2 > onlyinfile2


Last edited by Corona688; 01-17-2013 at 04:50 PM..
This User Gave Thanks to Corona688 For This Post:
# 7  
Old 01-17-2013
Using grep command with -f and -v option:
Code:
echo -e "deleted\n$( grep -f A.txt -v B.txt )"
echo -e "added\n$( grep -f B.txt -v A.txt )"

This User Gave Thanks to Yoda For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies

2. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

3. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

4. Shell Programming and Scripting

Help with selecting files from "diff" output

I have two directories Dir_A and Dir_A_Arc. Need help with a shell script. The shell script needs to take the path to these two directories as parameters $1 and $2. The script needs to check if any files exist in these directories and if either of the directories are empty then exit... (5 Replies)
Discussion started by: gaurav99
5 Replies

5. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

6. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies

7. UNIX for Dummies Questions & Answers

Maximum input file size in "Diff" Command

Hello, Can anyone let me know what is the maximum file size that can be given as input for the "Diff" Command in Unix? I have a file size as large as 28MB and which can also increase. Will I face any issues with such a file size. If yes, What is the other alternative. Thanks in advance for... (1 Reply)
Discussion started by: Neeraja
1 Replies

8. UNIX for Dummies Questions & Answers

diff on c-source file always returns "files differ"

I have a c-source file that is evidently seen by unix as a binary file. When doing a diff between it and older versions with substantial differences, diff will only return "files differ". I have tried cat-ing the file to another file; tried using the "-h" on the diff; I have tried ftp-ing it... (7 Replies)
Discussion started by: C-Prog-Man
7 Replies

9. Shell Programming and Scripting

reformat the output from "diff" command

Hi all, I use the diff command and got the output: $> diff -e file1.txt file2.txt 15a 000675695 Yi Chen Chen 200520 EASY 50 2/28/05 0:00 SCAD Debit Card Charge . 12a 000731176 Sarah Anderson 200520 EASY 25 2/28/05 0:00 SCAD Debit Card Charge . 11a... (5 Replies)
Discussion started by: CamTu
5 Replies
Login or Register to Ask a Question