Generate separate files with similar and dissimilar contents


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Generate separate files with similar and dissimilar contents
# 8  
Old 07-26-2016
Quote:
Originally Posted by H squared
The output of scenario 1 (dissimilar elements) can be redirected to a file as :

Code:
 { grep -Fvf 2.txt 1.txt && grep -Fvf 1.txt 2.txt; } > 3.txt

For scenario 2, it is relatively easier as the contents output is the same :

Code:
grep -Fxf 1.txt 2.txt > 3.txt

Moderator's Comments:
Mod Comment Please use CODE tags (not ICODE tags) when displaying full-line and multi-line sample input, sample output, and code segments.
As long as no line in either of your input files is a substring of a line in the other input file AND there is at least one line in 1.txt that is not present in 2.txt, you will get away with using the 3 grep commands above to do what you are trying to do.

If any of the conditions listed above are violated, you will not get the correct results with the above code. But, the changes I suggested in post #6 for the 1st two grep commands:
Code:
(grep -Fxvf 2.txt 1.txt; grep -Fxvf 1.txt 2.txt) > 3.txt

would give you correct results.

But, unless there are duplicated lines in one or both of your input files, the single awk script RavinderSingh13 suggested will be faster (only needing 1 process instead of 3 and only reading the 17,500 lines of input from your two files once instead of three times). If what you're saying is that want an output file named 3.txt instead of dissimilar_ones.txt, I would assume that you understand that you can change the string "dissimilar_ones.txt" in Ravinder's awk script in two places to change the name of that output file.

If there are duplicated lines in your input files that need to be preserved, Ravinder's suggested awk script could be modified slightly to handle that case (which was not mentioned in your requirements and input samples) and still the get speed improvements afforded by executing fewer processes and only needing to read your input files once.

Of course, as always, if you're using a Solaris/SunOS system, you'd need to change awk in Ravinder's script to /usr/xpg4/bin/awk or nawk.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Getting similar lines in two files

Hi, I need to compare the /etc/passwd files from 2 servers, and extract the users that are similar in these two files. I sorted the 2 files based on the user IDs (UID) (3rd column). I first sorted the files using the username (1st column), however when I use comm to compare the files there is no... (1 Reply)
Discussion started by: anaigini45
1 Replies

2. UNIX for Dummies Questions & Answers

How to generate one long column by merging two separate two columns in a single file?

Dear all, I have a simple question. I have a file like below (separated by tab): col1 col2 col3 col4 col5 col6 col7 21 66745 rs1234 21 rs5678 23334 0.89 21 66745 rs2334 21 rs9978 23334 0.89 21 66745 ... (4 Replies)
Discussion started by: forevertl
4 Replies

3. Shell Programming and Scripting

Merge files and generate a resume in two files

Dear Gents, Please I need your help... I need small script :) to do the following. I have a thousand of files in a folder produced daily. I need first to merge all files called. txt (0009.txt, 0010.txt, 0011.txt) and and to output a resume of all information on 2 separate files in csv... (14 Replies)
Discussion started by: jiam912
14 Replies

4. UNIX for Dummies Questions & Answers

Finding similar strings between two files

Hi, I have a file1 like this: ABAT ABCA1 ABCC1 ABCC5 ABCC8 ABCE1 ABHD2 ABL1 CAMTA1 ACBD3 ACCN1 And I have a second file like this: chr19 46118590 46119564 MACS_peak_1499 3100.00 chr19 46122009 46148405 CYP2B7P1 -2445 chr1 7430312 7430990... (7 Replies)
Discussion started by: a_bahreini
7 Replies

5. Shell Programming and Scripting

Looking to find files that are similar.

Hello all, I have a server that is running AIX, running a tool that converts various printstreams (AFP/Metadata) to PDF. This is done using a rexx script and an off the shelf utility. Each report (there's around 125) uses a certain script file, it's basically a text file. I am trying... (5 Replies)
Discussion started by: jeffs42885
5 Replies

6. Shell Programming and Scripting

Using bash to separate files files based on parts of a filename

Hey guys, Sorry for the basic question but I have a lot of files that I want to separate into groups based on filenames which I can then cat together. Eg I have: (a_b_c.txt) WB34_2_SLA8.txt WB34_1_SLA8.txt WB34_1_DB10.txt WB34_2_DB10.txt WB34_1_SLA8.txt WB34_2_SLA8.txt 77_1_SLA8.txt... (1 Reply)
Discussion started by: Breentax
1 Replies

7. Shell Programming and Scripting

appending data from similar files

I am familiar with scripting, but I am trying to see if there is an easy way to append files from similar files into one file. For example, if there is file1_20121201, file1_20121202, file1_20121203, file2_20121201, file2_20121202, file2_20121203 I want to be able to combine all the data from... (3 Replies)
Discussion started by: mrbean1975
3 Replies

8. Shell Programming and Scripting

Read file contents and separate the lines when completes with =

Hi, I have a file like this cpsSystemNotifyTrap='2010/12/14 11:05:31 CST' Manufacturer=IBM ReportingMTMS=n/a ProbNm=26 LparName=n/a FailingEnclosureMTMS=7946-IQL*99G4874 SRC=B3031107 EventText=Problem reported by customer. CallHome=true Calendar I want to have a output like this... (6 Replies)
Discussion started by: dbashyam
6 Replies

9. Shell Programming and Scripting

compare the similar files

I got many pair files, which only have small difference, such as more space, or more empty line, and some unreadable characters. If list by commend "diff", I can see many many difference. So I'd like to write a script to compare the pair files, if 95% contents are same, I will think they are... (2 Replies)
Discussion started by: rdcwayx
2 Replies

10. Shell Programming and Scripting

How to print Dissimilar keys and their values?

Hi guyz I have been using this script to find similar keys in 2 files and merge the keys along with their values. Therefore it prints similar keys by leaving dissimilar. Any one knows how to print only Dissimilar leaving Similar. Help would be appreciated. The script I'm using for similar... (4 Replies)
Discussion started by: repinementer
4 Replies
Login or Register to Ask a Question