Sponsored Content
Top Forums UNIX for Dummies Questions & Answers getting unique lines from 2 files Post 302678939 by anurupa777 on Monday 30th of July 2012 05:14:50 AM
Old 07-30-2012
getting unique lines from 2 files

hi
i have used
Code:
comm -13 <(sort 1.txt) <(sort 2.txt)

option to get the unique lines that are present in file 2 but not in file 1. but some how i am getting the entire file 2. i would expect few but not all uncommon lines fro my dat. is there anything wrong with the way i used the command?

my each line in both files appears like this

Code:
HWI-1KL120:99:C0C9MACXX:6:2308:11994:161057     0       chr1    1098159 42      101M    *       0       0       GTTAGTTATCTTAAAAAGAAGTAGCCAAATAAGATACTATTTGAT
TCTAAACAAAAAATTTGGGGTAACTTACAAGCATAAAGAAATCTGTCATCCAGGAA        CCCFFFFFHHHHHIJIJIJJIHIIIIJIJJGIIJJJJI@HIGGGHIJJIIGIJJBFGIHG>BEBG;AGEHFCAE@DBDCCA@EECCC@C>CCC
DDC@A<AD        AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:101        YT:Z:UU HWI-1KL120:99:C0C9MACXX:6:2308:11994:161057     16      chrY    10985
21      23      88M3I10M        *       0       0       AACGCCTCAGTATTTCTGGAACAATATTAAGATGCAGAAGATAACGAAGGAGCCTACAAATAAAAAAAAATATTGTTTTGTTACATAAGCTGCTTTAAAAT
        B9><<::@DDDDDDDCDCCCDDCADDDDCECAAEEEFFC@EEHHHHHHD==GHDDC;JJHDIJIJJJJJJIHGIJJJIIIIIHJIIJIHHHGFDFFFFCB@   AS:i:-29        XN:i:0  XM:i:3  XO:i:1  XG:i:
3       NM:i:6  MD:Z:92G0T2G1   YT:Z:UU

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing 2 files and return the unique lines in first file

Hi, I have 2 files file1 ******** 01-05-09|java.xls| 02-05-08|c.txt| 08-01-09|perl.txt| 01-01-09|oracle.txt| ******** file2 ******** 01-02-09|windows.xls| 02-05-08|c.txt| 01-05-09|java.xls| 08-02-09|perl.txt| 01-01-09|oracle.txt| ******** (8 Replies)
Discussion started by: shekhar_v4
8 Replies

2. Shell Programming and Scripting

how many unique lines in a file

I have a file, test.txt with approx 12,000 lines. Each line is a single word that looks like a hex address. There are many repeats. Over half of the lines are the same. I want to count how many UNIQUE lines there are. #>more test.txt 0x123456 0x56AF23 0x99ABC1 0x123456 0x123456 0x99ABC1... (3 Replies)
Discussion started by: ajp7701
3 Replies

3. Shell Programming and Scripting

Remove All Lines Between Two Unique Lines

Hi all! Im wondering if its possible to remove all lines between two lines. Im working with a document like this: data1 data2 <Remove> data3 data4 </Remove> data5 data6 I need it to end up like this if that possible: data1 data2 data5 data6 There are multiple instances of... (2 Replies)
Discussion started by: Grizzly
2 Replies

4. UNIX for Dummies Questions & Answers

Getting non unique lines from concatenated files

Hi All, Is there a way to get NON unique lines from 2 or more concatenated files? Basically I have several files which are very similar with the exception of few lines and I want to find out which lines are different in each file. Very simple example is file1 contains: 1 2 3 4 5file2... (122 Replies)
Discussion started by: pawannoel
122 Replies

5. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

6. Shell Programming and Scripting

Compare multiple files and print unique lines

Hi friends, I have multiple files. For now, let's say I have two of the following style cat 1.txt cat 2.txt output.txt Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100... (19 Replies)
Discussion started by: jacobs.smith
19 Replies

7. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

8. Shell Programming and Scripting

Transpose lines from individual blocks to unique lines

Hello to all, happy new year 2013! May somebody could help me, is about a very similar problem to the problem I've posted here where the member rdrtx1 and bipinajith helped me a lot. https://www.unix.com/shell-programming-scripting/211147-map-values-blocks-single-line-2.html It is very... (3 Replies)
Discussion started by: Ophiuchus
3 Replies

9. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

10. UNIX for Beginners Questions & Answers

Print number of lines for files in directory, also print number of unique lines

I have a directory of files, I can show the number of lines in each file and order them from lowest to highest with: wc -l *|sort 15263 Image.txt 16401 reference.txt 40459 richtexteditor.txt How can I also print the number of unique lines in each file? 15263 1401 Image.txt 16401... (15 Replies)
Discussion started by: spacegoose
15 Replies
comm(1) 							   User Commands							   comm(1)

NAME
comm - select or reject lines common to two files SYNOPSIS
comm [-123] file1 file2 DESCRIPTION
The comm utility reads file1 and file2, which must be ordered in the current collating sequence, and produces three text columns as output: lines only in file1; lines only in file2; and lines in both files. If the input files were ordered according to the collating sequence of the current locale, the lines written will be in the collating sequence of the original lines. If not, the results are unspecified. OPTIONS
The following options are supported: -1 Suppresses the output column of lines unique to file1. -2 Suppresses the output column of lines unique to file2. -3 Suppresses the output column of lines duplicated in file1 and file2. OPERANDS
The following operands are supported: file1 A path name of the first file to be compared. If file1 is -, the standard input is used. file2 A path name of the second file to be compared. If file2 is -, the standard input is used. USAGE
See largefile(5) for the description of the behavior of comm when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). EXAMPLES
Example 1 Printing a list of utilities specified by files If file1, file2, and file3 each contain a sorted list of utilities, the command example% comm -23 file1 file2 | comm -23 - file3 prints a list of utilities in file1 not specified by either of the other files. The entry: example% comm -12 file1 file2 | comm -12 - file3 prints a list of utilities specified by all three files. And the entry: example% comm -12 file2 file3 | comm -23 -file1 prints a list of utilities specified by both file2 and file3, but not specified in file1. ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of comm: LANG, LC_ALL, LC_COLLATE, LC_CTYPE, LC_MESSAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 All input files were successfully output as specified. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
cmp(1), diff(1), sort(1), uniq(1), attributes(5), environ(5), largefile(5), standards(5) SunOS 5.11 3 Mar 2004 comm(1)
All times are GMT -4. The time now is 02:06 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy