Find distinct values


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find distinct values
# 1  
Old 08-30-2012
Find distinct values

Hi,

I have two files of the following format

file1

Code:
chr1:345-456
chr2:123-456
chr2:455-678
chr3:456-789
chr3:444-555

file2

Code:
chr1:345-456
chr2:123-456
chr3:456-789


output

Code:
chr2:455-678
chr3:444-555

This is just a sample data. My file 1 has 97K records and my file 2 has 77K records.

I tried

Code:
join -v1 -v2 file1 file2

This one is giving me around 85K records. I think it is printing the common ones too.

Any thoughts on getting the above said output would be highly appreciate.

Thanks in advance.

Last edited by jacobs.smith; 08-30-2012 at 05:38 PM.. Reason: forgot code tags for join command
# 2  
Old 08-30-2012
join is for joining file if you want to find uniq line you can use… wait a minute… uniq !

Code:
sort file1 file2 | uniq -u

this code doesn't work if :
Code:
file1:
a
a
b
c
file2:
b
c
d
d
e
output:
e

if you want "a" and "d" in the output :
Code:
{sort -u file1 ;sort -u file2} | sort | uniq -u
output:
a
d
e


Last edited by delugeag; 08-30-2012 at 06:00 PM..
This User Gave Thanks to delugeag For This Post:
# 3  
Old 08-30-2012
Code:
cat file1 file2 |sort |uniq -u

--EDIT:

Or: "another way to misuse cat".
Sorry, I posted before reading delugeag answer.
--
Bye
This User Gave Thanks to Lem For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need distinct values from command in a script

Hello, I am using below command srvctl config service -d cmdbut cmdbut_01 (P):/devoragridcn_01/app/oracle> srvctl config service -d cmdbut Service name: boms10.world Service is enabled Server pool: cmdbut_boms10.world Cardinality: 1 Disconnect: false Service role: PRIMARY Management... (7 Replies)
Discussion started by: Vishal_dba
7 Replies

2. UNIX for Dummies Questions & Answers

count number of distinct values in each column with awk

Hi ! input: A|B|C|D A|F|C|E A|B|I|C A|T|I|B As the title of the thread says, I would need to get: 1|3|2|4 I tried different variants of this command, but I don't manage to obtain what I need: gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a++} END {for (b in a) print b}' input ... (2 Replies)
Discussion started by: beca123456
2 Replies

3. Shell Programming and Scripting

average of distinct values with awk

Hi guys, I am not an expert in shell and I need help with awk command. I have a file with values like 200 1 1 200 7 2 200 6 3 200 5 4 300 3 1 300 7 2 300 6 3 300 4 4 I need resulting file with averages of... (3 Replies)
Discussion started by: saif
3 Replies

4. Shell Programming and Scripting

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (2 Replies)
Discussion started by: vukkusila
2 Replies

5. UNIX for Dummies Questions & Answers

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (1 Reply)
Discussion started by: vukkusila
1 Replies

6. Shell Programming and Scripting

Select distinct values from a flat file

Hi , I have a similar problem. Please can anyone help me with a shell script or a perl. I have a flat file like this fruit country apple germany apple india banana pakistan banana saudi mango india I want to get a output like fruit country apple ... (7 Replies)
Discussion started by: smalya
7 Replies

7. Shell Programming and Scripting

grep distinct values

this is a little more complex than that. I have a text file and I need to find all the distinct words that appear in a line after the word TABLESPACE when I grep for just the word tablespace, I get: how do i parse this a little better so i have a smaller file to read? This is just an... (4 Replies)
Discussion started by: guessingo
4 Replies

8. Shell Programming and Scripting

Getting Distinct values from second field in a file....

Hi I have a pipe delimited file. I am trying to grab the DISTINCT value from the second field. The file is something like: 1233|apple|ron 1234|apple|elephant 1235|egg|man the output I am trying to get from second field is apple,egg (apple coming only once) Thanks simi (4 Replies)
Discussion started by: simi28
4 Replies

9. Shell Programming and Scripting

Awk to print distinct col values

Hi Guys... I am newbie to awk and would like a solution to probably one of the simple practical questions. I have a test file that goes as: 1,2,3,4,5,6 7,2,3,8,7,6 9,3,5,6,7,3 8,3,1,1,1,1 4,4,2,2,2,2 I would like to know how AWK can get me the distinct values say for eg: on col2... (22 Replies)
Discussion started by: anduzzi
22 Replies

10. Shell Programming and Scripting

Loop through only the distinct values in a file

Datafile has the following data seperated by : FIELD1:FIELD2:FIELD3 D1:/opt/9.1.9:Y D2:/opt/10.1.10:Y D3:/opt/9.1.9:Y D4:/opt/8.1.8:Y D5:/opt/8.1.8:Y D6:/opt/9.1.9:Y D7:/opt/9.1.9:Y D8:/opt/10.1.10:Y D9:/opt/9.1.9:Y D10:/opt/10.1.10:Y I want to do some operations only on the distinct... (2 Replies)
Discussion started by: pbekal
2 Replies
Login or Register to Ask a Question