Extract values of duplicate keys


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract values of duplicate keys
# 1  
Old 01-15-2013
Extract values of duplicate keys

I have two questions that are related, so it would be great if you can help me with both!

Question1:
I have a file A that looks like this:
Code:
a x
b y
b z
c w

I want to get something like:
Code:
a x
b y; z
c w

Given that a,b,c has no spaces. But the other letters might contain spaces.

Question2:
Next, I have a file B that has
Code:
x
y
q

And I want to compare it with subset of file A:
Code:
x
y; z
w

So that I count how many lines of B are subset of A. In this case it is 2.
# 2  
Old 01-15-2013
Answer1:
Code:
awk '{x=$0;sub("^[^ ]+ ","",x);a[$1]=(a[$1])?a[$1]"; "x:x}END{for (i in a) print i,a[i]}' fileA

Answer2:
Code:
awk '{x=$0;sub("^[^ ]+ ","",x);a[$1]=(a[$1])?a[$1]"; "x:x}END{for (i in a) print i,a[i]}' fileA | cut -d" " -f2- | grep -cf fileB -

# 3  
Old 01-15-2013
I ran this on files foo and foo2
Code:
cat > foo
x
y; z
w
cat > foo2 
x
y
q

Here's what I got:
Code:
awk '{x=$0;sub("^[^ ]+ ","",x);a[$1]=(a[$1])?a[$1]"; "x:x}END{for (i in a) print i,a[i]}' foo | cut -d" " -f2- | grep -cf foo2 -
1

---------- Post updated at 02:53 PM ---------- Previous update was at 02:52 PM ----------

Is there a way to get 2?
Since we have x and y matches?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Calculate average from a given set of keys and values

Hello, I am writing a script which expects as its input a hash with student names as the keys and marks as the values. The script then returns array of average marks for student scored 60-70, 70-80, and over 90. Output expected 50-70 1 70-90 3 over 90 0 The test script so far... (4 Replies)
Discussion started by: nans
4 Replies

2. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Hello I have a file like this: > cat examplefile ghi|NN603762|eee mno|NN607265|ttt pqr|NN613879|yyy stu|NN615002|uuu jkl|NN607265|rrr vwx|NN615002|iii yzA|NN618555|ooo def|NN190486|www BCD|NN628717|ppp abc|NN190486|qqq EFG|NN628717|aaa HIJ|NN628717|sss > I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies

3. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

4. Shell Programming and Scripting

Extract data according to keys from filename mentioned in file

Hello experts, I want to join a file with files whosE names are mentioned in one of the columns of the same file. File 1 t1,a,b,file number 1 t1,a,c,file number 1 t2,c,d,file number 2 t2,c,e,file number 2 t2,c,f,file number 2 t2,c,g,file number 2 t3,e,f,file number 3 file number 1... (3 Replies)
Discussion started by: ritakadm
3 Replies

5. Web Development

Duplicate Keys

I am trying to insert csv data into a table mysql> load data infile '/var/www/PLU.csv' into table Food2 fields terminated by ',' enclosed by '"' lines terminated by '\n' ; ERROR 1062 (23000): Duplicate entry '4014' for key 'PRIMARY' ... (4 Replies)
Discussion started by: Meow613
4 Replies

6. Shell Programming and Scripting

Deleting keys and values-Awk

key pair is 1st and 6th column ex:a20 : p10 or a20 : p11 For every key pair if the vlaue(4th column) is the same then delete all the lines who has keypair and the value ex: a20 : p10 has value 1 only then delete those but a20 : p11 has different values 1,2 and 3 and keep those. input a20 ... (8 Replies)
Discussion started by: ruby_sgp
8 Replies

7. Shell Programming and Scripting

comparing the values of repeated keys in multiple columns

Hi Guyz The 1st column of the input file has repeated keys like x,y and z. The ist task is if the 1st column has unique key (say x) and then need to consider 4th column, if it is + symbol then subtract 2nd column value with 3rd column value (we will get 2(10-8)) or if it is - symbol subtract 3rd... (3 Replies)
Discussion started by: repinementer
3 Replies

8. Shell Programming and Scripting

select values based on keys

HI The input 1st column has specific keys like 1 with value a,b and c. 2 with b,b,d and 3 with a,a a. when ever c appears as one of the values the result will be key ........ c (You can see in the out put as 1 w...... 6.... c) and same follows for d. Thanx:) I'm learning awk scripting. If... (3 Replies)
Discussion started by: repinementer
3 Replies

9. Shell Programming and Scripting

How to print Dissimilar keys and their values?

Hi guyz I have been using this script to find similar keys in 2 files and merge the keys along with their values. Therefore it prints similar keys by leaving dissimilar. Any one knows how to print only Dissimilar leaving Similar. Help would be appreciated. The script I'm using for similar... (4 Replies)
Discussion started by: repinementer
4 Replies

10. UNIX for Advanced & Expert Users

obtain duplicate keys in csv file

Hi, having two csv files, both sorted, by key (column1), f1 containing duplicate keys and f2 containing no duplicate keys, how can I obtain all rows from f1 with the keys listed in file2? Example: f1 is: k1,gsj01fd k2,vi982cj k2,1fjk01e k3,81kjfds k4,sd9dasi f2 is: k2 k3 and I... (3 Replies)
Discussion started by: oscarmon
3 Replies
Login or Register to Ask a Question