Sponsored Content
Top Forums Shell Programming and Scripting Identifying dupes within a database and creating unique sub-sets Post 302879975 by Chubler_XL on Monday 16th of December 2013 10:01:56 PM
Old 12-16-2013
You could try this, but I'm not sure how quick it will be:

Code:
awk '
function remove_dups(list, have, num, keys, i, new) {
    have[""]
    num=split(list, keys, "=")
    for(i=1;i<=num;i++) {
       if(!(keys[i] in have)) new=new "=" keys[i]
       have[keys[i]]
    }
    return substr(new,2)
}
function merge(list, num, keys,i,new) {
   new=remove_dups(list)
   num=split(new, keys, "=")
   master=keys[1]
   for(i=1;i<=num;i++)
      if(keys[i] in Found) {
          new = remove_dups(List[Found[keys[i]]] "=" new)
          delete List[Found[keys[i]]]
      }
   num=split(new, keys, "=")
   List[master]=new
   for(i=1;i<=num;i++) Found[keys[i]]=master
}
{merge($0)}
END { for (l in List) print List[l] }' infile


Last edited by Chubler_XL; 12-16-2013 at 11:22 PM.. Reason: Standardise variable names
 

9 More Discussions You Might Find Interesting

1. Programming

Creating a Unique ID on distributed systems

Hi, How do you actually create a unique ID on a distributed system. I looked at gethostid but the man page says that its not guaranteed to be unique. Also using the IP address does not seem to be a feasible solution. Is there a function call or mechanism by which this is possible when even the... (4 Replies)
Discussion started by: pic
4 Replies

2. Virtualization and Cloud Computing

Clouds (Partially Order Sets) - Streams (Linearly Ordered Sets) - Part 2

timbass Sat, 28 Jul 2007 10:07:53 +0000 Originally posted in Yahoo! CEP-Interest Here is my follow-up note on posets (partially ordered sets) and tosets (totally or linearly ordered sets) as background set theory for event processing, and in particular CEP and ESP. In my last note, we... (0 Replies)
Discussion started by: Linux Bot
0 Replies

3. UNIX for Dummies Questions & Answers

split a file with unique sets

This may sound like a trivial problem, but I still need some help: I have a file with ids and I want to split it 'n' ways (could be any number) into files: 1 1 1 2 2 3 3 4 5 5 Let's assume 'n' is 3, and we cannot have the same id in two different partitions. So the partitions may... (8 Replies)
Discussion started by: ChicagoBlues
8 Replies

4. UNIX for Dummies Questions & Answers

Identifying the commands creating subshells

Hi all, This is the basic question. I have read many books which advised to avoid creating sub shells. e.g: use wc -l<filename rather than using cat file|wc -l. So, how to identify whether a command creates subshell or not? so,is it better to use tail -n+1 file in stead of using cat.... (3 Replies)
Discussion started by: pandeesh
3 Replies

5. Shell Programming and Scripting

Script for identifying and deleting dupes in a line

I am compiling a synonym dictionary which has the following structure Headword=Synonym1,Synonym2 and so on, with each synonym separated by a comma. As is usual in such cases manual preparation of synonyms results in repeating the synonym which results in dupes as in the example below:... (3 Replies)
Discussion started by: gimley
3 Replies

6. Programming

Unique Number Identifying

I'm trying to solve the below problem for a number: Enter a number and if it has all unique digits print unique number else non-unique number. Eg: Input=123; Output=unique number Input=112; Output=Non-unique number The thing i tried is splitting the number into digits by using % operator... (2 Replies)
Discussion started by: Gautham
2 Replies

7. Shell Programming and Scripting

Help with Perl script for identifying dupes in column1

Dear all, I have a large dictionary database which has the following structure source word=target word e.g. book=livre Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated e.g. book=livre book=tome Since I want to... (7 Replies)
Discussion started by: gimley
7 Replies

8. Shell Programming and Scripting

Identifying single words in a dictionary database

I am reworking a Marathi-English dictionary to be out on open-source. My dictionary has the Headword in Marathi, followed by its Part of Speech and subsequently by its English glosses as in the examples below; अकरसणें v i To contract, shrink. अकरा a Eleven. अकराळ a Frightful, terrible. विकराळ... (2 Replies)
Discussion started by: gimley
2 Replies

9. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies
XkbGetKeyActions(3)						   XKB FUNCTIONS					       XkbGetKeyActions(3)

NAME
XkbGetKeyActions - Update the actions (the key_acts array) for a subset of the keys in a keyboard description SYNOPSIS
Status XkbGetKeyActions ( dpy, first, num, xkb ) Display * dpy; unsigned int first; unsigned int num; XkbDescPtr xkb; ARGUMENTS
- dpy connection to X server - first keycode of first key of interest - num number of keys desired - xkb pointer to keyboard description where result is stored DESCRIPTION
XkbGetKeyActions sends a request to the server to obtain the actions for num keys on the keyboard starting with key first. It waits for a reply and returns the actions in the server->key_acts field of xkb. If successful, XkbGetKeyActions returns Success. The xkb parameter must be a pointer to a valid Xkb keyboard description. If the server map, in the xkb parameter, has not been allocated, XkbGetKeyActions allocates and initializes it before obtaining the actions. If the server does not have a compatible version of Xkb, or the Xkb extension has not been properly initialized, XkbGetKeyActions returns BadAccess. If num is less than 1 or greater than XkbMaxKeyCount, XkbGetKeyActions returns BadValue. If any allocation errors occur, XkbGetKeyActions returns BadAlloc. DIAGNOSTICS
BadAccess The Xkb extension has not been properly initialized BadAlloc Unable to allocate storage BadValue An argument is out of range X Version 11 libX11 1.2.1 XkbGetKeyActions(3)
All times are GMT -4. The time now is 03:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy