Sponsored Content
Top Forums Shell Programming and Scripting Remove all instances of duplicate records from the file Post 302150689 by radoulov on Wednesday 12th of December 2007 06:50:46 AM
Old 12-12-2007
Another sort/Awk solution
(if your files are not already sorted as the samples you posted):

Code:
sort -t\| -k1,2 file1 file2|awk '{
	x[$1,$2]++
	y[NR] = $0
} END {
	for (i = 1; i <= NR; i++)
		print y[i] > ((x[substr(y[i],1,5)] > 1) ? "file3" : "file4")
}' SUBSEP="|" FS="|"

Use nawk or /usr/xpg4/bin/awk on Solaris.

P.S. For variable column width: you should not use substr, but split for example:

Code:
sort -t\| -k1,2 file1 file2|awk '{
	x[$1,$2]++
	y[NR] = $0
} END {
	for (i = 1; i <= NR; i++)
		{
			tmp = y[i]
			split(tmp,z)
			print tmp > ((x[z[1],z[2]] > 1) ? "file3" : "file4")
	}
}' SUBSEP="|" FS="|"


Last edited by radoulov; 12-12-2007 at 08:16 AM..
 

10 More Discussions You Might Find Interesting

1. Solaris

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Discussion started by: svenkatareddy
2 Replies

2. Shell Programming and Scripting

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
Discussion started by: svenkatareddy
19 Replies

3. Shell Programming and Scripting

How to find Duplicate Records in a text file

Hi all pls help me by providing soln for my problem I'm having a text file which contains duplicate records . Example: abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452 abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452 tas 3420 3562 ... (1 Reply)
Discussion started by: G.Aavudai
1 Replies

4. Shell Programming and Scripting

find out duplicate records in file?

Dear All, I have one file which looks like : account1:passwd1 account2:passwd2 account3:passwd3 account1:passwd4 account5:passwd5 account6:passwd6 you can see there're two records for account1. and is there any shell command which can find out : account1 is the duplicate record in... (3 Replies)
Discussion started by: tiger2000
3 Replies

5. Shell Programming and Scripting

Remove duplicate records

I want to remove the records based on duplicate. I want to remove if two or more records exists with combination fields. Those records should not come once also file abc.txt ABC;123;XYB;HELLO; ABC;123;HKL;HELLO; CDE;123;LLKJ;HELLO; ABC;123;LSDK;HELLO; CDF;344;SLK;TEST key fields are... (7 Replies)
Discussion started by: svenkatareddy
7 Replies

6. Shell Programming and Scripting

Remove Duplicate Records

Hi frinds, Need your help. item , color ,desc ==== ======= ==== 1,red ,abc 1,red , a b c 2,blue,x 3,black,y 4,brown,xv 4,brown,x v 4,brown, x v I have to elemnet the duplicate rows on the basis of item. the final out put will be 1,red ,abc (6 Replies)
Discussion started by: imipsita.rath
6 Replies

7. UNIX for Dummies Questions & Answers

Using sed command to remove multiple instances of repeating headers in one file?

Hi, I have catenated multiple output files (from a monte carlo run) into one big output file. Each individual file has it's own two line header. So when I catenate, there are multiple two line headers (of the same wording) within the big file. How do I use the sed command to search for the... (1 Reply)
Discussion started by: rebazon
1 Replies

8. Shell Programming and Scripting

Remove somewhat Duplicate records from a flat file

I have a flat file that contains records similar to the following two lines; 1984/11/08 7 700000 123456789 2 1984/11/08 1941/05/19 7 700000 123456789 2 The 123456789 2 represents an account number, this is how I identify the duplicate record. The ### signs represent... (4 Replies)
Discussion started by: jolney
4 Replies

9. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

10. Shell Programming and Scripting

Remove duplicate records

Hi, i am working on a script that would remove records or lines in a flat file. The only difference in the file is the "NOT NULL" word. Please see below example of the input file. INPUT FILE:> CREATE a ( TRIAL_CLIENT NOT NULL VARCHAR2(60), TRIAL_FUND NOT NULL... (3 Replies)
Discussion started by: reignangel2003
3 Replies
PMSPECLOCALPMDA(3)					     Library Functions Manual						PMSPECLOCALPMDA(3)

NAME
__pmSpecLocalPMDA - process command-line argument for the table of DSO PMDAs C SYNOPSIS
#include <pcp/pmapi.h> #include <pcp/impl.h> char *__pmSpecLocalPMDA(const char *spec); cc ... -lpcp DESCRIPTION
PCP contexts of type PM_CONTEXT_LOCAL are used by clients that wish to fetch metrics directly from one or more PMDAs on the local host without involving pmcd(1). __pmSpecLocalPMDA provides a convenience wrapper to be used by applications that wish to use a command line argument (usually with -K) to control the DSO PMDAs that are available for a PM_CONTEXT_LOCAL context. The spec argument specifies actions for one or more DSO PMDAs using up to four fields separated by commas (``,''), namely: - an opcode with one of the values add (add a new entry), del (delete an existing entry) or clear (clear all entries from the table). - the PMDA's domain number - the path to the PMDA DSO (may be absolute or relative to the $PCP_VAR_DIR/pmdas directory and the DSO suffix is optional), and - the name of the PMDA's initialization routine. All fields are required to add a new entry. To delete an entry the opcode is required plus either or both of the domain number and path fields. To clear all entries, only the opcode is required. If spec is parsed successfully, then __pmLocalPMDA(3) is called with the extracted arguments. RETURN VALUE
On success, __pmSpecLocalPMDA will return NULL. On error or failure, __pmSpecLocalPMDA will return a pointer to a static error message. EXAMPLES
Some examples of valid spec strings: clear Delete all entries from the DSO table. add,123,foo/foo_pmda,foo_init Add the ``foo'' PMDA using domain 123. The PMDA's DSO is most likely in below the directory $PCP_PMDAS_DIR and named foo/foo_pmda.so (for ELF-style platforms) or foo/foo_pmda.dylib (for BSD-style platforms) or fooo_pmda.dll (for Windows-style plat- forms). The initialization routine for the ``foo'' PMDA is foo_init(). del,123 Delete the entry for the DSO with domain 123. del,,foo/foo_pmda Delete the entry with a pathname to the DSO that matches foo/foo_pmda. del,123,foo/foo_pmda Delete the entry for the DSO with either domain 123 and/or a pathname to the DSO that matches foo/foo_pmda. SEE ALSO
PMAPI(3), __pmLocalPMDA(3) and pmNewContext(3). Performance Co-Pilot PMSPECLOCALPMDA(3)
All times are GMT -4. The time now is 10:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy