Removing duplicate records in a file based on single column explanation Post: 302608637

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Filtering records of a file based on a value of a column

Hi all, I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g. COL1 COL2 COL3 ............... COL47 1 XX 45 ...

2. Linux

Need awk script for removing duplicate records

I have huge txt file having millions of trade data. For e.g Trade.txt (first 8 lines in the file is header info) COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID, TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID, EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,...

3. Shell Programming and Scripting

Find Duplicate records in first Column in File

Hi, Need to find a duplicate records on the first column, ANU4501710430989 0000000W20389390 ANU4501710430989 0000000W67065483 ANU4501130050520 0000000W80838713 ANU4501210170685 0000000W69246611...

4. Shell Programming and Scripting

Removing duplicate records from 2 files

Can anyone help me to removing duplicate records from 2 separate files in UNIX? Please find the sample records for both the files cat Monday.dat 3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE 3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE 3MEHM0JG7AR652083MUTLAB NAL-NAFISAH...

5. Shell Programming and Scripting

duplicate row based on single column

I am a newbie to shell scripting .. I have a .csv file. It has 1000 some rows and about 7 columns... but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type... example below.. column 1 ...

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777...

7. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX...

8. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Hi, I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines Command : sort -t'|' -nuk1 file.txt Input : 38376KZ|09/25/15|1.057 38376KZ|09/25/15|1.057 02006YB|09/25/15|0.859 12593PS|09/25/15|2.803...

9. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789...

10. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field...

LEARN ABOUT HPUX

makedbm

makedbm(1M)															       makedbm(1M)

NAME

       makedbm - make a Network Information System database

SYNOPSIS

       nis_input_file] nis_output_name] nis_domain_name] nis_master_name] infile outfile

       database_name

   Remarks
       The  Network  Information  Service  (NIS) was formerly known as Yellow Pages (yp).  Although the name has changed, the functionality of the
       service remains the same.

DESCRIPTION

       generates databases (maps) for the Network Information System (NIS) from infile.  A database created by consists of two files: and A  data-
       base contains records called dbm records composed of key-value pairs.

       Each line of infile is converted to a single dbm record; all characters up to the first tab or space form the key, and the remainder of the
       line is the value.  If a value read from infile ends with the value for that record is continued onto the next line.  The NIS clients  must
       interpret the character (which means that does not treat the as if it precedes a comment).  If infile is a hyphen reads standard input.

       always  generates  a special dbm record with the key whose value is the time of last modification of infile (or the current time, if infile
       is This value is also known as the order number of a map, and prints it for a specified NIS map (see yppoll(1M)).

       Another special dbm record created by has the key Its value is usually the host name retrieved by however, the option can be used to  spec-
       ify a different value (see gethostname(2)).

       If  the	option	is used, another special dbm record with the key is created.  When this key exists in the NIS host.by* maps or ipnodes.by*
       maps and the NIS host name resolution fails, the process will query the Internet domain name server, to provide the host  name  resolution.
       Before  using  the  option,  it	is recommended that the name services switch, be set to allow NIS host name resolution first.  (Note that,
       since the process only checks hosts.by* and ipnodes.by* for the existence of the key, using the option on any other NIS map  will  have	no
       effect.	Also, the option should be used on both the *.byname and *.byaddr maps, not one exclusively.)

       If  the	option	is used, another special dbm record created is the key.  If this key exists in an NIS map, will only allow privileged pro-
       cesses (applications that can create reserved ports) to access the data within the map.

   Options
       recognizes the following options and command-line arguments.

       Create a special dbm record with the key
	      This key, which is in the hosts.byname, hosts.byaddr, ipnodes.byname, and ipnodes.byaddr maps,  allows  the  process  to	query  the
	      Internet domain name server (see named(1M)).

       Convert the keys of the given map to lowercase.
	      This command option allows host name matches to work independent of character-case distinctions.

       Accept connections from secure NIS networks only.

       Create a special dbm record with the key
	      and  the	value  If  the option is used, another special dbm record created is the key.  If this key exists in an NIS map, will only
	      allow privileged processes to access the data within the map (that is, applications that can create reserved ports).

       Create a special dbm record with the key
	      and the value nis_output_name.

       Create a special dbm record with the key
	      and the value nis_domain_name.

       Replace the value of the special dbm record whose key is
	      with nis_master_name.

       Undo the
	      database_name (that is, write the contents of database_name to the standard output) one dbm record per line.  A single  space  sepa-
	      rates each key from its value.

EXAMPLES

       Shell scripts can be written to convert ASCII files such as to the key-value form used by For example,

	      #!/usr/bin/sh
	      /usr/bin/awk 'BEGIN { FS = ":" } { print $1, $0 }' 
		      /etc/netgroup | 
	      makedbm - netgroup

       converts  the  file  to a form that is read by to make the NIS map The keys in the database are names, and the values are the remainders of
       the lines in the file.

AUTHOR

       was developed by Sun Microsystems, Inc.

SEE ALSO

       domainname(1), named(1M), ypinit(1M), ypmake(1M), yppoll(1M), gethostname(2), netgroup(4), ypfiles(4).

																       makedbm(1M)

-F,	Set the input field separator to comma
NR==FNR	When the first file is being read (only then are FNR and NR equal)
C[$1]++	create an (associative) array element with the first filed as the index and increment its value by 1
next	start reading the next record
C[$1]==1	(while reading the second file, which in this case is the first file for the second time) if the count is equal to 1, i.e. the total number of appearances of field 1 in the input file is 1 then print the record (line).
infile infile	read infile followed by infile