Find duplicate based on 'n' fields and mark the duplicate as 'D' Post: 302593698

Sponsored Content

Top Forums Shell Programming and Scripting Find duplicate based on 'n' fields and mark the duplicate as 'D' Post 302593698 by machomaddy on Saturday 28th of January 2012 02:07:24 AM

01-28-2012

Registered User

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi,

In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'.

In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create a new field at the end of the records as "C" or "D")

Code:

File 1
====
Man_ID|Man_Dt|Ship_Id|Ship_Dt|ItemID|Noof ITEMS|ItemNam
001|2010-12-31|11|2010-12-31|111|2|Jackets
002|2010-12-31|12|2010-12-31|111|1|Caps
001|2010-12-31|11|2009-11-31|111|2|Jackets
001|2010-12-31|11|2011-12-31|111|2|Jackets
003|2010-11-01|13|2011-12-31|111|1|Shoes

Expected Output

File 1
=====
Man_ID|Man_Dt|Ship_Id|Ship_Dt|ItemID|Noof ITEMS|ItemNam
 001|2010-12-31|11|2010-12-31|111|2|Jackets|D
002|2010-12-31|12|2010-12-31|111|1|Caps
001|2010-12-31|11|2009-11-31|111|2|Jackets|D
 001|2010-12-31|11|2011-12-31|111|2|Jackets|C
003|2010-11-01|13|2011-12-31|111|1|Shoes

Last edited by machomaddy; 01-28-2012 at 06:37 AM.. Reason: Edited wrong Input "2010-12-31" to "2011-12-31" in the 4th record

machomaddy

View Public Profile for machomaddy

Find all posts by machomaddy

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract duplicate fields in rows

I have a input file with formating: 6000000901 ;36200103 ;h3a01f496 ; 2000123605 ;36218982 ;heefa1328 ; 2000273132 ;36246985 ;h08c5cb71 ; 2000041207 ;36246985 ;heef75497 ; Each fields is seperated by semi-comma. Sometime, the second files is...

2. Shell Programming and Scripting

compare fields in a file with duplicate records

Hi: I've been searching the net but didnt find a clue. I have a file in which, for some records, some fields coincide. I want to compare one (or more) of the dissimilar fields and retain the one record that fulfills a certain condition. For example, on this file: 99 TR 1991 5 06 ...

3. Shell Programming and Scripting

awk 2 fields duplicate and 1 different

I have file that I need to remove the duplicates. The problem is, I need to only keep the one which has a unique 3rd field. Here is a sample file: xxx.xxx:x:CISCO1.CLEVE61W:ERIE.NET:x:x:x:x: xxx.xxx:x:CISCO2.CLEVE62W:OHIO.NET:x:x:x:x: xxx.xxx:x:CISCO2.CLEVE62W:NORTH.NET:x:x:x:x:...

4. Shell Programming and Scripting

Filter or remove duplicate block of text without distinguishing marks or fields

Hello, Although I have found similar questions, I could not find advice that could help with our problem. The issue: We have several hundreds text files containing repeated blocks of text (I guess back at the time they were prepared like that to optmize printing). The block of texts...

5. Shell Programming and Scripting

Remove duplicate based on Group

6. Shell Programming and Scripting

Join fields from files with duplicate lines

I have two files, file1.txt: 1 abc 2 def 2 dgh 3 ijk 4 lmn file2.txt 1 opq 2 rst 3 uvw My desired output is: 1 abc opq 2 def rst 2 dgh rst 3 ijk uvw

7. Shell Programming and Scripting

How To Remove Duplicate Based on the Value?

Hi , Some time i got duplicated value in my files , bundle_identifier= B Sometext=ABC bundle_identifier= A bundle_unit=500 Sometext123=ABCD bundle_unit=400 i need to check if there is a duplicated values or not if yes , i need to check if the value is A or B when Bundle_Identified ,...

8. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323...

9. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same...

10. UNIX for Beginners Questions & Answers

Discarding records with duplicate fields

Hi, My input looks like this (tab-delimited): grp1 name2 firstname M 55 item1 item1.0 grp1 name2 firstname F 55 item1 item1.0 grp2 name1 firstname M 55 item1 item1.0 grp2 name2 firstname M 55 item1 item1.0 Using awk, I am trying to discard the records with common fields 2, 4, 5, 6, 7...

LEARN ABOUT SUSE

dupdb-admin

dupdp-admin(1)						      General Commands Manual						    dupdp-admin(1)

NAME

       dupdb-admin - Manage the duplicate database for apport-retrace.

SYNOPSIS

       dupdb-admin -f dbpath status

       dupdb-admin -f dbpath dump

       dupdb-admin -f dbpath changeid oldid newid

DESCRIPTION

       apport-retrace(1)  has the capability of checking for duplicate bugs (amonst other things). It uses an SQLite database for keeping track of
       master bugs.  dupdb-admin is a small tool to manage that database.

       The central concept in that database is a "crash signature", a string that uniquely identifies a particular crash. It  is  built  from  the
       executable path name, the signal number or exception name, and the topmost functions of the stack trace.

       The  database maps crash signatures to the 'master' crash id and thus can close duplicate crash reports with a reference to that master ID.
       It also tracks the status of crashes (open/fixed in a particular version) to be able to identify regressions.

MODES

       status Print general status of the duplicate db. For now, it only shows the time when the database was "consolidated" last, i. e. when  the
	      bug states (open/fixed) in the SQLite database where updated to the actual states in the bug tracking system.

       dump   Print a list of all database entries.

       changeid
	      Change the associated crash ID for a particular crash.

OPTIONS

       -f path, --database-file=path
	      Instead  of processing the new crash reports in /var/crash/, report a particular report in an arbitrary file location.  This is use-
	      ful for copying a crash report to a machine with internet connection and reporting it from there. This defaults to  ~./apport_dupli-
	      cates.db.

AUTHOR

       apport and the accompanying tools are developed by Martin Pitt <martin.pitt@ubuntu.com>.

Martin Pitt							  August 01, 2007						    dupdp-admin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract duplicate fields in rows

Discussion started by: anhtt

2. Shell Programming and Scripting

compare fields in a file with duplicate records

Discussion started by: rleal

3. Shell Programming and Scripting

awk 2 fields duplicate and 1 different

Discussion started by: numele

4. Shell Programming and Scripting

Filter or remove duplicate block of text without distinguishing marks or fields

Discussion started by: samask