Find duplicate based on 'n' fields and mark the duplicate as 'D'
Hi,
In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'.
In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create a new field at the end of the records as "C" or "D")
Last edited by machomaddy; 01-28-2012 at 06:37 AM..
Reason: Edited wrong Input "2010-12-31" to "2011-12-31" in the 4th record
I have a input file with formating:
6000000901 ;36200103 ;h3a01f496 ;
2000123605 ;36218982 ;heefa1328 ;
2000273132 ;36246985 ;h08c5cb71 ;
2000041207 ;36246985 ;heef75497 ;
Each fields is seperated by semi-comma. Sometime, the second files is... (6 Replies)
Hi:
I've been searching the net but didnt find a clue. I have a file in which, for some records, some fields coincide. I want to compare one (or more) of the dissimilar fields and retain the one record that fulfills a certain condition. For example, on this file:
99 TR 1991 5 06 ... (1 Reply)
I have file that I need to remove the duplicates. The problem is, I need to only keep the one which has a unique 3rd field. Here is a sample file:
xxx.xxx:x:CISCO1.CLEVE61W:ERIE.NET:x:x:x:x:
xxx.xxx:x:CISCO2.CLEVE62W:OHIO.NET:x:x:x:x:
xxx.xxx:x:CISCO2.CLEVE62W:NORTH.NET:x:x:x:x:... (1 Reply)
Hello,
Although I have found similar questions, I could not find advice that
could help with our problem.
The issue:
We have several hundreds text files containing repeated blocks of text
(I guess back at the time they were prepared like that to optmize
printing).
The block of texts... (13 Replies)
Hi,
How can I remove duplicates from a file based on group on other column? for example:
Test1|Test2|Test3|Test4|Test5
Test1|Test6|Test7|Test8|Test5
Test1|Test9|Test10|Test11|Test12
Test1|Test13|Test14|Test15|Test16
Test17|Test18|Test19|Test20|Test21
Test17|Test22|Test23|Test24|Test5
... (2 Replies)
Hi ,
Some time i got duplicated value in my files ,
bundle_identifier= B
Sometext=ABC
bundle_identifier= A
bundle_unit=500
Sometext123=ABCD
bundle_unit=400
i need to check if there is a duplicated values or not if yes , i need to check if the value is A or B when Bundle_Identified ,... (2 Replies)
Dear community,
I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns
The data are like this:
Region 23/11/2014 09:11:36 41752
Medio 23/11/2014 03:11:38 4132
Info 23/11/2014 05:11:09 4323... (2 Replies)
Dear folks
I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Hi,
My input looks like this (tab-delimited):
grp1 name2 firstname M 55 item1 item1.0
grp1 name2 firstname F 55 item1 item1.0
grp2 name1 firstname M 55 item1 item1.0
grp2 name2 firstname M 55 item1 item1.0
Using awk, I am trying to discard the records with common fields 2, 4, 5, 6, 7... (4 Replies)
Discussion started by: beca123456
4 Replies
LEARN ABOUT SUSE
dupdb-admin
dupdp-admin(1) General Commands Manual dupdp-admin(1)NAME
dupdb-admin - Manage the duplicate database for apport-retrace.
SYNOPSIS
dupdb-admin -f dbpath status
dupdb-admin -f dbpath dump
dupdb-admin -f dbpath changeid oldid newid
DESCRIPTION apport-retrace(1) has the capability of checking for duplicate bugs (amonst other things). It uses an SQLite database for keeping track of
master bugs. dupdb-admin is a small tool to manage that database.
The central concept in that database is a "crash signature", a string that uniquely identifies a particular crash. It is built from the
executable path name, the signal number or exception name, and the topmost functions of the stack trace.
The database maps crash signatures to the 'master' crash id and thus can close duplicate crash reports with a reference to that master ID.
It also tracks the status of crashes (open/fixed in a particular version) to be able to identify regressions.
MODES
status Print general status of the duplicate db. For now, it only shows the time when the database was "consolidated" last, i. e. when the
bug states (open/fixed) in the SQLite database where updated to the actual states in the bug tracking system.
dump Print a list of all database entries.
changeid
Change the associated crash ID for a particular crash.
OPTIONS -f path, --database-file=path
Instead of processing the new crash reports in /var/crash/, report a particular report in an arbitrary file location. This is use-
ful for copying a crash report to a machine with internet connection and reporting it from there. This defaults to ~./apport_dupli-
cates.db.
AUTHOR
apport and the accompanying tools are developed by Martin Pitt <martin.pitt@ubuntu.com>.
Martin Pitt August 01, 2007 dupdp-admin(1)