Find and remove duplicate record and print list


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find and remove duplicate record and print list
# 1  
Old 11-06-2012
Find and remove duplicate record and print list

Gents,

I needs to delete duplicate values and only get uniq values based in columns 2-27

Always we should keep the last record found...

I need to store one clean file and other with the duplicate values removed.


Input :

Code:
S3033.0            7305.01               0     420123.8 2580723.8 151.9311    18
S3035.0            7305.01               0     420123.3 2580773.9 151.6311   130
S3355.0            7275.01               0     418624.2 2588774.2 156.0311   224
S3353.0            7275.01               0     418624.5 2588726.2 156.3311   336
S3033.0            7305.02               0     418623.9 2588674.7 156.8311   430
S3349.0            7275.01               0     418623.1 2588627.5 157.0311   542
S3349.0            7280.01               0     418874.1 2588631.6 156.0311   657
S3355.0            7275.02               0     418376.0 2588775.5 156.6311   733
S3349.0            7280.02               0     418874.4 2588677.4 156.1311   809
S3353.0            7270.01               0     418375.3 2588718.0 156.9311   846
S3353.0            7280.01               0     418874.8 2588727.6 156.3311   922
S3351.0            7270.01               0     418375.6 2588675.5 157.3311   958
S3355.0            7275.03               0     418875.1 2588777.6 156.4311  1034


Desired Output:
cleaned file.txt

Code:
S3035.0            7305.01               0     420123.3 2580773.9 151.6311   130
S3033.0            7305.02               0     418623.9 2588674.7 156.8311   430
S3349.0            7275.01               0     418623.1 2588627.5 157.0311   542
S3349.0            7280.02               0     418874.4 2588677.4 156.1311   809
S3353.0            7270.01               0     418375.3 2588718.0 156.9311   846
S3353.0            7280.01               0     418874.8 2588727.6 156.3311   922
S3351.0            7270.01               0     418375.6 2588675.5 157.3311   958
S3355.0            7275.03               0     418875.1 2588777.6 156.4311  1034


removed.txt

Code:
S3033.0            7305.01               0     420123.8 2580723.8 151.9311    18
S3355.0            7275.01               0     418624.2 2588774.2 156.0311   224
S3353.0            7275.01               0     418624.5 2588726.2 156.3311   336
S3355.0            7275.02               0     418376.0 2588775.5 156.6311   733


Thanks in advance Smilie
# 2  
Old 11-06-2012
could you please try sort -u or uniq -u
# 3  
Old 11-06-2012
Dear Anshum

Please let me know how to sort the imput file in order to get the output file as I request?..
# 4  
Old 11-06-2012
Code:
sort -u -t  "    " -k 2.27 filename


Last edited by Franklin52; 11-07-2012 at 06:53 AM.. Reason: Please use code tags for data and code samples
# 5  
Old 11-06-2012
Please can you let me know how i get the file output removed.txt with the removed points
thanks
# 6  
Old 11-06-2012
Code:
cut -d "    " -f3 filename|sort |uniq -d>removed.txt

Smilie

Last edited by Franklin52; 11-07-2012 at 08:53 AM.. Reason: Please use code tags for data and code samples
# 7  
Old 11-07-2012
Thanks a lot Anshuman,

Then, I will have the both files .. clean file and removed .. as my Desired Output:

---------- Post updated at 04:16 PM ---------- Previous update was at 02:46 PM ----------

Guys,
There is any options to do it using awk... I don't want to sort the file output ...only remove the duplicated values an print both files like I write. Thanks for your help

---------- Post updated 11-07-12 at 01:24 AM ---------- Previous update was 11-06-12 at 04:16 PM ----------

The sort comands does not work.....

Please can somebody help me to solve this issue... Thanks a lot
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Check/print missing number in a consecutive range and remove duplicate numbers

Hi, In an ideal scenario, I will have a listing of db transaction log that gets copied to a DR site and if I have them all, they will be numbered consecutively like below. 1_79811_01234567.arc 1_79812_01234567.arc 1_79813_01234567.arc 1_79814_01234567.arc 1_79815_01234567.arc... (3 Replies)
Discussion started by: newbie_01
3 Replies

2. Shell Programming and Scripting

Modifying text file records, find data in one place in the record and print it elsewhere

Hello, I have some text data that is in the form of multi-line records. Each record ends with the string $$$$ and the next record starts on the next line. RDKit 2D 15 14 0 0 0 0 0 0 0 0999 V2000 5.4596 2.1267 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 ... (5 Replies)
Discussion started by: LMHmedchem
5 Replies

3. Shell Programming and Scripting

Find key pattern and print selected lines for each record

Hi, I need help on a complicated file that I am working on. I wanted to extract important info from a very huge file. It is space delimited file. I have hundred thousands of records in this file. An example content of the inputfile as below:- ## ID Ser402 Old; 23... (2 Replies)
Discussion started by: redse171
2 Replies

4. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

5. Shell Programming and Scripting

Find x and print its record

Hi all, I have a file containing two fields with 154 rows/records/lines (forgive me, my UNIX terminology is not quite up to par yet). I am trying to read from this list, find a value (lets say 0), then print the record/line/row that value falls on (In this case it would be record/line/row #27)?... (5 Replies)
Discussion started by: StudentServitor
5 Replies

6. Shell Programming and Scripting

Find duplicate filenames and remove in different mount point

Hi Gurus, Do any kind souls encounter have the same script as mentioned here. Find and compare filenames in different mount point and remove duplicates. Thanks a million!!! wanna13e (7 Replies)
Discussion started by: wanna13e
7 Replies

7. UNIX for Advanced & Expert Users

Print Full record and substring in that record

I have i got a requirement like below. I have input file which contains following fixed width records. 00000000000088500232007112007111 I need the full record and concatenated with ~ and characters from 1to 5 and concatenated with ~ and charactes from 10 to 15 The out put will be like... (1 Reply)
Discussion started by: ukatru
1 Replies

8. UNIX for Advanced & Expert Users

How to remove duplicate lines of a record without changing the order

Hi all, I have to remove duplicate lines in a file without chainging the order.for eg if i have a record pqr def abc lmn pqr abc mkh hgf the output should be pqr def abc lmn mkh hgf (7 Replies)
Discussion started by: abhi.roy03
7 Replies

9. UNIX for Dummies Questions & Answers

How to extract duplicate records with associated header record

All, I have a task to search through several hundred files and extract duplicate detail records and keep them grouped with their header record. If no duplicate detail record exists, don't pull the header. For example, an input file could look like this: input.txt HA D1 D2 D2 D3 D4 D4... (17 Replies)
Discussion started by: run_eim
17 Replies
Login or Register to Ask a Question