Removing duplicate records in a file based on single column explanation Post: 302608683

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Filtering records of a file based on a value of a column

Hi all, I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g. COL1 COL2 COL3 ............... COL47 1 XX 45 ...

2. Linux

Need awk script for removing duplicate records

I have huge txt file having millions of trade data. For e.g Trade.txt (first 8 lines in the file is header info) COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID, TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID, EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,...

3. Shell Programming and Scripting

Find Duplicate records in first Column in File

Hi, Need to find a duplicate records on the first column, ANU4501710430989 0000000W20389390 ANU4501710430989 0000000W67065483 ANU4501130050520 0000000W80838713 ANU4501210170685 0000000W69246611...

4. Shell Programming and Scripting

Removing duplicate records from 2 files

Can anyone help me to removing duplicate records from 2 separate files in UNIX? Please find the sample records for both the files cat Monday.dat 3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE 3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE 3MEHM0JG7AR652083MUTLAB NAL-NAFISAH...

5. Shell Programming and Scripting

duplicate row based on single column

I am a newbie to shell scripting .. I have a .csv file. It has 1000 some rows and about 7 columns... but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type... example below.. column 1 ...

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777...

7. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX...

8. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Hi, I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines Command : sort -t'|' -nuk1 file.txt Input : 38376KZ|09/25/15|1.057 38376KZ|09/25/15|1.057 02006YB|09/25/15|0.859 12593PS|09/25/15|2.803...

9. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789...

10. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field...

LEARN ABOUT XFREE86

comm

COMM(1) 							   User Commands							   COMM(1)

NAME

       comm - compare two sorted files line by line

SYNOPSIS

       comm [OPTION]... FILE1 FILE2

DESCRIPTION

       Compare sorted files FILE1 and FILE2 line by line.

       When FILE1 or FILE2 (not both) is -, read standard input.

       With  no  options,  produce three-column output.  Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and
       column three contains lines common to both files.

       -1     suppress column 1 (lines unique to FILE1)

       -2     suppress column 2 (lines unique to FILE2)

       -3     suppress column 3 (lines that appear in both files)

       --check-order
	      check that the input is correctly sorted, even if all input lines are pairable

       --nocheck-order
	      do not check that the input is correctly sorted

       --output-delimiter=STR
	      separate columns with STR

       --total
	      output a summary

       -z, --zero-terminated
	      line delimiter is NUL, not newline

       --help display this help and exit

       --version
	      output version information and exit

       Note, comparisons honor the rules specified by 'LC_COLLATE'.

EXAMPLES

       comm -12 file1 file2
	      Print only lines present in both file1 and file2.

       comm -3 file1 file2
	      Print lines in file1 not in file2, and vice versa.

AUTHOR

       Written by Richard M. Stallman and David MacKenzie.

REPORTING BUGS

       GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
       Report comm translation bugs to <http://translationproject.org/team/>

COPYRIGHT

       Copyright (C) 2017 Free Software Foundation, Inc.  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
       This is free software: you are free to change and redistribute it.  There is NO WARRANTY, to the extent permitted by law.

SEE ALSO

       join(1), uniq(1)

       Full documentation at: <http://www.gnu.org/software/coreutils/comm>
       or available locally via: info '(coreutils) comm invocation'

GNU coreutils 8.28						   January 2018 							   COMM(1)

1,3000,5000	C[1]==2	not printed
1,4000,6000	C[1]==2	not printed
2,4000,600	C[2]==2	not printed
2,5000,700	C[2]==2	not printed
3,60000,4000	C[3]==1	printed
4,7000,7777	C[4]==1	printed
5,999,8888	C[5]==1	printed