Filter duplicate records from csv file with condition on one column Post: 303010199

Sponsored Content

Top Forums Shell Programming and Scripting Filter duplicate records from csv file with condition on one column Post 303010199 by as7951 on Thursday 28th of December 2017 11:15:46 AM

12-28-2017

Registered User

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns
Pasting just three column for problem description
I want to filter record if column 1 matches CN or DN then,
check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345
and if column 2 contains 6789, 6789 in row, then in column 3 values must be in sequence 7890, 7890
or if column 2 contains duplicate value(1234,1234) in row(1-4) in bundle, then column 3 must also contains duplicate value(4567,4567) in row(1-4)
or if column 2 contains duplicate value(5678,5678) in row(5-8) in bundle, then column 3 must also contains duplicate value(4321,4321) in row(5-8)
if combination as explained above is not present, then logs must be printed in another file with error code and line number

Sample file.

Code:

CN	1234	4567
CN	1234	4567
CN	1234	4567
CN	1234	4567
CN	5678	4321
CN	5678	4321
CN	5678	4321
CN	5678	4321

Last edited by jim mcnamara; 12-28-2017 at 12:25 PM..

as7951

View Public Profile for as7951

Find all posts by as7951

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find Duplicate records in first Column in File

Hi, Need to find a duplicate records on the first column, ANU4501710430989 0000000W20389390 ANU4501710430989 0000000W67065483 ANU4501130050520 0000000W80838713 ANU4501210170685 0000000W69246611...

2. Shell Programming and Scripting

Apply condition on fixed width file and filter records

Dear members.. I have a fixed width file. Requirement is as below:- 1. Scan each record from this fixed width file 2. Check for value under field no "6" equals to "ABC". If yes, then filter this record into the output file Please suggest a unix command to achieve this, my guess awk might...

3. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header...

4. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777...

5. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no...

6. Linux

Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format: "column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10 "12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""...

7. Shell Programming and Scripting

Identify duplicate values at first column in csv file

Input 1,ABCD,no 2,system,yes 3,ABCD,yes 4,XYZ,no 5,XYZ,yes 6,pc,noCode used to find duplicate with regard to 2nd column awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv Now is there a wise way to de-duplicate the entire line (remove...

8. Shell Programming and Scripting

Filter file to remove duplicate values in first column

Hello, I have a script that is generating a tab delimited output file. num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 -1.6120 3 compound_03 -1.3053 1.8495 ...

9. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field...

10. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was...

LEARN ABOUT LINUX

colrm

COLRM(1)						    BSD General Commands Manual 						  COLRM(1)

NAME

     colrm -- remove columns from a file

SYNOPSIS

     colrm [start [stop]]

DESCRIPTION

     The colrm utility removes selected columns from the lines of a file.  A column is defined as a single character in a line.  Input is read
     from the standard input.  Output is written to the standard output.

     If only the start column is specified, columns numbered less than the start column will be written.  If both start and stop columns are spec-
     ified, columns numbered less than the start column or greater than the stop column will be written.  Column numbering starts with one, not
     zero.

     Tab characters increment the column count to the next multiple of eight.  Backspace characters decrement the column count by one.

ENVIRONMENT

     The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of colrm as described in environ(7).

EXIT STATUS

     The colrm utility exits 0 on success, and >0 if an error occurs.

SEE ALSO

     awk(1), column(1), cut(1), paste(1)

HISTORY

     The colrm command appeared in 3.0BSD.

BSD
								  August 4, 2004							       BSD

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find Duplicate records in first Column in File

Discussion started by: Murugesh

2. Shell Programming and Scripting

Apply condition on fixed width file and filter records

Discussion started by: sureshg_sampat

3. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Discussion started by: arvindosu

4. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Discussion started by: G.K.K