Sponsored Content
Top Forums Shell Programming and Scripting Identify duplicate values at first column in csv file Post 302957938 by deadyetagain on Friday 16th of October 2015 05:19:14 PM
Old 10-16-2015
Quote:
Originally Posted by vgersh99
depending on what your desired output should be:
Code:
awk -F, '!a[$2]++{next} {print "Line " NR " " $2 " is duplicated"}' myFile
OR
awk -F, '!a[$2]++' myFile

Does this remove the line when $2 is found to have been a duplicate?

---------- Post updated at 05:19 PM ---------- Previous update was at 05:17 PM ----------

Quote:
Originally Posted by Don Cragun
Your thread title says you are trying to find duplicates in the 1st field; your code prints lines in which the 2nd field on the line has been seen before. Note that it prints duplicates; it does not remove duplicates.

And, since there are no lines in your sample input where the 1st field is duplicated on any other line, I have no idea what you are trying to do. What additional logic are you talking about? What output are you hoping to produce from this sample input?
Thank you for the suggestion.
I tried to change the title but that doesn't seem to be an option once the submission is made...

---------- Post updated at 05:19 PM ---------- Previous update was at 05:19 PM ----------

Quote:
Originally Posted by Don Cragun
Your thread title says you are trying to find duplicates in the 1st field; your code prints lines in which the 2nd field on the line has been seen before. Note that it prints duplicates; it does not remove duplicates.

And, since there are no lines in your sample input where the 1st field is duplicated on any other line, I have no idea what you are trying to do. What additional logic are you talking about? What output are you hoping to produce from this sample input?
Thank you for the suggestion.
I tried to change the title but that doesn't seem to be an option once the submission is made...
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Check to identify duplicate values at first column in csv file

Hello experts, I have a requirement where I have to implement two checks on a csv file: 1. Check to see if the value in first column is duplicate, if any value is duplicate script should exit. 2. Check to verify if the value at second column is between "yes" or "no", if it is anything else... (4 Replies)
Discussion started by: avikaljain
4 Replies

2. Linux

Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format: "column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10 "12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""... (2 Replies)
Discussion started by: dhruuv369
2 Replies

3. Shell Programming and Scripting

Fetching values in CSV file based on column name

input.csv: Field1,Field2,Field3,Field4,Field4 abc ,123 ,xyz ,000 ,pqr mno ,123 ,dfr ,111 ,bbb output: Field2,Field4 123 ,000 123 ,111 how to fetch the values of Field4 where Field2='123' I don't want to fetch the values based on column position. Instead want to... (10 Replies)
Discussion started by: bharathbangalor
10 Replies

4. Shell Programming and Scripting

Remove the values from a certain column without deleting the Column name in a .CSV file

(14 Replies)
Discussion started by: dhruuv369
14 Replies

5. Shell Programming and Scripting

Filter file to remove duplicate values in first column

Hello, I have a script that is generating a tab delimited output file. num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 -1.6120 3 compound_03 -1.3053 1.8495 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

6. Shell Programming and Scripting

Remove duplicate values in a column(not in the file)

Hi Gurus, I have a file(weblog) as below abc|xyz|123|agentcode=sample code abcdeeess,agentcode=sample code abcdeeess,agentcode=sample code abcdeeess|agentadd=abcd stereet 23343,agentadd=abcd stereet 23343 sss|wwq|999|agentcode=sample1 code wqwdeeess,gentcode=sample1 code... (4 Replies)
Discussion started by: ratheeshjulk
4 Replies

7. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

8. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789... (5 Replies)
Discussion started by: as7951
5 Replies

9. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies
MSGUNIQ(1)								GNU								MSGUNIQ(1)

NAME
msguniq - unify duplicate translations in message catalog SYNOPSIS
msguniq [OPTION] [INPUTFILE] DESCRIPTION
Unifies duplicate translations in a translation catalog. Finds duplicate translations of the same message ID. Such duplicates are invalid input for other programs like msgfmt, msgmerge or msgcat. By default, duplicates are merged together. When using the --repeated option, only duplicates are output, and all other messages are discarded. Comments and extracted comments will be cumulated, except that if --use-first is specified, they will be taken from the first translation. File positions will be cumulated. When using the --unique option, duplicates are discarded. Mandatory arguments to long options are mandatory for short options too. Input file location: INPUTFILE input PO file -D, --directory=DIRECTORY add DIRECTORY to list for input files search If no input file is given or if it is -, standard input is read. Output file location: -o, --output-file=FILE write output to specified file The results are written to standard output if no output file is specified or if it is -. Message selection: -d, --repeated print only duplicates -u, --unique print only unique messages, discard duplicates Input file syntax: -P, --properties-input input file is in Java .properties syntax --stringtable-input input file is in NeXTstep/GNUstep .strings syntax Output details: -t, --to-code=NAME encoding for output --use-first use first available translation for each message, don't merge several translations -e, --no-escape do not use C escapes in output (default) -E, --escape use C escapes in output, no extended chars --force-po write PO file even if empty -i, --indent write the .po file using indented style --no-location do not write '#: filename:line' lines -n, --add-location generate '#: filename:line' lines (default) --strict write out strict Uniforum conforming .po file -p, --properties-output write out a Java .properties file --stringtable-output write out a NeXTstep/GNUstep .strings file -w, --width=NUMBER set output page width --no-wrap do not break long message lines, longer than the output page width, into several lines -s, --sort-output generate sorted output -F, --sort-by-file sort output by file location Informative output: -h, --help display this help and exit -V, --version output version information and exit AUTHOR
Written by Bruno Haible. REPORTING BUGS
Report bugs to <bug-gnu-gettext@gnu.org>. COPYRIGHT
Copyright (C) 2001-2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
The full documentation for msguniq is maintained as a Texinfo manual. If the info and msguniq programs are properly installed at your site, the command info msguniq should give you access to the complete manual. GNU gettext-tools 0.17 November 2007 MSGUNIQ(1)
All times are GMT -4. The time now is 01:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy