Sponsored Content
Top Forums Shell Programming and Scripting finding duplicates in columns and removing lines Post 302188995 by totus on Thursday 24th of April 2008 04:04:05 PM
Old 04-24-2008
Data finding duplicates in columns and removing lines

I am trying to figure out how to scan a file like so:

1 ralphs office","555-555-5555","ralph@mail.com","www.ralph.com
2 margies office","555-555-5555","ralph@mail.com","www.ralph.com
3 kims office","555-555-5555","kims@mail.com","www.ralph.com
4 tims office","555-555-5555","tims@mail.com","www.ralph.com

and end up with this:

1 ralphs office","555-555-5555","ralph@mail.com","www.ralph.com
3 kims office","555-555-5555","kims@mail.com","www.ralph.com
4 tims office","555-555-5555","tims@mail.com","www.ralph.com

specifically, I'm needing to look for duplicates in column 3 in csv file, if a duplicate is found, remove "lines" based on duplicates found in column 3. In the instance above line two is removed or filtered.

Does anyone know if the unix uniq command can be utilized or perl? uniq doesn't seen to have a delimiter flag to use only character count or bit.

Thanks!
TotusSmilie

Last edited by totus; 04-24-2008 at 05:31 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Removing lines that are (same in content) based on columns

I have a file which looks like AA BB CC DD EE FF GG HH KK AA BB GG HH KK FF CC DD EE AA BB CC DD EE UU VV XX ZZ AA BB VV XX ZZ UU CC DD EE .... I want the script to give me only one line based on duplicate contents: AA BB CC DD EE FF GG HH KK AA BB CC DD EE UU VV XX ZZ (7 Replies)
Discussion started by: adsforall
7 Replies

2. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies

3. Shell Programming and Scripting

Finding duplicates from positioned substring across lines

I have million's of records each containing exactly 50 characters and have to check the uniqueness of 4 character substring of 50 character (postion known prior) and report if any duplicates are found. Eg. data... AAAA00000000000000XXXX0000 0000000000... upto50 chars... (2 Replies)
Discussion started by: gapprasath
2 Replies

4. Shell Programming and Scripting

Removing duplicates from string (not duplicate lines)

please help me in getting following: Input Desired output x="foo" foo x="foo foo" foo x="foo foo" foo x="foo abc foo" foo abc x="foo foo1 foo2" foo foo1 foo2 I need to remove duplicated from string.. (8 Replies)
Discussion started by: vickylife
8 Replies

5. Shell Programming and Scripting

finding duplicates in csv based on key columns

Hi team, I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record. can one help me on finding the duplicates, Thanks in advance. ... (2 Replies)
Discussion started by: baskivs
2 Replies

6. Shell Programming and Scripting

Help in removing duplicates

I have an input file abc.txt with info like: abcd rateuse inklite robet rateuse abcd I need to remove duplicates from the file (eg: abcd,rateuse) from the file and need to place the contents in same file abc.txt if needed can be placed in another file. can anyone help me in this :( (4 Replies)
Discussion started by: rkrish
4 Replies

7. Shell Programming and Scripting

Removing duplicates in fixed width file which has multiple key columns

Hi All , I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file . File has 8 columns. Key columns are col1 and col2. Col1 has the length of 8 col 2 has the length of 3. ... (5 Replies)
Discussion started by: saj
5 Replies

8. Shell Programming and Scripting

UNIX scripting for finding duplicates and null records in pk columns

Hi, I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns.. i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which... (5 Replies)
Discussion started by: praveenraj.1991
5 Replies

9. Shell Programming and Scripting

Removing duplicates from delimited file based on 2 columns

Hi guys,Got a bit of a bind I'm in. I'm looking to remove duplicates from a pipe delimited file, but do so based on 2 columns. Sounds easy enough, but here's the kicker... Column #1 is a simple ID, which is used to identify the duplicate. Once dups are identified, I need to only keep the one... (2 Replies)
Discussion started by: kevinprood
2 Replies

10. Shell Programming and Scripting

Removing carriage returns from multiple lines in multiple files of different number of columns

Hello Gurus, I have a multiple pipe separated files which have records going over multiple Lines. End of line separator is \n and records going over multiple lines have <CR> as separator. below is example from one file. 1|ABC DEF|100|10 2|PQ RS T|200|20 3| UVWXYZ|300|30 4| GHIJKL|400|40... (7 Replies)
Discussion started by: dJHa
7 Replies
RHNREG_KS(8)							   Red Hat, Inc.						      RHNREG_KS(8)

NAME
rhnreg_ks - A program for non interactively registering systems to RHN Satellite or Red Hat Network Classic. SYNOPSIS
rhnreg_ks [options] DESCRIPTION
rhnreg_ks is a utility for registering a system with the RHN Satellite or Red Hat Network Classic. It is designed to be used in a non- interactive environment (a kickstart style install, for example). All the information can be specified on the command line or stdin. OPTIONS
--profilename Specify the profile name that should be used as an identifier for the system in RHN Satellite, Red Hat Network Classic --username The username to register the system with under RHN Satellite or Red Hat Network Classic. This can be an existing RHN Satellite or Red Hat Network Classic username, or a new username. --password The password associated with the username specified with the --username option. This is an unencrypted password. --activationkey Specify/use a serial number to associate with the system. This is optional, but activation keys can really simplify the registration process. Learn more about activation keys in the online RHN documentation. --contactinfo This option has been deprecated. Please login to the server web user interface and update your contactinfo. --nohardware Do not probe or upload any hardware information. --nopackages Do not profile or upload any package information. --novirtinfo Do not profile or upload any virtualization information. --norhnsd Do not start rhnsd after completion. --force Register the system even if it has already been registered. --version Show the version of rhnreg_ks. --proxy Specify a HTTP proxy to use. --proxyUser Specify a username to use with an authenticated HTTP proxy. --proxyPassword Specify a password to use with an authenticated HTTP proxy. --sslCACert Specify a path to a SSL CA certificate to use. --serverUrl Specify a URL to as the server. -h, --help Show a help message and exit. FILES
/etc/sysconfig/rhn/systemid The digital server ID for this machine if the system has been registered onto RHN Satellite or Red Hat Network Classic. This file does not exist otherwise. /etc/sysconfig/rhn/up2date The common configuration file used by RHN client programs. EXAMPLES
Register a new system to Red Hat Network Classic: rhnreg_ks --profilename "example_profile_name" --username "someexampleuser" --password "foobar" Register a new system to Red Hat Network Classic with contact info: rhnreg_ks --profilename "example_profile_name" --username "someexampleuser" --password "foobar" --contactinfo < contactinfo where the file "contactinfo" contains data in the format: first_name: Billy last_name: Bob company: SomeCompanyName city: Springfield fax: 555-5555 phone: 555-5555 SEE ALSO
rhn_check(8), rhn_register(8), rhnsd(8), rhn-profile-sync(8), rhnplugin(8), up2date(5). AUTHORS
See the AUTHORS file included with this software. This manual page was written by Adrian Likins <alikins@redhat.com> and James Bowes <jbowes@redhat.com> BUGS
Report bugs to <http://bugzilla.redhat.com>. COPYRIGHT
Copyright (C) 1999-2011 Red Hat, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICU- LAR PURPOSE. Linux 2011 February 4 RHNREG_KS(8)
All times are GMT -4. The time now is 11:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy