CSV file:Find duplicates, save original and duplicate records in a new file
Hi Unix gurus,
Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me.
File format: CSV file
File has four columns with no header
File Size is 120GB.
Here are a few sample rows:
There are duplicates in column 1 and 4 (I know this for a fact).
I would like to find all the duplicates in column 1 and 4. In the example above, I want rows 2 and 3 (since the columns 1 has duplicates) and also rows 4 and 5 (since column four has duplicates).
If this is too complicated, may be I can look for duplicates in column 1 first and save a new file and then look for duplicates in column 4. (Since I am new to Unix, may be thats the way to go)
I want to save all the duplicates with original records (as in the example above) in a new CSV file.
---------- Post updated at 01:59 PM ---------- Previous update was at 01:56 PM ----------
For more clarity: My results would look like this:
Hi all
pls help me by providing soln for my problem
I'm having a text file which contains duplicate records .
Example:
abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
tas 3420 3562 ... (1 Reply)
Dear All,
I have one file which looks like :
account1:passwd1
account2:passwd2
account3:passwd3
account1:passwd4
account5:passwd5
account6:passwd6
you can see there're two records for account1. and is there any shell command which can find out : account1 is the duplicate record in... (3 Replies)
Hi,
Need to find a duplicate records on the first column,
ANU4501710430989 0000000W20389390
ANU4501710430989 0000000W67065483
ANU4501130050520 0000000W80838713
ANU4501210170685 0000000W69246611... (3 Replies)
I have 2 files
"File 1" is delimited by ";" and "File 2" is delimited by "|".
File 1 below (3 record shown):
Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones
Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull
Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
FILE_ID extraction from file name and save it in CSV file after looping through each folders
My files are located in UNIX Server, i want to extract file_id and file_name from each file .and save it in a CSV file. How do I do that?
I have folders in unix environment, directory structure is... (15 Replies)
Hi, all
I want to sort a csv file based on timestamp from oldest to newest and save the output as csv file itself. Here is an example of my csv file.
test.csv
SourceFile,DateTimeOriginal
/home/intannf/foto/IMG_0739.JPG,2015:02:17 11:32:21
/home/intannf/foto/IMG_0749.JPG,2015:02:17 11:37:28... (10 Replies)
Hi,
I have another problem. I want to sort another csv file by the first field.
result.csv
SourceFile,Airspeed,GPSLatitude,GPSLongitude,Temperature,Pressure,Altitude,Roll,Pitch,Yaw
/home/intannf/foto5/2015_0313_090651_219.JPG,0.,-7.77223,110.37310,30.75,996.46,148.75,180.94,182.00,63.92 ... (2 Replies)
I have csv file with 30, 40 columns
Pasting just three column for problem description
I want to filter record if column 1 matches CN or DN then,
check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345
and if column 2 contains 6789, 6789... (5 Replies)
Hi Experts,
I have csv file with 30, 40 columns
Pasting just 2 column for problem description.
Need to print error if below combination is not present in file
check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same.
For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies
LEARN ABOUT HPUX
pwgrd
pwgrd(1M)pwgrd(1M)NAME
pwgrd - password and group hashing and caching daemon
SYNOPSIS
logfile]
DESCRIPTION
provides accelerated lookup of password and group information for libc routines like and implements per request type caches and hashtables
as appropriate. When the corresponding routine in libc is called, a request is issued to via a Unix domain socket connection. determines
whether it can satisfy the request, returning the appropriate results to the requesting process.
Options
recognizes the following options and command-line arguments:
Debug mode. Do not become a daemon.
Issue additional diagnostic messages. Instead of logging message via issue messages to stderr.
Logfile. In addition to logging via
will write log messages to logfile.
modifies its behavior depending on whether or not the local machine is using some form of NIS for password or group information. When NIS
is being used, the hashtables corresponding to that service are not generated or consulted. Therefore only caching is provided for those
requests.
AUTHOR
was developed by the Hewlett-Packard Company.
FILES
Start up configuration variable. Set to if you want to start on reboot.
Hash files, status file and daemon Unix domain socket.
Client Unix domain sockets.
SEE ALSO pwgr_stat(1M).
pwgrd(1M)