Sponsored Content
Top Forums Shell Programming and Scripting removing duplicate records comparing 2 csv files Post 302599797 by rajak.net on Saturday 18th of February 2012 01:09:10 PM
Old 02-18-2012
removing duplicate records comparing 2 csv files

Hi All,

I want to remove the rows from File1.csv by comparing a column/field in the File2.csv. If both columns matches then I want that row to be deleted from File1 using shell script(awk). Here is an example on what I need.

File1.csv:

RAJAK,ACTIVE,1
VIJAY,ACTIVE,2
TAHA,ACTIVE,3

File2.csv:

VIJAY
TAHA

Output:

RAJAK,ACTIVE,1

Above scenario I need to delete the records if col1 of File1=col2 of File2 and the output should be File1 after removing the duplicate records.

Can you please help me out in preparing a shell script for above.

Thanks in Advance.
 

10 More Discussions You Might Find Interesting

1. Linux

Need awk script for removing duplicate records

I have huge txt file having millions of trade data. For e.g Trade.txt (first 8 lines in the file is header info) COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID, TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID, EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,... (6 Replies)
Discussion started by: nmumbarkar
6 Replies

2. Shell Programming and Scripting

Removing duplicate records from 2 files

Can anyone help me to removing duplicate records from 2 separate files in UNIX? Please find the sample records for both the files cat Monday.dat 3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE 3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE 3MEHM0JG7AR652083MUTLAB NAL-NAFISAH... (4 Replies)
Discussion started by: zooby
4 Replies

3. Linux

Need awk script for removing duplicate records

I have log file having Traffic line 2011-05-21 15:11:50.356599 TCP (6), length: 52) 10.10.10.1.3020 > 10.10.10.254.50404: 2011-05-21 15:11:50.652739 TCP (6), length: 52) 10.10.10.254.50404 > 10.10.10.1.3020: 2011-05-21 15:11:50.652558 TCP (6), length: 89) 10.10.10.1.3020 >... (1 Reply)
Discussion started by: Rastamed
1 Replies

4. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header... (8 Replies)
Discussion started by: arvindosu
8 Replies

5. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies

7. Shell Programming and Scripting

Comparing 2 CSV files and sending the difference to a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 ... (1 Reply)
Discussion started by: Naresh101
1 Replies

8. Shell Programming and Scripting

Removing specific records from files when duplicate key

Hello I have been trying to remove a row from a file which has the same first three columns as another row - I have tried lots of different combinations of suggestion on this forum but can't get it exactly right. what I have is 900 - 1000 = 0 900 - 1000 = 2562 1000 - 1100 = 0 1000 - 1100... (7 Replies)
Discussion started by: tinytimmay
7 Replies

9. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789... (5 Replies)
Discussion started by: as7951
5 Replies

10. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies
OCR4GAMERA(1)															     OCR4GAMERA(1)

NAME
ocr4gamera - OCR system using the Gamera framework USAGE
ocr4gamera -x <traindata> [options] <imagefile> OPTIONS
-v <int>, --verbosity=<int> Set verbosity level to <int>. Possible values are 0 (default): silent operation; 1: information on progress; >2: segmentation info is written to PNG files with prefix debug_. -h, --help Display help and exit. -d, --deskew Do a skew correction (recommended). -f, --filter Filter out very large (images) and very small components (noise). -a, --automatic-group Autogroup glyphs with classifier. -x <file>, --xmlfile=<file> Read training data from <file>. -o <xml>, --output=<xml> Write recognized text to file <xml> (otherwise it is written to stdout). -c <csv>, --extra_chars_csvfile=<csv> Read additional class name conversions from file <csv>. <csv> must contain one conversion per line. -R <rules>, --heuristic_rules=<rules> Apply heuristic rules <rules> for disambiguation of some chars. <rules> can be roman (default) or none (for no rules). -D, --dictionary-correction Correct words using a dictionary (requires aspell or ispell). -L <lang>, --dictionary-language=<lang> Use <lang> as language for aspell (when option -D is set). -e <int>, --edit-distance=<int> Correct words only when edit distance not more than <int>. OCR4GAMERA(1)
All times are GMT -4. The time now is 08:18 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy