Help comparing 2 files to find deleted records


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help comparing 2 files to find deleted records
# 1  
Old 04-02-2007
Help comparing 2 files to find deleted records

Hi,

I need to compare todays file to yesterdays file to find deletes.
I cannot use comm -23 file.old file.new.

Because each record may have a small change in it but is not really a delete.
I have two delimited files. the first field in each file is static. All other fields may change. I want to know if the static data has disappeared from the file. If I use comm -23 I will get the lines that data in field 2 or 3 etc may have changed but the static data in filed 1 is the same.

I did search this site and found a reply to another post that I used to solve my problem. I wrote a smal script that does work. But the problem is it is very slow.

My old file has 7000 lines my new file has 4000 lines. I ran my script and 30 minutes later it still was not done. I ran it on a UNIX box that has a lot of memory and processors. It is not a hardware issue.

Any sugestions? Below is my code:

while read static
do
found="no"
data=`echo "$static" | cut -d'|' -f1`

while read line
do

echo "$line" | grep "$data" >/dev/null
if [ $? -eq 0 ]
then
found="yes"
break
fi
done < file.new
if [ $found = "no" ]
then
echo "$static"
fi
done < file.old

I am trying to keep all data on the deleted line. I could cut both files down filed 1 into smaller files and then run a comm -23 on the smaller files. But then I loose all data in the other fields.

Last edited by eja; 04-02-2007 at 09:36 PM..
# 2  
Old 04-03-2007
Code:
awk -F"|" ' 
BEGIN { 
	while( getline < "file.new" ) 
	{ arr[$1]=1 }
}
arr[$1] != 1 { print } 
' file.old

# 3  
Old 04-03-2007
anbu23,

Thank you very much. this works great. I assume with a fixed length file I could use awk's substr command?

I also have two files that are fixed length where the first 20 characters is static. SO I re-wrote you helpful code to the below it also works do you see any issues with what i did I am not that good with awk.

awk -F"|" '
BEGIN {
while( getline < "file.new" )
{ arr[substr($1,1,20)]=1 }
}
arr[substr($1,1,20)] != 1 { print }
' file.old
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Comparing fastq files and outputting common records

I have two files: File_1: @M04961:22:000000000-B5VGJ:1:1101:9280:7106 1:N:0:86 GGCATGAAAACATACAAACCGTCTTTCCAGAAATTGTTCCAAGTATCGGCAACAGCTTTATCAATACCATGAAAAATATCAACCACACCAGAAGCAGCAT + GGGGGGGGGGGGGGGGGCCGGGGGF,EDFFGEDFG,@DGGCGGEGGG7DCGGGF68CGFFFGGGG@CGDGFFDFEFEFF:30CGAFFDFEFF8CAF;;8F ... (3 Replies)
Discussion started by: Xterra
3 Replies

2. Shell Programming and Scripting

Comparing two files to get only records to be inserted and updated

Hello all, Please help me for a script that compares two files and reads only those records that are to be inserted and updated. File1: c_id name place contact_no 1 abc xyz 34567 10 efg uvw 82725 6 hjk wth 01823 2 iuy ... (4 Replies)
Discussion started by: T@ni@
4 Replies

3. UNIX for Advanced & Expert Users

How to find duplicates contents in a files by comparing other files?

Hi Guys , we have one directory ...in that directory all files will be set on each day.. files must have header ,contents ,footer.. i wants to compare the header,contents,footer ..if its same means display an error message as 'files contents same' (7 Replies)
Discussion started by: Venkatesh1
7 Replies

4. Shell Programming and Scripting

removing duplicate records comparing 2 csv files

Hi All, I want to remove the rows from File1.csv by comparing a column/field in the File2.csv. If both columns matches then I want that row to be deleted from File1 using shell script(awk). Here is an example on what I need. File1.csv: RAJAK,ACTIVE,1 VIJAY,ACTIVE,2 TAHA,ACTIVE,3... (6 Replies)
Discussion started by: rajak.net
6 Replies

5. UNIX for Dummies Questions & Answers

To find the Ip address of the user who deleted files

Hi, There were a few files deleted from a server by user xyz. The file names are:- /oraextME4/oradata/ME11G22/TEST_IMPORT_01.dbf /oraextME4/oradata/ME11G22/RKVITR1_03.dbf /oraextME4/oradata/ME11G22/TEST_IMPORT_02.dbf need to know the ip address of the terminal from which that... (10 Replies)
Discussion started by: Abhinav Jaiswal
10 Replies

6. Shell Programming and Scripting

Find records between two files which are not exists in one another in one

Hello all, Would like to know how to find records between two files which are not exists in one another in one. For example: I've two files "fileA" and "fileB" and want to find record from "fileB" which does not exists in "fileA". fileA -------- ABCD DEFG GHIJ KLMN NOPQ RSTU VUWX... (5 Replies)
Discussion started by: nvkuriseti
5 Replies

7. Shell Programming and Scripting

Script to find files and delete it by comparing

I have a directory where lot of "gzip" files are dropped in every 5 minutes. There is an application which will process these gzip and move it to another directory but will leave a gzip.out file with lot of output data. I need to remove all the outfiles except for the one which is being... (1 Reply)
Discussion started by: gubbu
1 Replies

8. Shell Programming and Scripting

comparing two files and find mismatch

hi i have two files and i want to compare both the files and find out mismatch in 3rd file file1 00354|1|0|1|1|0|0|0|1|2 52424|1|0|1|1|0|0|0|1|2 43236|1|0|1|1|0|0|0|1|2 41404|1|0|1|1|0|0|0|1|2 79968|1|0|1|1|0|0|0|1|2 file2 00354|1|0|1|1|0|0|0|1|2 52424|1|0|1|1|0|0|0|0|2... (9 Replies)
Discussion started by: dodasajan
9 Replies

9. UNIX for Dummies Questions & Answers

logging deleted records by sed

Hi, I want to use the sed command to delete some lines in a file and I was wondering whether there is a possibility of knowing which lines are deleted, or at least which line numbers. Thanks (4 Replies)
Discussion started by: vanagreg
4 Replies

10. Shell Programming and Scripting

Find duplicate value comparing 2 files and create an output

I need a perl script which will create an output file after comparing two diff file in a directory path: /export/home/abc/file1 /export/home/abc/file2 File Format: <IP>TAB<DeviceName><TAB>DESCRIPTIONS file1: 10.1.2.1.3<tab>abc123def<tab>xyz.mm1.ppp.... (2 Replies)
Discussion started by: ricky007
2 Replies
Login or Register to Ask a Question