Comparing two files and generating the report


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing two files and generating the report
# 1  
Old 07-15-2013
Comparing two files and generating the report

Hi All,

What am trying to do is generate the report by compating two files.

File A
-----------
111 22222 3333
222 55555 7777

File B
-----------
11A 22222 3333
333 55555 7778

Now the report should be as follows

Added:
333 55555 7778

Removed:
222 55555 7777

Modified
111 22222 3333

I have tried using the diff and comm command, the issue am facing is in getting the modified record, by using comm -23 and comm -13 , I am able to get the individual records which are added or removed (Same result using the grep -Fvf file1 file2 and grep -Fvf file2 file1).

So kindly advise how can I get the modified record
# 2  
Old 07-15-2013
How do you know a record is modified and not simply added?
# 3  
Old 07-15-2013
Sorry did not get your question.

For records which are added, I am using
comm -13 file1 file2 ...and this will give me only lines in File2
# 4  
Old 07-15-2013
As I look at it,
all items in FileB are added while
all items in FileA are removed.

I do not understand how/why you say a record was changed.
Why is the 111_ modified to be 11A_ but the 222_ is not modified to 333_?

What is the rule to determine a modification?
# 5  
Old 07-15-2013
Ok...
Actually the files are such that only a specific column detail will be change (Say column 5) This has to be reported as modification.

If there are any addition/removal, the entire record will be added/removed

I think the below will explain a bit more
File A
--------------------------------
AAA BBB CCC DDD EEEE FFF
111 222 333 444 555 666
XXX YY CDE GTY YSE TYU
File B
------------------------------
AAA BBB CCC ZZZ EEEE FFF
QQQ ZZZ GHJ SDF JJJJ KLK


If you see the very first records in File A and File B is a modification (Column 4 changed) . This have to be reported as Modified

The second row in File A is not present in File B, hence to be reported as removed

similarly Second row in File B is to be mentioned as Added.
# 6  
Old 07-15-2013
Still do not understand 'change'

In your first example, a single character at position 3 is different; and thus you call this "Modification".
In your 2nd example, three characters at positions 13-14-15 are different; and this is also a "Modification".

What if 20 characters are different? Is that simply another Modification? When is a Modification not a Change, but a delete and add?
# 7  
Old 07-16-2013
Please take only the second example, also its not the character position as its a whole word. In fact these data are from two .xls file so you can say a TAB seperated or comma seperated (if the file A and file B are .csv)

So its a change of one column which specifies that its a modification.

---------- Post updated 07-16-13 at 02:47 AM ---------- Previous update was 07-15-13 at 08:14 AM ----------

All,

As a step forward what am trying to do is
a) Take the comm -12 fileA and FileB, append it with difference of comm -23 fileA and fileB.
b) Use a column (3rd in my case) to see if its present in FileA and fileB.
c) If present in both files its a modification
b) If any one of the file than either Added or Removed.

I suppose for step b, c and d awk can be used with if else loop, if some one can provide how to acheive this would be of great help.

---------- Post updated at 03:41 AM ---------- Previous update was at 02:47 AM ----------

Hi All......
I am trying to traverse the difference file and segregating the record as either Addded, Modified or removed. Using the below code

Code:
awk 'BEGIN {print "Start Generating the report";}
{if (grep -q $2 FileA && grep -q $2 FileB) 
 echo modified
else if( grep -q $2 FileA && !grep -q $2 FileB)
 echo removed
else if ( !grep -q $2 FileA && grep -q $2 FileB)
 echo added
END {print "Report generated";}' difference.txt

Howver getting the syntax error. Could you please advise

$2 is the Unique parameter which can be used in files
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing two files on row by row and send the report

Comparing two files on row by row File1 ecount~100 dcount~200 ccount~300 zxcscount~5000 and so on. File2 ecount~100 dcount~203 ccount~300 zxcscount~5000 and so on. If i use diff command (1 Reply)
Discussion started by: onesuri
1 Replies

2. Shell Programming and Scripting

Novice in shell scripting - generating report

Hi I recently joined a project where I have been asked to generate a report using shell script accessing UNIX box. I have no idea on how to do it as I am a beginner and learning shell scripts. Suppose I have a XML: <XYZRequest> <effectiveDate>someDate</effectiveDate>... (2 Replies)
Discussion started by: vat1kor
2 Replies

3. Shell Programming and Scripting

generating report in Excel(Open office) using shell scripting

Hello All, I need to execute around 15 queries after which all data should come in Excel format. Executing 15 queries is not a problem. Problem is how to present/put data in excel. kindly suggest me how to start, what to study or what should i do. thanks, (1 Reply)
Discussion started by: shubham8787
1 Replies

4. Shell Programming and Scripting

Generating MD5's of files

On my website I host a lot of files, and when people view the site, currently each time the page loads, I have PHP generating the md5 sums for the files right then and there. It was fine when my site was small, but now that's obviously very inefficient. Now I'd like to start generating MD5 sums... (4 Replies)
Discussion started by: GrdLock
4 Replies

5. Shell Programming and Scripting

Generating a count report

Hi, I want to generate a report for count mismatching. Steps followed for creating a script for file in 1). I have to fetch the file name from the checksum.out #customer_information_ 2). Added Detail #customer_information_Detail 3). Check the file exist or not. 3.1.1)if the the file... (1 Reply)
Discussion started by: number10
1 Replies

6. Shell Programming and Scripting

Awk Script for generating a report

Hi all, I have a log file of the below format. 20081016:000042 asdflasjdf asljfljs asdflasjf safjl 20081016:000229 /lask/ajlsdf/askdfjsa 20081016:000229 /lashflas /askdfaslj hsfhsahf 20081016:000304 lasflasj ashfashd 20081016:000304 lajfasdf ashfashdfhs I need to generate a... (3 Replies)
Discussion started by: manoj.naidu
3 Replies

7. Shell Programming and Scripting

Parsing out the logs and generating report

My file will contain following(log.txt): start testcase: config loading ...... error XXXX ..... end testcase: config loading, result failed start testcase: ping check ..... error ZZZZZ ..... error AAAAA end testcase: Ping check, result failed I am expecting below output. ... (4 Replies)
Discussion started by: shellscripter
4 Replies

8. Shell Programming and Scripting

Generating files.

I/P file name:- 20092008.txt Check number of entries in i/p file by following command ChkEnt -infl 20092008.txt -opfl 20092008_test.txt >count.txt Dear Friends, Please help me in automating following thing. If output generated (count.txt) is having value more than 1000 i.e.... (8 Replies)
Discussion started by: anushree.a
8 Replies

9. Shell Programming and Scripting

Generating a report -Formatted printing -Urgent

Hi, My aim is to generate a report using shell script. There are various formats fields coloumns etc. I want to print in a single line (row) but in different coloumn as given below: field1 field2 field3 field4 ....... ....... ...... ....... The spacing... (1 Reply)
Discussion started by: jisha
1 Replies

10. Shell Programming and Scripting

Generating files of specific size

I've been working on getting a script to take size, dir name and file name variables from an input file and creating the same dir structure along with the file of specific size. An example of the input file: size/dirname/filename 2100/JAN_06/12345ABC.TCC 2354/FEB_06/24564XYZ.NOS... (2 Replies)
Discussion started by: nxd25
2 Replies
Login or Register to Ask a Question