Need help regarding comparison between two files through UNIX script
Hi All ,
I am aware of unix command ,but not comforable in putting together in script level.I came to situation where I need to compare between two .txt files fieldwise and need a mismatch report. As I am new to unix script arena ,if anyone can help in the below scenario that will be really helpful.
We have one source mainframe .txt file (readble pipe dililimited format) and also have one target hdfs .txt file (pipe dilimited format).I need to compare two files field by field and not by whole line.Need to compare like
f1.first field with f2.first field
f1.second field with f2.second field
and so on .Please find below sample source & target file.
f1 :
f2 :
Once the comparison between two files are complete fieldwise ,we need a mismatch report which will contain source/target count validation ,field level src & target mismatches and their corresponding mismatch details .
It will be helpful if mismatch report like below :
Source data might contain leading spaces/zero ,precision and target data might not have those.We can ignore these cases in the mismatch report.
If anyone can help me in the above scenario that will really beneficial for me.Thanks !
This will find the mismatches:
with some mismatches added, as your two samples given obviously don't deviate except for the no. of leading zeroes in some numbers. The mismatch report that you want needs some further info:
- count is lines (records) or fields?
- What is "field lebel Src & Tgt Mismatches"?
- what do you want to show up in the details section?
Hi RudiC/All ,
Please find below my comments :
count is lines (records) or fields? -I mean to say the no total records count in each file .
What is "field lebel Src & Tgt Mismatches"? -Total mismatch count between source and target file .
what do you want to show up in the details section? -Details section I need the mismatch details like rowno/Key Column Value,Column Name / Index ,their corresponding source & target value .
Please find below two sample source & target file :
Source file :
Target file :
Once the comparison done between two files through script ,in the analysis report ,I need the details like below based on the above two sample source/target file .
Representation can be different ,but we need the above details in the analysis report.
Also We can run the script as local directory as parameter1 ,source file as parameter2 ,target file as parameter 3.We can sort the both files based on 1st column ,and 1st column value shd be unique.
If anyone can help me in the above scenario ,that will be really benefical for me.Thanks !
---------- Post updated 06-30-15 at 01:06 AM ---------- Previous update was 06-29-15 at 07:24 AM ----------
Hi All ,
I am looking forward to a solution to the above scenario.If anyone can help me in this regard ,that will be really helpful for me.Thanks !
Last edited by STCET22; 06-30-2015 at 03:08 AM..
Reason: Add missing CODE tags.
Hi Mannu2525/RudiC ,
Thanks a ton for your reply.
@Mannu2525 ,
When I am running your script like below based on the below mentioned sample source & target files ,we are getting few issues in the analysis report.
1.No of mismatch should come 4 instead of 8.
2.In the mismatch details ,2nd ,4th ,6th ,8th row should not come in the analysis report.Please find attached the analysis report(Report.jpg).Kindly look into the red highlighted row in the mismatch details section.
3.In the analysis report , in the mismatch details section ,the representation format is not coming properly.The values of column name ,source data ,target data are all left aligned .
If you kindly help me in the above mentioned three issues, it will be really helpful for me.
@RudiC ,
When I am running your below mentioned script ,we are getting few issues.
1.Source count is coming 5 ,it should return 4.
2.No of mismatch should come 4 instead of 8 .
3.In the mismatch details ,2nd ,4th ,6th ,8th row should not come in the analysis report.
4.In the analysis report , in the mismatch details section ,the representation format is not coming properly.The values of column name ,source data ,target data are all left aligned .
It wud be great if you kindly look into the above mentioned issues.Thanks!
The script was tested with the data you provided in post#3 and worked correctly. So, please check your input data for line count and field count.
If you're not happy with the output format, wouldn't it be a great learning opportunity trying to adapt it yourself?
I have the requirement
I have two files
cat fileA
something
anythg
nothing
everythg
cat fileB
everythg
anythg
Now i shld use fileB and compare every line at fileA and get the output as
something
nothing (3 Replies)
I have two files which has component name and version number separated by a space
cat file1
com.acc.invm:FNS_PROD 94.0.5
com.acc.invm:FNS_TEST_DCCC_Mangment 94.1.6
com.acc.invm:FNS_APIPlat_BDMap 100.0.9
com.acc.invm:SendEmail 29.6.113
com.acc.invm:SendSms 12.23.65
cat file2 ... (8 Replies)
Hi All ,
As I am new to unix scripting ,I need a help regarding unix scripting .I have two .txt files .One is source file and another is target file.I need a script through which I can compare those two files.I need a automated comparison report in a directory after comparing between source &... (2 Replies)
Hi All,
For past some days iam trying, which not able to get to..so please help me on this..
My exact requirement is...
Step1: Find how many files/sub files exist in /some/path (maybe in multiple path)
Step2: Count the no. of files/sub files with their respective size.
Step3: Then a file... (0 Replies)
Hi All,
For past some days iam trying, which not able to get to..so please help me on this..
My exact requirement is...
Step1: Find how many files/sub files exist in /some/path (maybe in multiple path)
Step2: Count the no. of files/sub files with their respective size.
Step3: Then a file... (1 Reply)
Hello,
I have 2 files and I want them to be compared in a specific fashion
file1:
A_1200_1250
A_1251_1300
B_1301_1350
B_1351_1400
B_1401_1450
C_1451_1500 and so on...
file2:
1210 1305 1260 1295
1400 1500 1450 1495
Now The script should look for "1200" from A_1200_1250 of... (8 Replies)
Hi,
I have a problem with comparison of two files
file1
20100101
20090101
20080101
20071001
20121229
file2
19990112 12 456 7
20011131 19
20100101 2 567 1 987
17890709 123 555
and, sh script needs to compare of these two files and give out to me result:
20100101 2 567 1 987
it... (5 Replies)
Kindly help on follows.
I have 2 files. One file contains only one column of mobile numbers. And total records in a file 12 million. Second file contains 2 columns mobile numbers and balance. and total records 30 million. I want to find out balance of each data in file 1 corresponding to file 2.... (2 Replies)
Hi, There are two files in UNIX system with some lines are exactly the same, some lines are not.
I want to compare these two files.The 2 files (both the files have data in Column format )should be compared row wise and any difference in data for a particular row should lead to storage of data of... (32 Replies)
I am very new to Unix. What are the similiarities and differences between ScoUnix and AIX5 if any? Where might i find the information? Which is better? (1 Reply)