Two folder comparison


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Two folder comparison
# 1  
Old 04-22-2008
Data Two folder comparison

Hi,

I have few files in a directory. I have the same set of files in another directory. I need to remove the lines starting with the word 'HDR' and 'FTR' if present from all the files in both the directories. Then i need to sort the contents from all the files in both the directories and then compare the two directories.If they are not same i should report the that the two files are different. If any file is additional in either of the directories, Then i should report that the file is present in one directory and not in the other. The columns in the files are either tab separated or | (pipe) separated.

One thing is that the file size may be huge. It might be in hundrerds of MB's.

Can anyone help me in givind a script to do this job?
# 2  
Old 04-22-2008
This would be a lot simpler if you could make a temporary copy of both directories. Is this feasible, or are they too large?

Code:
mkdir /tmp/dir1 /tmp/dir2
for file in dir1/* dir2/*; do
  egrep -v '^(HDR|FTR)' "$file" | sort >/tmp/"$file"
done
diff -rad /tmp/dir1 /tmp/dir2 | egrep '^(diff|Only in )'

... assuming dir1 and dir2 are the original directories you want to compare. If they are in different places in the file system, perhaps it would be easiest to create symlinks for the duration of this script. And of course, once you are done, you can remove the copies in /tmp/dir1 and /tmp/dir2.

The output from diff is not particularly intuitive but it contains the information you request; you can post-process it with sed, or simply load it in your editor and massage it into something your manager can understand.

I'm not sure I captured the sorting requirement correctly. This creates sorted copies of each file, after removing the HDR and FTR lines; if you want all the content in a single file, of course, that can be done, too (but running diff on a single huge file is going to be painful).

diff is really overkill for this problem, but it produces exactly the information you wanted. It might be easier on the hardware to run cmp and then separately check for files which exist in one directory but not in the other.

Last edited by era; 04-22-2008 at 01:36 PM.. Reason: Oops, forgot the sort
# 3  
Old 04-23-2008
PHP Two folder comparison

Hi era,

I am getting this message and the temp directories are not getting created and the files are not compared.
bash: /usr/local/bin/..: is a directory

The code i used is

mkdir /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/tmp/dir1 /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/tmp/dir2
for file in /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing/* /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing1/*; do
egrep -v '^(HDR|FTR)' "$file" | sort >/tmp/"$file"
done
diff -rad /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/tmp/dir1 /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/tmp/dir2 | egrep '^(diff|Only in )'


Why is this happening?
# 4  
Old 04-23-2008
The variable "file" contains the whole full path, and if the same path doesn't exist under /tmp it will complain about the redirection. That's what I was alluding to with the suggestion to create symbolic links to the directories.

Code:
ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing dir1
ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing1 dir2
mkdir /tmp/tmp1 /tmp/dir2
for file in dir1/* dir2/* ...

# 5  
Old 04-23-2008
PHP Two folder comparison

Hi era,

I am new to unix. i am getting this error. When i run the script.

++ set -x
++ ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing dir1
++ ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing1 dir2
++ mkdir /tmp/dir1 /tmp/dir2
mkdir: Failed to make directory "/tmp/dir2"; File exists
++ egrep -v '^(HDR|FTR)' dir1/file1.txt
++ sort
++ egrep -v '^(HDR|FTR)' dir1/file2.txt
++ sort
++ egrep -v '^(HDR|FTR)' dir1/file3.txt
++ sort
++ egrep -v '^(HDR|FTR)' dir1/testing
++ sort
sort: missing NEWLINE added at end of input file STDIN
++ egrep -v '^(HDR|FTR)' dir2/file1.txt
++ sort
++ egrep -v '^(HDR|FTR)' dir2/file2.txt
++ sort
++ egrep -v '^(HDR|FTR)' dir2/file3.txt
++ sort
++ egrep -v '^(HDR|FTR)' dir2/testing1
++ sort
sort: missing NEWLINE added at end of input file STDIN
++ diff -rad /tmp/dir1 /tmp/dir2
++ egrep '^(diff|Only in )'
diff: illegal option -- a
usage: diff [-bitw] [-c | -e | -f | -h | -n] file1 file2
diff [-bitw] [-C number] file1 file2
diff [-bitw] [-D string] file1 file2
diff [-bitw] [-c | -e | -f | -h | -n] [-l] [-r] [-s] [-S name] directory1 directory2

The directories i need to compare are testing and testing1 which are in the path /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/

The code i used is
ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing dir1
ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing1 dir2
mkdir /tmp/dir1 /tmp/dir2
for file in dir1/* dir2/*; do
egrep -v '^(HDR|FTR)' "$file" | sort >/tmp/"$file"
done
diff -rad /tmp/dir1 /tmp/dir2 | egrep '^(diff|Only in )'

But its still failing. Can you give me the exact code?

Last edited by ragavhere; 04-23-2008 at 04:38 AM.. Reason: Typo
# 6  
Old 04-23-2008
You have a typo in the mkdir, you have "mkdir /tmp/dir11" with one 1 too many.

(Looking back, I had the wrong directory names there too, tmp1 and tmp2 instead of dir1 or dir2.)

Of course, once the symlinks and directories are in place, the code to create them won't need to be run again.

If you have the following:
  • symlink dir1 pointing to ...path to/testing
  • symlink dir2 pointing to ...path to/testing1
  • directory /tmp/dir1 exists
  • directory /tmp/dir2 exists
... then you should be ready to go with the for loop.

Looks like you will also need to drop the -a and -d options to diff, so just diff -r

(For the record, my diff has an -a option which says to always treat all files as text, and -d basically means try a little harder algorithm in order to keep the diffs small. I guess you can cope without either of those.)

If testing and testing1 are subdirectories among the files, perhaps you want to skip those from the loop.

Code:
for file in dir1/* dir2/*; do
  test -d "$file" && continue  # skip if it's a directory
  egrep -v '^(HDR|FTR)' "$file" | sort >/tmp/"$file"
done
diff -r /tmp/dir1 /tmp/dir2 | egrep '^(diff|Only in )'


Last edited by era; 04-23-2008 at 04:48 AM.. Reason: diff options, skip subdirectories
# 7  
Old 04-23-2008
Data Two folder comparison

Testing and testing1 are the directories containing the files which have to be compared.

This script is going to be automated and will be a generic one and will be run by using a java code.so i cant create symbolic links each and every time i need to compare two directories. can the code be modified to suit this purpose?
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete oldest folder based on folder named as date

Hi, I have a script doing backup to synology server, the script create new folder each day with the date as being folder name i.e. 2018-07-30. Just before creating the new folder I want the script to find the oldest folder from the list and delete it including its content. for example... (3 Replies)
Discussion started by: humble_learner
3 Replies

2. Shell Programming and Scripting

Request for Shell script to move files from Subfolder to Parent folder and delete sub folder

Hi Team, I am new to shell script and there is a requirement where files should be moved from Subfolder to parent folder. Eg: parent folder --> /Interface/data/test/IN Sub folder -->/Interface/data/test/IN/Invoice20180607233338 Subfolder will be always with timestamp... (6 Replies)
Discussion started by: srivarun15
6 Replies

3. Shell Programming and Scripting

Shell scripting for moving folder specific files into target directory of that country folder.

I need help to write shell script to copy files from one server to another server. Source Directory UAE(inside i have another folder Misc with files inside UAE folder).I have to copy this to another server UAE folder( Files should be copied to UAE folder and Misc files should be copied in target... (3 Replies)
Discussion started by: naresh2389
3 Replies

4. Shell Programming and Scripting

do a full comparison of folder contents in script

Hello everyone.... I have a small issue here at work and I am trying to script out a way to automate a fix for it. I have a small number of users (I work in a 1:1 with 6,000 macbooks) that aren't really managed in my deployment. They are managed with a few policies, but the policies are broken... (2 Replies)
Discussion started by: tlarkin
2 Replies

5. Shell Programming and Scripting

File Management: How do I move all JPGS in a folder structure to a single folder?

This is the file structure: DESKTOP/Root of Photo Folders/Folder1qweqwasdfsd/*jpg DESKTOP/Root of Photo Folders/Folder2asdasdasd/*jpg DESKTOP/Root of Photo Folders/Folder3asdadfhgasdf/*jpg DESKTOP/Root of Photo Folders/Folder4qwetwdfsdfg/*jpg DESKTOP/Root of Photo... (4 Replies)
Discussion started by: guptaxpn
4 Replies

6. UNIX for Dummies Questions & Answers

Jar/Tar to a diffent folder/same folder w/ filename

Hi, I want to extract myfile.war to a folder which is in the same folder with war file.I did this as normal: jar -xvf myfile.war But it exploded all the content of file to the same level folder instead of that I was expecting to create a folder called myfile. This works with tar: ... (0 Replies)
Discussion started by: reis3k
0 Replies

7. Windows & DOS: Issues & Discussions

How can I upload a zip folder on a unix path from my windows folder?

Hello, I am an amature at UNIX commands and functionality. Please could you all assist me by replying to my below mentioned querry : How can I upload a zip folder on a unix path from my windows folder? Thanks guys Cheers (2 Replies)
Discussion started by: ajit.yadav83
2 Replies

8. UNIX for Advanced & Expert Users

Auto copy for files from folder to folder upon instant writing

Hello all, I'm trying to accomplish that if a file gets written to folder /path/to/a/ it gets automatically copied into /path/to/b/ the moment its get written. I thought of writing a shell script and cron it that every X amount of minutes it copies these files over but this will not help me... (2 Replies)
Discussion started by: Bashar
2 Replies

9. Shell Programming and Scripting

Parse the .txt file for folder name and FTP to the corrsponding folder.

Oracle procedure create files on UNIX folder on a regular basis. I need to FTP files onto windows server and place the files, based on their name, in the corresponding folders. File name is as follows: ccyymmddfoldernamefile.txt; Folder Name length could be of any size; however, the prefix and... (3 Replies)
Discussion started by: MeganP
3 Replies
Login or Register to Ask a Question