I have few files in a directory. I have the same set of files in another directory. I need to remove the lines starting with the word 'HDR' and 'FTR' if present from all the files in both the directories. Then i need to sort the contents from all the files in both the directories and then compare the two directories.If they are not same i should report the that the two files are different. If any file is additional in either of the directories, Then i should report that the file is present in one directory and not in the other. The columns in the files are either tab separated or | (pipe) separated.
One thing is that the file size may be huge. It might be in hundrerds of MB's.
Can anyone help me in givind a script to do this job?
This would be a lot simpler if you could make a temporary copy of both directories. Is this feasible, or are they too large?
... assuming dir1 and dir2 are the original directories you want to compare. If they are in different places in the file system, perhaps it would be easiest to create symlinks for the duration of this script. And of course, once you are done, you can remove the copies in /tmp/dir1 and /tmp/dir2.
The output from diff is not particularly intuitive but it contains the information you request; you can post-process it with sed, or simply load it in your editor and massage it into something your manager can understand.
I'm not sure I captured the sorting requirement correctly. This creates sorted copies of each file, after removing the HDR and FTR lines; if you want all the content in a single file, of course, that can be done, too (but running diff on a single huge file is going to be painful).
diff is really overkill for this problem, but it produces exactly the information you wanted. It might be easier on the hardware to run cmp and then separately check for files which exist in one directory but not in the other.
Last edited by era; 04-22-2008 at 01:36 PM..
Reason: Oops, forgot the sort
The variable "file" contains the whole full path, and if the same path doesn't exist under /tmp it will complain about the redirection. That's what I was alluding to with the suggestion to create symbolic links to the directories.
The directories i need to compare are testing and testing1 which are in the path /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/
The code i used is
ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing dir1
ln -s /gdw/shared/qa2/dev/scripts_regr_test/design3/extract_comparison/testing1 dir2
mkdir /tmp/dir1 /tmp/dir2
for file in dir1/* dir2/*; do
egrep -v '^(HDR|FTR)' "$file" | sort >/tmp/"$file"
done
diff -rad /tmp/dir1 /tmp/dir2 | egrep '^(diff|Only in )'
But its still failing. Can you give me the exact code?
Last edited by ragavhere; 04-23-2008 at 04:38 AM..
Reason: Typo
You have a typo in the mkdir, you have "mkdir /tmp/dir11" with one 1 too many.
(Looking back, I had the wrong directory names there too, tmp1 and tmp2 instead of dir1 or dir2.)
Of course, once the symlinks and directories are in place, the code to create them won't need to be run again.
If you have the following:
symlink dir1 pointing to ...path to/testing
symlink dir2 pointing to ...path to/testing1
directory /tmp/dir1 exists
directory /tmp/dir2 exists
... then you should be ready to go with the for loop.
Looks like you will also need to drop the -a and -d options to diff, so just diff -r
(For the record, my diff has an -a option which says to always treat all files as text, and -d basically means try a little harder algorithm in order to keep the diffs small. I guess you can cope without either of those.)
If testing and testing1 are subdirectories among the files, perhaps you want to skip those from the loop.
Last edited by era; 04-23-2008 at 04:48 AM..
Reason: diff options, skip subdirectories
Testing and testing1 are the directories containing the files which have to be compared.
This script is going to be automated and will be a generic one and will be run by using a java code.so i cant create symbolic links each and every time i need to compare two directories. can the code be modified to suit this purpose?
Hi,
I have a script doing backup to synology server, the script create new folder each day with the date as being folder name i.e. 2018-07-30. Just before creating the new folder I want the script to find the oldest folder from the list and delete it including its content.
for example... (3 Replies)
Hi Team,
I am new to shell script and there is a requirement where files should be moved from Subfolder to parent folder.
Eg:
parent folder --> /Interface/data/test/IN
Sub folder -->/Interface/data/test/IN/Invoice20180607233338
Subfolder will be always with timestamp... (6 Replies)
I need help to write shell script to copy files from one server to another server.
Source Directory UAE(inside i have another folder Misc with files inside UAE folder).I have to copy this to another server UAE folder( Files should be copied to UAE folder and Misc files should be copied in target... (3 Replies)
Hello everyone....
I have a small issue here at work and I am trying to script out a way to automate a fix for it. I have a small number of users (I work in a 1:1 with 6,000 macbooks) that aren't really managed in my deployment. They are managed with a few policies, but the policies are broken... (2 Replies)
This is the file structure:
DESKTOP/Root of Photo Folders/Folder1qweqwasdfsd/*jpg
DESKTOP/Root of Photo Folders/Folder2asdasdasd/*jpg
DESKTOP/Root of Photo Folders/Folder3asdadfhgasdf/*jpg
DESKTOP/Root of Photo Folders/Folder4qwetwdfsdfg/*jpg
DESKTOP/Root of Photo... (4 Replies)
Hi,
I want to extract myfile.war to a folder which is in the same folder with war file.I did this as normal:
jar -xvf myfile.war
But it exploded all the content of file to the same level folder instead of that I was expecting to create a folder called myfile.
This works with tar:
... (0 Replies)
Hello,
I am an amature at UNIX commands and functionality.
Please could you all assist me by replying to my below mentioned querry :
How can I upload a zip folder on a unix path from my windows folder?
Thanks guys
Cheers (2 Replies)
Hello all,
I'm trying to accomplish that if a file gets written to folder /path/to/a/ it gets automatically copied into /path/to/b/ the moment its get written.
I thought of writing a shell script and cron it that every X amount of minutes it copies these files over but this will not help me... (2 Replies)
Oracle procedure create files on UNIX folder on a regular basis. I need to FTP files onto windows server and place the files, based on their name, in the corresponding folders. File name is as follows: ccyymmddfoldernamefile.txt; Folder Name length could be of any size; however, the prefix and... (3 Replies)