Sponsored Content
Top Forums Shell Programming and Scripting Simple directory tree diff script Post 302781729 by LMHmedchem on Sunday 17th of March 2013 06:51:31 PM
Old 03-17-2013
Well the sorted find files differ by ~3000 lines. I take this to mean that there are ~3000 files that are missing from the one directory. The output off comm is 3091728 lines, which is the same number of lines as are in the find for the original directory. I presume this is because the col 3 output of comm are files that are in both, and output I don't need to see. I presume I want comm -3 for the output I want, meaning files that are in one director tree and not in the other?

LMHmedchem

---------- Post updated at 05:28 PM ---------- Previous update was at 02:43 PM ----------

This is the final script that I used. I have brushed it up a bit so that it is more generalized and checks a few things.

Code:
#!/usr/bin/bash

# accepts path to two directories and compares the file lists in each
# path should begin with /cygdrive/driveletter/

# assign location for output
TMPDIR='/cygdrive/c/cygwin/tmp/dir_compare'

# assign directory trees to compare
TREE1=$1
TREE2=$2

# check arguments, print help if no arguments are passed
if [ $# -eq 0 ]
then
   echo "this script expects two arguments"
   echo "each argument should be the path to a directory"
   echo "each path should start with /cygdrive/, not a relative path"
   echo "the script will compare the list of files in the directory and subdirectories"
   echo "and will report any instance where a file exists in one directory but not the other"
   echo 'output will be printed to '$TMPDIR'/comm.txt'
   exit
fi

# check if TREE1 exists
if [ ! -d $1 ];
then
   echo " "
   echo "directory " $1 "not found"
   echo "exiting"
   exit
fi
# check if TREE2 exists
if [ ! -d $2 ];
then
   echo " "
   echo "directory " $2 "not found"
   echo "exiting"
   exit
fi

# clean tmp dir if it contains files
cd $TMPDIR
FILES=(*)
FILES=${#FILES[@]}
if (( "$FILES" > 0 )) ; then
   rm *
fi

# echo some information
echo " "
echo "comparing file list of " $TREE1
echo "with file list of " $TREE2
echo " "

# cd to TREE1 and create file list for tree
cd $TREE1
find . > $TMPDIR'/check_1' &

# cd to TREE2 and create file list for tree
cd $TREE2
find . > $TMPDIR'/check_2' &

# wait for find to finish
wait

# sort output of find to keep file list from both dir trees is in registration
sort $TMPDIR'/check_1' > $TMPDIR'/check_1_sorted'
sort $TMPDIR'/check_1' > $TMPDIR'/check_2_sorted'

# print the number of lines (files) in each directory tree
wc -l $TMPDIR'/check_1_sorted'
wc -l $TMPDIR'/check_2_sorted'

# compare the two files, only print instances where a file exists in one tree but not the other
comm -3  $TMPDIR'/check_1_sorted'  $TMPDIR'/check_2_sorted' > $TMPDIR'/comm.txt'

Running this script indicates that I have 6,189,828 files in each tree and the script does not find any difference in file names. I found that I had one extra directory in one of the trees. This came from some testing I was doing to see if a copy of my files had the same issue with the time stamps as the original. When I deleted this copy directory, the comm file is empty.

The only problem is that I still have a 3GB size discrepancy between the two partitions.

$ df -h
Filesystem Size Used Avail Use% Mounted on
E: 879G 502G 378G 58% /cygdrive/e
I: 831G 499G 332G 61% /cygdrive/i

The size of the E partition didn't change when I deleted the extra directory, even though the folder was quite large. I expected that to make the sizes the same. I'm not sure what else I can do to check that my copy has all of the data from the original. The results would imply that some of the files exist on both drives, but are not the same size. Is there a reasonable way to check that? I would seem like that would be a non-trivial addition to what I am doing. Is it possible for the same exact files to be on both drives but to take up different amounts of space?

LMHmedchem

---------- Post updated at 05:44 PM ---------- Previous update was at 05:28 PM ----------

I see I had a typo in the script, so I wasn't doing the correct compare. I am running again with the corrected script.

---------- Post updated at 06:51 PM ---------- Previous update was at 05:44 PM ----------

Running the corrected script, there are a few files that are different, but the total size is not much. I keep my browser profiles here and these are different because one is the browser I am using and one is a copy made yesterday.

There is nothing here that accounts for 3GB of data.

Any suggestions on what to do next? I suppose I could use the sorted find files to do a diff between each file pair, but that wouldn't exactly be speedy. The find files don't differentiate between files and directories and I don't know what happens if you feed diff a pair of directories instead of files.

LMHmedchem
 

10 More Discussions You Might Find Interesting

1. Programming

directory as tree

hi i have modified a program to display directory entries recursively in a tree like form i need an output with the following guidelines: the prog displays the contents of the directory the directory contents are sorted before printing so that directories come before regular files if an entry... (2 Replies)
Discussion started by: anything2
2 Replies

2. Shell Programming and Scripting

directory tree

Hi all, The following is a script for displaying directory tree. D=${1:-`pwd`} (cd $D; pwd) find $D -type d -print | sort | sed -e "s,^$D,,"\ -e "/^$/d"\ -e "s,*/\(*\)$,\:-----\1,"\ -e "s,*/,: ,g" | more exit 0 I am trying to understand the above script.But... (3 Replies)
Discussion started by: ravi raj kumar
3 Replies

3. Shell Programming and Scripting

Diff. Backup Script Using TAR. Should be simple.

I'm specifically trying to find help or insight on using the --incremental ('-G') option for creating a tar. Please resist the urge to tell me to use --listed-incremental ('-g') option. That's fairly well documented in the GNU tar manual. GNU tar 1.19 This is what the manual does say in section... (0 Replies)
Discussion started by: protienplant
0 Replies

4. UNIX for Dummies Questions & Answers

Move all files in a directory tree to a signal directory?

Is this possible? Let me know If I need specify further on what I am trying to do- I just want to spare you the boring details of my personal file management. Thanks in advance- Brian- (2 Replies)
Discussion started by: briandanielz
2 Replies

5. Shell Programming and Scripting

Newbie problem with simple script to create a directory

script is: dirname= "$(date +%b%d)_$(date +%H%M)" mkdir $dirname should create a directory named Nov4_ Instead I get the following returned: root@dchs-pint-001:/=>./test1 ./test1: Nov04_0736: not found. Usage: mkdir Directory ... root@dchs-pint-001:/=> TOO easy, but what am I... (2 Replies)
Discussion started by: gwfay
2 Replies

6. UNIX for Dummies Questions & Answers

directory tree with directory size

find . -type d -print 2>/dev/null|awk '!/\.$/ {for (i=1;i<NF;i++){d=length($i);if ( d < 5 && i != 1 )d=5;printf("%"d"s","|")}print "---"$NF}' FS='/' Can someone explain how this works..?? How can i add directory size to be listed in the above command's output..?? (1 Reply)
Discussion started by: vikram3.r
1 Replies

7. UNIX for Dummies Questions & Answers

How to copy a tree of directory

Mi question is how can you copy only de three of directory and not the files in it. Only a need the three of directorys not the files (6 Replies)
Discussion started by: enkei17
6 Replies

8. Shell Programming and Scripting

Specific directory parsing in a directory tree

Hi friends, Hello again :) i got stuck in problem. Is there any way to get a special directory from directory tree? Here is my problm.." Suppose i have one fix directory structure "/abc/xyz/pqr/"(this will be fix).Under this directory structure i have some other directory and... (6 Replies)
Discussion started by: harpal singh
6 Replies

9. Shell Programming and Scripting

Shell script to build directory tree and files

Hi all, I'm trying at the moment to write a shell script to build a directory tree and create files within the built directories. I've scoured through sites and text books and I just can't figure out how to go about it. I would assume that I need to use loops of some sort, but I can't seem... (8 Replies)
Discussion started by: Libertad
8 Replies

10. Shell Programming and Scripting

How to run a script/command on all the directories in a directory tree?

How to run a script/command on all the directories in a directory tree? The below script is just for the files in a single directory, how to run it on all the directories in a directory tree? #!/bin/sh for audio_files in *.mp3 do outfile="${audio_files%.*}.aiff" sox "$audio_files"... (2 Replies)
Discussion started by: temp-usr
2 Replies
All times are GMT -4. The time now is 09:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy