Search and compare files from two paths


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Search and compare files from two paths
# 1  
Old 03-27-2013
Search and compare files from two paths

Hi All,

I have a 2 path, one with oldfile path in which has several sub folders,each sub folders contains a config file(basically text file), likewise there will be another newfile path which will have sub folders, each sub folders contains a config file.

Need to read files from oldfile path sub folders and from newfile path sub folders and do sdiff command on those 2 files ie,

Code:
sdiff /oldfile path/sub folderA/fileA.txt /newfile path/sub folderA/fileA.txt |egrep '>|<|\|' > folderA_fileA_result.txt

but before using sdiff, need to sort the both the files ie,

Code:
sort /oldfile path/sub folderA/fileA.txt > fileA.txt
sort /newfile path/sub folderA/fileA.txt > fileA.txt

problem is not sure how search for same sub folder name from both oldfile path and newfile , if foldername matched then basically need to do below :

a. First use sort command to sort the content of each files

b. Compare files using sdiff command and save the sdiff result file in output folder

c. Need to archive the result files with some versioning, so that when there is new config files, then it has compare the oldfiles from archived folder with new files

Note : All the config files is basic text files having several lines.

---------- Post updated 27-03-13 at 09:29 AM ---------- Previous update was 26-03-13 at 03:50 PM ----------

ok, i tired comparing same file names from 2 different directories but am getting the below error, even though i have kept the same file in both folders. Am not sure how to skip . and .. file check

Code:
 
# Check the number of input parameters. If two parameters are given go ahead, else exit
if [ $# -eq 2 ]
then
OLDFILESDIR=$1
NEWFILESDIR=$2
else
echo "Usage: script.sh oldfilesdir newfilesdir"
exit
fi
# Validate from_directory
if [ ! -d "${OLDFILESDIR}" ]
then
echo "Directory ${OLDFILESDIR} does not exist!!"
exit
fi
# Validate to_directory
if [ ! -d "${NEWFILESDIR}" ]
then
echo "Directory ${NEWFILESDIR} does not exist!!"
exit
fi
cd ${OLDFILESDIR}
#for i in `find . -type f`
for i in `find . -name '*.txt'`
do
if [ ! -f ${NEWFILESDIR}/$i ]
then
echo "Same filename doesn't found in ${OLDFILESDIR}/$i and in ${NEWFILESDIR}/$i"
else
echo "Same filename found in ${OLDFILESDIR}/$i and in ${NEWFILESDIR}/$i"
sort ${OLDFILESDIR}/$i
sort ${NEWFILESDIR}/$i
sdiff ${OLDFILESDIR}/$i ${NEWFILESDIR}/$i |egrep '>|<|\|' > resultfile.txt
fi
done

Error:

Code:
Same filename doesn't found in oldfiles/./tcp.txt and in newfile/./tcp.txt

---------- Post updated at 09:30 AM ---------- Previous update was at 09:29 AM ----------

ok, i tired comparing same file names from 2 different directories but am getting the below error, even though i have kept the same file in both folders. Am not sure how to skip . and .. file check

Code:
 
# Check the number of input parameters. If two parameters are given go ahead, else exit
if [ $# -eq 2 ]
then
OLDFILESDIR=$1
NEWFILESDIR=$2
else
echo "Usage: script.sh oldfilesdir newfilesdir"
exit
fi
# Validate from_directory
if [ ! -d "${OLDFILESDIR}" ]
then
echo "Directory ${OLDFILESDIR} does not exist!!"
exit
fi
# Validate to_directory
if [ ! -d "${NEWFILESDIR}" ]
then
echo "Directory ${NEWFILESDIR} does not exist!!"
exit
fi
cd ${OLDFILESDIR}
#for i in `find . -type f`
for i in `find . -name '*.txt'`
do
if [ ! -f ${NEWFILESDIR}/$i ]
then
echo "Same filename doesn't found in ${OLDFILESDIR}/$i and in ${NEWFILESDIR}/$i"
else
echo "Same filename found in ${OLDFILESDIR}/$i and in ${NEWFILESDIR}/$i"
sort ${OLDFILESDIR}/$i
sort ${NEWFILESDIR}/$i
sdiff ${OLDFILESDIR}/$i ${NEWFILESDIR}/$i |egrep '>|<|\|' > resultfile.txt
fi
done

Error:

Code:
Same filename doesn't found in oldfiles/./tcp.txt and in newfile/./tcp.txt


Last edited by vbe; 03-27-2013 at 12:02 PM..
# 2  
Old 03-27-2013
Seems like deja vu. Compare the file lists of the two directories, and the file content, using something like this:
Code:
diff -U0 <(
  cd head1
  find * -type f | sort | xargs -r cksum ) <(
  cd head2
  find * -type f | sort | xargs -r cksum ) | while read  diff_ind  cksum  sz  path
do
 case "$diff_ind" in
 (-)
  echo "Deleted file '$path'."
  ;;
 (+)
  echo "New file '$path':"
  cat head2/$path
  ;;
 (*)
  echo "Changed file '$path':"
  sdiff head1/$path head2/$path
  ;;
 esac
done

For stricter delete/new checking, use 'comm' not 'diff -U0' but no '| xargs -r cksum' until later, when you know both are present (no tab prefix is delete, one tab is new, two tabs is both). You can report new/delete on stderr and pipe others to stdout to cksum to another while read to compare cksums before running an sdiff.

Last edited by DGPickett; 03-27-2013 at 06:41 PM..
# 3  
Old 03-28-2013
Thanks DGPickett. Am notsure what change needs to be done for the above script. I did tried them by changing head1 to oldfile path and head2 newfile path.

Could you please correct me.
Code:
diff -U0 <(
cd /usr/config_check/oldfiles/
find * -type f | sort ) <(
cd /usr/config_check/newfile/
find * -type f | sort ) | while read diff_ind cksum sz path
do
case "$diff_ind" in
(-)
echo "Deleted file '$path'."
;;
(+)
echo "New file '$path':"
cat head2/$path
;;
(*)
echo "Changed file '$path':"
sdiff /usr/config_check/oldfiles/$path /usr/config_check/newfile/$path
;;
esac
done

After running above script, i get this error
Code:
root@att02 # ./cfile.sh
Changed file '':
sdiff: Cannot open: /usr/config_check/oldfiles//newfile/

# 4  
Old 03-28-2013
Well, I do not have the test facility, so you need to check where the blank line is coming from. I assume diff -U0 will print only lines starting with +, -, |; so stick a tee after it, or "pg ;true |" and see what the first part delivers. Or put the word echo before sdiff to see what the whole command line is.

The idea is that the lists of files are identical, so any delete or add will be - or + and the checksum does not matter. If the files are identical, the checksum and size will be identical, and diff should toss them. If the files are different, sdiff should be able to show that.

The blank line from diff can be filtered if just a nuisance, using case (?*) to process only not empty lines and ignoring empty lines that fit (*).

The comm command is much stricter, as it is not designed for the eye but for bit by bit perfection. However, comm is just good for finding deleted and new; to get from both to changed still needs a comparison by cksum result compare or cmp. Comm demands sorted inputs, so if two files have different cksum, they would sort to non-adjacent places. So, a comm of file names is nice, followed by a cmp of files.
Code:
( export LC_ALL=C head1=... head2=... # LC_ALL controls sort order, some systems sort not binary by default
comm <(
  cd $head1
  find * -type f | sort
 ) <(
  cd $head2
  find * -type f | sort
 )| sed '\
  s/^\t\t/both /
  t
  s/^\t/add /
  t
  s/^/del /
 ' | while read stat fn
 do
  case $stat in
   (add)
    echo "New: $fn"
    cat $head2/$fn
    ;;
   (del)
    echo "Deleted: $fn"
    ;;
   (*)
    if [ "" != "$(cmp $head1/$fn $head2/$fn 2>&1)" ]
    then
     echo "Different: $fn"
     sdiff $head1/$fn $head2/$fn 2>&1
    fi
    ;;
   esac
  done
 )

The \t needs to be a real tab, above. The comm unifies the two file name lists and tells you if they are a only, b only or both by tabbing the lines as if to put them in three columns. You can remove columns in com using -1 (new and both), -2 (old and both), -23 (old only), -3 (old and new but no both), etc. It is robust set logic for shell scripting.

Last edited by DGPickett; 03-28-2013 at 04:23 PM..
# 5  
Old 03-29-2013
Thanks DGPickett. Sure will tryout what you said.

I tried some very simple solution to compare *.txt files from 2 different directories. I was able to compare them and generate sdiff result.

Code:
#!/bin/bash
# cmp_dir - program to compare two directories
# Check for required arguments
if [ $# -ne 2 ]; then
    echo "usage: $0 directory_1 directory_2" 1>&2
    exit 1
fi
# Make sure both arguments are directories
if [ ! -d $1 ]; then
    echo "$1 is not a directory!" 1>&2
    exit 1
fi
if [ ! -d $2 ]; then
    echo "$2 is not a directory!" 1>&2
    exit 1
fi
# Process each file in directory_1, comparing it to directory_2
find $1/ -name '*.txt' -print | while read src
do
for filename in $1/*.txt; do
    fn=$(basename "$filename")
    if [ -f "$filename" ]; then
        #if [ ! -f "$2/$fn" ]; then
            #echo "$fn is missing from $2"
            #missing=$((missing + 1))
        #fi
                sort $filename
                #echo $filename
                sort $2/$fn
                #echo $2/$fn
                sdiff $filename $2/$fn | egrep '>|<|\|' > resultfile.txt
    fi
done
done

when i execute the above script ie,
./filecomp.sh oldfiles newfile

I get the resultfile.txt which will have the sdiff output(with grep). now my problem is not sure how create separate resultfile as it reads files.

I have 2 different folders :
a. oldfiles - conatains several files(*.txt)
b. newfile - contains several files(*.txt)

Files in the two folders will have the same filenames. ie,
oldfiles folder -
aa.txt
bb.txt

newfiles folder -
aa.txt
bb.txt

so, what am trying to do in above script, is to read file aa.txt from oldfiles folder and aa.txt from newfile folder then do sort/sdiff command and then put the result file in output folder with filename aa_result.txt

ie, output folder will contain results
aa_result.txt
bb_result.txt

this where am struck, how to get the separate resultfile on each inputfile.

Anyhelp will be greatful.
# 6  
Old 04-02-2013
Hi All,

Anyone can please give me the idea/solution. am still struck with no clue how to go about.
# 7  
Old 04-05-2013
Does sdiff recurse like diff, for dirs ( sdiff -bw head1 head2 )?

Consider something very readable but not sdiff, like: diff -bwU99999 head1 head2

Last edited by DGPickett; 04-05-2013 at 05:25 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script for linking files with paths in 2 text files

I have 2 txt files, 1.txt and 2.txt which contain the paths to files that need to be linked. Example 1.txt: /root/001/folder2/image4.nii.gz /root/002/folder2/image4.nii.gz Example 2.txt: /root/001/folder2/image5.nii.gz /root/002/folder2/image5.nii.gz Each line represents images from... (7 Replies)
Discussion started by: LeftoverStew
7 Replies

2. UNIX for Dummies Questions & Answers

Search for string in a file then compare it with excel files entry

All, i have a file text.log: cover6 cover3 cover2 cover4 other file is abc.log as : 0 0 1 0 Then I have a excel file result.xls that contains: Name Path Pass cover2 cover3 cover6 cover4 (1 Reply)
Discussion started by: Anamika08
1 Replies

3. UNIX Desktop Questions & Answers

Change name of files to their paths -- find loop

Dear All, I have many sub-folders but each of them have a file with same name but different data. I want to either move or copy them into a new folder but they need to have the path of where they are coming as part of their name... I have managed to find the files but dont know how to change... (2 Replies)
Discussion started by: A-V
2 Replies

4. Shell Programming and Scripting

Replace directory paths in multiple files at once

I need to update about 2400 files in a directory subtree, with a new directory path inside the files I need to change this occurence in all files: /d2/R12AB/VIS/apps/tech_st/10.1.2 with this: /u01/PROD/apps/apps_st/10.1.3 I know how to change single words using "find . -type f -print0 |... (6 Replies)
Discussion started by: wicus
6 Replies

5. UNIX for Dummies Questions & Answers

Determining file size for a list of files with paths

Hello, I have a flat file with a list of files with the path to the file and I am attempting to calculate the filesize for each one; however xargs isn't playing nicely and I am sure there is probably a better way of doing this. What I envisioned is this: cat filename|xargs -i ls -l {} |awk... (4 Replies)
Discussion started by: joe8mofo
4 Replies

6. Shell Programming and Scripting

Search compare and determine duplicate files

Hi May i ask if someone know a package that will search a directory recursively and compare determine duplicate files according to each filename, date modified or any attributes that will determine its duplicity If none where should i start or what are those command in shell scripting that... (11 Replies)
Discussion started by: jao_madn
11 Replies

7. Shell Programming and Scripting

compare two files and search keyword and print output

You have two files to compare by searching keyword from one file into another file File A 23 >pp_ANSWER 24 >aa hello 25 >jau head wear 66 >jss oops 872 >aqq olps ploww oww sss 722 >GG_KILLER ..... large files File B Beta done KILLER John Mayor calix meyers ... (5 Replies)
Discussion started by: cdfd123
5 Replies

8. HP-UX

Search environment variables for paths

Hi, I am using the HP machine at the moment and by default I have been setup with the kron shell i.e. my home profile is .kshrc I would like to access a program anywhere on the system so I have added a path and created an environment variable like this: export myvarpath=/a/abc/def/ghij... (3 Replies)
Discussion started by: cyberfrog
3 Replies

9. UNIX Desktop Questions & Answers

how to display paths of files in a directory

hi guys does anyone know how to display the file paths of the files stored within a directory at the command terminal? e.g. if i have a directory called "home", how do i display the file paths of the files inside the directory? cheers (2 Replies)
Discussion started by: Villaman69
2 Replies

10. Shell Programming and Scripting

How to search & compare paragraphs between two files

Hello Guys, Greetings to All. I am stuck in my work here today while trying to comapre paragraphs between two files, I need your help on urgent basis, without your inputs I can not proceed. Kindly find some time to answer my question, I'll be grateful to you for ever. My detailed issue is as... (10 Replies)
Discussion started by: NARESH1302
10 Replies
Login or Register to Ask a Question