Help with file compare and move script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with file compare and move script
# 1  
Old 11-27-2018
Help with file compare and move script

I'm running debian (the raspbian version of it) and working on a script to compare files in 2 directories, source and target, move files with duplicate names to a 3rd directory, then move remaining files in source to target. I can't get the syntax right, keep getting syntax errors and can't get past the file comparison stage to start figuring out the move portion. I thought I'd print the results to start, to see if it's working.

This isn't intended to be a command line script, it's intended to run automatically on boot so there shouldn't be any user intervention required. I found several scripts that require a user to input directories when they're run, then delete duplicates. I took what looked like the most easy to understand one and am trying to modify it. I was mistaken about the simplicity.

Any advice or hints would be greatly appreciated.


Code:
#!/bin/bash
# Compare file names in source and target directories
# Move duplicates from source to duplicates directory
# Move remaining files in source to target directory
# Only care about files names, not upper lower case, checksum, date, time

dir1="/mnt/nas/source"
dir2="/mnt/nas/target"
dir3="/mnt/nas/Duplicates"

for file in $dir1 ;
  do [ -f $dir2 ${file} ] && echo ${file} ;
done

# 2  
Old 11-27-2018
Code:
-f ${dir2}/${file}

This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 11-27-2018
Above won't work as the for loop won't supply file names but just the directory name. If complemented with /*, full path names would be supplied then. You'd have to either cd into the source dir, or strip off the path part:


Code:
cd "$dir1"
for FN in *; do ...; done

OR

Code:
for FN in "$dir1"/*; do echo mv "${dir2}${FN##*/}" "$dir3"; done

This User Gave Thanks to RudiC For This Post:
# 4  
Old 11-27-2018
Thanks for the advice above, it got me further along but I've run into another wall or two. Here's where I'm at so far
Code:
!/bin/bash
# Compare file names in source and target directories
# Move duplicates from source to duplicates directory
# Move remaining files in source to target directory
# Only care about files names, not upper lower case, checksum, date, time

cd /mnt/nas/source
dir1=$(ls *.*)
cd /mnt/nas/target
dir2=$(ls *.*)
cd / 
dir3="/mnt/nas/Duplicates"

for FN in $dir1;
do
    if [ -f "$dir2 -eq ${FN}" ];
        then echo "Duplicate "${FN};
        else echo "Unique "${FN};
    fi;
done

The first problem is I can't get the IF statement to work no matter where I put quotes, parens, brackets, or curly brackets. Also switching -eg for =, ==, or / makes no difference. I get either a 'too many parameters on line 16" error or it drops right through and declares every file unique when 10 out of 20 are duplicates.

Second problem is I have a file named 'space test dupe10.jpg' that I named to see how it would handle spaces in file names. I have a simplified version of the script (no IF statement) that just echoes the variables as it increments through them, and it appears to treat space, test, and dupe10.jpg as 3 different files.

This isn't a life or death situation so I greatly appreciate any and all advice. I'm just updating an electronic picture frame that hangs on the living room wall and runs for 4 hrs per night which makes adding pictures a pain since it has to be done while it's on. With this script I can put the pics on my NAS and they'll get copied over when the frame boots. I can do that now but duplicate file names are a concern. I made this thing before you could buy them, from instructions in a physical popular mechanics magazine. 15 or so years and several thousand pics later and you can imagine how many times my wife (a photography hobbyist no less) has tried to load "flowers.jpg" on it.
# 5  
Old 11-27-2018
This demo program might help you to your solution:
Code:
dir1="holiday.jpg camping.jpg beach.jpg"
dir2="xmas.jpg holiday.jpg"

for FN in $dir1
do
    found=0
    for FO in $dir2
    do
        if [ $FN = $FO ]
        then
            found=1
        fi
    done

    if [ $found -eq 1 ]
    then
        echo "Duplicate $FN"
    else
        echo "Unique $FN"
    fi
done

Output:
Code:
Duplicate holiday.jpg
Unique camping.jpg
Unique beach.jpg

This User Gave Thanks to Chubler_XL For This Post:
# 6  
Old 11-28-2018
First off: you are getting along fine. We all had about the same issues you are experiencing right now when we learned our trade, so don't worry - keep trying and you will sure be giving the answers here instead of asking them.

Let us get to the first problem:
Quote:
Originally Posted by mattz40
Code:
    if [ -f "$dir2 -eq ${FN}" ];

The first problem is I can't get the IF statement to work no matter where I put quotes, parens, brackets, or curly brackets.
It helps to understand problems like this by picturing how the shell works: a script is basically a list of commands the shell will enter on your behalf line by line. Therefore you can "debug" your code the same way: open a new terminal window and paste the relevant pieces in, line by line - then see what happens. Furthermore there is a device you might find very useful: set -xv and set +xv. The first one switches on, the second one off a feature that shows every command as it will be exeuted in a script. So, you could insert into your script:

Code:
[...]
    set -xv
    if [ -f "$dir2 -eq ${FN}" ];
        then echo "Duplicate "${FN};
        else echo "Unique "${FN};
    fi;
    set +xv

to see exactly what happens between these lines and what the respective values for the variables are at that point.

The next thing is your quoting: quoting in shell is necessary to prevent something called "field splitting" - the shell splits every command line into fields ("words") before executing it. This is why i.e. command -opt argument is interpreted as calling "command" with the options "-o" "-p" and "-t" and the argument "argument" and not the argument "-opt argument". Field splitting per default happens with spaces as delimiters. Quoting has in fact two reasons: first, to prevent this field splitting - this is what double quotes are mainly used for - and to revert characters with a special meaning to the shell back to normal characters. Try i.e.:

Code:
var="foo bar"
echo $var

Notice, btw., the double quotes: they prevent the field splitting, otherwise the shell would complain for syntax errors because var=foo would be a valid statement and bar would be nothing meaningful - hence, complaint. This is field splitting at work. Now try:

Code:
var="foo bar"
echo '$var'

And notice the difference in output. This is because the single quotes have stripped the special meaning of "$" from the character and therefore "$var" is no longer meaning "expand this to the content of variable var" but just a string consisting of the four characters "$", "v", "a" and "r". Btw.: it is a common misconception that quotes can be nested: "....'...'...". This is not the case at all. Inside a quote everything is a normal character until this quote is closed. The string before is just a double-quoted string with two characters that happen to be single-quote characters with no special meaning.

Now, in light of all this, let us look again at your codeline:

Code:
if [ -f "$dir2 -eq ${FN}" ];

It is obvious that the double quotes are misplaced, they should be surrounding the variables, which would protect the script from breaking when these variables contain spaces:

Code:
if [ -f "$dir2" -eq "${FN}" ];

But there is another issue with this line and it has nothing to do with quoting. Let us first examine the if-statement and how it works: if gets a command as argument, executes it and if this returns 0 (this is the UNIX way of programs saying they were successful) the then-part is executed, otherwise the else-part if there is one. Here is an example:

Code:
ls /etc ; echo "The return code is: $?"

This lists the content of the /etc-directory and because it exists ls will return 0 (or "TRUE") which is confirmed in the following echo-statement. If the directory wouldn't exist then ls would return something else (anything else is "FALSE"):

Code:
ls /this/does/not/exist ; echo "The return code is: $?"

We could use this in the if-statement - we won't even need ls's output, just its return code:

Code:
if ls /etc > /dev/null ; then
     echo "ls returned 0"
else
     echo "ls returned something else"
fi

Plug in some non-existing directory instead of /etc and you will see how it works.

Coming back to your line:

Code:
if [ -f "$dir2" -eq "${FN}" ] ; then

Which program is called here? It is - and you might not have guessed that - the program [. Yes, ridiculous as it seems, this is really the name of it and in fact this is another name for the program test. You see, when shells were first created and the mechanism described above of plugging any command into if was invented they invented test to do what usually if-statements in other programming languages do. So you had lines like:

Code:
if test "$x" == "$y" ; then

This work (in fact upt to now), but programmers were used to another style, like in C:

Code:
if( x == y)

So someone "invented" a link from /bin/test to /bin/[ and now lines looked like:

Code:
if [ "$x" == "$y" ; then

This resembled what they were used to much more but opening a bracket that wasn't closed was among the things "good people won't do", so /bin/test was changed: if it was called as [ it would expect a ] as last argument! We had athe syntax we know today:

Code:
if [ "$x" == "$y" ] ; then

So in fact this is a call to test with the arguments "$x", "==" and "$y" - and the last canonical argument "]". If you ever want to know something about "[" and how it works - consult the man page of "test"!

Now let us, in light of this, examine what you fed poor test as arguments:

Code:
test -f "$dir2" -eq "${FN}"

-f expects a single argument after it and test will return TRUE if this second argument is a file and FALSE if not. Basically

Code:
if [ -f "$dir2" ] ; then

would say: "if dir2 exists and is a file then do...". -eq on the other hand expects two operands: one before and one after it. It is intended for NUMERICAL (only numerical!) comparisons and tests for equality, like the anme suggests. Try this:

Code:
if [ 1 -eq 1 ] ; then
    echo "these are equal"
else
    echo "these are not"
fi

Change "1" to any other number and watch the result again.

So the problem is: you cannot mix two different conditions and most likely this was not what you intended anyway. What you probably wanted to compare was filenames which are basically strings. You cannot compare strings with the -eq (or the -lt, -le, -gt or -ge) operators because they only work on numbers - integers, to be precise. For strings there are the == and the != operators, which test for equality or non-equality.

You perhaps want to find out how to correctly phrase your condition yourself, so i won't spoil the fun. Happy programming.

I hope this helps.

bakunin

Last edited by bakunin; 11-28-2018 at 02:18 AM..
This User Gave Thanks to bakunin For This Post:
# 7  
Old 11-28-2018
Quote:
Originally Posted by Chubler_XL
This demo program might help you to your solution:
Code:
dir1="holiday.jpg camping.jpg beach.jpg"
dir2="xmas.jpg holiday.jpg"

for FN in $dir1
do
    found=0
    for FO in $dir2
    do
        if [ $FN = $FO ]
        then
            found=1
        fi
    done

    if [ $found -eq 1 ]
    then
        echo "Duplicate $FN"
    else
        echo "Unique $FN"
    fi
done

Output:
Code:
Duplicate holiday.jpg
Unique camping.jpg
Unique beach.jpg

Hi Chubler_XL,
Your example works fine for the example you've chosen, but, as mattz40 mentioned in post #4 in this thread, it won't work if one or more of the filenames in the lists in the expansions of $dir1 and $dir2 contain a <space> character.

Hi mattz40,
Maybe you would want to try something more like:
Code:
#!/bin/bash
# Compare file names in source and target directories
# Move duplicates from source to duplicates directory (not yet implemented)
# Move remaining files in source to target directory (not yet implemented)
# Only care about filenames; not checksum, date, time

dir1="/mnt/nas/source"
dir2="/mnt/nas/target"
dir3="/mnt/nas/Duplicates"

for Path1 in "$dir1"/*.*
do	File1="${Path1##*/}"
	found=0
	for Path2 in "$dir2"/*.*
	do	if [ "$File1" = "${Path2##*/}" ]
		then	found=1
			break
		fi
	done
	if [ $found -eq 1 ]
	then	echo "Duplicate: \"$File1\""
	else	echo "Unique: \"$File1\""
	fi
done

Note that the comment line in your sample code:
Code:
# Only care about files names, not upper lower case, checksum, date, time

has been modified because this code does care about case in filenames. You can modify the code if you want to allow case-insensitive matches, but that is not the way normal UNIX/Linux/BSD filesystems work.

If the directory /mnt/nas/source contains the files:
Code:
birds and bees.jpg
flowers.jpg
hive.jpg

and the directory /mnt/nas/target contains the files:
Code:
beehive.jpg
birds and bees.jpg
flowers.jpg

then the above code will produce the output:
Code:
Duplicate: "birds and bees.jpg"
Duplicate: "flowers.jpg"
Unique: "hive.jpg"

Note also that the dir1 and dir2 variables now contain the full pathnames of the source and target directories; not lists of words contained in filenames in those directories. The Path1 and Path2 variables contain absolute pathnames of a file in the source and target directories, respectively and the File1 variable contains the filename of the last component of the pathname in the expansion of $Path1.

Note that bakunin gave an excellent explanation of what was wrong in your if [ ... ] expression. But I have to disagree with one point. The standard test expression and [ expression ] string equality operator is a single <equals-sign>, not the double <equals-sign> that is used in the C Language. Some shells will accept both, some shells will give you a syntax error is you use the double <equals-sign>, and some manual pages for some shells will say that the single <equals-sign> form is deprecated, but that is not what the shell standards say. I don't know of any shells that do not accept the single <equals-sign> form that is required by the standards.

(Some shells also have a [[ expression ]] in which [[ is a shell keyword; not the name of utility used to evaluate expressions. In the shells that understand [[ expression ]], the string equality operator in expression in this form is the double <equals-sign> in all shells that I've used. And some shells accept a single <equal-sign> operator in this form with a similar, but not always identical, meaning.)

Last edited by Don Cragun; 11-28-2018 at 02:56 AM.. Reason: Fix typo: s,!/bin/bash,#!/bin/bash,
These 2 Users Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script (sh file) logic to compare contents of one file with another file and output to file

Shell script logic Hi I have 2 input files like with file 1 content as (file1) "BRGTEST-242" a.txt "BRGTEST-240" a.txt "BRGTEST-219" e.txt File 2 contents as fle(2) "BRGTEST-244" a.txt "BRGTEST-244" b.txt "BRGTEST-231" c.txt "BRGTEST-231" d.txt "BRGTEST-221" e.txt I want to get... (22 Replies)
Discussion started by: pottic
22 Replies

2. Shell Programming and Scripting

Move file in to directory- script

Hi In directory /mnt/upload I have about 100 000 files (*.png) that have been created during the last six months. Now I need to move them to right folders. eg: file created on 2014-10-10 move to directory /mnt/upload/20141010 file created on 2014-11-11 move to directory /mnt/upload/20141111... (6 Replies)
Discussion started by: primo102
6 Replies

3. Shell Programming and Scripting

Shell script to get the latest file from the file list and move

Hi, Anybody help me to write a Shell Script Get the latest file from the file list based on created and then move to the target directory. Tried with the following script: got error. A=$(ls -1dt $(find "cveit/local_ftp/reflash-parts" -type f -daystart -mtime -$dateoffset) | head... (2 Replies)
Discussion started by: saravan_an
2 Replies

4. Shell Programming and Scripting

Script to compare substrings of multiple filenames and move to different directory

Hi there, I am having trouble with a script I have written, which is designed to search through a directory for a header and payload file, retrieve a string from both filenames, compare this string and if it matches make a backup of the two files then move them to a different directory for... (1 Reply)
Discussion started by: alcurry
1 Replies

5. Shell Programming and Scripting

A script that will move a file to a directory with the same name and then rename that file

Hello all. I am new to this forum (and somewhat new to UNIX / LINUX - I started using ubuntu 1 year ago).:b: I have the following problem that I have not been able to figure out how to take care of and I was wondering if anyone could help me out.:confused: I have all of my music stored in... (7 Replies)
Discussion started by: marcozd
7 Replies

6. Shell Programming and Scripting

script to move two lines to the end of a file

My input file is multiline file and I am writing a script to search for a pattern and move the line with the pattern and the next line to the end of the file. Since I am trying to learn awk, I thought I would try it. My input looks like the following: D #testpoint 1 510.0 D #testpoint2 ... (5 Replies)
Discussion started by: banjo25
5 Replies

7. Shell Programming and Scripting

Script to move the first line of a file to the end

I'm rather new to scripting, and despite my attempts at finding/writing a script to do what I need, I have not yet been successful. I have a file named "list.txt" of arbitrary length with contents in the following format: /home/user/Music/file1.mp3 /home/user/Music/file2.mp3... (21 Replies)
Discussion started by: Altay_H
21 Replies

8. Shell Programming and Scripting

script to move text in file?

ok i asked around to a few ppl and they said to use sed or awk to do what i want.. but i cant figure out how to use it like that.. anyway i have a text file that is 10k lines long.. i need to move text from the end of a line after the ? and move it to the front of the line then add a | after it.... (3 Replies)
Discussion started by: wckdkl0wn
3 Replies

9. UNIX for Dummies Questions & Answers

Compare directories then move similar ones

I would like to know how to compare a listing of directories that begin with the same four numbers ie. /1234cat /1234tree /1234fish and move all these directories into one directory Thanks in advance (2 Replies)
Discussion started by: tgibson2
2 Replies

10. Shell Programming and Scripting

File Compare & Move between 2 servers

Greetings - I am a newbie in shell scripts. I have been thru the whole forum but there has been no similar query posed. The objective of my system is to have a unified filebase system. I am using RSync to synchronise files between the location & central server with both of them having the... (4 Replies)
Discussion started by: evolve
4 Replies
Login or Register to Ask a Question