Visit Our UNIX and Linux User Community


Making a script to copy files not seen before (using md5sum)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Making a script to copy files not seen before (using md5sum)
# 22  
Old 09-04-2013
OK... I spoke too soon...

Bummer. My testing routine was not specific enough. And shortly after I posted my "it's all fixed" message, I realized it was probably not.

The "photos-backup" folder is supposed to house the "DCIM" folder from the phone, which in turn has subfolders (i.e. 100MEDIA, 101MEDIA, etc.).

So I put the "set-x" back in the script:
Code:
#!/bin/bash
set -x

# The username variable passed by command line to this script
USER=$1

# The source directory where the photo folder on the phone is mirrored to
SRC=/hd1/home/$USER/.phonesync/photos-backup

# The destination directory where we want to copy only new photos we have copied before
DST=/hd1/home/$USER/.phonesync/photos-new

# The MD5 list file that tracks which files we have copied before
MD5=/hd1/home/$USER/.phonesync/photos-backup.md5

# Check files against the MD5 list and then copy if not previously copied
# Then add the md5 for that file to the MD5 list
cd $SRC
for f in *
do
  FMD5=$(md5sum $f)
  grep -q $FMD5 $MD5
  if [[ $? -ne 0 ]]; then
    cp $f $DST
    md5sum $f >> $MD5
  fi
done

# In case this script gets run as root, redo the file ownership so users can access their photos
chown -R $USER:$USER $DST

And ran it... and here is the output.

Code:
nick@server ~/.phonesync$ ./photos-new.sh nick
+ USER=nick
+ SRC=/hd1/home/nick/.phonesync/photos-backup
+ DST=/hd1/home/nick/.phonesync/photos-new
+ MD5=/hd1/home/nick/.phonesync/photos-backup.md5
+ cd /hd1/home/nick/.phonesync/photos-backup
+ for f in '*'
++ md5sum DCIM
md5sum: DCIM: Is a directory
+ FMD5=
+ grep -q /hd1/home/nick/.phonesync/photos-backup.md5
^C

Had to CTRL+C to stop the script.

The folder structure of photos-backup is...

Code:
nick@server ~/.phonesync$ ls * -a -R -l
-rw-r--r-- 1 nick nick   46 Sep  4 10:42 photos-backup.md5
-rwxr-xr-x 1 nick nick  841 Sep  4 11:11 photos-new.sh

photos-backup:
total 12
drwxr-xr-x 3 nick nick 4096 Sep  4 11:15 .
drwxr-xr-x 4 nick nick 4096 Sep  4 10:42 ..
drwxr-xr-x 3 nick nick 4096 Sep  4 11:09 DCIM

photos-backup/DCIM:
total 12
drwxr-xr-x 3 nick nick 4096 Sep  4 11:09 .
drwxr-xr-x 3 nick nick 4096 Sep  4 11:15 ..
drwxr-xr-x 2 nick nick 4096 Sep  4 11:10 100MEDIA

photos-backup/DCIM/100MEDIA:
total 4080
drwxr-xr-x 2 nick nick    4096 Sep  4 11:10 .
drwxr-xr-x 3 nick nick    4096 Sep  4 11:09 ..
-rw-r--r-- 1 nick nick 1243984 Jun 28  2012 IMAG0001.jpg
-rw-r--r-- 1 nick nick 1551828 Jun 28  2012 IMAG0002.jpg
-rw-r--r-- 1 nick nick 1369884 Jun 28  2012 IMAG0003.jpg

photos-new:
total 8
drwxr-xr-x 2 nick nick 4096 Sep  4 11:15 .
drwxr-xr-x 4 nick nick 4096 Sep  4 10:42 ..

So how do I modify this script for it to carry out the copies for all subfolders of "photos-backup"?
# 23  
Old 09-04-2013
I see. Basically, what it's saying is that you can't md5sum a directory.
Here's what I would do. Once you change directory to $SRC (cd $SRC):
  • Check if each file is a regular file or a directory.
  • If it's a regular file, continue with the current for loop.
  • If it's a directory, change to this directory and then you'll have to write another for loop to perform the same test with each file inside. Once done, go to its parent directory and resume the main loop.
Writing a for loop with a couple of conditionals is not that difficult. If you need help to know how to determine if a certain file (the way Unix or Linux sees it) is a regular file, let us know.
# 24  
Old 09-04-2013
Do you want to copy $SRC/DCIM/100MEDIA/IMAG000[1-3].jpg to $DST/DCIM/100MEDIA/IMAG000[1-3].jpg or to $DST/IMAG000[1-3].jpg?

Do all of the files you want to copy have names that end with .jpg?

Last edited by Don Cragun; 09-04-2013 at 04:01 PM.. Reason: Add 2nd question...
# 25  
Old 09-04-2013
Hi Don,

The phone stores both photos and videos in the DCIM structure, so I'd prefer not to limit to a single extension. If I have to call out extensions, I would want the flexibility to supply more than one to look for.

And I think I would prefer to copy $SRC/DCIM/100MEDIA/IMAG000[1-3].jpg to $DST/DCIM/100MEDIA/IMAG000[1-3].jpg rather than dump all files in one $DST location - just in case there are filename duplicates within a "100MEDIA" directory vs. a "101MEDIA" directory that might be under the DCIM directory.
# 26  
Old 09-04-2013
You could try changing:
Code:
cd $SRC
for f in *
do
  FMD5=$(md5sum $f)
  grep -q $FMD5 $MD5
  if [[ $? -ne 0 ]]; then
    cp $f $DST
    md5sum $f >> $MD5
  fi
done

in your script to something like:
Code:
cd $SRC
find . -type f | while IFS= read -r f
do
  FMD5=$(md5sum $f)
  grep -q $FMD5 $MD5
  if [[ $? -ne 0 ]]; then
    if [ ! -d $DST/${f%/*} ]; then
      mkdir -p $DST/${f%/*}
    fi
    cp $f $DST/$f
    md5sum $f >> $MD5
  fi
done

This User Gave Thanks to Don Cragun For This Post:
# 27  
Old 09-10-2013
OK... So I modified the code of "photos-new.sh" to be as follows:
Code:
#!/bin/bash
set -x

# The username variable passed by command line to this script
USER=$1

# The source directory where the photo folder on the phone is mirrored to
SRC=/hd1/home/$USER/.phonesync/photos-backup

# The destination directory where we want to copy only new photos we have copied before
DST=/hd1/home/$USER/.phonesync/photos-new

# The MD5 list file that tracks which files we have copied before
MD5=/hd1/home/$USER/.phonesync/photos-backup.md5

# Check files against the MD5 list and then copy if not previously copied
# Then add the md5 for that file to the MD5 list

cd $SRC
find . -type f | while IFS= read -r f
do
  FMD5=$(md5sum $f)
  grep -q $FMD5 $MD5
  if [[ $? -ne 0 ]]; then
    if [ ! -d $DST/${f%/*} ]; then
      mkdir -p $DST/${f%/*}
    fi
    cp $f $DST/$f
    md5sum $f >> $MD5
  fi
done

# In case this script gets run as root, redo the file ownership so users can access their photos
chown -R $USER:$USER $DST

I ran the script and got the following output:
Code:
nick@server ~/.phonesync$ ./photos-new.sh nick
+ USER=nick
+ SRC=/hd1/home/nick/.phonesync/photos-backup
+ DST=/hd1/home/nick/.phonesync/photos-new
+ MD5=/hd1/home/nick/.phonesync/photos-backup.md5
+ cd /hd1/home/nick/.phonesync/photos-backup
+ find . -type f
+ IFS=
+ read -r f
++ md5sum ./DCIM/100MEDIA/IMAG0003.jpg
+ FMD5='c0bd05642752af82a79fef52fffb3120  ./DCIM/100MEDIA/IMAG0003.jpg'
+ grep -q c0bd05642752af82a79fef52fffb3120 ./DCIM/100MEDIA/IMAG0003.jpg /hd1/home/nick/.phonesync/photos-backup.md5
+ [[ 1 -ne 0 ]]
+ '[' '!' -d /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA ']'
+ mkdir -p /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA
+ cp ./DCIM/100MEDIA/IMAG0003.jpg /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA/IMAG0003.jpg
+ md5sum ./DCIM/100MEDIA/IMAG0003.jpg
+ IFS=
+ read -r f
++ md5sum ./DCIM/100MEDIA/IMAG0001.jpg
+ FMD5='ccf0730cdc59d92323465401905b9a79  ./DCIM/100MEDIA/IMAG0001.jpg'
+ grep -q ccf0730cdc59d92323465401905b9a79 ./DCIM/100MEDIA/IMAG0001.jpg /hd1/home/nick/.phonesync/photos-backup.md5
+ [[ 1 -ne 0 ]]
+ '[' '!' -d /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA ']'
+ cp ./DCIM/100MEDIA/IMAG0001.jpg /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA/IMAG0001.jpg
+ md5sum ./DCIM/100MEDIA/IMAG0001.jpg
+ IFS=
+ read -r f
++ md5sum ./DCIM/100MEDIA/IMAG0002.jpg
+ FMD5='9a0d8d0d82690ecf7c690fe386679ae3  ./DCIM/100MEDIA/IMAG0002.jpg'
+ grep -q 9a0d8d0d82690ecf7c690fe386679ae3 ./DCIM/100MEDIA/IMAG0002.jpg /hd1/home/nick/.phonesync/photos-backup.md5
+ [[ 1 -ne 0 ]]
+ '[' '!' -d /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA ']'
+ cp ./DCIM/100MEDIA/IMAG0002.jpg /hd1/home/nick/.phonesync/photos-new/./DCIM/100MEDIA/IMAG0002.jpg
+ md5sum ./DCIM/100MEDIA/IMAG0002.jpg
+ IFS=
+ read -r f
+ chown -R nick:nick /hd1/home/nick/.phonesync/photos-new

And my file structure shows the files were copied:

Code:
nick@server ~/.phonesync$ ls -a -l -R
.:
total 24
drwxr-xr-x 4 nick nick 4096 Sep  4 10:42 .
drwxr-xr-x 7 nick nick 4096 Sep  4 10:37 ..
drwxr-xr-x 3 nick nick 4096 Sep  4 11:15 photos-backup
-rw-r--r-- 1 nick nick  235 Sep 10 14:50 photos-backup.md5
drwxr-xr-x 3 nick nick 4096 Sep 10 14:50 photos-new
-rwxr-xr-x 1 nick nick  942 Sep 10 14:30 photos-new.sh

./photos-backup:
total 12
drwxr-xr-x 3 nick nick 4096 Sep  4 11:15 .
drwxr-xr-x 4 nick nick 4096 Sep  4 10:42 ..
drwxr-xr-x 3 nick nick 4096 Sep  4 11:09 DCIM

./photos-backup/DCIM:
total 12
drwxr-xr-x 3 nick nick 4096 Sep  4 11:09 .
drwxr-xr-x 3 nick nick 4096 Sep  4 11:15 ..
drwxr-xr-x 2 nick nick 4096 Sep  4 11:10 100MEDIA

./photos-backup/DCIM/100MEDIA:
total 4080
drwxr-xr-x 2 nick nick    4096 Sep  4 11:10 .
drwxr-xr-x 3 nick nick    4096 Sep  4 11:09 ..
-rw-r--r-- 1 nick nick 1243984 Jun 28  2012 IMAG0001.jpg
-rw-r--r-- 1 nick nick 1551828 Jun 28  2012 IMAG0002.jpg
-rw-r--r-- 1 nick nick 1369884 Jun 28  2012 IMAG0003.jpg

./photos-new:
total 12
drwxr-xr-x 3 nick nick 4096 Sep 10 14:50 .
drwxr-xr-x 4 nick nick 4096 Sep  4 10:42 ..
drwxr-xr-x 3 nick nick 4096 Sep 10 14:50 DCIM

./photos-new/DCIM:
total 12
drwxr-xr-x 3 nick nick 4096 Sep 10 14:50 .
drwxr-xr-x 3 nick nick 4096 Sep 10 14:50 ..
drwxr-xr-x 2 nick nick 4096 Sep 10 14:50 100MEDIA

./photos-new/DCIM/100MEDIA:
total 4080
drwxr-xr-x 2 nick nick    4096 Sep 10 14:50 .
drwxr-xr-x 3 nick nick    4096 Sep 10 14:50 ..
-rw-r--r-- 1 nick nick 1243984 Sep 10 14:50 IMAG0001.jpg
-rw-r--r-- 1 nick nick 1551828 Sep 10 14:50 IMAG0002.jpg
-rw-r--r-- 1 nick nick 1369884 Sep 10 14:50 IMAG0003.jpg
nick@server ~/.phonesync$

This is very encouraging. I'll do some more testing and report back.
# 28  
Old 09-10-2013
I'm glad to hear that. Keep up the good work! And let us know how it goes Smilie.

Previous Thread | Next Thread
Test Your Knowledge in Computers #962
Difficulty: Medium
ILOVEYOU, sometimes referred to as Love Bug or Love Letter, was a computer worm that attacked tens of millions of Windows PCs on and after 5 May 2000 local time in the Philippines.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

[md5sum] script

I am getting No such file or directory if my variable contains white spaces... Is there a way to fix this? This works x="1.md5" md5sum -c "$x" This, does not x="23\ 5\ 6\ 7\ 8\ 9\ 10.md5" md5sum -c "$x" md5sum: '23\ 5\ 6\ 7\ 8\ 9\ 10.md5': No such file or directory How do I fix... (1 Reply)
Discussion started by: soichiro
1 Replies

2. Shell Programming and Scripting

Need script for making files based on some conditions.

Hi All, I have a text file (code_data.txt) with the followig data. AMAR AB123456 XYZ KIRAN CB789 ABC RAJ CS78890 XYZ KAMESH A33535335 ABC KUMAR MD678894 MAT RITESH SR3535355... (26 Replies)
Discussion started by: ROCK_PLSQL
26 Replies

3. Shell Programming and Scripting

Compare files in directories with md5sum

And not to start. I can compare files, that's easy. The problem is that I compare files in a directory, and check if these files exist in another directory. The problem is that the file names are not the same. So I have to compare with "md5sum" or something similar. How I can do? All this in... (7 Replies)
Discussion started by: Jomeaide
7 Replies

4. Shell Programming and Scripting

Md5sum script

Hello, I need to download multiple files from an FTP server but occasionally they arrive in error so I need to perform an integrity check. I've been attempting to write a bash script that does the following: Downloads all files including those in sub directories Perform md5sum using... (4 Replies)
Discussion started by: shadyuk
4 Replies

5. Shell Programming and Scripting

PHP Script Help - Making links to files Clickable

Ok so I wrote a php script that outputs the below to users on a webpage. # Download: /home/content/d/i/v/divine1234/eBookDownloads/ScalpRemedy_jablaa12734.zip the php code that outputs the above is: echo ("<li>Download: $download_link</li>\n"); The thing is, I dont want... (1 Reply)
Discussion started by: SkySmart
1 Replies

6. Shell Programming and Scripting

Script to check MD5SUM on file

Hi, I currently have a shell script that takes an RPM and scp's it to a set of remote servers and installs it. What I would like to be able to do is make the script get the md5sum of the RPM locally (so get the md5sum of the rpm from where im running the script) and then scp the rpm to the... (0 Replies)
Discussion started by: tb1986
0 Replies

7. Shell Programming and Scripting

Making script show command (e.g. copy) being executed and variable substitution?

When script is running you only see when some of the commands are not successfull. Is there a way to see which command are executed and to show the substitution of variables as every line is executed ? (3 Replies)
Discussion started by: gr0124
3 Replies

8. UNIX for Dummies Questions & Answers

Making a copy of an Magneto Optical Disk

We are trying to make duplicates of some Magneto Optical Disks that were created in Irix 6.5. The disks are 2.3 gig and the using a scsi MOD drive. Is there possbily a disk copy like in dos or some simple script to do this - any help appreciated. Thanks (0 Replies)
Discussion started by: drew_holm
0 Replies

9. Shell Programming and Scripting

Running md5sum on a list of files

Hello, I would like to run md5sum on a list of files saved in a text file, and save the result in another file. (ie. md5sum `cat list.txt` > md5list.txt) I have tried several things, but I am always confronted to the same problem: some of the filenames have spaces. I have run sed on the... (5 Replies)
Discussion started by: SDelroen
5 Replies

10. Solaris

making copy of 0 level dump via ufsdump

Hi how do u make "copy" of o level dump taken via ufsdumo in solaris? To elaborate, imagine you have taken a 0 level dump via the following command ufsdump 0ulf /dev/rmt/1n / and then again execute the same command to take a second 0 level dump Now take an incremental dump ufsdump 1ulf... (2 Replies)
Discussion started by: vishalsngh
2 Replies

Featured Tech Videos