Copy files into another directory

Thread Tools Search this Thread
# 1  
Copy files into another directory

I have a folder will a lot of documents (pdf, xls, doc etc.) which users have uploaded but only 20% of them are currently linking from my html files. So my goal is to copy only the files which are linked in my html files from my Document directory into another directory.

Eg: My documents exist in /web/Documents (with sub-folders) and my html files exist in /web/html

My users were kind to me Smilie and made sure that they did both absolute linking and relative linking meaning they used <a href="Documents/***.doc"> and <a href="***.doc">

And of course not everyone was case-sensitive when linking.

Can someone help me in figuring out how do I accomplish this.

Thanks in advance.

---------- Post updated at 02:11 PM ---------- Previous update was at 12:22 PM ----------

I am able to get a huge list of all the links (irrelevant of whether they are within the domain or not) by using

perl -nle 'print " $&" if /(?<=href=")[^">]+/' *.html

This gives me a list of all my links within the folder.
external website links
xxxx.html (other html documents within the domain)

How do I go further from here...
# 2  
Welcome to the forum.
1. Please post few lines from the HTML file.. lines containing both absolute linking and relative linking (preferably covering all possibilities that needs to be parsed)
2. And please use code tags for codes and data samples.
# 3  
Here are the examples of links in a html file. I have changed the webpage names and wordings but to give you a jist of what it would look like

Possible cases for documents are pdf, doc, docx, ppt, pptx, xls, xlsx, jpg. I want to be able to copy any of the above files into a separate directory (retaining the folder structure)

Absolute linking
<a href="">

Relative linking
<a href="Documents/PDF/handbook.pdf" target="_blank">
<a href-"Documents/htb/1112.doc" title="test" target="_blank">
<a href="Documents/life/2011-12 HANDBOOK.pdf">
<a href="documents/science/oral06R2.pdf">
<a href="Documents/arts&amp;letters/F 11 FINAL.doc">
<a href="documents/b_office/Office%20Change%20Request.doc">
<a href="../../html/Documents/htb/211_diverse.xls">

Note: There are space in the names of the pdf's and they use upper and lower case 'd'

External site ( I really don't care for this but it shows up in my query)
<a href="">

Email link ( I really don't care for this but it shows up in my query)
<a href="">

Linking to page within the site ( I really don't care for this but it shows up in my query)
<a href="anotherpage.html">

Ideally, I would like to be able to create directories and copy the files as well. Eg: if my list has Documents/PDF/document1.pdf I want to copy it to a location say in my destination 'copy' folder copy/Documents/PDF/document1.pdf

I am hoping to keep the directory hierarchy so I don't break any existing links in the html files.

Thank you so much for your help.
# 4  
#! /bin/bash
while IFS='"' read a file c
    echo $file | grep -qi 'documents'
    [ $? -ne 0 ] && continue
    file=`echo $file | sed 's/.*\(documents\/.*\)/\1/i'`
    mkdir -p copy/`dirname $file`
    cp $file copy/$file
done < inputfile.xml

Last edited by balajesuri; 02-22-2012 at 11:49 PM..
# 5  
Thank you. It's creating the directories but not copying the files.

---------- Post updated at 11:55 AM ---------- Previous update was at 11:16 AM ----------

I guess it helps if I give the error message Smilie

cp: cannot stat `Documents/htb/able%20Pop%20Blocker.pdf': No such file or directory

My deduction is that the %20 should be a space. If I manually copy it I would be doing something like
Documents/htb/able\ Pop\ Blocker.pdf

So how do I replace the %20 with \(space)
# 6  
sed -i 's/%20/ /g' file.html


Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Copy the files in directory and sub folders as it is to another directory.

How to copy files from one directory to another directory with the subfolders copied. If i have folder1/sub1/sub2/* it needs to copy files to folder2/sub1/sub2/*. I do not want to create sub folders in folder2. Can copy command create them automatically? I tried cp -a and cp -R but did... (4 Replies)
Discussion started by: santosh2626
4 Replies

2. Shell Programming and Scripting

Find and copy these files to particular directory

RedHat Enterprise Linux 5.4 I have some files with the extension .cdp in several directories in various mountpoints(filesystems) . I would like to find and copy all these files into a single directory /u03/diagnore/data. How can I do this ? (3 Replies)
Discussion started by: kraljic
3 Replies

3. Red Hat

Unable to copy files due to many files in directory

I have directory that has some billion file inside , i tried copy some files for specific date but it's always did not respond for long time and did not give any result.. i tried everything with find command and also with xargs.. even this command find . -mtime -2 -print | xargs ls -d did not... (2 Replies)
Discussion started by: before4
2 Replies

4. UNIX for Dummies Questions & Answers

Copy files with same name but different extension from 2 different directory

Hi all, i have 2 directory of files, the first directory(ext1directory) contain files of extension .ext1 and the second directory(allextdirectory) contains files of multiple extensions (.ext1,.ext2,.ext3,..) so i want to copy the files from directory 2(allextdirectory) that have the same name... (8 Replies)
Discussion started by: shelladdict
8 Replies

5. Shell Programming and Scripting

Copy files on a list to another directory

Hi. I have a list with file names like testfile1.wav testfile2.wav testfile3.wav and a folder that contains a large number of wav files (not only the ones on the list). I would like to copy the files whose names are on the list from the wav file directory to a new directory. I... (5 Replies)
Discussion started by: Bloomy
5 Replies

6. UNIX for Dummies Questions & Answers

How to copy all files into the same directory

Dear All, Again I have another simple question. :confused: I want to write a csh which can copy all files of a current directory with a new name in the same directory, I mean: If I have tree bird apple as files in a directory I want to give ,say number 007 as argument to my csh and it copies... (3 Replies)
Discussion started by: dreamer0085
3 Replies

7. Solaris

Copy files from the file to another directory

I have created a file that has list of all the files I want to copy into another directory.Is there a way to do it? Thanks In advance (4 Replies)
Discussion started by: shreethik
4 Replies

8. UNIX for Dummies Questions & Answers

copy files with directory structure

i have a text file as. /database/sp/NTR_Update_Imsi_List.sql /database/sp/NTR_Update_Imsi_Range_List.sql /database/sp/NTR_Vlr_Upload.sql /database/tables/StatsTables.sql /mib/ntr.mib /mib/ntr.v2.mib /scripts/operations/ntr/ /scripts/operations/ntr/ ... (3 Replies)
Discussion started by: adddy
3 Replies

9. Shell Programming and Scripting

Copy files from one directory to another

Hi when copy the files from one directory to another as like below,it is tried to copy *. as a file. cp /home/rha/*. My objective is to copy all the files (don't care about case sensitive), Thanks in advance for your valuable reply. (1 Reply)
Discussion started by: HAA
1 Replies

10. Shell Programming and Scripting

Copy files from one directory to another

I need to copy about 13 Tb of data from one directory and subdirectories to the other (another mount point). If I run this as a cron, say between 10 pm and 7 am, not all of the files will be copied over. Is there a way of 'resuming' the copy the following evenings until all files are copied over? (0 Replies)
Discussion started by: hd2006
0 Replies

Featured Tech Videos