Copy files into another directory


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Copy files into another directory
# 1  
Old 02-21-2012
Copy files into another directory

I have a folder will a lot of documents (pdf, xls, doc etc.) which users have uploaded but only 20% of them are currently linking from my html files. So my goal is to copy only the files which are linked in my html files from my Document directory into another directory.

Eg: My documents exist in /web/Documents (with sub-folders) and my html files exist in /web/html

My users were kind to me Smilie and made sure that they did both absolute linking and relative linking meaning they used <a href="Documents/***.doc"> and <a href="http://xxx.com/Documents/***.doc">

And of course not everyone was case-sensitive when linking.

Can someone help me in figuring out how do I accomplish this.

Thanks in advance.

---------- Post updated at 02:11 PM ---------- Previous update was at 12:22 PM ----------

I am able to get a huge list of all the links (irrelevant of whether they are within the domain or not) by using

perl -nle 'print " $&" if /(?<=href=")[^">]+/' *.html

This gives me a list of all my links within the folder.
eg:
Documents/....
mailto:
external website links
xxxx.html (other html documents within the domain)


How do I go further from here...
# 2  
Old 02-21-2012
Welcome to the forum.
1. Please post few lines from the HTML file.. lines containing both absolute linking and relative linking (preferably covering all possibilities that needs to be parsed)
2. And please use code tags for codes and data samples.
# 3  
Old 02-22-2012
Here are the examples of links in a html file. I have changed the webpage names and wordings but to give you a jist of what it would look like

Possible cases for documents are pdf, doc, docx, ppt, pptx, xls, xlsx, jpg. I want to be able to copy any of the above files into a separate directory (retaining the folder structure)

Absolute linking
Code:
<a href="http://mywebsite.com/Documents/comm_ed/regform.pdf">

Relative linking
Code:
<a href="Documents/PDF/handbook.pdf" target="_blank">
<a href-"Documents/htb/1112.doc" title="test" target="_blank">
<a href="Documents/life/2011-12 HANDBOOK.pdf">
<a href="documents/science/oral06R2.pdf">
<a href="Documents/arts&amp;letters/F 11 FINAL.doc">
<a href="documents/b_office/Office%20Change%20Request.doc">
<a href="../../html/Documents/htb/211_diverse.xls">


Note: There are space in the names of the pdf's and they use upper and lower case 'd'

External site ( I really don't care for this but it shows up in my query)
Code:
<a href="http://yahoo.com">

Email link ( I really don't care for this but it shows up in my query)
Code:
<a href="mailto:webmaster@mywebsite.com">

Linking to page within the site ( I really don't care for this but it shows up in my query)
Code:
<a href="anotherpage.html">



Ideally, I would like to be able to create directories and copy the files as well. Eg: if my list has Documents/PDF/document1.pdf I want to copy it to a location say in my destination 'copy' folder copy/Documents/PDF/document1.pdf

I am hoping to keep the directory hierarchy so I don't break any existing links in the html files.

Thank you so much for your help.
# 4  
Old 02-22-2012
Code:
#! /bin/bash
while IFS='"' read a file c
do
    echo $file | grep -qi 'documents'
    [ $? -ne 0 ] && continue
    file=`echo $file | sed 's/.*\(documents\/.*\)/\1/i'`
    mkdir -p copy/`dirname $file`
    cp $file copy/$file
done < inputfile.xml


Last edited by balajesuri; 02-22-2012 at 11:49 PM..
# 5  
Old 02-24-2012
Thank you. It's creating the directories but not copying the files.

---------- Post updated at 11:55 AM ---------- Previous update was at 11:16 AM ----------

I guess it helps if I give the error message Smilie

cp: cannot stat `Documents/htb/able%20Pop%20Blocker.pdf': No such file or directory

My deduction is that the %20 should be a space. If I manually copy it I would be doing something like
Documents/htb/able\ Pop\ Blocker.pdf

So how do I replace the %20 with \(space)
# 6  
Old 02-24-2012
Code:
sed -i 's/%20/ /g' file.html

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Copy the files in directory and sub folders as it is to another directory.

How to copy files from one directory to another directory with the subfolders copied. If i have folder1/sub1/sub2/* it needs to copy files to folder2/sub1/sub2/*. I do not want to create sub folders in folder2. Can copy command create them automatically? I tried cp -a and cp -R but did... (4 Replies)
Discussion started by: santosh2626
4 Replies

2. Red Hat

Unable to copy files due to many files in directory

I have directory that has some billion file inside , i tried copy some files for specific date but it's always did not respond for long time and did not give any result.. i tried everything with find command and also with xargs.. even this command find . -mtime -2 -print | xargs ls -d did not... (2 Replies)
Discussion started by: before4
2 Replies

3. Shell Programming and Scripting

Copy a number of files to a directory, then more to another

I can't find how to do this. I want to take a bulk of files, and copy/move a specific number of them (say 1000) to a newly created directory. Once that directory is full, I want to create a new folder and copy/move another batch of files, and so on. Seems like there should be an easy way to... (6 Replies)
Discussion started by: twjolson
6 Replies

4. UNIX for Dummies Questions & Answers

Copy files with same name but different extension from 2 different directory

Hi all, i have 2 directory of files, the first directory(ext1directory) contain files of extension .ext1 and the second directory(allextdirectory) contains files of multiple extensions (.ext1,.ext2,.ext3,..) so i want to copy the files from directory 2(allextdirectory) that have the same name... (8 Replies)
Discussion started by: shelladdict
8 Replies

5. UNIX for Dummies Questions & Answers

How to copy all files into the same directory

Dear All, Again I have another simple question. :confused: I want to write a csh which can copy all files of a current directory with a new name in the same directory, I mean: If I have tree bird apple as files in a directory I want to give ,say number 007 as argument to my csh and it copies... (3 Replies)
Discussion started by: dreamer0085
3 Replies

6. UNIX for Dummies Questions & Answers

Copy directory tree with files

Iam in the process of copying a directory with thousands of directories and files into a new directory. I need to preserve permissions, owner, group, date and timestamps, everything. Iam using AIX and would need help of writing the command whether it is cp-RP or cpio. Apprecaite your... (3 Replies)
Discussion started by: baanprog
3 Replies

7. Shell Programming and Scripting

copy files with new extension in same directory

I've been able to find all the extensionless files named photos using the command: find /usr/local/apache/htdocs -name photos -print0 I need to copy those files to the name photos.php in their same directory. I've found a bunch of xarg examples for moving to other directories but I wasn't... (7 Replies)
Discussion started by: dheian
7 Replies

8. Solaris

Copy files from the file to another directory

I have created a file that has list of all the files I want to copy into another directory.Is there a way to do it? Thanks In advance (4 Replies)
Discussion started by: shreethik
4 Replies

9. Shell Programming and Scripting

Copy files from one directory to another

Hi when copy the files from one directory to another as like below,it is tried to copy *. as a file. cp /home/rha/*. My objective is to copy all the files (don't care about case sensitive), Thanks in advance for your valuable reply. (1 Reply)
Discussion started by: HAA
1 Replies

10. Shell Programming and Scripting

Copy files from one directory to another

I need to copy about 13 Tb of data from one directory and subdirectories to the other (another mount point). If I run this as a cron, say between 10 pm and 7 am, not all of the files will be copied over. Is there a way of 'resuming' the copy the following evenings until all files are copied over? (0 Replies)
Discussion started by: hd2006
0 Replies
Login or Register to Ask a Question