Wget - how to ignore files in immediate directory? Post: 302906028

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to ignore '.' files

I'm running Fedora Core 6 as an FTP server on a powerMac G4... I'm trying to create a script to remove files older than 3 days... I'm able to find all data older than 3 days but it finds hidden files such as /home/ftp/goossens/.canna /home/ftp/goossens/.kde...

2. Solaris

How to ignore incomplete files

On Solaris, suppose there is a directory 'dir'. Log files of size approx 1MB are continuously being deposited here by scp command. I have a script that scans this dir every 5 mins and moves away the log files that have been deposited so far. How do I design my script so that I pick up *only*...

3. UNIX for Advanced & Expert Users

Why is wget copying my directory tree with some files with "@"?

I'm using wget 1.11.4 on Cygwin 1.5.25. I'm trying to recursively download a directory tree, which is the root of a javadoc tree. This is approximately the command line I tried: wget -x -p -r http://<host>/.../apidoc When it finished, it seemed like it downloaded...

4. Shell Programming and Scripting

wget a directory structure question

Can you tell me how to download the directory tree just starting from "project1/" in this URL? "https://somesite.com/projects/t/project1/" This command does not seem to do what I want as it downloads also files from the upper hierarchy: wget --no-check-certificate --http-user=user...

5. Shell Programming and Scripting

Getting ls to ignore ~ and # files

Is there a way to customize ls to ignore files ending with ~ and #? (those are Emacs backup and auto-save files). I found -B option, which only ignores ~ files

6. Shell Programming and Scripting

Find: ignore directory completely

Hello, I know find can be prevented from recursing into directories with something like the following... find . -name .svn -prune -a type d But how can I completely prevent directories of a certain name (.svn) from being displayed at all, the top level and the children? I really...

7. Shell Programming and Scripting

Wget to ignore an IP address

Hello Unix Geeks, I am in a situation to use wget for crawling a site where the site contains 5 IP addresses. Out of 5, 4 are accessible and 1 is having a problem due to firewall problems. In this case, my wget is getting stuck with that X.X.X.X and giving up. How can I ignore this IP and...

8. Shell Programming and Scripting

Find command with ignore directory

Dear All, I am using find command find /my_rep/*/RKYPROOF/*/*/WDM/HOME_INT/PWD_DATA -name rk*myguidelines*.pdf -print The problem i am facing here is find /my_rep/*/ the directory after my_rep could be mice001, mice002 and mice001_PO, mice002_PO i want to ignore mice***_PO directory...

9. Shell Programming and Scripting

How to change wget download directory?

i have a cron that mirrors a site periodically wget -r -nc --passive-ftp ftp://user:pass@123.456.789.0 i want to download this into a directory called /files but when I do this, it always create a new directory called "123.456.789.0" (the hostname) it puts it into /files/123.456.789.0 but...

10. UNIX for Advanced & Expert Users

AIX find ignore directory

I am using aix. I would like to ignore the /u directory. I tried this but it is not working. find / -type f -type d \( -path /u \) -prune -o -name '*rpm*' 2>/dev/null /u/appx/ls.rpm /u/arch/vim.rpm

LEARN ABOUT DEBIAN

urifind

URIFIND(1p)						User Contributed Perl Documentation					       URIFIND(1p)

NAME

       urifind - find URIs in a document and dump them to STDOUT.

SYNOPSIS

	   $ urifind file

DESCRIPTION

       urifind is a simple script that finds URIs in one or more files (using "URI::Find"), and outputs them to to STDOUT.  That's it.

       To find all the URIs in file1, use:

	   $ urifind file1

       To find the URIs in multiple files, simply list them as arguments:

	   $ urifind file1 file2 file3

       urifind will read from "STDIN" if no files are given or if a filename of "-" is specified:

	   $ wget http://www.boston.com/ -O - | urifind

       When multiple files are listed, urifind prefixes each found URI with the file from which it came:

	   $ urifind file1 file2
	   file1: http://www.boston.com/index.html
	   file2: http://use.perl.org/

       This can be turned on for single files with the "-p" ("prefix") switch:

	   $urifind -p file3
	   file1: http://fsck.com/rt/

       It can also be turned off for multiple files with the "-n" ("no prefix") switch:

	   $ urifind -n file1 file2
	   http://www.boston.com/index.html
	   http://use.perl.org/

       By default, URIs will be displayed in the order found; to sort them ascii-betically, use the "-s" ("sort") option.  To reverse sort them,
       use the "-r" ("reverse") flag ("-r" implies "-s").

	   $ urifind -s file1 file2
	   http://use.perl.org/
	   http://www.boston.com/index.html
	   mailto:webmaster@boston.com

	   $ urifind -r file1 file2
	   mailto:webmaster@boston.com
	   http://www.boston.com/index.html
	   http://use.perl.org/

       Finally, urifind supports limiting the returned URIs by scheme or by arbitrary pattern, using the "-S" option (for schemes) and the "-P"
       option.	Both "-S" and "-P" can be specified multiple times:

	   $ urifind -S mailto file1
	   mailto:webmaster@boston.com

	   $ urifind -S mailto -S http file1
	   mailto:webmaster@boston.com
	   http://www.boston.com/index.html

       "-P" takes an arbitrary Perl regex.  It might need to be protected from the shell:

	   $ urifind -P 's?html?' file1
	   http://www.boston.com/index.html

	   $ urifind -P '.org' -S http file4
	   http://www.gnu.org/software/wget/wget.html

       Add a "-d" to have urifind dump the refexen generated from "-S" and "-P" to "STDERR".  "-D" does the same but exits immediately:

	   $ urifind -P '.org' -S http -D
	   $scheme = '^(http):'
	   @pats = ('^(http):', '.org')

       To remove duplicates from the results, use the "-u" ("unique") switch.

OPTION SUMMARY

       -s  Sort results.

       -r  Reverse sort results (implies -s).

       -u  Return unique results only.

       -n  Don't include filename in output.

       -p  Include filename in output (0 by default, but 1 if multiple files are included on the command line).

       -P $re
	   Print only lines matching regex '$re' (may be specified multiple times).

       -S $scheme
	   Only this scheme (may be specified multiple times).

       -h  Help summary.

       -v  Display version and exit.

       -d  Dump compiled regexes for "-S" and "-P" to "STDERR".

       -D  Same as "-d", but exit after dumping.

AUTHOR

       darren chamberlain <darren@cpan.org>

COPYRIGHT

       (C) 2003 darren chamberlain

       This library is free software; you may distribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

       URI::Find

perl v5.14.2							    2012-04-08							       URIFIND(1p)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to ignore '.' files

Discussion started by: James_UK

2. Solaris

How to ignore incomplete files

Discussion started by: sentak

3. UNIX for Advanced & Expert Users

Why is wget copying my directory tree with some files with "@"?

Discussion started by: dkarr

4. Shell Programming and Scripting

wget a directory structure question

Discussion started by: majormark