Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

httpindex(1) [debian man page]

httpindex(1)						      General Commands Manual						      httpindex(1)

NAME
httpindex - HTTP front-end for SWISH++ indexer SYNOPSIS
wget [ options ] URL... 2>&1 | httpindex [ options ] DESCRIPTION
httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1). The files (in a copy of the remote direc- tory structure) can be kept, deleted, or replaced with their descriptions after indexing. OPTIONS
wget Options The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and -w. (See the EXAMPLE.) httpindex Options httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V. The following options are unique to httpindex: -d Replace the text of local copies of retrieved files with their descriptions after they have been indexed. This is useful to display file descriptions in search results without having to have complete copies of the remote files thus saving filesystem space. (See the extract_description() function in WWW(3) for details about how descriptions are extracted.) -D Delete the local copies of retrieved files after they have been indexed. This prevents your local filesystem from filling up with copies of remote files. EXAMPLE
To index all HTML and text files on a remote web server keeping descriptions locally: wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 | httpindex -d -e'html:*.html,text:*.txt' Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex. EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise. CAVEATS
In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options (because the Perl script uses the standard GetOpt::Std package for processing command-line options that doesn't). The last of any of those options ``wins.'' The work-around is to use multiple values for those options seperated by commas to a single one of those options. For example, if you want to do: httpindex -e'html:*.html' -e'text:*.txt' do this instead: httpindex -e'html:*.html,text:*.txt' SEE ALSO
index++(1), wget(1), WWW(3) AUTHOR
Paul J. Lucas <pauljlucas@mac.com> SWISH++ August 2, 2005 httpindex(1)

Check Out this Related Man Page

W3M(1)							      General Commands Manual							    W3M(1)

NAME
w3m - a text based Web browser and pager SYNOPSIS
w3m [options] [URL or filename] Use "w3m -h" to display a complete list of current options. DESCRIPTION
w3m is a World Wide Web (WWW) text based client. It has English and Japanese help files and an option menu and can be configured to use either language. It will display hypertext markup language (HTML) documents containing links to files residing on the local system, as well as files residing on remote systems. It can display HTML tables and frames. In addition, it can be used as a "pager" in much the same man- ner as "more" or "less". Current versions of w3m run on Unix (Solaris, SunOS, HP-UX, Linux, FreeBSD, and EWS4800) and on Microsoft Windows 9x/NT. OPTIONS
At start up, w3m will load any local file or remote URL specified at the command line. For help with runtime options, press "H" while run- ning w3m. Command line options are: -t tab set tab width -r ignore backspace effect -l line # of preserved line (default 10000) -s Shift_JIS -j JIS -e EUC-JP -B load bookmark -bookmark file specify bookmark file -T type specify content-type -m internet message mode -v visual startup mode -M monochrome display -F automatically render frame -dump dump formatted page into stdout -cols width specify column width (used with -dump) -ppc count specify the number of pixels per character (default 8.0) Larger values will make tables narrower. -dump_source dump page source into stdout -dump_head dump response of HEAD request into stdout -dump_both dump HEAD and source into stdout -dump_extra dump HEAD, source, and extra information into stdout -post file use POST method with file content -header string insert string as a header +<num> goto <num> line -num show line number -no-proxy don't use proxy -no-mouse don't use mouse -pauth user:pass proxy authentication -S squeeze multiple blank lines -W toggle wrap search mode -X don't use termcap init/deinit -o opt=value assign value to config option -config file specify config file -debug DO NOT USE EXAMPLES
To use w3m as a pager: $ ls | w3m To use w3m to translate HTML files: $ cat foo.html | w3m -T text/html or $ cat foo.html | w3m -dump -T text/html >foo.txt NOTES
This is the w3m 0.2.1 Release. Additional information about w3m may be found on its Japanese language Web site located at: http://w3m.sourceforge.net/index.ja.html or on its English version of the site at: http://w3m.sourceforge.net/index.en.html ACKNOWLEDGMENTS
w3m has incorporated code from several sources. Hans J. Boehm, Alan J. Demers, Xerox Corp. and Silicon Graphics have the copyright of the GC library comes with w3m package. Users have contributed patches and suggestions over time. AUTHOR
Akinori ITO <aito@fw.ipsj.or.jp> 4th Berkeley Distribution Local W3M(1)
Man Page