html2stx(1) General Commands Manual html2stx(1)
NAME
html2stx - convert HTML documents into Stx
SYNOPSIS
html2stx [ file ]
DESCRIPTION
html2stx takes the given file, which should contain an HTML document, and converts it to structured text (Stx). If no file is given, stan-
dard input is read instead.
The program does not attempt to convert every possibly convertible piece of markup into Stx. For example, <font> tags are simply ignored.
This tends to result in a nice, clean, beautiful document. (If it doesn't, the source document probably does not contain enough informa-
tion to start with.)
OPTIONS
None.
DIAGNOSTICS
html2stx is a python script and will throw an exception if something goes amiss. In this case, the return value will be non-zero.
SEE ALSO
stx2any (1), Stx-ref.html
BUGS
o The word wrapping algorithm is probably not very clever.
o Sometimes there are extra linebreaks in the output.
o Probably many others.
AUTHOR
This manual page was written by Panu A. Kalliokoski.
html2stx is derived from the html2text utility by Aaron Swartz. html2text is a utility for converting html into "Markdown" structured
text; the changes required to make it work for Stx were done by Panu Kalliokoski.
Panu A. Kalliokoski html2stx(1)