SGMLS(3pm) User Contributed Perl Documentation SGMLS(3pm)
NAME
SGMLS - class for postprocessing the output from the sgmls and nsgmls parsers.
SYNOPSIS
use SGMLS;
my $parse = new SGMLS(STDIN);
my $event = $parse->next_event;
while ($event) {
SWITCH: {
($event->type eq 'start_element') && do {
my $element = $event->data; # An object of class SGMLS_Element
[[your code for the beginning of an element]]
last SWITCH;
};
($event->type eq 'end_element') && do {
my $element = $event->data; # An object of class SGMLS_Element
[[your code for the end of an element]]
last SWITCH;
};
($event->type eq 'cdata') && do {
my $cdata = $event->data; # A string
[[your code for character data]]
last SWITCH;
};
($event->type eq 'sdata') && do {
my $sdata = $event->data; # A string
[[your code for system data]]
last SWITCH;
};
($event->type eq 're') && do {
[[your code for a record end]]
last SWITCH;
};
($event->type eq 'pi') && do {
my $pi = $event->data; # A string
[[your code for a processing instruction]]
last SWITCH;
};
($event->type eq 'entity') && do {
my $entity = $event->data; # An object of class SGMLS_Entity
[[your code for an external entity]]
last SWITCH;
};
($event->type eq 'start_subdoc') && do {
my $entity = $event->data; # An object of class SGMLS_Entity
[[your code for the beginning of a subdoc entity]]
last SWITCH;
};
($event->type eq 'end_subdoc') && do {
my $entity = $event->data; # An object of class SGMLS_Entity
[[your code for the end of a subdoc entity]]
last SWITCH;
};
($event->type eq 'conforming') && do {
[[your code for a conforming document]]
last SWITCH;
};
die "Internal error: unknown event type " . $event->type . "
";
}
$event = $parse->next_event;
}
DESCRIPTION
The SGMLS package consists of several related classes: see "SGMLS", "SGMLS_Event", "SGMLS_Element", "SGMLS_Attribute", "SGMLS_Notation",
and "SGMLS_Entity". All of these classes are available when you specify
use SGMLS;
Generally, the only object which you will create explicitly will belong to the "SGMLS" class; all of the others will then be created auto-
matically for you over the course of the parse. Much fuller documentation is available in the ".sgml" files in the "DOC/" directory of the
"SGMLS.pm" distribution.
The "SGMLS" class
This class holds a single parse. When you create an instance of it, you specify a file handle as an argument (if you are reading the out-
put of sgmls or nsgmls from a pipe, the file handle will ordinarily be "STDIN"):
my $parse = new SGMLS(STDIN);
The most important method for this class is "next_event", which reads and returns the next major event from the input stream. It is impor-
tant to note that the "SGMLS" class deals with most ESIS events itself: attributes and entity definitions, for example, are collected and
stored automatically and invisibly to the user. The following list contains all of the methods for the "SGMLS" class:
"next_event()": Return an "SGMLS_Event" object containing the next major event from the SGML parse.
"element()": Return an "SGMLS_Element" object containing the current element in the document.
"file()": Return a string containing the name of the current SGML source file (this will work only if the "-l" option was given to sgmls or
nsgmls).
"line()": Return a string containing the current line number from the source file (this will work only if the "-l" option was given to
sgmls or nsgmls).
"appinfo()": Return a string containing the "APPINFO" parameter (if any) from the SGML declaration.
"notation(NNAME)": Return an "SGMLS_Notation" object representing the notation named "NNAME". With newer versions of nsgmls, all notations
are available; otherwise, only the notations which are actually used will be available.
"entity(ENAME)": Return an "SGMLS_Entity" object representing the entity named "ENAME". With newer versions of nsgmls, all entities are
available; otherwise, only external data entities and internal entities used as attribute values will be available.
"ext()": Return a reference to an associative array for user-defined extensions.
The "SGMLS_Event" class
This class holds a single major event, as generated by the "next_event" method in the "SGMLS" class. It uses the following methods:
"type()": Return a string describing the type of event: "start_element", "end_element", "cdata", "sdata", "re", "pi", "entity", "start_sub-
doc", "end_subdoc", and "conforming". See "SYNOPSIS", above, for the values associated with each of these.
"data()": Return the data associated with the current event (if any). For "start_element" and "end_element", returns an "SGMLS_ELement"
object; for "entity", "start_subdoc", and "end_subdoc", returns an "SGMLS_Entity" object; for "cdata", "sdata", and "pi", returns a string;
and for "re" and "conforming", returns the empty string. See "SYNOPSIS", above, for an example of this method's use.
"key()": Return a string key to the event, such as an element or entity name (otherwise, the same as "data()").
"file()": Return the current file name, as in the "SGMLS" class.
"line()": Return the current line number, as in the "SGMLS" class.
"element()": Return the current element, as in the "SGMLS" class.
"parse()": Return the "SGMLS" object which generated the event.
"entity(ENAME)": Look up an entity, as in the "SGMLS" class.
"notation(ENAME)": Look up a notation, as in the "SGMLS" class.
"ext()": Return a reference to an associative array for user-defined extensions.
The "SGMLS_Element" class
This class is used for elements, and contains all associated information (such as the element's attributes). It recognises the following
methods:
"name()": Return a string containing the name, or Generic Identifier, of the element, in upper case.
"parent()": Return the "SGMLS_Element" object for the element's parent (if any).
"parse()": Return the "SGMLS" object for the current parse.
"attributes()": Return a reference to an associative array of attribute names and "SGMLS_Attribute" structures. Attribute names will be
all in upper case.
"attribute_names()": Return an array of strings containing the names of all attributes defined for the current element, in upper case.
"attribute(ANAME)": Return the "SGMLS_Attribute" structure for the attribute "ANAME".
"set_attribute(ATTRIB)": Add the "SGMLS_Attribute" object "ATTRIB" to the current element, replacing any other attribute structure with the
same name.
"in(GI)": Return "true" (ie. 1) if the string "GI" is the name of the current element's parent, or "false" (ie. 0) if it is not.
"within(GI)": Return "true" (ie. 1) if the string "GI" is the name of any of the ancestors of the current element, or "false" (ie. 0) if it
is not.
"ext()": Return a reference to an associative array for user-defined extensions.
The "SGMLS_Attribute" class
Each instance of an attribute for each "SGMLS_Element" is an object belonging to this class, which recognises the following methods:
"name()": Return a string containing the name of the current attribute, all in upper case.
"type()": Return a string containing the type of the current attribute, all in upper case. Available types are "IMPLIED", "CDATA", "NOTA-
TION", "ENTITY", and "TOKEN".
"value()": Return the value of the current attribute, if any. This will be an empty string if the type is "IMPLIED", a string of some sort
if the type is "CDATA" or "TOKEN" (if it is "TOKEN", you may want to split the string into a series of separate tokens), an "SGMLS_Nota-
tion" object if the type is "NOTATION", or an "SGMLS_Entity" object if the type is "ENTITY". Note that if the value is "CDATA", it will
not have escape sequences for 8-bit characters, record ends, or SDATA processed -- that will be your responsibility.
"is_implied()": Return "true" (ie. 1) if the value of the attribute is implied, or "false" (ie. 0) if it is specified in the document.
"set_type(TYPE)": Change the type of the attribute to the string "TYPE" (which should be all in upper case). Available types are
"IMPLIED", "CDATA", "NOTATION", "ENTITY", and "TOKEN".
"set_value(VALUE)": Change the value of the attribute to "VALUE", which may be a string, an "SGMLS_Entity" object, or an "SGMLS_Notation"
subject, depending on the attribute's type.
"ext()": Return a reference to an associative array available for user-defined extensions.
The "SGMLS_Notation" class
All declared notations appear as objects belonging to this class, which recognises the following methods:
"name()": Return a string containing the name of the notation.
"sysid()": Return a string containing the system identifier of the notation, if any.
"pubid()": Return a string containing the public identifier of the notation, if any.
"ext()": Return a reference to an associative array available for user-defined extensions.
The "SGMLS_Entity" class
All declared entities appear as objects belonging to this class, which recognises the following methods:
"name()": Return a string containing the name of the entity, in mixed case.
"type()": Return a string containing the type of the entity, in upper case. Available types are "CDATA", "SDATA", "NDATA" (external enti-
ties only), "SUBDOC", "PI" (newer versions of nsgmls only), or "TEXT" (newer versions of nsgmls only).
"value()": Return a string containing the value of the entity, if it is internal.
"sysid()": Return a string containing the system identifier of the entity (if any), if it is external.
"pubid()": Return a string containing the public identifier of the entity (if any), if it is external.
"filenames()": Return an array of strings containing any file names generated from the identifiers, if the entity is external.
"notation()": Return the "SGMLS_Notation" object associated with the entity, if it is external.
"data_attributes()": Return a reference to an associative array of data attribute names (in upper case) and the associated
"SGMLS_Attribute" objects for the current entity.
"data_attribute_names()": Return an array of data attribute names (in upper case) for the current entity.
"data_attribute(ANAME)": Return the "SGMLS_Attribute" object for the data attribute named "ANAME" for the current entity.
"set_data_attribute(ATTRIB)": Add the "SGMLS_Attribute" object "ATTRIB" to the current entity, replacing any other data attribute with the
same name.
"ext()": Return a reference to an associative array for user-defined extensions.
AUTHOR AND COPYRIGHT
Copyright 1994 and 1995 by David Megginson, "dmeggins@aix1.uottawa.ca". Distributed under the terms of the Gnu General Public License
(version 2, 1991) -- see the file "COPYING" which is included in the SGMLS.pm distribution.
SEE ALSO
:
SGMLS::Output and SGMLS::Refs.
perl v5.8.8 2004-02-22 SGMLS(3pm)