Xapian(3pm) User Contributed Perl Documentation Xapian(3pm)
NAME
Search::Xapian - Perl XS frontend to the Xapian C++ search library.
SYNOPSIS
use Search::Xapian;
my $db = Search::Xapian::Database->new( '[DATABASE DIR]' );
my $enq = $db->enquire( '[QUERY TERM]' );
printf "Running query '%s'
", $enq->get_query()->get_description();
my @matches = $enq->matches(0, 10);
print scalar(@matches) . " results found
";
foreach my $match ( @matches ) {
my $doc = $match->get_document();
printf "ID %d %d%% [ %s ]
", $match->get_docid(), $match->get_percent(), $doc->get_data();
}
DESCRIPTION
This module wraps most methods of most Xapian classes. The missing classes and methods should be added in the future. It also provides a
simplified, more 'perlish' interface to some common operations, as demonstrated above.
There are some gaps in the POD documentation for wrapped classes, but you can read the Xapian C++ API documentation at
<http://xapian.org/docs/apidoc/html/annotated.html> for details of these. Alternatively, take a look at the code in the examples and
tests.
If you want to use Search::Xapian and the threads module together, make sure you're using Search::Xapian >= 1.0.4.0 and Perl >= 5.8.7. As
of 1.0.4.0, Search::Xapian uses CLONE_SKIP to make sure that the perl wrapper objects aren't copied to new threads - without this the
underlying C++ objects can get destroyed more than once.
If you encounter problems, or have any comments, suggestions, patches, etc please email the Xapian-discuss mailing list (details of which
can be found at <http://xapian.org/lists>).
EXPORT
None by default.
:db
DB_OPEN
Open a database, fail if database doesn't exist.
DB_CREATE
Create a new database, fail if database exists.
DB_CREATE_OR_OPEN
Open an existing database, without destroying data, or create a new database if one doesn't already exist.
DB_CREATE_OR_OVERWRITE
Overwrite database if it exists.
:ops
OP_AND
Match if both subqueries are satisfied.
OP_OR
Match if either subquery is satisfied.
OP_AND_NOT
Match if left but not right subquery is satisfied.
OP_XOR
Match if left or right, but not both queries are satisfied.
OP_AND_MAYBE
Match if left is satisfied, but use weights from both.
OP_FILTER
Like OP_AND, but only weight using the left query.
OP_NEAR
Match if the words are near each other. The window should be specified, as a parameter to "Search::Xapian::Query::Query", but it
defaults to the number of terms in the list.
OP_PHRASE
Match as a phrase (All words in order).
OP_ELITE_SET
Select an elite set from the subqueries, and perform a query with these combined as an OR query.
OP_VALUE_RANGE
Filter by a range test on a document value.
:qpflags
FLAG_DEFAULT
This gives the QueryParser default flag settings, allowing you to easily add flags to the default ones.
FLAG_BOOLEAN
Support AND, OR, etc and bracketed subexpressions.
FLAG_LOVEHATE
Support + and -.
FLAG_PHRASE
Support quoted phrases.
FLAG_BOOLEAN_ANY_CASE
Support AND, OR, etc even if they aren't in ALLCAPS.
FLAG_WILDCARD
Support right truncation (e.g. Xap*).
FLAG_PURE_NOT
Allow queries such as 'NOT apples'.
These require the use of a list of all documents in the database which is potentially expensive, so this feature isn't enabled by
default.
FLAG_PARTIAL
Enable partial matching.
Partial matching causes the parser to treat the query as a "partially entered" search. This will automatically treat the final word as
a wildcarded match, unless it is followed by whitespace, to produce more stable results from interactive searches.
FLAG_SPELLING_CORRECTION
FLAG_SYNONYM
FLAG_AUTO_SYNONYMS
FLAG_AUTO_MULTIWORD_SYNONYMS
:qpstem
STEM_ALL
Stem all terms.
STEM_NONE
Don't stem any terms.
STEM_SOME
Stem some terms, in a manner compatible with Omega (capitalised words and those in phrases aren't stemmed).
:enq_order
ENQ_ASCENDING
docids sort in ascending order (default)
ENQ_DESCENDING
docids sort in descending order
ENQ_DONT_CARE
docids sort in whatever order is most efficient for the backend
:standard
Standard is db + ops + qpflags + qpstem
Version functions
major_version
Returns the major version of the Xapian C++ library being used. E.g. for Xapian 1.0.9 this would return 1.
minor_version
Returns the minor version of the Xapian C++ library being used. E.g. for Xapian 1.0.9 this would return 0.
revision
Returns the revision of the Xapian C++ library being used. E.g. for Xapian 1.0.9 this would return 9. In a stable release series,
Xapian libraries with the same minor and major versions are usually ABI compatible, so this often won't match the third component of
$Search::Xapian::VERSION (which is the version of the Search::Xapian XS wrappers).
Numeric encoding functions
sortable_serialise NUMBER
Convert a floating point number to a string, preserving sort order.
This method converts a floating point number to a string, suitable for using as a value for numeric range restriction, or for use as a
sort key.
The conversion is platform independent.
The conversion attempts to ensure that, for any pair of values supplied to the conversion algorithm, the result of comparing the
original values (with a numeric comparison operator) will be the same as the result of comparing the resulting values (with a string
comparison operator). On platforms which represent doubles with the precisions specified by IEEE_754, this will be the case: if the
representation of doubles is more precise, it is possible that two very close doubles will be mapped to the same string, so will
compare equal.
Note also that both zero and -zero will be converted to the same representation: since these compare equal, this satisfies the
comparison constraint, but it's worth knowing this if you wish to use the encoding in some situation where this distinction matters.
Handling of NaN isn't (currently) guaranteed to be sensible.
sortable_unserialise SERIALISED_NUMBER
Convert a string encoded using sortable_serialise back to a floating point number.
This expects the input to be a string produced by sortable_serialise(). If the input is not such a string, the value returned is
undefined (but no error will be thrown).
The result of the conversion will be exactly the value which was supplied to sortable_serialise() when making the string on platforms
which represent doubles with the precisions specified by IEEE_754, but may be a different (nearby) value on other platforms.
TODO
Error Handling
Error handling for all methods liable to generate them.
Documentation
Add POD documentation for all classes, where possible just adapted from Xapian docs.
Unwrapped classes
The following Xapian classes are not yet wrapped: Error (and subclasses), ErrorHandler, standard ExpandDecider subclasses (user-defined
ones works), user-defined weight classes.
We don't yet wrap Xapian::Query::MatchAll, Xapian::Query::MatchNothing, or Xapian::BAD_VALUENO.
Unwrapped methods
The following methods are not yet wrapped: Enquire::get_eset(...) with more than two arguments, Query ctor optional "parameter"
parameter, Remote::open(...), static Stem::get_available_languages().
We wrap MSet::swap() and MSet::operator[](), but not ESet::swap(), ESet::operator[](). Is swap actually useful? Should we instead tie
MSet and ESet to allow them to just be used as lists?
CREDITS
Thanks to Tye McQueen <tye@metronet.com> for explaining the finer points of how best to write XS frontends to C++ libraries, James Aylett
<james@tartarus.org> for clarifying the less obvious aspects of the Xapian API, Tim Brody for patches wrapping ::QueryParser and ::Stopper
and especially Olly Betts <olly@survex.com> for contributing advice, bugfixes, and wrapper code for the more obscure classes.
AUTHOR
Alex Bowley <kilinrax@cpan.org>
Please report any bugs/suggestions to <xapian-discuss@lists.xapian.org> or use the Xapian bug tracker <http://xapian.org/bugs>. Please do
NOT use the CPAN bug tracker or mail any of the authors individually.
SEE ALSO
Search::Xapian::BM25Weight, Search::Xapian::BoolWeight, Search::Xapian::Database, Search::Xapian::Document, Search::Xapian::Enquire,
Search::Xapian::MultiValueSorter, Search::Xapian::PositionIterator, Search::Xapian::PostingIterator, Search::Xapian::QueryParser,
Search::Xapian::Stem, Search::Xapian::TermGenerator, Search::Xapian::TermIterator, Search::Xapian::TradWeight,
Search::Xapian::ValueIterator, Search::Xapian::Weight, Search::Xapian::WritableDatabase, and <http://xapian.org/>.
perl v5.14.2 2012-05-09 Xapian(3pm)