Query: bloom::filter
OS: debian
Section: 3pm
Format: Original Unix Latex Style Formatted with HTML and a Horizontal Scroll Bar
Filter(3pm) User Contributed Perl Documentation Filter(3pm)NAMEBloom::Filter - Sample Perl Bloom filter implementationDESCRIPTIONA Bloom filter is a probabilistic algorithm for doing existence tests in less memory than a full list of keys would require. The tradeoff to using Bloom filters is a certain configurable risk of false positives. This module implements a simple Bloom filter with configurable capacity and false positive rate. Bloom filters were first described in a 1970 paper by Burton Bloom, see <http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal>.SYNOPSISuse Bloom::Filter my $bf = Bloom::Filter->new( capacity => 10, error_rate => .001 ); $bf->add( @keys ); while ( <> ) { chomp; print "Found $_ " if $bf->check( $_ ); }CONSTRUCTORSnew %PARAMS Create a brand new instance. Allowable params are "error_rate", "capacity". init Calculates the best number of hash functions and optimum filter length, creates some random salts, and generates a blank bit vector. Called automatically by constructor.ACCESSORScapacity Returns the total capacity of the Bloom filter error_rate Returns the configured maximum error rate length Returns the length of the Bloom filter in bits key_count Returns the number of items currently stored in the filter on_bits Returns the number of 'on' bits in the filter salts Returns the list of salts used to create the hash functionsPUBLIC METHODSadd @KEYS Adds the list of keys to the filter. Will fail, return "undef" and complain if the number of keys in the filter exceeds the configured capacity. check @KEYS Checks the provided key list against the Bloom filter, and returns a list of equivalent length, with true or false values depending on whether there was a match.INTERNAL METHODS_calculate_shortest_filter_length CAPACITY ERR_RATE Given a desired error rate and maximum capacity, returns the optimum combination of vector length (in bits) and number of hash functions to use in building the filter, where "optimum" means shortest vector length. _get_cells KEY Given a key, hashes it using the list of salts and returns an array of cell indexes corresponding to the key.AUTHORMaciej Ceglowski <maciej@ceglowski.com>CHANGELOGFeb 2007 big speedup by Dmitriy Ryaboy <dmitriy.ryaboy@ask.com> (thanks!)COPYRIGHT AND LICENSE(c) 2004 Maciej Ceglowski This is free software, distributed under version 2 of the GNU Public License (GPL). perl v5.12.3 2011-05-13 Filter(3pm)
Related Man Pages |
---|
grossd(8) - debian |
mail::filter(3) - suse |
mail::filter(3) - osx |
html::formfu::filter(3pm) - debian |
rdf::query::algebra::service(3pm) - debian |
Similar Topics in the Unix Linux Community |
---|
intent: df -kh | filter based on capacity (used space) column where % > 85 |
Building concerns regarding his patiently |