MediaWiki::Bot (previously Perlwikipedia) is a Perl module designed to interface with the Wikipedia online encyclopedia and other MediaWiki-powered sites.
I created a mediawiki page and it was completely working and I had multiple pages within it. When I edited the $wgLogo = " "; to something I wanted (I put the link to the picture within " ")
I had to edit the logo on the top right but now I can't even load any of my pages, I may have... (10 Replies)
I created a mediawiki page and it was completely working and I had multiple pages within it. When I edited the $wgLogo = " "; to something I wanted (I put the link to the picture within " ")
I had to edit the logo on the top right but now I can't even load any of my pages, I may have touched... (1 Reply)
Parse::MediaWikiDump::Revisions(3pm) User Contributed Perl Documentation Parse::MediaWikiDump::Revisions(3pm)NAME
Parse::MediaWikiDump::Revisions - Object capable of processing dump files with multiple revisions per article
ABOUT
This object is used to access the metadata associated with a MediaWiki instance and provide an iterative interface for extracting the
indidivual article revisions out of the same. To gurantee that there is only a single revision per article use the
Parse::MediaWikiDump::Revisions object.
SYNOPSIS
$pmwd = Parse::MediaWikiDump->new;
$revisions = $pmwd->revisions('pages-articles.xml');
$revisions = $pmwd->revisions(*FILEHANDLE);
#print the title and id of each article inside the dump file
while(defined($page = $revisions->next)) {
print "title '", $page->title, "' id ", $page->id, "
";
}
STATUS
This software is being RETIRED - MediaWiki::DumpFile is the official successor to Parse::MediaWikiDump and includes a compatibility library
called MediaWiki::DumpFile::Compat that is 100% API compatible and is a near perfect standin for this module. It is faster in all instances
where it counts and is actively maintained. Any undocumented deviation of MediaWiki::DumpFile::Compat from Parse::MediaWikiDump is
considered a bug and will be fixed.
METHODS
$revisions->new
Open the specified MediaWiki dump file. If the single argument to this method is a string it will be used as the path to the file to
open. If the argument is a reference to a filehandle the contents will be read from the filehandle as specified.
$revisions->next
Returns an instance of the next available Parse::MediaWikiDump::page object or returns undef if there are no more articles left.
$revisions->version
Returns a plain text string of the dump file format revision number
$revisions->sitename
Returns a plain text string that is the name of the MediaWiki instance.
$revisions->base
Returns the URL to the instances main article in the form of a string.
$revisions->generator
Returns a string containing 'MediaWiki' and a version number of the instance that dumped this file. Example: 'MediaWiki 1.14alpha'
$revisions->case
Returns a string describing the case sensitivity configured in the instance.
$revisions->namespaces
Returns a reference to an array of references. Each reference is to another array with the first item being the unique identifier of
the namespace and the second element containing a string that is the name of the namespace.
$revisions->namespaces_names
Returns an array reference the array contains strings of all the namespaces each as an element.
$revisions->current_byte
Returns the number of bytes that has been processed so far
$revisions->size
Returns the total size of the dump file in bytes.
EXAMPLE
Extract the article text of each revision of an article using a given title
#!/usr/bin/perl
use strict;
use warnings;
use Parse::MediaWikiDump;
my $file = shift(@ARGV) or die "must specify a MediaWiki dump of the current pages";
my $title = shift(@ARGV) or die "must specify an article title";
my $pmwd = Parse::MediaWikiDump->new;
my $dump = $pmwd->revisions($file);
my $found = 0;
binmode(STDOUT, ':utf8');
binmode(STDERR, ':utf8');
#this is the only currently known value but there could be more in the future
if ($dump->case ne 'first-letter') {
die "unable to handle any case setting besides 'first-letter'";
}
$title = case_fixer($title);
while(my $revision = $dump->next) {
if ($revision->title eq $title) {
print STDERR "Located text for $title revision ", $revision->revision_id, "
";
my $text = $revision->text;
print $$text;
$found = 1;
}
}
print STDERR "Unable to find article text for $title
" unless $found;
exit 1;
#removes any case sensativity from the very first letter of the title
#but not from the optional namespace name
sub case_fixer {
my $title = shift;
#check for namespace
if ($title =~ /^(.+?):(.+)/) {
$title = $1 . ':' . ucfirst($2);
} else {
$title = ucfirst($title);
}
return $title;
}
LIMITATIONS
Version 0.4
This class was updated to support version 0.4 dump files from a MediaWiki instance but it does not currently support any of the new
information available in those files.
perl v5.10.1 2010-12-05 Parse::MediaWikiDump::Revisions(3pm)