Sponsored Content
Full Discussion: Search Engine
Top Forums Web Development Search Engine Post 302486102 by pludi on Friday 7th of January 2011 03:22:33 AM
Old 01-07-2011
I'd say you need (at least) 3 components:
  1. A crawler that downloads pages, and follows links on those pages.
  2. An indexer that builds a list of words used on each page (maybe in relation to other words nearby), and saves that to a database.
  3. A front-end to query the database.
For the crawler you can use just about any language since the main limitation is the network speed. For the indexer I'd recommend either C/C++ (for speed) or a language geared towards natural language processing (like Perl). For the front-end you can again choose whatever language you're comfortable with.
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search Engine

How do you write a search engline to search offline scripts? (3 Replies)
Discussion started by: hawaiifiver
3 Replies

2. Programming

Search Engine in C

Hello everybody, I need help with this, I need to design a CGI search engine in C but i have no idea on what or how to do it. Do i have to open all the html files one by one and search for the given strings? i think this process will be slow, and will take too much of the server processing... (11 Replies)
Discussion started by: semash!
11 Replies

3. Shell Programming and Scripting

About search engine in unix

Hello, How to create a search engine in unix using commands?...Atleast guidelines to craete this search engine...Thank you in advance. (10 Replies)
Discussion started by: Sindhu R
10 Replies

4. Homework & Coursework Questions

About search engine in unix

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: How to create a search engine in unix using commands?...Atleast guidelines to craete this search engine...Thank... (1 Reply)
Discussion started by: Sindhu R
1 Replies

5. What is on Your Mind?

YouTube: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search

Getting a bit more comfortable making quick YT videos in 4K, here is: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search Console https://youtu.be/I6b9T2qcqFo (0 Replies)
Discussion started by: Neo
0 Replies
WWW::Wikipedia(3pm)					User Contributed Perl Documentation				       WWW::Wikipedia(3pm)

NAME
WWW::Wikipedia - Automated interface to the Wikipedia SYNOPSIS
use WWW::Wikipedia; my $wiki = WWW::Wikipedia->new(); ## search for 'perl' my $result = $wiki->search( 'perl' ); ## if the entry has some text print it out if ( $result->text() ) { print $result->text(); } ## list any related items we can look up print join( " ", $result->related() ); DESCRIPTION
WWW::Wikipedia provides an automated interface to the Wikipedia <http://www.wikipedia.org>, which is a free, collaborative, online encyclopedia. This module allows you to search for a topic and return the resulting entry. It also gives you access to related topics which are also available via the Wikipedia for that entry. INSTALLATION
To install this module type the following: perl Makefile.PL make make test make install METHODS
new() The constructor. You can pass it a two letter language code, or nothing to let it default to 'en'. ## Default: English my $wiki = WWW::Wikipedia->new(); ## use the French wiki instead my $wiki = WWW::Wikipedia->new( language => 'fr' ); WWW::Wikipedia is a subclass of LWP::UserAgent. If you would like to have more control over the user agent (control timeouts, proxies ...) you have full access. ## set HTTP request timeout my $wiki = WWW::Wikipedia->new(); $wiki->timeout( 2 ); You can turn off the following of wikipedia redirect directives by passing a false value to "follow_redirects". language() This allows you to get and set the language you want to use. Two letter language codes should be used. The default is 'en'. my $wiki = WWW::Wikipedia->new( language => 'es' ); # Later on... $wiki->language( 'fr' ); follow_redirects() By default, wikipeda redirect directives are followed. Set this to false to turn that off. search() Which performs the search and returns a WWW::Wikipedia::Entry object which you can query further. See WWW::Wikipedia::Entry docs for more info. $entry = $wiki->search( 'Perl' ); print $entry->text(); If there's a problem connecting to Wikipedia, "undef" will be returned and the error message will be stored in "error()". random() This method fetches a random wikipedia page. error() This is a generic error accessor/mutator. You can retrieve any searching error messages here. TODO
o Clean up results. Strip HTML. o Watch the development of Special:Export XML formatting, eg: http://en.wikipedia.org/wiki/Special:Export/perl SEE ALSO
o LWP::UserAgent AUTHORS
Ed Summers <ehs@pobox.com> Brian Cassidy <bricas@cpan.org> COPYRIGHT AND LICENSE
Copyright 2003-2011 by Ed Summers This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.10.1 2011-04-05 WWW::Wikipedia(3pm)
All times are GMT -4. The time now is 04:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy