Visit Our UNIX and Linux User Community


Search Engine in C


 
Thread Tools Search this Thread
Top Forums Programming Search Engine in C
# 1  
Old 09-01-2009
Search Engine in C

Hello everybody,

I need help with this,
I need to design a CGI search engine in C but i have no idea on what or how to do it.

Do i have to open all the html files one by one and search for the given strings? i think this process will be slow, and will take too much of the server processing resources.

Please, give me some examples, some source code i can use to study. All i have found are PERL scripts, language which i don't like and do not understand.

Thanks!.
# 2  
Old 09-01-2009
Obvious question first: Why re-invent the wheel? ht://Dig might do what you need.

Other than that, go by logic. If you have to search for something (often), create a database of possible matches in certain intervals. That way you don't have to open all the files every time.

And what do you not like about Perl? It was written for effective text processing, so it might just be the right tool for a job like this.
# 3  
Old 09-01-2009
Thank you very much for your reply,

i tried htdig, but it has a problem with the customized search/results pages, it doesn't load the ".css" file and the page shows the results without the attributes specified in the css file. That's why i wanted to create a search engine by my own.

Believe me, i'm not crazy, i've searched in internet all over and there are no solutions, do you know how to solve the .css problem?

And thanks A LOT again man, i appreciate it.
# 4  
Old 09-01-2009
According to this, the configuration for the pages is pretty simple. Are you sure that the CSS file could be found?

But if that's your only problem, why start a new search engine? ht://Dig is Open Source, you can modify it to your hearts desire. Or just use the code as a starting point for your own.
# 5  
Old 09-01-2009
Yes, the page is correctly configured, if i load it directly in the URL it shows perfectly, but when it's loaded by htsearch.cgi, then it doesn't show the css attributes...

I know it's kinda insane to "reinvent the wheel", but its my desperate solution. i've tried everything with htdig, modified the header.html, footer.html, wrapper.html, nomatch.html, changed the $(common_dir) variable, copied the .css header everywhere, everything!

I think it might be that htsearch.cgi doesn't recognize the html "type="text/css" value, or something like that... i've set the "link href="file.css" in hundreds of ways... tried direct path "/srv/www/htdocs/htdig/file.css", local path "file.css", etc. copied it to the root directory, cgi-bin directory, htdig directory, pff...

And i can't use it as example because it's programmed in C++, and i don't have a clue of it...

Thank you very much pludi.
# 6  
Old 09-02-2009
htsearch.cgi probably couldn't care less about any HTML-Tags, as they are meant to be interpreted by the browser. When you copied the file.css around, did you try to access it directly from your browser? Did it load OK?

If you don't know C++, you can still use it as a starting point, as long as you can read it. You can at least get some ideas on how the search algorithm works and how the database is created/used.
# 7  
Old 09-02-2009
If I were doing it I would use wget or curl and a bit of logic.

Most search engines index the pages off-line, not in real-time as the pages are downloaded.

It is not trivial to write text classification code that indexed well for search and retrieval. If it was so easy, Google would not be so successful and the competition would be much greater.

Previous Thread | Next Thread
Test Your Knowledge in Computers #430
Difficulty: Medium
Variables in JavaScript can be defined using either the var, let, static or const keywords.
True or False?

5 More Discussions You Might Find Interesting

1. What is on Your Mind?

YouTube: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search

Getting a bit more comfortable making quick YT videos in 4K, here is: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search Console https://youtu.be/I6b9T2qcqFo (0 Replies)
Discussion started by: Neo
0 Replies

2. Homework & Coursework Questions

About search engine in unix

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: How to create a search engine in unix using commands?...Atleast guidelines to craete this search engine...Thank... (1 Reply)
Discussion started by: Sindhu R
1 Replies

3. Shell Programming and Scripting

About search engine in unix

Hello, How to create a search engine in unix using commands?...Atleast guidelines to craete this search engine...Thank you in advance. (10 Replies)
Discussion started by: Sindhu R
10 Replies

4. Web Development

Search Engine

Hey guys. I have a quick question. My friends and I are working on a search engine project that will hopefully be up and running by December of 2011. Here's my concern. What programs should I use to create the search engine. Thanks guys! :b: (9 Replies)
Discussion started by: OussenkoSearch
9 Replies

5. Shell Programming and Scripting

Search Engine

How do you write a search engline to search offline scripts? (3 Replies)
Discussion started by: hawaiifiver
3 Replies

Featured Tech Videos