Sponsored Content
The Lounge What is on Your Mind? Similar Threads for Man Pages - In Development Post 303042548 by Neo on Saturday 28th of December 2019 11:24:00 PM
Old 12-29-2019
Similar Threads for Man Pages - In Development

FYI,

I have been quietly updating the man page database adding "similar threads" for man pages.

STEP 1: Full Text MySQL DB Search Matches

The first step, after creating the DB columns, was to process each of the nearly 400K man pages and do a full text mysql search, match and score against each post in the DB and get the top 15 threadids matched (or less than 15, based on the matches and scores).

That process took a few days and resulted in around one third (forgot to record the stats at that point) of the man page entries having similar thread entries.

STEP2: Cross Reference Similar Man Pages in Thread DB Back to Man Page Entries

Then, for the remaining man pages with no entries from the process above (step 1), I took the similarman entries for each thread and did a simple boolean match for man page ids associated with each similar man page (created a number of weeks ago) and created a list of thread matches ordered by the thread reply count in the DB. That process will complete today (in about 3 hours from now, give or take) and there will remain a lot of man pages with no matches based on steps 1 and 2.

STEP3: Boolean Matches Man Page Name with Thread Tags

Then, I will take the remaining man pages without any similar threads and repeat step two matching the name of the man page (only the query, for example 'sshd') against the tags for each thread, and order the matches by thread reply count, and keep up to 15 matches, as before.

After that, I will look at the remaining unmatched man pages to threads and decide what match I can try next.

The purpose of all is to create more relevant content for each man page in the DB, providing users with a list of discussion threads related to the man page; hence as the idea implies "similar threads for man pages". In addition, this could help SEO, as Google is only including between 10 and 15% of our entire man page collection in their index of our man pages. I would like to increase this percentage in 2020 to closer to 25 to 40%.

Currently, there are a few hours remaining for step 2:

Code:
1577593027 Time: 54 Inserts: 116 Floor: 6000 Limit: 300 ToDo: 64839 RemainingTime: 3.6 Hours QLoad: 1.06
1577593080 Time: 55 Inserts: 103 Floor: 6000 Limit: 300 ToDo: 64548 RemainingTime: 3.6 Hours QLoad: 1.17
1577593138 Time: 53 Inserts: 110 Floor: 6000 Limit: 300 ToDo: 64248 RemainingTime: 3.6 Hours QLoad: 1.27
1577593196 Time: 53 Inserts: 108 Floor: 6000 Limit: 300 ToDo: 63948 RemainingTime: 3.6 Hours QLoad: 1.23
1577593257 Time: 53 Inserts: 98 Floor: 6000 Limit: 300 ToDo: 63648 RemainingTime: 3.5 Hours QLoad: 1.04
1577593332 Time: 54 Inserts: 108 Floor: 6000 Limit: 300 ToDo: 63344 RemainingTime: 3.5 Hours QLoad: 1.01

After step 2 is done, I will start step 3 (but I will remember to record a few simple stats before I start step 3).
This User Gave Thanks to Neo For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Man pages

Hello , I just installed openssh in my system . I actually tried to man sshd but it says no entry , though there is a man directory in the installation which have the man pages for sshd . Can anyone tell me how should i install these man pages . DP (2 Replies)
Discussion started by: DPAI
2 Replies

2. UNIX for Dummies Questions & Answers

man pages

Hi, I've written now a man pages, but I don't knwo how to get 'man' to view them. Where have I to put this files, which directories are allowed?? THX Bensky (3 Replies)
Discussion started by: bensky
3 Replies

3. UNIX for Dummies Questions & Answers

man pages

Hi folks, I want to know all the commands for which man pages are available. How do i get it? Cheers, Nisha (4 Replies)
Discussion started by: Nisha
4 Replies

4. UNIX for Dummies Questions & Answers

man pages

When reading man pages, I notice that sometimes commands are follwed by a number enclosed in parenthesis. such as: mkdir calls the mkdir(2) system call. What exactly does this mean? (4 Replies)
Discussion started by: dangral
4 Replies

5. UNIX for Dummies Questions & Answers

how to read man pages

can anybody explain me how to read unix man pages? for example when i want to get information about ps command man ps gives me this output: *********************************** Reformatting page. Please wait... completed ps(1) ... (2 Replies)
Discussion started by: gfhgfnhhn
2 Replies

6. UNIX for Dummies Questions & Answers

Man pages on Solaris 10

Hi, I want to install man pages package from solaris 10. Solaris 10 has already been installed on my servor but I have to add the man pages packages. I search for a long time on internet this package but I didn't find a compatible one... So I downloaded Solaris 10 from Sun site to get this... (1 Reply)
Discussion started by: MasterapocA
1 Replies

7. Fedora

why do we have .1 extension in MAN PAGES?

Hello sir, I am using FEDORA 9. I wanted to know why do we have ".1" extension in the archives of man pages. I know we are giving format. I want to know the importance or purpose of this format. Can you please tell me :confused: (2 Replies)
Discussion started by: nsharath
2 Replies

8. Solaris

MAN PAGES

Hi everyone, I have a small query, in solaris the man pages get displayed on half of the terminal , can i get a full terminal or full screen display ?:) (2 Replies)
Discussion started by: M.Choudhury
2 Replies

9. HP-UX

Looking for some man pages.

Can anyone supply me with the man pages for: omnidatalist omnibarlist omnisap.exe I prefer the source man pages in nroff format. A clue about the software bundles which supply these man pages is fine as well. OS: HP-UX TIA (11 Replies)
Discussion started by: sb008
11 Replies

10. Shell Programming and Scripting

Commands for man pages

what command should i use for displaying the manual pages for the socket, read and connect system calls? (1 Reply)
Discussion started by: Nabeel Nazir
1 Replies
All times are GMT -4. The time now is 08:35 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy