05-11-2019
Well, the many people I have discussed this with all seem to agree that Google is penalizing forums in one way or another, in their algorithm changes.
I serious doubt Google's algo development team wakes up in the morning and says "let penalize forums", but I am confident they are making their algorithms more "AI-like" and we can only see the symptoms of what they are doing.
In the last week, I have directly seen perfectly good forum discussions marked by Google as "soft 404" and not indexed because of a single keyword like "error"in the title of the discussion. When I manually change the title and remove the keyword "error", it passes Google's algorithm in flying colors. I have seen this for other phrases like "not found" as well.
This does make sense if you look at it from a global AI perspective. AI is not intelligence and nor are methods like Bayesian classifiers. It is collecting data globally and I am sure many bad links on the net return responses that have "error" or "not found" in the text. So, then, speaking globally, Google's then classifies links with test with "error" or "not found" in the title or the meta data as "bad".. or in their case "soft 404" and they do not index it.
So, the website a forum about dogs and cats, then that site probably does not have have titles and meta data like "My Dog Has An Error" or "Please Help Me with My Cat Error". So, based on my years of working with such classifiers (we also ran a Bayesian anti-spam classifier here at unix.com for many years), it is easy to see how a classifier could penalize a technology forum dealing with software errors as a matter. These are simply false positives in Google's algorithm.
This same is true of "thin content"
If someone asks a short question about grep and they get a short but accurate reply, and even if that reply is very helpful to everyone, Google's classifier cannot score that. Google will just score on the content "thinness".
I spend most of the week looking at all the links on our site which Google has classified as "soft 404 errors" and in each cause, either "thinness" or a keyword in the title or meta data like "error" or "not found" was the cause. In each case I confirmed it by double checking before and after I made the change.
So, to help with the soft 404s on posts and discussion threads, I added summary test to "similar threads". Now those pass Google's classifier and are currently being validated as "looking good".
As for all the "titles" and meta data with keywords like "error" and "not found".. that is a huge problem and of course we cannot change 12K threads and make the title and metadata senseless to pass Google's classifiers.
Unfortunately, this is how the classifiers work and it is really a very poor design which would classify a technology forum with links with "error" and "not found" in the metadata as "soft 404" but Google does not listen to me. In fact, since I left the US over 10 years ago and live on the seacoast in Thailand, very few people listen to me like they used to. People are mostly jealous! LOL
Anyway, I digress.
The good news is that I have made a lot of changes this week and learned a lot. The bad news is I cannot say with any assurance that the changes I make will have any immediate effect. Five or ten years ago, I could see changes have an effect very quickly; but as many have pointed out to me recently, the network is orders of magnitude larger now than it was back then and it is growing larger at neck-breaking speeds.
6 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
This little doey allows you do fire up a google search right from your terminal.
---------------------------------------------------
#!/bin/sh
#(save me into the path as "google")
clear &&
for i in "$@";
do lynx http://www.google.com/search?q="$@";
done
... (3 Replies)
Discussion started by: JoeTheGuy
3 Replies
2. Web Development
Please let me know if Google search appliance supports JSON or XML interface? If yes please provide some references (3 Replies)
Discussion started by: uunniixx
3 Replies
3. What is on Your Mind?
Hi Unix Gurus,
In my Co. we have intranet site hosted on Unix box. In Explorer there is a text box for searching information on internet. By default it is using Google Custom Search. This search engine is little old one. Now I want to patch this search engine with latest patch. If any one know... (0 Replies)
Discussion started by: sriramis4u
0 Replies
4. What is on Your Mind?
Have just added (after missing for some time), the latest version of Google Site Search for our site in the Navbar Search Menu:
https://www.unix.com/members/1-albums215-picture791.png
Cheers and Enjoy.
Here is the URL for that link in case you need it:
https://goo.gl/P8p82c (4 Replies)
Discussion started by: Neo
4 Replies
5. What is on Your Mind?
Some search results for the keyword "unix" searches:
DuckDuckGo #1
https://www.unix.com/members/1-albums215-picture1254.png
Bing #2
https://www.unix.com/members/1-albums215-picture1253.png
Google #15 (page 2)
https://www.unix.com/members/1-albums215-picture1252.png (1 Reply)
Discussion started by: Neo
1 Replies
6. What is on Your Mind?
Getting a bit more comfortable making quick YT videos in 4K, here is:
Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search Console
https://youtu.be/I6b9T2qcqFo (0 Replies)
Discussion started by: Neo
0 Replies
LEARN ABOUT DEBIAN
finance::quotehist::google
Finance::QuoteHist::Google(3pm) User Contributed Perl Documentation Finance::QuoteHist::Google(3pm)
NAME
Finance::QuoteHist::Google - Site-specific class for retrieving historical stock quotes.
SYNOPSIS
use Finance::QuoteHist::Google;
$q = Finance::QuoteHist::Google->new
(
symbols => [qw(IBM UPS AMZN)],
start_date => '01/01/1999',
end_date => 'today',
);
foreach $row ($q->quotes()) {
($symbol, $date, $open, $high, $low, $close, $volume) = @$row;
...
}
DESCRIPTION
Finance::QuoteHist::Google is a subclass of Finance::QuoteHist::Generic, specifically tailored to read historical quotes from the Google
web site (http://finance.google.com/).
Google does not currently provide information on dividends or splits.
Please see Finance::QuoteHist::Generic(3) for more details on usage and available methods. If you just want to get historical quotes and
are not interested in the details of how it is done, check out Finance::QuoteHist(3).
METHODS
The basic user interface consists of a single method, as shown in the example above. That method is:
quotes()
Returns a list of rows (or a reference to an array containing those rows, if in scalar context). Each row contains the Symbol, Date,
Open, High, Low, Close, and Volume for that date. Quote values are pre-adjusted for this site.
REQUIRES
Finance::QuoteHist::Generic
DISCLAIMER
The data returned from these modules is in no way guaranteed, nor are the developers responsible in any way for how this data (or lack
thereof) is used. The interface is based on URLs and page layouts that might change at any time. Even though these modules are designed to
be adaptive under these circumstances, they will at some point probably be unable to retrieve data unless fixed or provided with new
parameters. Furthermore, the data from these web sites is usually not even guaranteed by the web sites themselves, and oftentimes is
acquired elsewhere.
Details for Googles's terms of use can be found here:
http://www.google.com/accounts/TOS?loc=us
If you still have concerns, then use another site-specific historical quote instance, or none at all.
Above all, play nice.
AUTHOR
Matthew P. Sisk, <sisk@mojotoad.com>
COPYRIGHT
Copyright (c) 2007-2010 Matthew P. Sisk. All rights reserved. All wrongs revenged. This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.
SEE ALSO
Finance::QuoteHist::Generic(3), Finance::QuoteHist(3), perl(1).
perl v5.12.4 2010-06-07 Finance::QuoteHist::Google(3pm)