05-11-2019
19,118,
3,359
Join Date: Sep 2000
Last Activity: 15 July 2022, 8:51 AM EDT
Location: Asia Pacific, Cyberspace, in the Dark Dystopia
Posts: 19,118
Thanks Given: 2,351
Thanked 3,359 Times in 1,878 Posts
Well, the many people I have discussed this with all seem to agree that Google is penalizing forums in one way or another, in their algorithm changes.
I serious doubt Google's algo development team wakes up in the morning and says "let penalize forums", but I am confident they are making their algorithms more "AI-like" and we can only see the symptoms of what they are doing.
In the last week, I have directly seen perfectly good forum discussions marked by Google as "soft 404" and not indexed because of a single keyword like "error"in the title of the discussion. When I manually change the title and remove the keyword "error", it passes Google's algorithm in flying colors. I have seen this for other phrases like "not found" as well.
This does make sense if you look at it from a global AI perspective. AI is not intelligence and nor are methods like Bayesian classifiers. It is collecting data globally and I am sure many bad links on the net return responses that have "error" or "not found" in the text. So, then, speaking globally, Google's then classifies links with test with "error" or "not found" in the title or the meta data as "bad".. or in their case "soft 404" and they do not index it.
So, the website a forum about dogs and cats, then that site probably does not have have titles and meta data like "My Dog Has An Error" or "Please Help Me with My Cat Error". So, based on my years of working with such classifiers (we also ran a Bayesian anti-spam classifier here at unix.com for many years), it is easy to see how a classifier could penalize a technology forum dealing with software errors as a matter. These are simply false positives in Google's algorithm.
This same is true of "thin content"
If someone asks a short question about grep and they get a short but accurate reply, and even if that reply is very helpful to everyone, Google's classifier cannot score that. Google will just score on the content "thinness".
I spend most of the week looking at all the links on our site which Google has classified as "soft 404 errors" and in each cause, either "thinness" or a keyword in the title or meta data like "error" or "not found" was the cause. In each case I confirmed it by double checking before and after I made the change.
So, to help with the soft 404s on posts and discussion threads, I added summary test to "similar threads". Now those pass Google's classifier and are currently being validated as "looking good".
As for all the "titles" and meta data with keywords like "error" and "not found".. that is a huge problem and of course we cannot change 12K threads and make the title and metadata senseless to pass Google's classifiers.
Unfortunately, this is how the classifiers work and it is really a very poor design which would classify a technology forum with links with "error" and "not found" in the metadata as "soft 404" but Google does not listen to me. In fact, since I left the US over 10 years ago and live on the seacoast in Thailand, very few people listen to me like they used to. People are mostly jealous! LOL
Anyway, I digress.
The good news is that I have made a lot of changes this week and learned a lot. The bad news is I cannot say with any assurance that the changes I make will have any immediate effect. Five or ten years ago, I could see changes have an effect very quickly; but as many have pointed out to me recently, the network is orders of magnitude larger now than it was back then and it is growing larger at neck-breaking speeds.