Sponsored Content
The Lounge What is on Your Mind? UNIX.com is getting crushed in google search these days Post 303035012 by Neo on Friday 10th of May 2019 11:48:30 PM
Old 05-11-2019
Well, the many people I have discussed this with all seem to agree that Google is penalizing forums in one way or another, in their algorithm changes.

I serious doubt Google's algo development team wakes up in the morning and says "let penalize forums", but I am confident they are making their algorithms more "AI-like" and we can only see the symptoms of what they are doing.

In the last week, I have directly seen perfectly good forum discussions marked by Google as "soft 404" and not indexed because of a single keyword like "error"in the title of the discussion. When I manually change the title and remove the keyword "error", it passes Google's algorithm in flying colors. I have seen this for other phrases like "not found" as well.

This does make sense if you look at it from a global AI perspective. AI is not intelligence and nor are methods like Bayesian classifiers. It is collecting data globally and I am sure many bad links on the net return responses that have "error" or "not found" in the text. So, then, speaking globally, Google's then classifies links with test with "error" or "not found" in the title or the meta data as "bad".. or in their case "soft 404" and they do not index it.

So, the website a forum about dogs and cats, then that site probably does not have have titles and meta data like "My Dog Has An Error" or "Please Help Me with My Cat Error". So, based on my years of working with such classifiers (we also ran a Bayesian anti-spam classifier here at unix.com for many years), it is easy to see how a classifier could penalize a technology forum dealing with software errors as a matter. These are simply false positives in Google's algorithm.

This same is true of "thin content"

If someone asks a short question about grep and they get a short but accurate reply, and even if that reply is very helpful to everyone, Google's classifier cannot score that. Google will just score on the content "thinness".

I spend most of the week looking at all the links on our site which Google has classified as "soft 404 errors" and in each cause, either "thinness" or a keyword in the title or meta data like "error" or "not found" was the cause. In each case I confirmed it by double checking before and after I made the change.

So, to help with the soft 404s on posts and discussion threads, I added summary test to "similar threads". Now those pass Google's classifier and are currently being validated as "looking good".

As for all the "titles" and meta data with keywords like "error" and "not found".. that is a huge problem and of course we cannot change 12K threads and make the title and metadata senseless to pass Google's classifiers.

Unfortunately, this is how the classifiers work and it is really a very poor design which would classify a technology forum with links with "error" and "not found" in the metadata as "soft 404" but Google does not listen to me. In fact, since I left the US over 10 years ago and live on the seacoast in Thailand, very few people listen to me like they used to. People are mostly jealous! LOL

Anyway, I digress.

The good news is that I have made a lot of changes this week and learned a lot. The bad news is I cannot say with any assurance that the changes I make will have any immediate effect. Five or ten years ago, I could see changes have an effect very quickly; but as many have pointed out to me recently, the network is orders of magnitude larger now than it was back then and it is growing larger at neck-breaking speeds.
 

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

A google search shellscript

This little doey allows you do fire up a google search right from your terminal. --------------------------------------------------- #!/bin/sh #(save me into the path as "google") clear && for i in "$@"; do lynx http://www.google.com/search?q="$@"; done ... (3 Replies)
Discussion started by: JoeTheGuy
3 Replies

2. Web Development

Google search appliance

Please let me know if Google search appliance supports JSON or XML interface? If yes please provide some references (3 Replies)
Discussion started by: uunniixx
3 Replies

3. What is on Your Mind?

Patching Google Search engine/application in Unix.

Hi Unix Gurus, In my Co. we have intranet site hosted on Unix box. In Explorer there is a text box for searching information on internet. By default it is using Google Custom Search. This search engine is little old one. Now I want to patch this search engine with latest patch. If any one know... (0 Replies)
Discussion started by: sriramis4u
0 Replies

4. What is on Your Mind?

Google Site Search in Search Drop Down Menu (Again)

Have just added (after missing for some time), the latest version of Google Site Search for our site in the Navbar Search Menu: https://www.unix.com/members/1-albums215-picture791.png Cheers and Enjoy. Here is the URL for that link in case you need it: https://goo.gl/P8p82c (4 Replies)
Discussion started by: Neo
4 Replies

5. What is on Your Mind?

Search Results for the UNIX keyword - Google, Bing, DuckDuckGo

Some search results for the keyword "unix" searches: DuckDuckGo #1 https://www.unix.com/members/1-albums215-picture1254.png Bing #2 https://www.unix.com/members/1-albums215-picture1253.png Google #15 (page 2) https://www.unix.com/members/1-albums215-picture1252.png (1 Reply)
Discussion started by: Neo
1 Replies

6. What is on Your Mind?

YouTube: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search

Getting a bit more comfortable making quick YT videos in 4K, here is: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search Console https://youtu.be/I6b9T2qcqFo (0 Replies)
Discussion started by: Neo
0 Replies
ELVI(1sr)																 ELVI(1sr)

NAME
elvi - surfraw(1) search tools SYNOPSIS
surfraw elviname [options] search words ... sr elviname [options] search words ... sr elviname -help sr -elvi DESCRIPTION
This is the man page for the elvi, a set of search tools that form part of surfraw(1). Surfraw provides a fast unix command line interface to a variety of popular WWW search engines and other artifacts of power. It reclaims google, altavista, dejanews, freshmeat, research index, slashdot and many others from the false-prophet, pox-infested heathen lands of html-forms, placing these wonders where they belong, deep in unix heartland, as god loving extensions to the shell. To search using this elvis, do: sr elvisname [options] search terms. For example, to search google for information on Debian ports, using the "I'm feeling lucky" option: sr google -l debian ports To find out about options specific to this elvis, do: sr elvisname -local-help To find out about surfraw, see the man page for surfraw(1). To see the full list of elvi, do: sr -elvi Adding the elvi to your path If you are a regular user of surfraw, you will probably get sick of typing sr or surfraw each time. You can regain the old behaviour of running the elvi directly by adding the elvi directory (/usr/lib/surfraw) to your path, either manually or using surfraw-update-path(1). OPTIONS
Use sr elviname -local-help for elvi-specific options. The following options work with all elvi. -help Show summary of options (including elvi-specific options). -local-help Show elvi-specific options. -version Show version of program. -browser=EXECUTABLE Set browser (default: sensible-browser). -elvi Display a list of other Surfraw mechanisms for conquering evil. -escape-url-args=yes|no Apply url escaping to arguments (default: yes) -q|-quote Bracket arguments with " characters (default: no) Note that putting quotes round arguments works now, so you can do, for example: sr google foo "bar baz" bam and the quoting is passed on to the search engine EXAMPLES
$ sr ask why is jeeves gay? $ surfraw google -results=100 RMS, GNU, which is sinner, which is sin? $ sr austlii -method=phrase dog like $ /usr/lib/surfraw/rhyme -method=perfect Julian BUGS
Please report any bugs found (or any web sites in need of surfrawizing) either via the debian bug tracking system (http://bugs.debian.org/) or to the surfraw-devel list (surfraw-devel@lists.alioth.debian.org). SEE ALSO
http://alioth.debian.org/projects/surfraw/ PROPAGANDA
Oh Baybe I need some Deep Linking Let us go Surfin' in the raw! Read HACKING. Surfrawize the soul of your favourite internet wonder. Join the Shell Users' Revolutionary Front Against the WWW by submit- ting code. Reclaim heathen lands. Bear witness to the truth. Its love will set you free. Join us on surfraw-devel@lists.alioth.debian.org AUTHORS
Originally written by Julian Assange. Now maintained by the surfraw-devel team <surfraw-devel@lists.alioth.debian.org>. See the file AUTHORS for the full list of contributors. Man page by Ian Beckwith, based on the original README and an earlier man page for surfraw(1) by Christian Surchi. COPYRIGHT
Copyright (c) 2003-2010 The Surfraw-Devel Team <surfraw-devel@lists.alioth.debian.org> Copyright (c) 2000-2001 Julian Assange <proff@iq.org> Copyright (c) 2001 Australian Institute for Collaborative Research Copyright (c) 2000 Melbourne Institute for Advanced Study The copyright holders listed above assert no rights on this release of the software ``surfraw'' and thereby explicity place this release into the into the public domain. Do what you will. Feb 03, 2004 ELVI(1sr)
All times are GMT -4. The time now is 05:42 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy