Tag Archive

 

You are currently browsing posts tagged: SERPS.

Welcome! It looks like you're new here. Be notified whenever there's a new post by subscribing to Biz Bits using the form on the right. You'll also get my exclusive Internet marketing course as a bonus. Give it a try. You can unsubscribe with one-click at any time, although hardly anyone ever does...

I went back and had a look at the Yahoo Search Content Quality Guidelines the other day after Michael Campbell mentioned them in his ezine. They tell us that Yahoo wants to index:

.

.

.

.

.

  1. Original and unique content of genuine value
  2. Pages designed primarily for humans, with search engine considerations secondary
  3. Hyperlinks intended to help people find interesting, related content, when applicable
  4. Metadata (including title and description) that accurately describes the contents of a web page
  5. Good web design in general

Immediate Thoughts:

No revelations in (1) & (2).
(3) Seems to shout “anchor text,” but probably simply refers to linking in general since it’s said that for years
(4) Makes it plain that the title and meta description tag ARE important.
(5) Reinforces that search engines like html that validates, with structured sites full of good old plain links.

So what is search engine spam to Yahoo? Well, here’s what Yahoo says it doesn’t want in its index:

  1. Pages that harm accuracy, diversity or relevance of search results
  2. Pages dedicated to directing the user to another page
  3. Pages that have substantially the same content as other pages
  4. Sites with numerous, unnecessary virtual hostnames
  5. Pages in great quantity, automatically generated or of little value
  6. Pages using methods to artificially inflate search engine ranking
  7. The use of text that is hidden from the user
  8. Pages that give the search engine different content than what the end user sees
  9. Excessively cross-linking sites to inflate a site’s apparent popularity
  10. Pages built primarily for the search engines
  11. Misuse of competitor names
  12. Multiple sites offering the same content
  13. Sites that use excessive pop-ups, interfering with user navigation
  14. Pages that seem deceptive, fraudulent, or provide a poor user experience

Immediate Thoughts:

(1) Misleading spam pages with nothing to do with the terms they are optimised for

(2) Redirects, which are especially prevalent amongst affiliate marketers, and Doorway Pages

(3) Duplicate content in any form, whether scrapes, articles or datafeeds

(4) Virtual hostnames actually means “maintaining more than one server on one Machine, as differentiated by their apparent hostname,” so I’m unsure whether Yahoo has its language confused and is refering to sites using many subdomains when a directory would do fine, or to the use of several variations on a domain name all pointing back to the same folder. In either case, the point is in regard to doing it simply to get a greater presence in the search engines.

(5) Obviously auto-generated content, including vanilla datafeed sites and unmodified RSS feeds

(6) Too many examples to list!

(7) Yes, people DO still do this. Only now most use CSS. But SE’s are making inroads into reading CSS too.

(8) Cloaking done to inflate rankings, with a specially optimised page served to the search engine crawlers and a normal page to regular visitors.

(9) Link networks, mini nets, blog farms, etc.

(10) Well who else would you build them for if you want them read? Sorry! :P Over optimised to the point usability and value is impaired.

(11) Registering misspellings of well-known names, piggy-backing on competitor names to rank, attempting to sabotage competitor rankings, etc.

(12) Again to gain greater exposure in the SERPS, but not much of an issue with todays duplicate content filters.

(13) Clear enough.

(14) Catch-all for anything else.

Unfortunately though — and I’m guessing Michael doesn’t realise this — the Yahoo Search Content Quality Guidelines have not changed much in years. In fact, the current page is almost identical to the Inktomi Content Guidelines (the search engine Yahoo bought years ago because they didn’t have their own) from back in June 2002! (The link sends you to the original page as stored in the wayback.org index).

It makes you wonder how accurate the Yahoo Search Content Quality Guidelines really are …

Matt Cutts talks about a new user-interface experiment on Google that enables searchers to remove results from the SERPS they view in Personalized Search.

The idea is that you can block spam pages from showing up in your search results.

Under each listing, a new “remove result” link is shown adjacent to the “view similar pages” link. Clicking it removes the listing and presents a link to “more options.”

These give the searcher the ability to choose between removing the particular page for the current search, every search, or blocking the entire site from appearing in any searches performed by the user.

Matt says it’s too early to say whether Google will use the data to improve general search or even if the feature will become permanent.

I don’t claim to be a search engine expert, but to my mind it seems likely that the data would eventually be used to help Google spot spam.

Obviously there would have to be some sort of threshold or “score” in place to prevent purely vindictive attempts by competitors to get Google to drop certain websites, but if 5,000 or 10,000 Personalized Search users all block a page (or whatever a realistic threshold needs to be), that would seem to be a pretty good indicator that something is wrong somewhere. Where there’s smoke there’s usually fire.

Initially Google might have these “threshold breakers” inspected by its human spam watchers, to see how reliable such user input is before any further action is taken against a page or website. But if they find that 99% of the sites are indeed spam, Google could easily move to automatic filtering and reduce its own human input to monitoring accuracy with random samples.

Since it is also likely that some users will remove pages that are not necessarily spam, but simply not relevant to their search, Google may also use the data to improve its understanding of what searchers are actually looking for when they enter a particular search string.

I would imagine pages removed for this reason to be far less in number and relatively easy for Google to spot because they will be tied to specific kinds of searches instead of popping up all over the place, and form a pattern of removals in common with similar themed sites, probably having few entire site removals. And whilst different searchers may be using the same words but looking for different things, objections to spam run across the board.

Matt mentions that the final format may change, so we could even see an option to select either “spam” or “not relevant” as the reason for removing a page. Alternatively Google may see the mass removal of entire sites as a good indicator of spam.

Of course, this is all conjecture on my part, and I could be entirely wrong.