The Penguin Update: Google’s Webspam Algorithm Gets Official Name
Move over Panda, there’s a new Google update in town: Penguin. That’s the official name Google has given to the webspam algorithm that it released on Tuesday.
What’s An Update?
For those unfamiliar with Google updates, I’d recommend reading my Why Google Panda Is More A Ranking Factor Than Algorithm Update post from last year. It explains how Google has a variety of algorithms used to rank pages.
With Google Panda Update 2.2 upon us, it’s worth revisiting what exactly Panda is and isn’t. Panda is a new ranking factor. Panda is not an entirely new overall ranking algorithm that’s employed by Google. The difference is important for anyone hit by Panda and hoping to recover from it.
Google’s Ranking Algorithm & Updates
Let’s start with search engine optimization 101. After search engines collect pages from across the web, they need to sort through them in demand to searches that are done. Which are the best? To decide this, they employ a ranking algorithm. It’s like a recipe for cooking up the best results.
Like any recipe, the ranking algorithm contains many ingredients. Search engines look at words that appear on pages, how people are linking to pages, try to calculate the reputation of websites and more. Our Periodic Table Of SEO Ranking Factors explains more about this.
Google is constantly tweaking its ranking algorithm, making little changes that might not be noticed by many people. If the algorithm were a real recipe, this might be like adding in a pinch more salt, a bit more sugar or a teaspoon of some new flavoring. The algorithm is mostly the same, despite the little changes.
From time-to-time, Google does a massive overhaul of its ranking algorithm. These have been known as “updates” over the years. “Florida” was a famous one from 2003; the Vince Update hit in 2009; the Mayday Update happened last year.
Index & Algorithm Updates
Confusingly, the term “updates” also gets used for things that are not actual algorithm updates. Here’s some vintage Matt Cutts on this topic. For example, years ago Google used to do an “index update” every month or so, when it would suddenly dump millions of new pages it had found into its existing collection.
This influx of new content caused ranking changes that could take days to settle down, hence the nickname of the “Google Dance.” But the changes were caused by the algorithm sorting through all the new content, not because the algorithm itself had changed.
Of course, as said, sometimes the core ranking algorithm itself is massively altered, almost like tossing out an old recipe and starting from scratch with a new one. These “algorithm updates” can produce massive ranking changes. But Panda, despite the big shifts it has caused, is not an algorithm update.
Instead, Panda — like PageRank — is a value that feeds into the overall Google algorithm. If it helps consider it as if every site is given a PandaRank score. Those low in Panda come through OK; those high get hammered by the beast.
Calculating Ranking Factors
So where are we now? Google has a ranking algorithm, a recipe that assesses many factors to decide how pages should rank. Google can — and does — change some parts of this ranking algorithm and can see instant (though likely minor) effects by doing so. This is because it already has the values for some factors calculated and stored.
For example, let’s say Google decides to reward pages that have all the words someone has searched for appearing in close proximity to each other. It decides to give them a slightly higher boost than in the past. It can implement this algorithm tweak and see changes happen nearly instantly.
This is because Google’s has already gathered all the values relating to this particular factor. It already has stored the pages and made note of where each word is in proximity to other words. Google can turn the metaphorical proximity ranking factor dial up from say 5 to 6 effortlessly, because those factors have already been calculated as part of an ongoing process.
Automatic Versus Manual Calculations
Other factors require deeper calculations that aren’t done on an ongoing basis, what Google calls “manual” updates. This doesn’t mean that a human being at Google is somehow manually setting the value of these factors. It means that someone decides its time to run a specific computer program to update these factors, rather than it just happening all the time.
For example, a few years ago Google rolled out a “Google Bomb” fix. But then, new Google Bombs kept happening! What was up with that? Google explained that there was a special Google Bomb filter that would periodically be run, since it wasn’t needed all the time. When the filter ran, it would detect new Google Bombs and defuse those.
In recipe terms, it would be as if you were using a particular brand of chocolate chips in your cookies but then switched to a different brand. You’re still “inputting” chocolate chips, but these new chips make the cookies taste even better (or so you hope).
NOTE: In an earlier edition of this story, I’d talked about PageRank values being manually updated from time-to-time. Google’s actually said they are constantly being updated. Sorry about any confusion there.
The Panda Ranking Factor
Enter Panda. Rather than being a change to the overall ranking algorithm, Panda is more a new ranking factor that has been added into the algorithm (indeed, on our SEO Periodic Table, this would be element Vt, for Violation: Thin Content).
Panda is a filter that Google has designed to spot what it believes are low-quality pages. Have too many low-quality pages, and Panda effectively flags your entire site. Being Pandified, Pandification — whatever clever name you want to call it — doesn’t mean that your entire site is out of Google. But it does mean that pages within your site carry a penalty designed to help ensure only the better ones make it into Google’s top results.
At our SMX Advanced conference earlier this month, the head of Google’s spam fighting team, Matt Cutts, explained that the Panda filter isn’t running all the time. Right now, it’s too much computing power to be running this particular analysis of pages.
Instead, Google runs the filter periodically to calculate the values it needs. Each new run so far has also coincided with changes to the filter, some big, some small, that Google hopes improves catching poor quality content. So far, the Panda schedule has been like this:
For anyone who was hit by Panda, it’s important to understand that the changes you’ve made won’t have any immediate impact.
For instance, if you started making improvements to your site the day after Panda 1.0 happened, none of those would have registered for getting you back into Google’s good graces until the next time Panda scores were assessed — which wasn’t until around April 11.
With the latest Panda round now live, Google says it’s possible some sites that were hit by past rounds might see improvements, if they themselves have improved.
The latest round also means that some sites previously not hit might now be impacted. If your site was among these, you’ve probably got a 4-6 week wait until any improvements you make might be assessed in the next round.
If you made changes to your site since the last Panda update, and you didn’t see improvements, that doesn’t necessarily mean you’ve still done something wrong. Pure speculation here, but part of the Panda filter might be watching to see if a site’s content quality looks to have improved over time. After enough time, the Panda penalty might be lifted.
In conclusion, some key points to remember:
Google makes small algorithm changes all the time, which can cause sites to fall (and rise) in rankings independently of Panda.
Google may update factors that feed into the overall algorithm, such as PageRank scores, on an irregular basis. Those updates can impact rankings independently of Panda.
So far, Google has confirmed when major Panda factor updates have been released. If you saw a traffic drop during one of these times, there’s a good chance you have a Panda-related problem.
Looking at rankings doesn’t paint an accurate picture of how well your site is performing on Google. Look at the overall traffic that Google has sent you. Losing what you believe to be a key ranking might not mean you’ve lost a huge amount of traffic. Indeed, you might discover that in general, you’re as good as ever with Google.
Google periodically changes these algorithms. When this happens, that’s known as an “update,” which in turn has an impact on the search results we get. Sometimes the updates have a big impact; sometimes they’re hardly noticed.
Who Names Updates?
Google also periodically creates new algorithms. When this happens, sometimes they’re given names by Google itself, as with the Vince update in 2009.
About a week ago, SEOs and Webmasters began noticing a significant change in how Google returned results for a certain set of keywords. Many webmasters felt Google was giving “big brands” a push in the search results. However, Matt Cutts of Google created a video that answered many questions about this “brand push.”
Let me first take you back to last week when on February 20th, a WebmasterWorld thread was created based on some SEOs noticing this change in Google. I then covered the thread at the Search Engine Roundtable on February 23rd, summarizing some of the discussion in the thread. Aaron Wall followed up that post on February 25th, with statistical data to show significant changes in the search results, pointing to evidence behind this brand push. Then we saw dozens of blog posts, discussion forum threads and Twitters from SEOs and webmasters about Google changing their algorithm to give big brands a major push in the search results.
Matt Cutts addressed these concerns in a three and a half minute video, which I have embedded below. Matt Cutts said this change is not necessarily a Google “update,” but rather what he would call a “minor change.” In fact, Matt told us a Googler named Vince created this change and they call it the “Vince change” at Google. He said it is not really about pushing brands to the front of the Google results. It is more about factoring trust more into the algorithm for more generic queries. He said most searchers won’t notice and it does not impact the long tail queries, but for some queries, Google might be factoring in things like trust, quality, PageRank and other metrics that convey the importance and value of a page, into the ranking algorithm. I guess, big brands have earned more trust than smaller brands, which is noted by all the recent chatter in our industry.
If Google doesn’t give a name, sometimes others such as Webmaster World may name them, as with the Mayday update in 2010.
Google made between 350 and 550 changes in its organic search algorithms in 2009. This is one of the reasons I recommend that site owners not get too fixated on specific ranking factors. If you tie construction of your site to any one perceived algorithm signal, you’re at the mercy of Google’s constant tweaks. These frequent changes are one reason Google itself downplays algorithm updates. Focus on what Google is trying to accomplish as it refines things (the most relevant, useful results possible for searchers) and you’ll generally avoid too much turbulence in your organic search traffic.
However, sometimes a Google algorithm change is substantial enough that even those who don’t spend a lot of time focusing on the algorithms notice it. That seems to be the case with what those discussing it at Webmaster World have named “Mayday”. Last week at Google I/O, I was on a panel with Googler Matt Cutts who said, when asked during Q&A, ”this is an algorithmic change in Google, looking for higher quality sites to surface for long tail queries. It went through vigorous testing and isn’t going to be rolled back.”
I asked Google for more specifics and they told me that it was a rankings change, not a crawling or indexing change, which seems to imply that sites getting less traffic still have their pages indexed, but some of those pages are no longer ranking as highly as before. Based on Matt’s comment, this change impacts “long tail” traffic, which generally is from longer queries that few people search for individually, but in aggregate can provide a large percentage of traffic.
This change seems to have primarily impacted very large sites with “item” pages that don’t have many individual links into them, might be several clicks from the home page, and may not have substantial unique and value-added content on them. For instance, ecommerce sites often have this structure. The individual product pages are unlikely to attract external links and the majority of the content may be imported from a manufacturer database. Of course, as with any change that results in a traffic hit for some sites, other sites experience the opposite. Based on Matt’s comment at Google I/O, the pages that are now ranking well for these long tail queries are from “higher quality” sites (or perhaps are “higher quality” pages).
My complete speculation is that perhaps the relevance algorithms have been tweaked a bit. Before, pages that didn’t have high quality signals might still rank well if they had high relevance signals. And perhaps now, those high relevance signals don’t have as much weight in ranking if the page doesn’t have the right quality signals.
What’s a site owner to do? It can be difficult to create compelling content and attract links to these types of pages. My best suggestion to those who have been hit by this is to isolate a set of queries for which the site now is getting less traffic and check out the search results to see what pages are ranking instead. What qualities do they have that make them seen as valuable? For instance, I have no way of knowing how amazon.com has fared during this update, but they’ve done a fairly good job of making individual item pages with duplicated content from manufacturer’s databases unique and compelling by the addition of content like of user reviews. They have set up a fairly robust internal linking (and anchor text) structure with things like recommended items and lists. And they attract external links with features such as the my favorites widget.
From the discussion at the Google I/O session, this is likely a long-term change so if your site has been impacted by it, you’ll likely want to do some creative thinking around how you can make these types of pages more valuable (which should increase user engagement and conversion as well).
Update on 5/30/10: Matt Cutts from Google has posted a YouTube video about the change. In it, he says “it’s an algorithmic change that changes how we assess which sites are the best match for long tail queries.” He recommends that a site owner who is impacted evaluate the quality of the site and if the site really is the most relevant match for the impacted queries, what “great content” could be added, determine if the the site is considered an “authority”, and ensure that the page does more than simply match the keywords in the query and is relevant and useful for that query.
He notes that the change:
has nothing to do with the “Caffeine” update (an infrastructure change that is not yet fully rolled out).
is entirely algorithmic (and isn’t, for instance, a manual flag on individual sites).
impacts long tail queries more than other types
was fully tested and is not temporary
With Penguin, history is repeating itself, where Google is belatedly granting a name to an update after-the-fact. The same thing happened with Panda last year.
When the Panda Update was first launched in February 2011, Google didn’t initially release the name it was using internally.
In January, Google promised that it would take action against content farms that were gaining top listings with “shallow” or “low-quality” content. Now the company is delivering, announcing a change to its ranking algorithm designed take out such material.
New Change Impacts 12% Of US Results
The new algorithm — Google’s “recipe” for how to rank web pages — starting going live yesterday, the company told me in an interview today.
Google changes its algorithm on a regular basis, but most changes are so subtle that few notice. This is different. Google says the change impacts 12% (11.8% is the unrounded figure) of its search results in the US , a far higher impact on results than most of its algorithm changes. The change only impacts results in the US. It may be rolled out worldwide in the future.
While Google has come under intense pressure in the past month to act against content farms, the company told me that this change has been in the works since last January.
Officially, Not Aimed At Content Farms
Officially, Google isn’t saying the algorithm change is targeting content farms. The company specifically declined to confirm that, when I asked. However, Matt Cutts — who heads Google’s spam fighting team — told me, “I think people will get the idea of the types of sites we’re talking about.”
Well, there are two types of sites “people” have been talking about in a way that Google has noticed: “scraper” sites and “content farms.” It mentioned both of them in a January 21 blog post:
We’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content. We’ll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.
As “pure webspam” has decreased over time, attention has shifted instead to “content farms,” which are sites with shallow or low-quality content.
I’ve bolded the key sections, which I’ll explore more next.
The “Scraper Update”
About a week after Google’s post, Cutts confirmed that an algorithm change targeting “scraper” sites had gone live:
This was a pretty targeted launch: slightly over 2% of queries change in some way, but less than half a percent of search results change enough that someone might really notice. The net effect is that searchers are more likely to see the sites that wrote the original content rather than a site that scraped or copied the original site’s content.
“Scraper” sites are those widely defined as not having original content but instead pulling content in from other sources. Some do this through legitimate means, such as using RSS files with permission. Others may aggregate small amounts of content under fair use guidelines. Some simply “scrape” or copy content from other sites using automated means — hence the “scraper” nickname.
In short, Google said it was going after sites that had low-levels of original content in January and delivered a week later.
By the way, sometimes Google names big algorithm changes, such as in the case of the Vince update. Often, they get named by WebmasterWorld, where a community of marketers watches such changes closely, as happened with last year’s Mayday Update.
In the case of the scraper update, no one gave it any type of name that stuck. So, I’m naming it myself the “Scraper Update,” to help distinguish it against the “Farmer Update” that Google announced today.
But “Farmer Update” Really Does Target Content Farms
“Farmer Update?” Again, that’s a name I’m giving this change, so there’s a shorthand way to talk about it. Google declined to give it a public name, nor do I see one given in a WebmasterWorld thread that started noticing the algorithm change as it rolled out yesterday, before Google’s official announcement.
Postscript: Internally, Google told me this was called the “Panda” update, but they didn’t want that on-the-record when I wrote this original story. About a week later, they revealed the internal name in a Wired interview. “Farmer” is used through the rest of this story, though the headline has been changed to “Panda” to help reduce future confusion.
How can I say the Farmer Update targets content farms when Google specifically declined to confirm that? I’m reading between the lines. Google previously had said it was going after them.
Since Google originally named content farms as something it would target, you’ve had some of the companies that get labeled with that term push back that they are no such thing. Most notable has been Demand Media CEO Richard Rosenblatt, who previously told AllThingsD about Google’s planned algorithm changes to target content farms:
It’s not directed at us in any way.
I understand how that could confuse some people, because of that stupid “content farm” label, which we got tagged with. I don’t know who ever invented it, and who tagged us with it, but that’s not us…We keep getting tagged with “content farm”. It’s just insulting to our writers. We don’t want our writers to feel like they’re part of a “content farm.”
I guess it all comes down to what your definition of a “content farm” is. From Google’s earlier blog post, content farms are places with “shallow or low quality content.”
In that regard, Rosenblatt is right that Demand Media properties like eHow are not necessarily content farms, because they do have some deep and high quality content. However, they clearly also have some shallow and low quality content.
That content is what the algorithm change is going after. Google wouldn’t confirm it was targeting content farms, but Cutts did say again it was going after shallow and low quality content. And since content farms do produce plenty of that — along with good quality content — they’re being targeted here. If they have lots of good content, and that good content is responsible for the majority of their traffic and revenues, they’ll be fine. In not, they should be worried.
More About Who’s Impacted
As I wrote earlier, Google says it has been working on these changes since last January. I can personally confirm that several of Google’s search engineers were worrying about what to do about content farms back then, because I was asked about this issue and thoughts on how to tackle it, when I spoke to the company’s search quality team in January 2010. And no, I’m not suggesting I had any great advice to offer — only that people at Google were concerned about it over a year ago.
Since then, external pressure has accelerated. For instance, start-up search engine Blekko blocked sites that were most reported by its users to be spam, which included many sites that fall under the content farm heading. It gained a lot of attention for the move, even if the change didn’t necessarily improve Blekko’s results.
In my view, that helped prompt Google to finally push out a way for Google users to easily block sites they dislike from showing in Google’s results, via Chrome browser extension to report spam.
Cutts, in my interview with him today, made a point to say that none of the data from that tool was used to make changes that are part of the Farmer Update. However, he went on to say that of the top 50 sites that were most reported as spam by users of the tool, 84% of them were impacted by the new ranking changes. He would not confirm or deny if Demand’s eHow site was part of that list.
“These are sites that people want to go down, and they match our intuition,” Cutts said.
In other words, Google crafted a ranking algorithm to tackle the “content farm problem” independently of the new tool, it says — and it feels like tool is confirming that it’s getting the changes right.
The Content Farm Problem
By the way, my own definition of a content farm that I’ve been working on is like this:
Looks to see what are popular searches in a particular category (news, help topics)
Generates content specifically tailored to those searches
Usually spends very little time and or money, even perhaps as little as possible, to generate that content
The problem I think content farms are currently facing is with that last part — not putting in the effort to generate outstanding content.
For example, last night I did a talk at the University Of Utah about search trends and touched on content farm issues. A page from eHow ranked in Google’s top results for a search on “how to get pregnant fast,” a popular search topic. The advice:
The class laughed at the “Enjoyable Sex Is Key” advice as the first tip for getting pregnant fast. Actually, the advice that you shouldn’t get stressed makes sense. But this page is hardly great content on the topic. Instead, it seems to fit the “shallow” category that Google’s algorithm change is targeting. And the page, there last night when I was talking to the class, is now gone.
Perhaps the new “curation layer” that Demand talked about in it earnings call this week will help in cases like these. Demand also defended again in that call that it has quality content.
Will the changes really improve Google’s results? As I mentioned, Blekko now automatically blocks many content farms, a move that I’ve seen hailed by some. What I haven’t seen is any in-depth look at whether what remains is that much better. When I do spot checks, it’s easy to find plenty of other low quality or completely irrelevant content showing up.
Cutts tells me Google feels the change it is making does improve results according to its own internal testing methods. We’ll see if it plays out that way in the real world.
I knew it, but I wasn’t allowed say what it was. Without an official name, I gave it an unofficial one of “Farmer,” since one of the reasons behind the update was to combat low-quality content that was often seen associated with content farms.
In the end, I suspect Google didn’t want the update to sound like it was especially aimed at content farms, so it eventually let the “Panda” name go public, in a Steven Levy interview for Wired about the update about a week after it launched. Panda took its name from one of the key engineers involved.
LONG BEACH, California — Google announced a new update last week to its search engine that addressed the growing complaint that low-quality content sites (derisively referred to as content farms) were ranked higher than higher-quality sites that seemed to be more important to users. This major change affects almost 12 percent of all search results, and the web is still buzzing about its implications, which include dramatic losses for some companies (Mahalo, Suite 101), and gains by some established sites known for high-quality information.
The change comes at a time where critics are wondering whether Google’s search quality has flagged. I delved into the mysteries of the search engine for my upcoming book, In the Plex, and this week had breakfast at the TED conference with the Google engineers who wrote the blog item announcing the change: the company’s search-quality guru Amit Singhal and Matt Cutts, Google’s top search-spam fighter.
Here’s an edited transcript.
Wired.com: What’s the code name of this update? Danny Sullivan of Search Engine Land has been calling it “Farmer” because its apparent target is content farms.
Amit Singhal: Well, we named it internally after an engineer, and his name is Panda. So internally we called a big Panda. He was one of the key guys. He basically came up with the breakthrough a few months back that made it possible.
Say Hello To Penguin
Since Panda, Google’s been avoiding names. The new algorithm in January designed to penalize pages with too many ads above the fold was called the “page layout algorithm.”
Do you shove lots of ads at the top of your web pages? Think again. Tired of doing a Google search and landing on these types of pages? Rejoice. Google has announced that it will penalize sites with pages that are top-heavy with ads.
Top Heavy With Ads? Look Out!
The change — called the “page layout algorithm” — takes direct aim at any site with pages where content is buried under tons of ads.
From Google’s post on its Inside Search blog today:
We’ve heard complaints from users that if they click on a result and it’s difficult to find the actual content, they aren’t happy with the experience. Rather than scrolling down the page past a slew of ads, users want to see content right away.
So sites that don’t have much content “above-the-fold” can be affected by this change. If you click on a website and the part of the website you see first either doesn’t have a lot of visible content above-the-fold or dedicates a large fraction of the site’s initial screen real estate to ads, that’s not a very good user experience.
Such sites may not rank as highly going forward.
Google also posted the same information to its Google Webmaster Central blog.
Sites using pop-ups, pop-unders or overlay ads are not impacted by this. It only applies to static ads in fixed positions on pages themselves, Google told me.
How Much Is Too Much?
How can you tell if you’ve got too many ads above-the-fold? When I talked with the head of Google’s web spam team, Matt Cutts, he said that Google wasn’t going to provide any type of official tools similar to how it provides tools to tell if your site is too slow (site speed is another ranking signal).
Instead, Cutts told me that Google is encouraging people to make use of its Google Browser Size tool or similar tools to understand how much of a page’s content (as opposed to ads) is visible at first glance to visitors under various screen resolutions.
But how far down the page is too far? That’s left to the publisher to decide for themselves. However, the blog post stresses the change should only hit pages with an abnormally large number of ads above-the-fold, compared to the web as a whole:
We understand that placing ads above-the-fold is quite common for many websites; these ads often perform well and help publishers monetize online content.
This algorithmic change does not affect sites who place ads above-the-fold to a normal degree, but affects sites that go much further to load the top of the page with ads to an excessive degree or that make it hard to find the actual original content on the page.
This new algorithmic improvement tends to impact sites where there is only a small amount of visible content above-the-fold or relevant content is persistently pushed down by large blocks of ads.
Impacts Less Than 1% Of Searches
Clearly, you’re in trouble if you have little-to-no content showing above the fold for commonly-used screen resolutions. You’ll know you’re in trouble shortly, because the change is now going into effect. If you suddenly see a drop in traffic today, and you’re heavy on the ads, chances are you’ve been hit by the new algorithm.
For those ready to panic, Cutts told me the change will impact less than 1% of Google’s searches globally, which today’s post also stresses.
Fixed Your Ads? Penalty Doesn’t Immediately Lift
What happens if you’re hit? Make changes, then wait a few weeks.
Similar to how last year’s Panda Update works, Google is examining sites it finds and effectively tagging them as being too ad-heavy or not. If you’re tagged that way, you get a ranking decrease attached to your entire site (not just particular pages) as part of today’s launch.
If you reduce ads above-the-fold, the penalty doesn’t instantly disappear. Instead, Google will make note of it when it next visits your site. But it can take several weeks until Google’s “push” or “update” until the new changes it has found are integrated into its overall ranking system, effectively removing penalties from sites that have changed and adding them to new ones that have been caught.
Google’s post explains this more:
If you decide to update your page layout, the page layout algorithm will automatically reflect the changes as we re-crawl and process enough pages from your site to assess the changes.
How long that takes will depend on several factors, including the number of pages on your site and how efficiently Googlebot can crawl the content.
On a typical website, it can take several weeks for Googlebot to crawl and process enough pages to reflect layout changes on the site.
Ironically, on the same day that Google’s web search team announced this change, I received this message from Google’s AdSense team encouraging me to put more ads on my site:
This was in relation to my personal blog, Daggle. The image in the email suggests that Google thinks content pretty much should be surrounded by ads.
Of course, if you watch the video that Google refers me (and others) to in the email, it promotes careful placement, that user experience be considered and, at one point, shows a page top-heavy with ads as something that shouldn’t be done.
Still, it’s not hard to easily find sites using Google’s own AdSense ads that are definitely pushing content down as far down on their pages as they can or trying to hide it. Those pages, AdSense or not, are subject to the new rules, Cutts said.
Pages Ad-Heavy, But Not Top-Heavy With Ads, May Escape
As a searcher, I’m happy with the change. But it might not be perfect. For example, here’s something I tweeted about last year:
Yes, that’s my finger being used as an arrow. I was annoyed that to find the actual download link I was after was surrounded by AdSense-powered ads telling me to download other stuff.
This particular site was heavily used by kids who might easily click on an ad by mistake. That’s potentially bad ROI for those advertisers. Heck, as net-savvy adult, I found it a challenge.
But the problem here wasn’t that the content was pushed “below the fold” by ads. It was that the ratio of ads was so high in relation to the content (a single link), plus the misleading nature of the ads around the content.
Are Google’s Own Search Results Top Heavy?
Another issue is that ads on Google’s own search results pages push the “content” — the unpaid editorial listings — down toward the bottom of the page. For example, here’s exactly what’s visible on my MacBook Pro’s 1680×1050 screen:
(Side note, that yellow color around the ads in the screenshot? It’s much darker in the screenshot than what I see with my eyes. In reality, the color is so washed-out that it might as well be invisible. That’s something some have felt has been deliberately engineered by Google to make ads less noticeable as ads).
The blue box surrounds the content, the search listings that lead you to actual merchants selling trash cans, in this example. Some may argue that the Google shopping results box is further pushing down the “real content” of listings that lead out of Google. But the shopping results themselves do lead you to external merchants, so I consider them to be content.
The example above is pretty extreme, showing the maximum of three ads that Google will ever show above its search results (with a key exception, below). Even then, there’s content visible, with it making up around half the page or more, if you include the Related Searches area as content.
My laptop’s screen resolution is pretty high, of course. Others would see less (Google’s Browser Size tool doesn’t work to measure its own search results pages). But you can expect Google will take “do as I say, not as I do” criticism on this issue.
Indeed, I shared this story initially with the main details, then started working on this section. After that was done, I could see this type of criticism already happening, both in the comments or over on my Google+ post and Facebook post about the change.
Here’s a screenshot that Daniel Weadley shared in my Google+ post about what he sees on his netbook:
In this example, Google’s doing a rare display of four ads. That’s because it’s showing the maximum of three regular ads it will show with a special Comparison Ads unit on top of those. And that will just add fuel to criticisms that if Google is taking aim at pages top-heavy with ads, it might need to also look closer to home.
NOTE: About three hours after I wrote this, Google clearly saw the criticisms about ads on its own search results pages and sent this statement:
This is a site-based algorithm that looks at all the pages across an entire site in aggregate. Although it’s possible to find a few searches on Google that trigger many ads, it’s vastly more common to have no ads or few ads on a page.
Again, this algorithm change is designed to demote sites that make it difficult for a user to get to the content and offer a bad user experience.
Having an ad above-the-fold doesn’t imply that you’re affected by this change. It’s that excessive behavior that we’re working to avoid for our users.
Does all this talk about ranking signals and algorithms have you confused? Our video SaveFrom.net below explains briefly how a search engine’s algorithm works to rank web pages:
Today’s change is a new, significant ranking factor for our table, one we’ll add in a future update, probably as Va, for “Violation, Ad-Heavy site.”
Often when Google rolls out new algorithms, it gives them names. Last year’s Panda Update was a classic example of this. But Google’s not given one to this update (I did ask). It’s just being called the “page layout algorithm.”
Boring. Unhelpful for easy reference. If you’d like to brainstorm a name, visit our posts on Google+ and on Facebook, where we’re asking for ideas.
Now for the self-interested closing. You can bet this will be a big topic of discussion at our upcoming SMX West search marketing conference at the end of next month, especially on the Ask The Search Engines panel. So check out our full agenda and consider attending.
Postscript: Some have been asking in the comments about how Google knows what an ad is. I asked, and here’s what Google said:
We have a variety of signals that algorithmically determine what type of ad or content appears above the fold, but no further details to share. It is completely algorithmic in its detection–we don’t use any sort of hard-coded list of ad providers.
When Penguin rolled out earlier this week, it was called the “webspam algorithm update.”
Without a name for the new webspam algorithm, Search Engine Land was asking people for their own ideas at Google+ and Facebook, with the final vote making “Titanic” the leading candidate. A last check with Google got it to release its own official name of “Penguin.”