In the pool of underappreciated — or, at least, poorly understood — tools, Google Search Console tops the list. It began its current life as Google Webmaster Tools and gave technical webmasters and SEOs a window into their search presence. While Google Analytics offers rich data about users after they’ve decided to visit your site, GWT brought insight to the process that brought them there in the first place.
In the world of (Not Provided), this became an even more powerful tool, offering our only, or at least most accurate, glimpse into search query data direct from the source. Beyond keywords, however, the rebranded Google Search Console gave us valuable information about everything from mobile usability and site speed to robots.txt status and structured data. If you’re not using Search Console, you’re likely flying blind.
Let’s take a look at the new Google Search Console beta release.
One of the biggest frustrations with GSC for anyone looking for long term insights and trends is the seemingly arbitrary and otherwise inexplicable 90-day data limit. Many marketers and webmasters, particularly the SMB crowd, lack the time resources to even warehouse years’ worth of data, much less to compile and analyze it. For those who do, processing this with R or dumping into a big data visualization tool is certainly effective, but cumbersome, to say the least.
Finally, Google is giving us lengthy data reporting baked right into the Google Search Console interface — the default selections include three, six, 12 months, and… “full duration,” which has been reported by some sources as 16 months. I can anecdotally confirm this based on my own accounts, though the existence of the data makes me wonder if Google could go back further if they wanted. Either way, year-on-year comparisons for impressions are now a thing, and that’s exciting!
Duplication Duplication (Pun Intended)
When someone in the SEO world talks about “duplication” or (gasp) “duplication penalties,” there is an immediate polarization — either they definitely know what they’re talking about, or they definitely don’t.
Back in reality, while there is no explicit duplication penalty, Google has suggested that if a web page is considered highly duplicative, then it’s inherently less valuable to a user than a page that is very unique. The exact logistics of this are a secret, of course, but it parallels how we treat everyday life, often seeking customized and varied products, from food to cars to clothing, but with full acceptance that there will inevitably be some level of overlap.
The Search Console beta now gives us a plethora of tools to determine exactly what Google considers duplicative, to what extent, and most importantly, whether it is affecting organic traffic. In a section that Google calls “Index Coverage,” you can filter the report to show pages which have been crawled but not indexed, along with broad reasoning for that choice. Some are pretty straightforward like “Excluded by ‘noindex’ tag” and “Blocked by robots.txt.” Others are much more intriguing, like “Duplicate page without a canonical tag,” and “Google chose different canonical than user.”
In both cases, based on the New Search Console support page, Google is outright telling us that these pages are (too) similar to another page, and therefore haven’t been indexed. Some are obvious — think parameters and user sessions — while others are strangely different from what you might expect. In my own digging, I’ve found that while I think page A should be the canonical for B, C and D, Google actually thinks it should be page C. Why does Google think the “Blue Widget” page should be the canonical? That’s a question for another time, but for now I can use this information to better leverage canonicals, internal linking and content armed with the knowledge that page C is quietly winning this arms race.
Our Search Console Wishlist
The ability to see over a year’s worth of impression and click data from Google SERPs is invaluable, but occasionally, the visuals are less than spectacular. I’d love to see some smoothing features added to long-term graphs, similar to what we have in Google Analytics, with day, week and month groupings.
Within the “Index Coverage” report, you can overlay an “Impressions” line above the “Valid”/”Excluded”/”Error” bars, which is quite helpful. Unfortunately, this functionality goes away when you drill down any deeper, like into a specific exclusion type, and it doesn’t include clicks or position data. This may be an intentional limitation as to not give away the secret sauce, but I’d love to see it, nonetheless.
For any given site, as has always been the case, the secure/non-secure versions, as well as www, are added as explicitly distinct entities. This makes sense, and is actually better in most cases, but the ability to view an aggregate could go a long way in reducing headaches — for me, at least. Case in point: A client moves to all secure URLs (great!), but now our historical GSC data is effectively segregated, preventing the year-over-year comparisons we raved about earlier. My guess is that this is another limitation based more on how and where the data is stored, rather than Google being mean… but still, it would be nice.
Using and Understanding the New Search Console
Overall, the latest version of this venerable tool has a lot to offer for anyone willing to dig deep. It will be interesting to see if someone will apply some machine learning to the results and perhaps glean a stronger understanding of how and why Google thinks the way it does. We may not be able to predict what RankBrain will do, but this may help turn the giant black box just slightly more white.
Have you checked out the Google Search Console beta? What insights did you find and what features are you hoping for? Share your thoughts with TSA!