Many people assume that Google leverages all of its information resources in every way possible. For example, the vast majority of people who think about it probably assume that Google uses its Chrome browser as a data source and in particular, they use it to discover new URLs to crawl.
We decided to put that to the test.
Brief Note on Methodology
The test methodology is quite simple. Initially, all we needed was to set up a couple of pages that Google didn't know about, have a bunch of people visit those pages from a Chrome browser, and then wait to see if Googlebot would visit them. In practice, there was a bit more discipline involved in the process, so here's what we did:
1.- Four new articles were created as pages. We used two of these as test pages, and two of them as control pages.
2.- They were uploaded to a website using FTP directly to a web server. By that, I mean that we do not use any content management system (e.g. WordPress, Drupal, …) to upload them to ensure that something in that process did not make Google aware of the pages.
3.- We waited a week to make sure nothing went wrong, causing Googlebot to visit the site. During that week, we checked the site log files every day to make sure there were no Google bot visits.
4.- 27 people were enlisted to follow this process:
- Open Chrome.
- Go to settings and disable all extensions.
- Paste the URL of the first test page into the browser and then visit it.
- Paste the URL of the second test page into the browser and then visit it.
- Reactivate extensions after completing the steps above.
- View the IP address and send it to to verify that all steps were followed and which were not.
5.- Log files were checked every day up to a week after the last user completed these steps
Please note that control pages They were never visited by humans, and that's what makes them page controls. If something went wrong in the upload process then they could be visited, but that never happened. There have been no views of any of the control pages, either by humans or robots, so this confirmed that our upload process worked.
What do people believe about this?
In anticipation of this test, Rand Fishkin of Moz posted a poll on Twitter to see what people believed about whether Google uses Chrome data for this purpose. Here is your result:
As you can see, a whopping 88% believe that Google sniffs out new URLs from Chrome data, and most of them are sure that they definitely do. Let's see how it compares to our test results.
The results
The results are quite simple: Googlebot never came to visit any of the test pages.
As a result, two people in the test did not disable their extensions, and this resulted in hits from Open Site Explorer (someone had Mozbar installed and enabled) and Yandex (due to a different person, although I'm not sure which extension was installed and enabled ).
Summary
This is a remarkable result. Google has access to a huge amount of Chrome data, and it's hard to believe it doesn't use it in some way.
However, keep in mind that This test was specific to testing if you were using Chrome to discover new URLs. My guess is that Google is still of the mindset that if there is no connection via a page (link), it doesn't have enough value to rank.
This doesn't mean that Google doesn't use Chrome data for other things, like collecting aggregated user behavior data or other metrics. Google's Paul Haahr confirmed that they use data like CTR as a quality check in highly controlled tests to measure search quality.
Note: He didn't say it was a live ranking factor, but rather a way to validate that the other ranking factors were doing their job well. That said, perhaps Chrome is a data source in such tests. It could easily have been made truly anonymous, and perhaps added a ton of information about user satisfaction with the search results.
In any case, this part of the conversation is all speculation and, for now, we have shown that Google does not seem to use simple Chrome visits to new web pages as a way to discover URLs for crawling.