How Machine Learning Affects the Need for Quality Content

How Machine Learning Affects the Need for Quality Content pixelwork

How Machine Learning Affects the Need for Quality Content


As Google continues to invest in machine learning technology to help it better understand and analyze user questions, in this article we emphasize the need for marketers to continually improve content quality and user satisfaction.

Back in August, I floated the concept of a two-factor ranking model for SEO. The idea was to greatly simplify SEO for most editors and remind them that the finer points of SEO don't matter if you don't get the fundamentals right. This concept leads to a basic classification model that looks like this:

Rating = Quality Content * Quality Link

To look at it a little differently, here's a way to evaluate the importance of content quality:


The reason machine learning is important to this picture is that search engines are investing heavily in improving their understanding of language.

Hummingbird was the first algorithm publicly announced by Google that it focused heavily on the direction of natural language understanding, and RankBrain was the next algorithm.

I think these investments focus on objectives like these:

  1. Better understanding of user intent
  2. Better assessment of content quality

We also know that Google (and other engines) are interested in leveraging user satisfaction and user engagement data as well. Although it is less clear exactly what cue signals they will use, it seems likely that this will be another learning site.

Today, I'm going to explore where the state of the art is when it comes to content quality, and how I think machine learning is likely to lead to the evolution of that.

Content Quality Improvement Case Study

A large number of the sites we see continue to under-invest in adding content to their pages. This is very common on e-commerce sites. Too many of them create their pages, add the products and product descriptions, and then think that's it. This is a mistake.

For example, adding unique user comments specific to products on the page is very effective. Stone Temple once worked at a place where Adding user comments led to a 45 percent traffic increase on pages included in the test.

A test was also done where existing text on category pages that had originally been created as “SEO text” was taken and replaced. The so-called SEO text was not written with users in mind and therefore added little value to the page.

Replaced the SEO text with a true mini-guide specific to the categories in which the content resided. One was seen 68% gain in traffic on those pages. There were also some control pages for which no changes were made, and traffic on them fell by 11%, so the net profit was only 80 percent:


Please note that the text was handcrafted and tuned with an explicit goal of adding value to the pages evaluated. So this wasn't cheap or easy to implement, but it was still quite profitable, given that it was done on the main category pages for the site.

These two examples show us that investing in improving content quality can offer significant benefits. Now let's explore how machine learning can make this even more important.

Impact of machine learning

Let's start by looking at our top ranking factors and see how machine learning could change them.

Content Quality

Showing high-quality content in search results will continue to be critical for search engines. Machine learning algorithms like RankBrain have improved their ability to understand human language. An example of this is the query that Gary Illyes shared: “can you get 100% score in Super Mario without a tutorial?”

Before RankBrain, the word “without” was ignored by Google's algorithm, which returned examples of tutorials, when what the user wanted was to be able to obtain a result by telling them how to do it without a tutorial. RankBrain focused primarily on long-tail search queries and represented a good step in understanding user intent for such queries.

But Google has a long way to go. For example, consider the following query:

In this query, Google is not clear about how the word “best” is being used. The question is not about the best down comforters, but rather why down comforters are better than other types of comforters.

Let's take a look at another example:

See how the article identifies that the coldest day in US history occurred in Alaska, but doesn't actually provide the detailed answer in a Featured snippet. The interesting thing here is that the article from which Google got the answer actually saying both the date and temperature of the coldest day in the US.

These things aren't that complicated, when you look at them one at a time, for Google to fix. Current limitations arise due to the complexity of the language and the scale of machine learning needed to solve it.

The approach to fixing it requires building increasingly larger sets of examples like the two I've shared above, and then using them to help train algorithms derived from machine learning.

RankBrain was a big step for Google, but work is still ongoing. The company is making huge investments to advance its language understanding dramatically. The following excerpt, from USA Today, begins with a quote from Google senior program manager Linne Ha, who leads the Pygmalion team of linguists at the company:

“We're making up rules and exceptions to train the computer,” Ha says. “Why do we say “the president of the United States?” “And why don't we say” the president of France? “There are all kinds of inconsistencies within our language and in each language. For humans it seems obvious and natural, but for machines it is quite difficult.”

Google's Pygmalion team is the one focused on improving Google's understanding of natural language. Some of the things that will improve at the same time are your understanding of:

  1. Which web pages best match the user's intent, as implied by the query.
  2. The breadth of a page in addressing user needs.

As they do so, their abilities to measure the quality of content and how well it addresses user intent will grow, and this will therefore become a bigger and bigger factor over time.

User engagement/satisfaction

As mentioned above, we know that search engines use various methods to measure user engagement. They have already publicly revealed that they use CTR as a quality control factor, and many believe they use it as a direct ranking factor.

Regardless, it's reasonable to expect that search engines will continue to look for more useful ways for user signals to play a larger role in search rankings.

There is a type of machine learning called “reinforcement learning” which may come into play here. What if you could test different sets of search results, see how they perform, and then use them as input to directly refine and improve search results automatically?

In other words, could you simply collect user engagement signals and use them to dynamically test different types of search results for queries and then keep tweaking them until you find the best set of results?

But it turns out that this is a very difficult problem to solve. Jeff Dean, who many consider one of the leaders of Google's machine learning efforts, had this to say about measuring user engagement in a recent interview he did with Fortune:

An example of a disordered reinforcement learning problem is perhaps trying to use it on what search results I should display. There is a much larger set of search results I can display in response to different queries, and the reward signal is a bit noisy. Like if a user looks at a search result and likes it or dislikes it, that's not so obvious.

However, I expect this to be a continued area of ​​investment for Google. And, if you think about it, user engagement and satisfaction has an important interaction with content quality. In fact, it helps us think about what really represents the quality of the content: web pages that meet the needs of a significant portion of the people who land on them. This means several things:

  1. The product/service/information they are looking for is present on the page.
  2. You can find it relatively easily on the page.
  3. The products/services/supporting information they want can also be easily found on the page.
  4. The page/website gives them confidence that you are a good, reputable source to interact with.
  5. The overall design offers an engaging experience.

As Google's learning capabilities advance, they will gain a better measurement of page quality or various types of user engagement signals They show what users think about the quality of a page.

This means that you will have to invest in creating pages that fit the criteria set out in the five points above. If you do, you'll give yourself an advantage in your digital marketing strategies – and if you don't, you'll end up suffering as a result.



There are huge shifts in the wind, and they're going to dramatically impact your approach to digital marketing. Your basic priorities will not change, as you will still need:

  1. Create high quality content.
  2. Continually measure and improve user satisfaction with your site.
  3. Establish authority with links.

The big question is, are you really doing enough of these things today? In my experience, most companies invest in continually improving content quality and improving user satisfaction.

It's time to start paying more attention to these things. As Google and other search engines get better at determining content quality, the winners and losers in search results will begin to change dramatically.

Google focuses on offering more and better results, as this leads to a greater market share for them and therefore higher income levels. It's best to get on board the quality content train now – before it leaves the station and leaves you behind!