Google's new algorithm creates original articles from your content.
Google has published research on a new algorithm that can take your website and that of your competitor and generate “coherent” articles. By creating original content, Google's new algorithm can answer a user's question without having to send them to another web page.
How does the “Paraphrasing” Algorithm work?
Google's new algorithm works by summarize web content using an algorithm which “extracts” your content and then discards the irrelevant parts. This is similar to the algorithms used to generate featured snippets.
These are called “extractive digests” because they extract content from web pages. Extractive Summaries are like a reduction of the original text to the most important sentences. Then this algorithm uses another type of algorithm called abstract summarization. Abstractive summaries are a form of paraphrasing.
A disadvantage of artificial paraphrasing (abstract summaries) is that almost a third of the summaries contain false facts. New Google research has discovered a way to bring together the best of both approaches. They use “extractive summarization” to extract the important facts from web documents and then apply the “abstract” approach to paraphrase the content.
This approach creates a new document based on information found on the web and creates Google's own version of Wikipedia. Google's new algorithm is described in a research paper titled, Generating Wikipedia by summarizing long sequences
According to Google:
“We show that the generation of English Wikipedia articles can be approached as a multi-document summary of original documents.”
This means that Google can go out and collect information about a topic from multiple web pages.
Then:
“We use extractive summarization to identify relevant information…”
This means they reduce web pages to the most important sentences to extract meaning.
The following step is to use:
“…an abstract neural model to generate the article.”
This means that Google will take the extracted meanings and use a “neural abstract model” to summarize those facts (extracted from many websites) into natural-looking sentences and paragraphs to create an article.
Google says the resulting articles can pass human scrutiny.
“We show that this model can generate fluent and coherent multi-sentence paragraphs… When presented with reference documents, we show that they can extract relevant objective information reflected in… human evaluations.”
Featured Snippets are the First Step
Featured snippets are an example of Extractive Summary. It is a process of taking an entire web page and then throwing out the irrelevant words and phrases and keeping only the few sentences that communicate the answer to a question.
There is a related Google algorithm that summarizes Google Voice web pages called, Deletion Sentence Compression with LSTM.
Does Google algorithm summarize your content?
This algorithm consists of summarizing “multiple documents” and summarizing them. This can be applied to books. This can be applied to open source databases. But this can also apply to any public web page, including its content.
The research uses Wikipedia topics as a search query and the search engine results as a source for the extracted summaries which are then paraphrased to create new articles. This algorithm also did a side-by-side test by also generating a second set of articles using only the references cited by Wikipedia.
The document describes the process this way:
“Reference documents are obtained from a search engine, with the Wikipedia topic used as a query similar to our search engine references. However, we also show results with documents that are only found in the References section of Wikipedia articles.”
The simple English translation is that they use Wikipedia topics like search queries and the search engine results pages (SERPs), your content, as the source material to generate entirely new web pages that can be used to answer a question. without showing a link to your website
The research document does not mention whether Google will show its own content created from your content. It is also not discussed whether Google will add links to the source materials, either as part of the SERPs or as a footer link.
Google no longer needs to show your content
The research paper concludes that their experiment is successful. Google can generate its own content by summarizing your content, thus answering a user's question without disturbing them by clicking on your site.
This is what the Google research document says:
“We have shown that Wikipedia generation can be approached as a multi-document summary problem…”
That phrase “multiple documents” means any document that is freely available, including your web pages and the web pages of your competitors.
And this is what the research paper says about the success of the algorithm:
“This model significantly outperforms traditional encoder-decoder architectures on long sequences, allowing us to condition many reference documents and generate coherent and informative Wikipedia articles.”
That means Google can use many web pages to generate articles “coherent” and “informative”. This is a pretty disturbing turn of events.
Will Google use this algorithm with Voice Assistant?
There is no word yet on when or when Google will start generating its own content from your content. However, an algorithm like this is perfect for searching for a voice assistant. Voice assistant search are searches performed through a mobile phone or Internet of Things (IoT) device in your home or car.
Therefore, a person can ask the Google Voice Assistant about a movie star and Google's voice assistant can respond in sentences to answer your question, as if you had asked a real person.
Google always aspired to be like the voice assistant computer in Star Trek. In 2014, an earlier version of voice search was reported to be codenamed after the actress who played the voice of the Star Trek computer. An algorithm like this would fit perfectly into a voice assistant setup.