How to remove URLs from Google searches

How to remove URLs from Google searches Mars664

How to remove URLs from Google searches

URLs - How to delete URLs

URLs - How to delete URLs

There are many ways to remove URLs from Google. Which way to use depends on the circumstances. Using the wrong method could cause issues with the SEO of the pages, in addition to the pages not being removed from the index as intended.

We will seek to help you quickly decide which removal method is best for you. In this post you will learn:

  • How to check if a URL is indexed
  • 5 ways to remove URLs from Google
  • How to prioritize deletions
  • Common Removal Mistakes to Avoid

 

How to check if a URL is indexed

What is usually done to check if content is indexed is to use a site: search on Google (for example, site: http://pixelwork.mx). However, site:searches are not normal queries and do not actually tell whether a page is indexed or not. They may display pages known to Google, but this does not mean that they are eligible to be displayed in normal search results without the “site:” operator.

site:searches may still show pages that redirect or canonize to another page.

The best method to verify indexing is to use the indexing coverage report in Google Search Console or the URL inspection tool for an individual URL. These tools tell you whether a page is indexed and provide additional information about how Google treats the page. If you don't have access to this, simply Google the full URL of your page.

 

Ways to remove URLs from Google

1) Delete Content

If you delete the page and it shows a 404 (not found) or 410 (missing) status code, the page will be removed from the index shortly after it is crawled again. Until it is removed, the page may still appear in search results. And even if the page itself is no longer available, a cached version of the page may be temporarily available.

 

You may need another option if:

  • You need a more immediate removal. See the URL removal tool section.
  • You need to consolidate signals like links. See the canonicalization section.
  • You need the page available for users. See if noindex or restricted access sections fit your situation.

 

2) Noindex

A noidex label meta robots or an x-robots header response will tell search engines to remove a page from the index. The meta robots tag works for pages and the x-robots response works for pages and additional file types like PDF. For these tags to be seen, a search engine must be able to crawl the pages. Please note that removing pages from the index may prevent link consolidation and other signals.

Example of a noindex meta robots:
<meta name = "robots" content = "noindex">
Example of x-robots noindex tag in header response:
HTTP / 1.1 200 OK
X-Robots-Tag: noindex

 

You may need another option if:

  • You don't want users to access these pages. See the access restriction section.
  • You need to consolidate signals like links. See the canonicalization section.

 

3) Access restriction

If you want the page to be accessible to some users, but not search engines, then what you probably want is one of three options:

  • Some login system
  • HTTP Authentication (where a password is required to access)
  • IP whitelist (which only allows specific IP addresses to access pages)

This type of setup allows a group of users to access the page, but search engines will not be able to access it and will not index the pages.

 

You may need another option if:

  • You need a more immediate removal. See the URL removal tool section.

 

4) URL removal tool

The name of this Google tool is slightly misleading, as the way it works is that it temporarily hides content. Google will still see and crawl this content, but “removed” pages will not appear to users. This temporary effect lasts six months on Google. This tool should be used in the most extreme cases for issues such as security issues, data leaks, personally identifiable information, etc. For Google, use the URL removal tool.

You will still need to apply another method along with using the removal tool to have pages removed for a longer period (noindex or remove) or to prevent users from accessing the content if they still have the links (remove or restrict access ). This just gives you a faster way to hide the pages, while the deletion has time to process. The request may take up to one day to process.

 

5) Canonicalization

When you have multiple versions of a page and you want to consolidate signals like links to a single version, what you want to do is some form of canonicalization. This is primarily to avoid duplicate content while consolidating multiple versions of a page into a single indexed URL.

There are several canonicalization options:

  • Canonical tag. This specifies another URL as the canonical version or the version you want to display. If the pages are duplicates or very similar, this should be fine. When the pages are too different, the canonical can be ignored as it is a hint and not a directive.
  • A redirect takes a user and a search bot from one page to another. 301 is the most commonly used redirect by SEOs, and it tells search engines that you want the final URL to be the one shown in search results and where the signals are consolidated. A 302 or temporary redirect tells search engines that you want the original URL to be the one that remains in the index and consolidate signals there.
  • Handling URL parameters. A parameter is added to the end of the URL and usually includes a question mark, such as pixelwork.mx?this=parameter. This Google tool allows you to tell search engines how to treat URLs with specific parameters.

 

How to prioritize deletions

If you have several pages to remove from Google's index, then you must give them a prioritization order:

  • Maximum priority: These pages are usually related to security or sensitive data. This includes content that contains personal data, customer data or proprietary information.
  • Medium priority: This usually involves content intended for a specific group of users. Employee portals, member-only content, and staging, testing, or development environments.
  • Low priority: These pages often include duplicate content of some kind. Some examples of this would include pages served from multiple URLs, URLs with parameters, and again could include staging, testing or development environments.

 

Common mistakes you should avoid

  • Noindex in robots.txt

While Google used to unofficially support noindex in robots.txt, it was never an official standard and they have now formally removed support. Many of the sites that were doing this were doing it incorrectly and were harming themselves.

  • Blocking crawling in robots.txt

Crawling is not the same as indexing. Even if Google can't crawl pages, if there are internal or external links to a page, it can still index it. Google won't know what's on the page because it won't crawl it, but it knows that a page exists and will even write a title to display in search results based on signals like anchor text from links to the page.

  • Nofollow

This is commonly confused with noindex, and some people use it at the page level expecting the page to not be indexed. Nofollow originally stopped on-page links and individual links with the nofollow attribute from being crawled, but that is no longer the case. Google can now crawl these links if it chooses. Nofollow has also been used on individual links to try to prevent Google from crawling specific pages. Again, this no longer works as nofollow is just a hint.

  • Noindex, wait for Google to crawl, then block crawling

There are a couple of ways this usually happens:

  • The pages are already blocked, but are indexed; people add noindex and unblock so Google can crawl and see the noindex; then they block the pages from being crawled again.
  • People add noindex tags to pages they want to remove and after Google has crawled and processed the noindex tag, they block the pages from being crawled.

Either way, the final state is blocked from being tracked. Although these pages are blocked, they can still end up in the index.

 

Deciding how to delete URLs is quite situational, and the process to follow is different in each case. It is necessary to know what you want and what steps to follow to obtain the desired results.

 

Sources:

[1] https://ahrefs.com/blog/remove-urls-from-google/#check-url-is-indexed

[2] https://support.google.com/webmasters