Index all pages in Google It can be a challenge for some website owners. But if Google is not indexing every page in your sitemap, is this a problem or is this how Google normally handles it?
The question arose recently in Google Webmaster Office Hours, a site owner was wondering why Google wasn't indexing all the pages on his site submitted via sitemaps.
Yes, that's true. In Search Console we provide information about whether or not how many URLs in a sitemap are indexed but not which ones specifically. For the most part, that's not something you need to worry about, it's completely normal that we don't index every URL we find, and that's not something you need to artificially inflate.
The only thing I'd like to take into account, of course, if it's something that's really important to your website, is whether it's actually indexed, but you notice that pretty quickly, because these are the pages that should be sending traffic.
That being said, you really need to look at the percentage of pages that are indexed versus those that are not.
For example, If your site has a ton of tags on WordPress and you are trying to index them all, when they show the same or almost the same content in each of them (that is, if WordPress created the tag pages for both “book” and “Books” with the same posts tagged in both) Google would probably filter out at least one of those pages for being identical.
Unless there are technical reasons, most pages that tend to be filtered are simply filtered for being duplicate or already indexed. In these cases, it's worth looking and seeing if those pages might be better served as canonical.
In addition, look for problems when new pages are not being indexed when they are added to the sitemap, but this is something that many owners will notice outside of the sitemap indexing numbers.
If you have a very large site and you're trying to figure out why Google isn't indexing large parts of it, you can split your sitemap to try to isolate the problems. For example, you can split your sitemap between product types or page types and it will be easier to identify which parts are having indexing issues so they can be fixed.