Page is not indexed by Google: Possible reasons

For webpages to appear in Google search results, it need to be indexed by the Google. The Indexing process starts from crawling. If your webpage has been crawled, then the page is discovered. After the webpage has been discovered by the Google, next it will be indexed by the Google. However, there are certain reasons for which webpages are not indexed by the Google.

We will discuss the possible reasons which restricts the page to be indexed, how to rectify it to index the webpage and let it appear in the search results.

Reasons, webpage is not indexed by Google.

Page url not submitted in the sitemap
Blocked by Robots.txt
Blocked by noindex tag
Duplicate content issue
Blocked due to access forbidden (403)

Useful link,

Baisc set of meta tags for SEO - A beginners guide

1. Page url not submitted in the sitemap

Sitemap is the link structure of the website. It contains all the urls of the webpages that any website has. Search engine crawlers visits sitemap from time to time. It indexes all the webpages that it finds in the sitemap. But, if any specific url is not included in the sitemap. Crawlers cant crawl that webpage and eventually that page remains unvisited by the crawler. In such a case crawlers will not index that page, unless the url is included in the sitemap.

However, in some special cases the pages still might get indexed if it is reachable via another pages. That is the page is reachable for crawler, but it is not included in the sitemap. In such case, you will get a message Indexed but not submitted in the sitemap.

2. Blocked by Robots.txt

It is a text file which contains necessary instruction which every search engine bots/crawlers must follow. We can use this file to control how crawlers crawl pages in the website. If a certain page has been blocked by Robots.txt, page will not crawled by the crawlers. Eventually, the page will not be indexed. So, web developers/ SEO beginners should use this file carefully. Any kind of misconfiguration might block any url. Or in worst case entire website might get blocked for the crawlers.

Test if a page or resource is blocked by a robots.txt rule

3. Blocked by noindex tag

Noindex is a html meta tag which is placed in the head section of the webpage. It is a directive to search engine crawlers that do not index this page. If you have used a noindex tag inside head section of a webpage, it is very much possible that page will be indexed by the Googlebots.

4. Duplicate content issue

Duplicate content is a common issue which might block some of the pages of your website. So, choose your content wisely. Even if you have some duplicate content issue, you can handle it with canonical tag. Set the original page as canonical and duplicate pages as alternate.

Link tag rel=canonical - The ultimate Guide to SEO

5. Blocked due to access forbidden (403)

If certain pages in your website needs authorization, Googlebots can not crawl such pages. If crawlers can not crawl any page they will not be able to index that page. In such scenario that page will remain blocked.

Published on Apr 15, 2021

Dj Techblog