🚨 What Happens if Disallow and Noindex are Used Simultaneously?
If you want to completely remove a page from Google’s index, using the noindex and disallow directives together incorrectly can be a major mistake.
✅ Correct Approach:
- Add a noindex tag to the page and allow search engines to access it.
- Verify in Google Search Console that the page is crawled with “noindex” and removed from the index.
- You can use disallow to block crawling in order to optimize your crawl budget.
❌ Incorrect Implementation (Error Scenario)
- If a page is both tagged with noindex and disallowed via robots.txt, Google will not be able to crawl the page.
- Because it cannot crawl the page, the noindex tag will not be seen and Google will not remove the page from the index.
- As a result, Google may continue to index the page, and you might see an error message in Search Console stating: “Indexed, though blocked by robots.txt.”
🔎 A Real-Life Example:
When I performed a site:domain.com search on a website, I noticed that some pages, despite being tagged with noindex, were still indexed. The reason was that the pages were disallowed in the robots.txt file, completely blocking bots from accessing them.
📌 Conclusion:
👉 First, allow Google to access the page so that it processes the noindex tag, and then use disallow!
👉 Otherwise, your pages might not be removed from the index as expected, and your SEO strategy could be negatively affected.