The Robots.txt Mistake That Cost Me Visitors

Three months after launching my first multilingual Next.js application, I opened Google Search Console expecting to see healthy indexing numbers. Instead, I found a graveyard: 156 pages crawled, only 23 indexed. The rest? “Crawled but not indexed.”

I spent two weeks debugging. The site looked perfect in the browser. The code was clean. Internationalization worked flawlessly. But search engines were treating my carefully crafted multilingual site like broken HTML from 1999—no styling, no images, just plain text.

The culprit? A single line in my robots.txt:

Disallow: /_next/

That one line was blocking all JavaScript, CSS, and static assets. Google couldn’t render the pages properly, so it simply refused to index them. Fixing it took 30 seconds. Getting Google to re-crawl everything? Another month.

Let me show you how to avoid this disaster.

Why This Happens

When you block /_next/ entirely, you’re telling search engines: “Don’t access anything in this directory.” Problem is, Next.js stores ALL your critical assets there:

  • /_next/static/ - Your JavaScript bundles and CSS files
  • /_next/image/ - Optimized images
  • /_next/data/ - Data for dynamic pages

Without these, Google sees your beautiful React app as a blank page with unstyled HTML. And blank pages don’t get indexed.

The Evidence

I’ll never forget searching for my own site and seeing it ranked on page 3—not because of competition, but because Google’s preview showed a wall of unstyled text that looked like a broken website from 2005. Nobody clicks on that.

Here’s what was happening:

Before (my mistake):

  • 23 out of 156 pages indexed (15% index rate)
  • Search engines see unstyled HTML
  • Poor user experience in search results
  • Rankings suffer even when content is good
  • Weeks of wasted crawl budget

After (proper configuration):

  • 156 pages indexed within 4 weeks (100% index rate)
  • Pages render correctly with full styling
  • JavaScript and CSS load properly
  • Images display correctly
  • Search results look professional and clickable

The Solution

Create app/robots.ts with precise allow/disallow rules:

import type { MetadataRoute } from 'next';

const SITE_URL = process.env.SITE_URL ?? 'https://yoursite.com';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: '*',
        allow: [
          '/',
          '/images/',
          '/assets/',
          '/favicon.ico',
          // ✅ CRITICAL: Allow static assets for proper rendering
          '/_next/static/',
          '/_next/image/',
        ],
        disallow: [
          '/admin',
          '/api/',
          '/private/',
          '/scripts/',
          '/logs/',
          // ✅ Only block _next/data directory (not needed for SEO)
          '/_next/data/',
          // Block build manifest files
          '/*_buildManifest.js',
          '/*_middlewareManifest.js',
          '/*_ssgManifest.js',
        ],
      }
    ],
    sitemap: `${SITE_URL}/sitemap.xml`,
    host: SITE_URL,
  };
}

The key differences:

  1. Allow /_next/static/ - JavaScript and CSS can load
  2. Allow /_next/image/ - Images render properly
  3. Block only /_next/data/ - This is safe to block (internal data fetching)

How to Test Your Fix

Don’t trust, verify. Here’s how to confirm your robots.txt is correct:

# Local testing
npm run dev
curl http://localhost:3000/robots.txt

You should see:

User-Agent: *
Allow: /
Allow: /_next/static/
Allow: /_next/image/
Disallow: /_next/data/
...
Sitemap: https://yoursite.com/sitemap.xml

Critical checks:

  • /_next/static/ is in Allow list
  • /_next/image/ is in Allow list
  • ✅ Only /_next/data/ is disallowed
  • ✅ Sitemap URL is present

After You Deploy

Once you’ve deployed the fix:

  1. Request re-indexing in Google Search Console
    • Go to URL Inspection
    • Enter your homepage URL
    • Click “Request Indexing”
    • Repeat for your top 10 most important pages
  2. Monitor the recovery (be patient!)
    • Week 1-2: Google starts re-crawling
    • Week 3-4: New pages getting indexed
    • Week 5-8: Full recovery of indexing

My timeline: 26 pages indexed → 60 pages (2 weeks) → 100+ pages (4 weeks) → 156 pages (6 weeks)

The Mistake

Why do I say this cost me visitors? Let’s do the math:

  • 133 pages not indexed for 3 months
  • Average 2-3 organic visitors per page per day (conservative estimate)
  • 133 pages × 2.5 visitors × 90 days = ~30,000 lost visitors

Even with a conservative 5% conversion rate to email signups or inquiries, that’s 1,500 lost leads. If even 1% of those would have converted to customers… you get the idea.

The opportunity cost of broken SEO is brutal.

Don’t Make My Mistake

This is one of the easiest SEO wins you can get. It takes 5 minutes to implement and can save you months of confusion and lost traffic.

Check your robots.txt right now. If you see Disallow: /_next/ without the specific subdirectory paths, you’re blocking Google from indexing your site properly.

Fix it. Deploy it. Request re-indexing. Move on.

What’s Next?

Fixing robots.txt solves the “can Google see my site?” problem. But that’s just the foundation. Next, you need to make sure Google can discover all your pages efficiently.

In my next post, I’ll show you how to build dynamic multilingual sitemaps that actually get all your content indexed—including the location pages that drove 60% of my SEO growth.

Read next: Building Dynamic Multilingual Sitemaps in Next.js


Full SEO Series for Next.js:

  1. The Robots.txt Mistake That Cost Me Visitors (you are here)
  2. Building Dynamic Multilingual Sitemaps in Next.js


Mina Nabil Sami

Mina Nabil Sami

Senior software engineer with 16+ years building scalable backend systems. Expert in PHP (Symfony, Laravel), Node.js, AWS, and DevOps. Technical blog, mentorship, and consulting.