Scaling Next.js Apps to 10K Concurrent Users
Most Next.js apps never need to think about scale. The defaults are remarkably good. But when your SaaS starts landing enterprise accounts or your marketing campaign goes viral, the cracks show fast.
We've scaled three client apps past the 10K concurrent user mark in the last year. Here's the exact playbook we use.
Where Next.js Apps Break First
It's almost never where people expect.
Not the framework. Next.js handles routing, rendering, and bundling efficiently out of the box. The bottlenecks we see repeatedly:
- Unoptimized database queries — N+1 queries on listing pages
- Missing cache layers — Every request hits the origin
- Oversized client bundles — Shipping 400KB of JavaScript for a dashboard
- Unthrottled API routes — No rate limiting, no connection pooling
Layer 1: Rendering Strategy
The single biggest performance lever is choosing the right rendering mode per page.
| Page Type | Strategy | Why | | --------------------- | ----------------- | ------------------------------------- | | Marketing / Landing | Static (SSG) | Zero compute per request | | Dashboard | Server Components | Fresh data, no client JS | | Interactive widgets | Client Components | Minimal, lazy-loaded | | User-specific content | PPR | Static shell + streamed dynamic parts |
Next.js 16's Partial Prerendering (PPR) is transformative here. Your authenticated dashboard can have a static shell that loads in milliseconds while user-specific data streams in.
// Layout renders instantly as static HTML
// Only the Suspense boundary triggers server work
export default function DashboardLayout({ children }) {
return (
<div className="grid grid-cols-[240px_1fr]">
<Sidebar /> {/* Static — no Suspense needed */}
<Suspense fallback={<DashboardSkeleton />}>
{children} {/* Dynamic — streamed per-user */}
</Suspense>
</div>
)
}
Layer 2: Database Optimization
PostgreSQL handles 10K concurrent connections poorly with default settings. Here's our standard config:
Connection pooling is mandatory. We use Supabase's built-in PgBouncer. For self-hosted setups, configure PgBouncer in transaction mode.
Indexed queries only. Every query that appears in an API route must have a covering index. We enforce this in code review.
-- Before: Full table scan on every dashboard load
SELECT * FROM projects WHERE client_id = $1 ORDER BY updated_at DESC;
-- After: Composite index covers the exact query
CREATE INDEX idx_projects_client_updated
ON projects (client_id, updated_at DESC);
Pagination everywhere. No endpoint returns more than 50 rows. Cursor-based pagination for anything user-facing.
Layer 3: Caching Architecture
We implement caching at three levels:
- CDN edge cache — Vercel's edge network caches static assets and ISR pages automatically
- Application cache —
unstable_cache(oruse cachein Next.js 16) for expensive server computations - Client cache — React Query or SWR for client-side state
import { unstable_cache } from 'next/cache'
const getProjectMetrics = unstable_cache(
async (projectId: string) => {
// Expensive aggregation query
return db.query(
`
SELECT count(*), avg(completion_time)
FROM tasks WHERE project_id = $1
`,
[projectId]
)
},
['project-metrics'],
{ revalidate: 300 } // 5 minute cache
)
Layer 4: Bundle Optimization
Next.js 16 with Turbopack handles tree-shaking well, but you still need discipline:
- Lazy load below-the-fold sections with
dynamic(() => import(...)) - Audit your dependencies — run
npx @next/bundle-analyzerregularly - Use server-only packages — heavy libraries (date-fns, lodash) should stay server-side
- Image optimization — Next.js Image component with proper
sizesprop
Our target: under 150KB first-load JS for any page.
Layer 5: API Route Hardening
Every API route in production needs:
import { headers } from 'next/headers'
import { NextResponse } from 'next/server'
const RATE_LIMIT = new Map<string, { count: number; reset: number }>()
export async function POST(request: Request) {
const headersList = await headers()
const ip = headersList.get('x-forwarded-for') ?? 'unknown'
// Rate limiting
const now = Date.now()
const limit = RATE_LIMIT.get(ip)
if (limit && limit.count > 100 && now < limit.reset) {
return NextResponse.json({ error: 'Too many requests' }, { status: 429 })
}
// ... handle request
}
Results We've Seen
After applying this playbook to a B2B SaaS dashboard:
- Time to First Byte: 2.1s → 180ms
- Largest Contentful Paint: 4.2s → 1.1s
- API p95 latency: 800ms → 120ms
- Monthly Vercel bill: $340 → $89 (fewer serverless invocations)
When to Call for Help
If your app is growing fast and you're hitting walls, book a 15-minute architecture review. We'll identify your top 3 bottlenecks and outline a fix — no strings attached.
Austin Coders
We build SaaS & AI apps that actually scale. React, Next.js, and AI-powered solutions for startups and enterprises.