The Web App Architecture Decisions That Determine Your Scalability Ceiling

The most expensive line in a web app's history is usually not a line of code. It's an architecture decision made in week one that seemed reasonable at the time and revealed itself as a constraint at 50,000 users.

We've seen this play out repeatedly: a product launches on a monolithic Django or Rails app, scales to a point where every deploy feels risky, and then the team spends six months extracting services, refactoring data models, and migrating infrastructure while simultaneously trying to ship features. The rewrite nobody wanted to do.

Most of it is preventable with better decisions at the start. Not by over-engineering — by making the right tradeoffs with the scale trajectory in mind.

The Rendering Decision: SSR, SSG, CSR, or ISR

This is the decision that surprises teams most when they get it wrong, because the consequences are invisible at low traffic and catastrophic at high traffic.

Client-side rendering (CSR) — the default for create-react-app and most SPAs — sends a near-empty HTML shell and renders everything in the browser via JavaScript. Fast to build, expensive at scale. Problems: poor Core Web Vitals (no content in the initial HTML), bad SEO for content-heavy pages, high JavaScript bundle size, high TTFB if your API is slow.

Server-side rendering (SSR) — Next.js renders HTML on the server per request. Good for personalized or dynamic pages that need to be indexed by search engines. The cost is server compute per request. At 100k concurrent users, SSR infrastructure cost is non-trivial.

Static site generation (SSG) — pages are rendered at build time. Zero server compute on the hot path. Excellent for marketing pages, documentation, and content that doesn't change per-user. CDN-cached, globally fast. The cost is build time as your page count grows and the constraint is that you can't easily show user-specific content.

Incremental static regeneration (ISR) — Next.js renders pages statically and revalidates them on a schedule or on-demand. The best of SSR and SSG for most SaaS products: marketing pages are static, product pages are revalidated when data changes. Handles most use cases without per-request server compute.

The decision matrix for a typical SaaS product:

| Page type | Rendering strategy | | ------------------------------ | ------------------------ | | Marketing / landing pages | SSG | | Blog, documentation | SSG or ISR | | Dashboard, user-specific views | CSR (behind auth) or SSR | | E-commerce product pages | ISR | | Search results | SSR or CSR |

Getting this right from the start means your Core Web Vitals pass without heroic optimization later, your SEO works without a separate static site, and your infrastructure costs scale with actual usage instead of page views.

The Data Layer: When a Database Becomes a Bottleneck

Most early-stage web apps use a single Postgres instance and it works fine. Until it doesn't.

The failure modes are predictable:

Read-heavy workloads without read replicas. Analytics dashboards, activity feeds, reporting queries — all on the primary. Heavy analytical queries lock rows, slow down writes, and cause cascading latency across your product during peak usage. The fix is read replicas, with application-level routing to direct queries appropriately.

Unindexed columns used in WHERE clauses. A query that takes 2ms at 10,000 rows takes 200ms at 1,000,000 rows with the same index. Index decisions made (or not made) in week one become performance incidents in year two.

Synchronous blocking operations for background work. Email sends, webhook deliveries, PDF generation — if these happen synchronously in a request handler, they contribute directly to user-facing latency and consume database connections while waiting. The fix is a job queue (BullMQ, Sidekiq, Celery) that processes these asynchronously.

Missing connection pooling. A Next.js API route running on serverless infrastructure opens a new database connection on every invocation. Without connection pooling (PgBouncer or Prisma Accelerate), a traffic spike can exhaust your Postgres connection limit and cause cascading failures across every API endpoint simultaneously.

These are not scaling problems. They're architecture omissions. Adding read replicas, indexes, a job queue, and a connection pooler before launch is engineering table stakes.

The Session and Auth Architecture

Building your own authentication system is one of the most common and costly mistakes in early web app development.

Auth is not just a login form. Auth includes: password hashing, reset flows, email verification, session management, token rotation, rate limiting on auth endpoints, multi-factor authentication, social login, session revocation, and compliance considerations for GDPR (data portability, right to erasure).

Rolling your own auth system for a web app in 2025 means spending 3–5 weeks on something that Auth.js, Clerk, or AWS Cognito handles correctly out of the box. The time cost is obvious. The risk cost is less visible: auth bugs in production often become security incidents, and security incidents at early-stage products are disproportionately damaging to customer trust.

The counterargument is that managed auth services add a dependency and a cost. Both are true. The auth cost at 10,000 MAU from Clerk or Auth0 is $100–$300/month. The engineering cost of building and maintaining an equivalent system is significantly higher. This is not a close call.

The exception: highly regulated environments (HIPAA, FedRAMP, defense contracts) where data residency requirements preclude managed services. In those cases, you're building auth, and you're budgeting for it properly.

The API Design Decisions That Don't Age Well

Two API decisions cause the most pain two years after launch:

Exposing your database schema as your API contract. If your API returns database column names as JSON keys and your frontend uses those keys directly, any database schema change requires a coordinated frontend change. The tighter this coupling, the harder every database migration becomes. API responses should be shaped by what the client needs, not by what the database stores.

Returning unpaginated collections. An endpoint that returns all records works at 100 rows. At 100,000 rows, it returns a response so large it kills the client. Cursor-based or offset-based pagination needs to be on every list endpoint from day one. Retrofitting pagination after launch is painful because the client code that consumes the unpaginated endpoint needs to change simultaneously.

Both are easy to get right at the start and expensive to fix later, when the API is consumed by a mobile app, third-party integrations, and internal tools that each need to be updated simultaneously.

Performance Engineering: What the Big Platforms Get Right

We built infrastructure that handled 45–74M monthly active users at MakeMyTrip. At that scale, performance engineering is survival.

The patterns that matter at any scale:

CDN-first asset delivery. Every static asset — images, fonts, JavaScript bundles — should be served from a CDN. Modern frameworks like Next.js do this automatically for pages and static files. Images need additional treatment: modern formats (WebP, AVIF), responsive sizes, and lazy loading. At MakeMyTrip's scale, we reduced image payloads by 95–98% through format optimization alone.

Edge caching for API responses. Public or semi-public data (product catalogs, pricing, public profiles) can be cached at the CDN edge and revalidated on change. This eliminates backend compute for a significant fraction of requests at high traffic volumes.

Bundle splitting and code splitting. A monolithic JavaScript bundle that includes all application code is a performance antipattern. Route-based code splitting — loading only the code needed for the current page — reduces initial load time and time-to-interactive for every page in the application.

Database query patterns. N+1 queries are the most common database performance problem. An ORM that loads a user and then fires a separate query for each related record at scale produces N+1 queries where N is the size of the result set. Identifying and fixing these with eager loading or DataLoader patterns is important at any non-trivial scale.

The Infrastructure Decision: Serverless vs Containerized

This decision determines your operational overhead more than your performance characteristics.

Serverless (Vercel, Netlify, AWS Lambda) is the right default for web apps that don't have unusual infrastructure requirements. Zero operational overhead, automatic scaling, pay-per-invocation pricing that's cheap at low volume. The constraints: cold starts for infrequently called functions, execution time limits, limited control over the runtime environment.

Containerized (Docker + Kubernetes or AWS ECS) is the right choice when you have long-running processes, specialized dependencies (GPU inference, specific OS libraries), strict latency requirements where cold starts are unacceptable, or you're operating at a scale where compute costs become a significant factor.

Most web apps should start serverless and move to containers when a specific constraint forces the decision. The migration path from Vercel to containerized is well-understood. The reverse — from custom infrastructure back to managed — is significantly harder.

Building It Right the First Time

These decisions are not premature optimization. They're the difference between a product that scales smoothly and one that requires a rewrite when traction arrives.

Our web app development practice makes these architecture decisions explicit in week one — rendering strategy, database design, auth approach, API contracts, infrastructure model — before engineering starts. Every engagement includes a technical architecture document that the client owns, so the reasoning behind each decision is preserved for the engineers who come after us.

For teams that need the web app to work alongside a mobile product, our mobile app development practice builds on the same API layer with shared business logic where possible. If your web app is also the dashboard layer for an IoT platform, our IoT development team handles the device and data pipeline layer. And if your product needs a design system that keeps the UI consistent as the engineering team grows, our design system development practice delivers it alongside the web app build.

Book a 30-min discovery call — we scope your web app, recommend the right architecture for your scale trajectory, and quote a fixed price before any work starts.

The Rendering Decision: SSR, SSG, CSR, or ISR

The Data Layer: When a Database Becomes a Bottleneck

The Session and Auth Architecture

The API Design Decisions That Don't Age Well

Performance Engineering: What the Big Platforms Get Right

The Infrastructure Decision: Serverless vs Containerized

Building It Right the First Time

Building something like this?

Keep reading.

Multi-Tenant SaaS Architecture: The Guide for Builders Who Don't Want to Rewrite in Year Two

Ship the product. This quarter.

Tell us about your project