Back to Architecture Patterns
InfrastructureAdvanced

Serverless-First Architecture

Pay for what you use, scale to zero when you don't, and stop managing servers entirely — but know when the economics stop working.

May 2, 2026
|
2 topics covered
Discuss This Architecture
Serverless-First Architecture
Infrastructure
Category
Advanced
Complexity
SaaS, Media
Industries
2+
Technologies

When You Need This

Your application has variable traffic — quiet overnight, spikes during business hours, and unpredictable bursts from marketing campaigns or seasonal events. You're paying for servers that sit idle 70% of the time. Or you're building a new product and don't want to invest in infrastructure provisioning, capacity planning, and on-call rotation before you've validated product-market fit. Serverless gives you per-request pricing, automatic scaling, and zero infrastructure management — but only when the workload characteristics fit.

Pattern Overview

Serverless-first architecture builds applications entirely on managed, scale-to-zero compute services (Lambda, Cloud Functions, Vercel Functions) connected by managed event services (EventBridge, SQS, Step Functions). There are no servers to patch, no clusters to resize, no capacity to plan. Functions execute in response to events (HTTP requests, queue messages, schedule triggers, database changes) and scale automatically from zero to thousands of concurrent instances. The pattern extends to serverless databases (DynamoDB, Neon, PlanetScale), serverless queues (SQS), and serverless orchestration (Step Functions, Temporal Cloud).

Reference Architecture

The architecture is event-driven by nature. An API Gateway (AWS API Gateway, Vercel) routes HTTP requests to individual functions. Event sources (SQS queues, EventBridge rules, S3 notifications, DynamoDB streams) trigger functions asynchronously. Step Functions or Temporal orchestrate multi-step workflows where each step is a function with built-in retry, timeout, and error handling. Serverless databases (DynamoDB for key-value, Neon/PlanetScale for relational) handle storage without capacity management. A strangler fig pattern enables gradual migration from existing monoliths.

Core Components
  • Function Layer: AWS Lambda, Vercel Functions, or Google Cloud Functions. Each function handles one responsibility — one API endpoint, one event processor, one scheduled task. Functions are stateless; any state lives in databases or caches. Cold start optimization through provisioned concurrency (Lambda), Fluid Compute (Vercel), or language choice (Go/Rust for sub-10ms cold starts)
  • Event Router: EventBridge for content-based event routing, SQS for simple queue processing, SNS for fan-out to multiple consumers. Events are the integration layer between functions — no function calls another function directly
  • Workflow Orchestrator: Step Functions (AWS) or Temporal Cloud for multi-step processes — order fulfillment, document processing pipelines, approval workflows. Each step is independently retryable with configurable timeouts and fallback paths. Visual debugging through step-level execution traces
  • API Composition Layer: API Gateway with request validation, throttling, and caching. GraphQL (AppSync) when clients need flexible queries across multiple serverless backends. WebSocket support (API Gateway WebSocket, Vercel) for real-time features

Design Decisions & Trade-offs

Lambda vs. Containers (Fargate/Cloud Run)
Lambda for event-driven functions with < 15-minute execution, spiky traffic, and scale-to-zero requirements. Containers for long-running processes, workloads that need persistent connections, or applications that don't decompose cleanly into functions. MW starts serverless and moves specific functions to containers when they hit Lambda's limitations — not the other way around.
Cold Start Mitigation
Cold starts (100ms-3s depending on runtime and package size) are the primary objection to serverless for latency-sensitive workloads. MW mitigates through: (a) runtime selection (Node.js/Python have faster cold starts than Java/C#), (b) package size optimization (tree-shaking, no heavy SDKs), (c) Vercel's Fluid Compute which keeps function instances warm across requests, and (d) provisioned concurrency for the critical path (login, checkout, search). We don't use provisioned concurrency for everything — that defeats the economic benefit.
Strangler Fig Migration
MW uses the strangler fig pattern to migrate monoliths to serverless incrementally. We place an API Gateway in front of the monolith and route individual endpoints to new serverless functions one at a time. The monolith shrinks as functions replace its capabilities. This is safer than a big-bang rewrite, delivers value incrementally, and allows rollback per-endpoint.
Serverless Database Selection
DynamoDB for simple access patterns (key-value, single-table design). Neon or PlanetScale for relational data with complex queries — both offer serverless scaling with connection pooling that handles Lambda's connection-per-invocation pattern. Aurora Serverless v2 for teams already on AWS RDS that want scale-to-zero. MW avoids traditional RDS with Lambda — the connection exhaustion problem is real and painful.
Serverless-First Architecture - System Architecture Diagram

System Architecture Overview

Technology Choices

LayerTechnologies
ComputeAWS Lambda, Vercel Functions (Fluid Compute), Google Cloud Functions, Cloudflare Workers
APIAPI Gateway (REST/WebSocket), Vercel, AppSync (GraphQL)
OrchestrationAWS Step Functions, Temporal Cloud, Vercel Workflow DevKit
DataDynamoDB, Neon Postgres, PlanetScale, Upstash Redis, S3
EventsEventBridge, SQS, SNS, Vercel Queues
ObservabilityCloudWatch, Datadog (serverless monitoring), Lumigo, X-Ray

When to Use / When to Avoid

Use WhenAvoid When
Traffic is variable with significant idle periods (scale-to-zero saves money)Traffic is steady and high-volume — reserved instances are 50-70% cheaper at sustained load
You want zero infrastructure management and operations overheadYou need persistent connections (WebSocket servers, database connection pools) — though Vercel handles this
The application decomposes naturally into event-driven functionsThe workload requires > 15 minutes of continuous execution per request
You're migrating incrementally from a monolith and want per-endpoint rolloutThe team is unfamiliar with distributed systems — serverless introduces distributed debugging complexity

Our Approach

MW treats serverless as an economic decision, not a religious one. We model the cost of serverless vs. containers vs. reserved instances for your actual traffic pattern (not theoretical), and recommend the option that minimizes total cost of ownership including engineering time for operations. Our serverless architectures include per-function cost attribution (tagging every invocation with the feature that triggered it), cold start monitoring with alerting when P99 exceeds thresholds, and gradual migration playbooks that move one endpoint per sprint. We've migrated monoliths to serverless for media companies, SaaS products, and e-commerce platforms — and in two cases, we've migrated parts back to containers when the workload characteristics changed.

Related Blueprints

Related Case Studies

Related Technologies
Cloud SolutionsSaaS Development

Need Help Implementing This Architecture?

Our architects can help design and build systems using this pattern for your specific requirements.

Get In Touch
Contact UsSchedule Appointment