Decisions that shape a backend project
Four decisions made in week 1 set the ceiling for the next two years of engineering:
- Runtime - Python (FastAPI) by default. Node or Go when the workload demands. Pick language to match the team and the integrations, not the hype cycle.
- Datastore - Postgres for anything relational. The shape of your data picks the store; the store does not reshape your data.
- Real-time transport - WebSocket, Socket.IO, Server-Sent Events, or polling. Wrong pick here kills latency budgets on the hot path.
- Deploy target - AWS, GCP, Fly.io, Render, Vercel, on-prem. Picking the vendor is a decision we weigh in on, not a prerequisite for us to start.
Get these right and every feature is cheaper. Get them wrong and every feature fights the foundation.
What you actually get from a modern Python backend
Typed endpoints. Every request and response is a Pydantic schema. The client gets an OpenAPI spec auto-generated from the code. We do not hand-write interfaces. The “backend returned an int where the app expected a string” bug class does not exist.
Async Python. One FastAPI worker handles hundreds of concurrent requests without the thread-per-request tax. Our small services run on a single modest box and handle real traffic.
Migrations you trust. Alembic migrations reviewed like code, rolled forward and rolled back. Schema changes that do not break production on a Friday evening - reversible, staged, reviewed.
OpenAPI that clients consume. Flutter (Chopper), React Native + React (TypeScript via openapi-typescript), Python clients all generated from the same spec. A field rename on the server fails a client build, not production.
When FastAPI + Postgres is the right call
- Typed API for typed clients - Flutter, React Native, or React on the other end of the wire.
- Real-time features - WebSocket endpoints are first-class in FastAPI. One stack for REST + sockets.
- AI and ML integration - Python owns the LLM tooling. Your FastAPI backend is already where the OpenAI SDK and vector stores live.
- Regulated data - typed contracts make audit easier. We have shipped KYC, AMFI compliance, and PCI-adjacent flows.
When FastAPI is not
- Hyper-low-latency webhook handlers in the single-digit millisecond range - Node or Go are honest answers.
- CPU-bound services that need a single binary deploy - Go.
- Teams that are entirely TypeScript and adding Python is a hire-and-train - stay in Node + tRPC + Prisma or Bun + Hono.
Real-time systems we have shipped
- Server-authoritative live auction timer with atomic bid writes and sticky-session WebSocket upgrades - live auction bidding.
- Socket.IO live price broadcast from a single upstream subscription fanned out to every connected client - gold trading platform.
- Live tick stream + rules engine for price alerts with last-seen-price state and exactly-once delivery on reconnect - trading calculator watchlist.
- Webhook-driven order state machine hitting a ~5-second top-up SLA across 130+ operators and 44 countries - cross-border mobile top-up.
Migrations without pager duty
Every destructive change uses the expand-then-contract pattern:
- Expand. Add the new column, table, or enum value. Both old and new code work.
- Backfill. Populate data into the new shape.
- Switch. Flip the application code to read from the new shape.
- Contract. Drop the old column, table, or enum value - weeks later, once logs confirm nothing still reads it.
This is why our deploys do not page anyone at 11pm.
What goes in Postgres, Redis, Celery, or S3
- Postgres - anything you would be upset to lose. Users, orders, content, audit logs.
- Redis - anything you can recompute. Session data, rate-limit counters, cache of expensive queries, Celery’s job broker.
- Celery - anything that should not block a user request. Email, imports, PDF generation, nightly cron.
- S3 or R2 or Spaces - anything large. User uploads, generated PDFs, exports. Signed URLs for access control.
The boundary matters because getting it wrong (sessions in Postgres, cache in Celery) is how projects end up with 5-second page loads at month six.
Deploy targets we have shipped to
- AWS (ECS + RDS + ElastiCache) - enterprise environments, strong IAM.
- AWS EC2 with Docker for predictable cost and fewer moving parts - shipped for Sabika Gold.
- Google Cloud (Cloud Run + Cloud SQL) - fast iteration, Firebase-adjacent.
- DigitalOcean App Platform + Managed Postgres - cost-effective for early-stage.
- Fly.io - global presence, cheap for small services.
- Render - simple, opinionated, works for most MVPs.
- Vercel / Netlify - only for services that live in the same repo as a Next.js app (via Serverless Functions or Edge Runtime).
We meet you at your infrastructure. Picking a vendor is a decision we weigh in on, but it is not a prerequisite to start.
Integrations we have written this year
- Payments - Stripe, PayPal, Razorpay, Al Rajhi custody, Apple Pay, Google Pay.
- Messaging - Twilio, Plivo, WhatsApp Business, OneSignal.
- Auth - Auth0, Clerk, Firebase Auth, NextAuth, Okta SSO.
- Maps - Google Places, Mapbox, PostGIS.
- LLMs - OpenAI, Anthropic, Azure OpenAI, Google Gemini. See AI Integration Services.
- Market data - Polygon.io, Finnhub, Alpaca, Binance, CoinGecko, metalpriceapi.com.
- Telco aggregators - DT One (130+ operators across 44 countries).
- Legacy SOAP banking endpoints that should not exist but do.
Case studies
- Cross-border mobile top-up with shared core across consumer and reseller surfaces - FastAPI + Postgres ledger. Webhook-driven order state machine. DT One aggregator integration. Stripe + PayPal settled into one ledger. React Native mobile + React web on top of one backend.
- Al Rajhi-custodied gold trading across iOS, Android, and web - FastAPI on AWS EC2, Socket.IO broadcaster streaming spot prices from metalpriceapi.com, SQL Server for users + KYC + holdings, bucketed historical storage for fast chart rendering across six timeframes.
- Live auction bidding with server-authoritative 30-second timer - FastAPI REST + WebSocket with sticky sessions. Atomic bid writes with seen-current-highest conflict handling. OneSignal for auction-start alerts. AWS S3 for vehicle galleries.
How we work on a backend engagement
Week 1 - architecture. Data model, service boundaries, integrations, deploy target, observability strategy. Written down before a single endpoint is typed.
Weeks 2 to 4 - the spine. Database, auth, first three endpoints, CI to staging, Sentry wired, basic dashboard in Datadog or Grafana. The boring bit most projects skip, which is why most projects hurt at month three.
Weeks 5 to 16 - features. Endpoint by endpoint, with tests. Celery jobs for anything slower than 200ms. Redis for caching on read-heavy paths. We ship to staging every day, to production after a gate.
Weeks 17 to 20 - hardening. Load tests, rate limits, edge cases, runbooks. The last month is where cheap builds skip steps and good builds pay down technical debt.
What we will not do
- Serverless-only for everything. Lambda and Cloud Functions are great for event handlers and scheduled jobs. They are usually the wrong default for a full API because of cold starts, traceability, and cost at scale. We pick case-by-case.
- NoSQL where SQL fits. 90% of projects that reach for MongoDB regret it at month nine. We start with Postgres and only add other stores when there is a concrete reason.
- Microservices on day one. Monolith first. Split when a boundary gets painful enough that splitting costs less than not splitting. Most projects never cross that line.
- ORMs so heavy they write queries you cannot read. SQLAlchemy core + raw SQL where it earns clarity. Abstractions that hide the query plan hide the bug.
Why teams pick us for backend
We run FastAPI, Pydantic, and OpenAPI as one unit. The mobile or web client gets a typed binding generated straight from the server code. Rename a field on the backend, break a client build, catch it in CI. You do not find out a field changed shape because a user’s screen went blank.
Real-time lives in the same app as REST. WebSocket and Socket.IO handlers run next to your regular endpoints, not on a separate service you forget to deploy. Migrations use expand-then-contract so nothing goes down during a rollout. We deploy where you already host (AWS, GCP, Fly.io, Render, Vercel, on-prem) and hand over everything at launch: source, infrastructure-as-code, runbooks, the lot.
Read next
- Flutter App Development - the most common mobile client for backends we build.
- React Development - typed-client web apps that talk to this backend.
- AI Integration Services - adding LLMs to an existing backend without leaking keys.