API-First Architecture for Platforms
Baalvion Strategic Brief • June 11, 2026
Strategic Intelligence by Baalvion Engineering
Registry Date: June 11, 2026
8 min read
The contract is the product
When you operate a platform rather than an application, the API stops being an implementation detail and becomes the product itself. At Baalvion, the Baalvion Operating System connects commerce, finance, compliance, logistics, and intelligence into one multi-tenant fabric spanning 198 markets and 180+ jurisdictions. None of those internal domains share a database or a deployment cadence. The only thing they genuinely share is a set of contracts. That observation forces a simple discipline: design the contract before you write the handler, and treat the contract as the artefact you version, review, and break compatibility against.
API-first is frequently reduced to "we publish an OpenAPI document." That is necessary but nowhere near sufficient. An API-first organisation makes the schema the source of truth from which server stubs, client SDKs, mock servers, documentation, and contract tests are all generated. The human-readable description and the machine-enforceable schema are the same object. When a payments team and a ledger team argue about whether an amount is minor units or a decimal string, the argument is settled in the spec file in a pull request, not discovered in production during a cross-border settlement reconciliation.
Designing the contract: explicit over clever
A good platform contract is boring on purpose. We favour explicit field semantics, closed enums with a documented extension policy, and types that cannot silently lose information. Monetary values are carried as integer minor units plus an ISO 4217 currency code, never as floats. Timestamps are RFC 3339 in UTC. Identifiers are opaque strings, never sequential integers that leak volume or invite enumeration. These are unglamorous choices, but in a compliance-first system they prevent whole classes of defect before any code exists.
We standardise on OpenAPI 3.1 for synchronous REST surfaces and AsyncAPI for the event-driven seams between domains, because the BOS is as much a choreography of events as it is a set of request/response endpoints. For high-throughput internal service-to-service calls where shape stability and code generation matter most, gRPC with Protocol Buffers gives us a binary contract and forward/backward-compatible evolution rules out of the box. GraphQL appears at the edge for a small number of aggregation-heavy product surfaces, but we resist it for the core: federated graphs are powerful and easy to make slow, and an enterprise platform needs predictable, cacheable, auditable request shapes more than it needs query flexibility.
- Design-first workflow: the spec is authored and reviewed before implementation, then linted in CI with rules like Spectral so style drift is caught mechanically.
- One canonical error envelope across every service — a stable machine-readable code, a human message, and a correlation id — so clients write error handling once.
- Idempotency keys on every state-changing endpoint, which is non-negotiable when an operation might move money or file a customs declaration.
- Pagination, filtering, and field-selection conventions defined once at the platform level rather than reinvented per team.
Versioning without breaking the world
Versioning is where API-first discipline is actually tested. The cheapest version is the one you never have to ship, so the default rule is additive evolution: you may add optional fields, new endpoints, and new enum values behind a documented tolerance policy, but you may not remove or repurpose what exists. Clients are written to ignore unknown fields (Postel's law applied narrowly and deliberately), which lets a single deployed contract carry months of additive change without a major bump.
When a genuinely breaking change is unavoidable, we use major versions in the URL path (/v1, /v2) for external partner-facing APIs because the boundary is explicit, easy to route at the gateway, and trivial to reason about in support tickets. Internally, gRPC's field-numbering rules let us evolve messages far more freely. Either way the rule is the same: run versions in parallel, instrument usage per version, publish a deprecation timeline measured in quarters not weeks, and only sunset a version once telemetry shows the old surface has drained. With 125+ active partners integrated against the platform, a silent breaking change is not a bug — it is an incident with a financial and regulatory blast radius.
The gateway: one front door, many policies
An API gateway is the seam where cross-cutting concerns stop being copy-pasted into every service. In the BOS, the gateway terminates TLS, authenticates the caller, resolves tenant context, enforces rate limits and quotas, and stamps a correlation id that follows the request through every downstream hop. Authentication is OAuth 2.0 and OpenID Connect with short-lived JWTs; the gateway validates signatures and scopes so individual services trust a verified principal rather than re-implementing auth. This is the practical centre of our multi-tenant identity platform, where tenant isolation has to hold under load and under audit.
We keep a firm line between the gateway and a service mesh. The gateway owns north-south traffic — the public and partner edge — and is where business-facing policy lives. The mesh (we lean on Envoy-based data planes) owns east-west traffic between services: mutual TLS, retries with budgets, circuit breaking, and traffic shifting for progressive delivery. Collapsing these two concerns into one component is a common way to build a system that is impossible to reason about. Keeping them separate means a partner-facing quota change never accidentally alters how two internal services fail over.
- Edge concerns at the gateway: authn/authz, rate limiting, quotas, request validation against the published schema, and tenant resolution.
- Mesh concerns between services: mTLS, retries with deadlines, circuit breakers, and canary/blue-green traffic splitting.
- Observability everywhere: OpenTelemetry traces, RED metrics, and structured logs keyed by the same correlation id.
Governance: turning principles into guardrails
Principles that live in a wiki get ignored under deadline pressure. API governance only works when it is encoded as automation in the path of every change. We run schema linting on every pull request, fail the build on backward-incompatible diffs detected by tooling such as Buf for Protobuf and oasdiff for OpenAPI, and require contract tests (Pact-style consumer-driven contracts) to pass before a producer can deploy. A producer cannot merge a change that an existing consumer's contract proves would break it. That single gate replaces an enormous amount of cross-team coordination meetings with a deterministic check.
Governance also means making the right thing the easy thing. A shared style guide, golden-path templates, and a reviewed library of common schema components mean a new service starts compliant rather than being retrofitted. For a platform that carries 500K+ transactions and operates under SOC 2 Type II, ISO 27001, GDPR, and KYC/AML obligations, every endpoint is also an audit surface: who called it, on behalf of which tenant, with what authorisation, and what changed. We design for that from the first line of the spec, which is the same posture that drives our governance and compliance suite across the wider ecosystem. This is what we mean by compliance-first — it is enforced in the contract, not bolted on after a finding.
Developer experience is a platform feature
If an internal team or an external partner cannot reach a working integration quickly, the elegance of the architecture is irrelevant. Because every contract is machine-readable, we generate typed SDKs in the languages our consumers actually use, stand up mock servers directly from the spec so client teams can build before the producer is finished, and publish reference documentation that is never out of sync because it is derived from the same schema the gateway validates against. A self-service developer portal with scoped credentials, sandbox environments, and example payloads turns onboarding from a multi-week ticket queue into an afternoon.
The payoff compounds. Good developer experience is not a nicety layered on top of an API-first platform; it is the natural output of having done the contract work properly. When the schema is the source of truth, documentation, SDKs, mocks, and tests stop being separate deliverables that drift apart and instead become projections of one artefact. That is the difference between a platform that scales to hundreds of partners and one that quietly accumulates integration debt until every change becomes frightening. You can see the same architectural posture applied end to end in how we approached unifying global trade operations, and it underpins the broader enterprise software work we do for organisations building their own platforms.
Where to start
If you are retrofitting API-first onto an existing system, do not try to boil the ocean. Pick the most-integrated surface, write its contract as it should be, put it behind a gateway with the canonical error envelope and idempotency, and wire up linting plus contract testing in CI for that one surface. Prove the loop — design, lint, generate, test, deploy in parallel versions — on something real, then expand outward. The hardest part of API-first was never the technology; it is the organisational habit of deciding the contract first and holding the line on backward compatibility. Build that habit on one endpoint, and the rest of the platform follows.
Frequently Asked Questions
What does "API-first" actually mean beyond publishing an OpenAPI file?+
It means the schema is the source of truth from which server stubs, client SDKs, mock servers, documentation, and contract tests are all generated, and the contract is designed and reviewed before any implementation is written. Publishing a spec after the fact is documentation; designing against the spec first is API-first.
How do you version APIs without breaking partner integrations?+
The default is additive, backward-compatible evolution: add optional fields and new endpoints, never remove or repurpose existing ones, and write clients to ignore unknown fields. For unavoidable breaking changes we use major versions in the URL path for external APIs, run versions in parallel, instrument usage per version, and only sunset an old version after telemetry shows traffic has drained.
What is the difference between an API gateway and a service mesh?+
The gateway owns north-south edge traffic — authentication, rate limiting, quotas, schema validation, and tenant resolution at the public and partner boundary. The service mesh owns east-west traffic between internal services — mutual TLS, retries with deadlines, circuit breaking, and traffic shifting. Keeping the two separate keeps edge policy and internal resilience independently reasonable.
How is API governance enforced in practice?+
Governance is encoded as automation in the path of every change: schema linting on every pull request, build failures on backward-incompatible diffs detected by tooling like Buf and oasdiff, and consumer-driven contract tests that must pass before a producer can deploy. Guardrails plus golden-path templates make the compliant choice the easy choice.
Why does Baalvion favour REST and gRPC over GraphQL for the core platform?+
Enterprise platforms need predictable, cacheable, and auditable request shapes more than they need query flexibility. We use OpenAPI/REST and AsyncAPI for domain seams, gRPC with Protocol Buffers for high-throughput internal calls, and reserve GraphQL for a small number of aggregation-heavy edge surfaces where its flexibility genuinely pays off.
How does API-first improve developer experience?+
Because every contract is machine-readable, typed SDKs, mock servers, and reference documentation are all generated from the same schema, so they never drift apart. A self-service portal with scoped credentials and sandbox environments turns partner onboarding from a multi-week ticket queue into an afternoon.