graphqlmachcomposablescloud-native

GraphQL Federation & Composable Architecture: a Cloud Native love story

10 min read

GraphQL Federation & Composable Architecture: a Cloud Native love story

--

At Lab Digital both 2020 and 2021 have been about building Composable/MACH platforms and particularly about leveraging GraphQL in those extensively.

To be effective in that, we’ve adopted GraphQL Federation with Apollo as a best practice in our projects. We’ve come to the belief that leveraging Federation for building a ‘single graph’ is the perfect companion for Composable/MACH architectures at scale (and also at smaller scale — though consider the extra complexity). In this article we explain why.

Background

As you might have read on our blog we are in the business of building scalable MACH platforms for clients that are usually comprised of multiple businesses across brands & countries.

Building a system that scales over those axes is complex. However, over the last couple of years it was made a lot simpler due to the Composable trend, which — as the name implies — allows you to compose your systems out of various separate components that do one thing really well.

As explained in a previous blogpost, a component can be anything. From a SAAS service that you consume, to a custom serverless microservice, to a service within AWS or Azure. Orchestrating those components is the key to scaling well in a robust manner. This is why we released MACH composer some time ago.

MACH composer is a framework for orchestrating all of those components: it makes sure the right components are in the right place and are aware of each other. But it does not manage (or prescribe how to do) any data transmission between components (for example, client/server interactions). In fact, MACH composer does not have any role in ‘running’ the site, its job is configure and deploy your infrastructure, not tell you how to build or run the platform.

So in the implementation of your MACH platform you need to (among other things) consider how to expose data to client side applications. And it is quite important to arrange this well, because you will have multiple endpoints exposed by components that clients need to consume. These may even be mixed types, as some services provide REST and other GraphQL endpoints to consume.

When these individual services are consumed directly in the client application (i.e. a front-end connecting directly to commercetools or a microservice) our experience is that the client becomes too ‘heavy’ as business logic for combining/merging data from multiple services is pushed to the frontend. As you then need to replicate this business logic across clients, you risk losing the decoupling of presentation from business logic that a single API provides, resulting in duplication and inconsistent behaviour.

This is why we’ve adopted the BFF pattern, and use GraphQL Federation to implement that.

The Backend-For-Frontend (BFF)

One common way to solve the challenge of exposing a single API endpoint which abstracts multiple services is “Backend-For-Frontend”. The BFF approach simplifies a lot at the client side and allows you to centralise things like authentication, security, rate limiting/quotas, caching, enforce certain API design practises, etc.

Using GraphQL to expose your BFF API has become best practice, as it gives the client control over what data to fetch without becoming tightly coupled to the backend.

Still, whether you are using a Restful approach or a GraphQL endpoint, there is a risk that you end up going against proper composable practises: the BFF will soon become a monolithic ‘integration service’ where all of the services are brought together in a tightly coupled way (for example, through ‘schema stitching’ or custom ways of integration) to expose a single API to clients. The latter is something you want, the former is something that you might want to prevent, depending on your use case and future plans.

When you build multi-tenant composable systems (like we do) each individual tenant might have a different set of components and thus interfaces that need to be combined. So the BFFs functionality differs from tenant to tenant and environment to environment, from use-case to use-case and from team to team.

Building such complex systems in a monolithic way becomes problematic when working with many moving parts in a Composable architecture. When creating single points of integration it can be natural to put business logic in the BFF (resulting in tight coupling between the BFF and the microservices), whereas good design calls for business logic to be separated into the individual microservices.

Separation of concerns & productivity

In the end composable implementations are build upon microservice architectures and vice versa. It’s just a different name for the same thing.

While being critical of microservice architectures — internally we have the mantra of keeping the least amount of systems for as long as possible — in larger scale systems they serve their purpose. And in our case, when building ‘multi instance/tenant’ systems, they allow for true composability across many instances of our platforms.

A key principle of building these architectures is separation of concerns (and following from that, building loosely coupled services). We translate this to building custom components that do a particular thing, for example a Payment integration, and expose a GraphQL API that clients can interface with. The service can be deployed as a standalone component, without a dependency on other components.

This means you might end up with many services that change continuously, that have individual GraphQL interfaces, and are intended to be consumed by client applications. And at the same time you want all those individual APIs, including external APIs such as commercetools (who continuously change/update their schema as well), to become a single GraphQL endpoint that client applications can consume.

Federation

This is where Federation comes in to play for us. Federation allows for a loosely coupled way of ‘joining’ distinct GraphQL schemas (that are called subgraphs), into a single graph called the supergraph. Client applications connect to a derivative of the supergraph called the API schema, and perform queries that span multiple subgraph schemas. These queries connect to individual microservices or SAAS components and even integrate data across them using entities.

Architecture

Apollo itself defines the architecture as follows:

An Apollo Federation architecture consists of:

A collection of subgraphs (usually represented by different back-end services) that each define a distinct GraphQL schema

A gateway that uses a supergraph schema (composed from all subgraph schemas) to execute queries across multiple subgraphs

On a daily basis this means that as a backend developer on the team you will work on one or more components and update their functionality (and therefore subgraph), meaning that the GraphQL schema will be changed occasionally (or at least the service that is running behind that).

Individual services register their schemas (and updates to it) with the central GraphQL Gateway. That gateway then validates this and takes the updated version in production. Optionally you can use Apollo Studio to organise this.

The benefits

The result of this is an elegant gateway server that is very thin, and solely responsible for composing the supergraph schema and making that available to clients. And at the same time, making changes to individual services at an individual pace becomes possible without a lot of communication/alignment overhead with other teams or developers that work on other services. So you can work on individual services independently.

So, now have a Backend-For-Frontend that is composable as well. Whatever the Composable back-end landscape looks like, there is a loosely coupled process for composing your BFF automatically, whilst putting responsibilities where they belong: in the services themselves, without tight coupling with other services. And that results in respecting service boundaries and keeping the team or developer that is responsible for the component independent from other components and teams.

What about the added complexity of managing multiple services?

With Federation you take a distributed approach to building your services. And when it comes to building custom services, this means that you will need to manage those services’ CI/CD pipelines, deployment and infrastructure as well, which comes at a cost.

As a starting point, you could choose to build a single GraphQL gateway service that implements Federation, and which does not rely on distributed services but local services instead, effectively resulting in a monolithic GraphQL server though with a clear exit scenario once you grow out of it.

This article coins the term Local Federation, which you can use to build multiple services in the same codebase, but still follow an approach that allows you to use the benefits of Federation and at some point move towards using separate services once you are ready or when its required.

When performance is a concern, this is also a route to investigate, though it is recommended to also test how ‘common’ federation behaves of course.

But my service doesn't support GraphQL!

This can of course happen. Not every service that you might want to use, supports GraphQL natively. You have several options then; you could have clients connect directly to this service, outside of your single graph. Only in some cases this would be valid, in our opinion, as this is hard to keep under control and has several downsides.

Another, more robust option would be to write your own GraphQL schema on top of the service that you are consuming, effectively wrapping that existing API. And then use federation to add this schema to your single graph. This is the approach we usually take.

Stay tuned for a next article, in which we explain how to use the GraphQL Schema Definition Language to generate most code & configuration you need for, in this case, the headless CMS Amplience, which does not (yet) support GraphQL out of the box.

Conclusion: federation allows for composition of your single (data) graph across teams and services

While you can argue that using Federation introduces an additional component and thus complexity, we believe that it is worth it in situations where you are composing systems comprised of various GraphQL APIs, which is quite common in Composable architectures where every service has an API interface, preferably exposed as GraphQL. Federation will ensure that you can elegantly combine these services, without ending up with a ‘heavy headed’ BFF implementation, whilst keeping responsibilities where they should.

The alternatives to Federation, which would be to stitch schemas or even build a monolithic BFF, in our view would be close to re-inventing the wheel yourself, rather than using industrialised (and supported) practises that are available with Federation. The best alternative would be to apply Local Federation, which gives you the benefits at the start, without much of the added complexity of distributed services.

Further reading

At the risk of triggering “your not Netflix, so you don’t need this” feedback, reading how Netflix scales its API with GraphQL federation is a very nice read, both from a technical as well as organisational perspective. And they have several other inspirational blogposts and open source tools!

Also, in case you work with commercetools, you might want to have a look at our library that enables federation for commercetools (as that is not supported natively), next to providing plugin points for creating your own resolvers and transforms.

Last but not least: the future of Federation

Recently, Apollo announced Apollo Router, their next-generation GraphQL Federation runtime written in Rust (and very fast).

Early benchmarks show that the Router adds less than 10ms of latency to each operation, and it can process 8x the load of the JavaScript Apollo Gateway. Packaged as a standalone, multi-threaded binary, the Router can use all available CPU cores without needing to run multiple instances on a single machine.

Learn more about Apollo Router through the Github repo (caution: still in Alpha).

Tags used in this article:
graphqlmachcomposablescloud-native