SDL or Code-first GraphQL Schemas?
Whether to build GraphQL schemas directly using the GraphQL SDL or by using code has been a debate since GraphQL’s inception. We’re in 2022, and things haven’t changed much. According to State of GraphQL 2022, people are pretty much split equally between the two options. I’ve had the chance to try both options in fairly large organizations over the past years, and the results don’t surprise me. Both options work, but they require vastly different investments and trade-offs. In this post, I’ll try to highlight the benefits and downsides of each approach and what tools might be needed to be successful.
Design First Tho!
I can hear many already thinking that “code-first” will inevitability lead to poor API design. After all, shouldn’t we think about the API interface first, and then code the implementation details? Absolutely. Here I am talking about how to implement a GraphQL schema. Certain libraries allow you to define your interface directly using the SDL, while others may have code abstractions, like classes to define types for example. You should use the SDL to talk about schema design, but once you’re ready to implement the schema in your server, both options are available to you.
Reusable Abstractions & Consistency
The main advantage of using code to build your GraphQL schema is that, well, you have the power of a full programming language to build it instead of the rather simple GraphQL schema definition language. That means it’s possible to build very powerful abstractions that helps your team build more consistent and powerful schemas. For example, we can use code that helps us build all the Relay Connection types for a paginated field, or even a mutation function that builds the unique input argument and wraps results in a `Payload` type. We can encode best practices directly into the schema builder. At scale, I have found this is the best approach to ensure a consistent and well-designed GraphQL API.
This doesn’t mean we can’t achieve something close with SDL-first APIs. It just means that we have to mitigate this through other means. In practice, this usually means using or building these tools:
- GraphQL Schema Linter*
- Schema Snippets or Library (For common patterns like pagination & mutations for example).
- GraphQL specific editor tooling.
*Note that a GraphQL Schema Linter is still very important even when building the schema using code. However, it is crucial when opting for SDL-first approaches.
On the flip side, building our schema using code abstractions means that it can be harder to visualize what the GraphQL interface will look like. There is this code -> schema transformation happening somewhere, and the more complex your abstractions, the harder it is to see how your changes impact the final schema. Breaking changes can slip in, or just changes you did not expect. When using the SDL, we are directly (or very close to) writing the resulting interface. It’s much easier to reason about the final schema.
Small note on what I consider to be the worst of both worlds approach. Things that use the SDL as the basic block, but use transforms to turn it into something else. For example, you could use a magic comment to turn a field into a relay paginated one, or have a schema transform pipeline that turns your schema into something else. You lose the main advantage of an SDL-first approach, clearly seeing the resulting interface, and don’t gain the advantages of using the full power of code to generate your schema.
Tooling & Interoperability
A lot of GraphQL tools operate on the GraphQL SDL. When using a code-first approach, you are stuck in a language-specific world, and might have to build your own tooling if using your own custom abstractions. Compare that to SDL-first, where tooling can directly operate on our schema definitions.
Because of change visibility and tooling, code-first approaches usually must implement a schema printing workflow. Most GraphQL libraries have some sort of
printSchema function at this point. Here’s an example of such workflow. Happy path first:
- Make changes to schema.
- Run command to print SDL.
- Commit schema changes and generated SDL file to repository.
And the unhappy path:
- Make changes to the schema.
- Push changes.
- CI prints schema, and compares against checked in SDL file. Diffs are detected, CI fails 🛑
- Run command to print SDL.
- Push SDL.
Now we know there is an up to date SDL at all points in our repository. This enables tooling to work off that SDL, and viewing the diff is much easier on pull requests. Note that this does require a generation step, so if you have tooling running at development time, it will never be as instant as if we implemented the schema directly as SDL. Part of the trade-off!
It’s often useful to attach metadata to schema entities so that they can drive behaviour at execution time or consumed by other tools. The classic example in an SDL-first world is to annotate schema members with directives, and then use those to drive middleware execution. In my opinion, directives are often poorly suited for this, and code can be much more powerful to express metadata and custom use cases. This often leads to talks about internal vs external directives. Honestly, I kind of see directives used this way (purely internal) as a bit of hack.
When using code-first, our metadata is whatever we want. Instead of an authorization directive, we can implement an authorization hook on a class, directly integrate with authz abstractions in the same code base, etc. Then when it comes to introspection and SDL printing, we use directives only for things we want to be exposed externally.
Schema metadata is a hard problem in general, but when it comes to internal metadata used to configure things like middlewares, in my experience, code-first is much more powerful.
Code-first also usually leads to co-location of metadata/schema and resolve logic. This can lead to less cognitive overhead, but some folks like the clear separation.
Sometimes, the language ecosystem you’re in will dictate the choice. If there is a very mature and powerful library in your language that prefers SDL-first, it might be wise to go with this instead of rolling your own code-first approach or using a library that doesn’t look as great. This will most likely have a much higher impact than sdl vs code-first.
I prefer the code-first approach when it comes to building large schemas at scale. It has very high leverage and enables teams to move fast and keep some consistency and quality.
However, I think this is especially true in a certain context in a monolithic, single language environment. Micro-service / federated environments can mean multiple languages and libraries, and the SDL-first approach might be easier to implement in those cases.
In the end, both approaches have blind spots. Different tools and processes will need to be implemented to mitigate those blind spots. Pick the approach that match with your most important goals while weighting the cost of building or using tooling to deal with the downsides.
If I missed any trade-offs, let me know on Twitter and I’ll discuss them here.