A Guide to GraphQL Rate Limiting & Security

8 min readAug 3, 2022

The Problem

Rate limiting and securing server resources are central problems when developing any kind of API. We want to prevent clients from being able to affect the experience of other clients, or simply avoid being taken down. That can happen, among other things, due to an excessive volume of requests, requests that are too expensive, or too many requests at the same time.

Things are no different with GraphQL. A GraphQL schema exposes a set of possible API capabilities, from which GraphQL clients can select exactly what they want. However, “exactly what they want” can turn into a complex problem. Not only can clients send too many GraphQL requests, but they could also simply send fewer, but very expensive GraphQL requests.

Note: It’s interesting to think about whether this is really a GraphQL problem per se. The same problem exists in other API styles, not all resources, endpoints or remote procedures “cost” the same to a server. The problem of variable cost exists there as well. In fact, some API providers have rate-limiting “tiers”. GraphQL does make this even more variable though, and it’s very hard to bucket the use cases into tiers like this.

Very closely related to rate limiting, we may want to forbid the execution of malicious queries right off the bat. Rate limiting usually allows queries until some threshold has been met, but sometimes we want to block them ahead of time, before even executing them. We’ll explore this problem as well.

Rate Limiting

Let’s start with the problem of rate limiting. We want to make sure our clients have a certain quota of GraphQL requests to respect. However, we can’t simply count the number of queries. A ton of small queries is ok, while larger queries should be less frequent. How do we do this?

What to Rate Limit?

The most popular approach for GraphQL rate limiting is to assign a GraphQL document a “cost” or “complexity” and to use this number to rate limit instead of just counting the number of requests hitting our server. This is usually done in a static way, by analyzing the query AST before even executing it. However it’s possible to analyze it both during execution, or looking at the final response. The static analysis is important if you want to block queries above a certain complexity, before they execute. You could however, decide to rate limit through things like response or execution analysis.

GitHub implements this approach in their public GraphQL API. There are many implementations of this approach out there, in open source libraries and some private ones like GitHub and Shopify. Luckily, there’s something even more formal. A group of researchers have published a paper called “A principled approach to GraphQL cost analysis” which covers this approach in a very formal way. Highly recommend you take a look: https://dl.acm.org/doi/10.1145/3368089.3409670

For an in-depth explanation and ideas around cost analysis, check out this great resource: https://mmatsa.com/blog/why-cost-analysis/

GraphQL query cost analysis is a great way to give value to requests coming our way, but there are a few gotchas you have to be aware of, especially with static analysis:

Not all fields are created equal: You’ll have to make sure to manually assign higher costs to fields that are known to be more resource intensive or come up with a good heuristic for it. Over time and at scale, this can be difficult.
To compute a cost, these algorithms usually depend heavily on pagination arguments. The same field, depending on if it returns the first 5 or the first 1000 items, can be vastly different in cost. Make sure you have mostly paginated list types, and that they are paginated in a hopefully pretty standardized way.
Pagination arguments are only an upper bound. Maybe the client is asking for 1000 items, but if there was only 1 item in the collection, we charged them 1000 points for nothing. Some folks have played with the idea of “refunds” if the response ends up containing less than what we charged up front, but this gets complex for clients.
There is no good way to determine how expensive a plain list type field will be. Generally have to be careful with those.
You may want to give mutations a bigger cost by default, in most applications, writes are more expensive than reads by default.

Even though there are gotchas, query complexity remains a pretty good value to rate limit in GraphQL APIs. Now that we’ve got the what, we have to move on to how we rate limit this value.

Algorithms

The rate-limiting algorithms don’t change much whether you’re using a GraphQL or REST API. Here are some common ones:

Fixed window rate limiting.
Sliding window rate limiting.
Token Bucket / Leaky bucket rate limiting.
Genetic Cell Rate Algorithm (a potential improvement over leaky bucket)

Window-based rate limiting, especially fixed window rate limiting has some considerable drawbacks. With fixed window rate limiting, if a client exceeds their limit within the first seconds of the window, they have to wait till the next window to start making requests again. A lot of request patterns are more “bursty”, and leaky bucket style rate limiting is much more natural to those patterns.

My advice is to use something close to a leaky bucket / GCRA algorithm. It fits common request patterns a bit better and allows for bursts of requests without completely blocking a client from recovering.

Dimensions

In some cases, simply rate limiting for cost per x amount of time, even if using something like a leaky bucket algorithm, might not be enough. Concurrency is a great example. Maybe we’re fine with 1000 complexity per second, but not fine with 100 times 10 complexity at the exact same time. You’ll want to think about some of those different dimensions. For HTTP APIs, sometimes we want to rate limit by:

Overall requests per x time
Max request concurrency
Max requests a single resources
Any other domain-specific dimension you may have

There are also dimensions of who we are limiting, for example, per IP address, or perhaps just by client ID. Both could be needed.

For GraphQL APIs, it might just be making sure you have concurrency limiting somewhere in your stack, and using query cost rate limiting as well. With that in place, you should have a good system to score, and rate limit GraphQL clients. But we’re probably not done, moving on to malicious queries in general.

Detecting and Blocking Malicious Queries

While we’re now able to give a general cost to incoming requests, and rate limit per client. However, there are approaches to crafting malicious queries we need to be aware of, some can be mitigated by query complexity, others might require specific handling.

Query Depth

There’s a lot of focus on abusing query depth in the GraphQL community. You’ll see that it’s only one way to craft potentially malicious queries, but it’s worth mentioning anyway since it's a popular concern.

query {
  store {
    product {
      store {
        product {
          ... and so on
        }
      }
    }
  }
}

While this may look really bad, query depth is not necessarily worst than query breadth:

query {
  a: product { name }
  b: product { name }   
  c: product { name }
  d: product { name } 
  ... and so on
}

Two things can generally avoid disasters here:

Using the dataloader pattern / caching data access so that repetitive fields or recursive fields have cached results.
Using query complexity, as we covered above. We want to use query complexity to block certain queries before they’re executed here, on top of using it as a rate-limiting mechanism
A lot of GraphQL libraries have a max query depth setting which you can use to limit depth. It’s great, just do not mistake this as a way to stop all expensive queries. It’s only one pattern of access.

If those two first things are in place, generally, query breadth or query depth usually simplifies to the same problem of an “expensive query”.

Attacking the GraphQL Parser

Some smart attackers can slide in before we even had a chance to compute query complexity, during the parsing phase. This can be done in many ways, for example passing a list of billions of fields or arguments. GraphQL libraries often implement things to make sure you’re protected, but here are the two things to think about:

A token limit/timeout in the parser.
A max byte size for query documents. You probably have one already along the request path, but it is worth checking if it is right for your GraphQL API.

Attacking the Response Generation

Another way we’ve seen to attack a GraphQL API is to craft a query that results in a giant response, most commonly through generating validation errors. In general two things could help in those cases:

Capping the amount of returned errors
Having limits to input and field arguments, start with something fairly strict and be more open when needed (easier to relax an API than to make it more strict)

Observability

A quick note of observability. Make sure that if for some reason there was a resource exhaustion problem, you’re able to pinpoint which queries and which clients are the cause. Make sure you can answer some of these questions:

Which queries had the highest response time in the past 30 minutes?
Which queries had the highest query complexity?
Which clients were sending these queries?

Finally, once you have the answers, you may implement a way to cut off some of these clients or queries, blocking them from hitting the API altogether.

Query Allow Listing

Especially for internal APIs, It’s worth mentioning persisted queries and query allow-listing, which helps prevent intentionally malicious queries from getting executed by your GraphQL server.

You should still run query complexity analysis and some of the other safeguards we talked about but in this case done at query registration time rather than runtime. This offers some pretty sweet performance benefits as Erik Wittern noted on Twitter:

The benefit is that cost analysis adds latency, so having to run it only once when “deploying” is a huge improvement over running it on every request.

The Multi-Layered Approach

Just like with certain hum… viruses, a multi-layered approach is usually best. Here are my recommendations:

Implement a query complexity analyzer, block queries above a certain complexity and use it to rate limit clients using a leaky-bucket style algorithm.
Set a max byte-size to incoming GraphQL requests and make sure you or your GraphQL library implements protections for the parser.
Use pagination limits, and sane input limits on all potentially problematic fields.
Implement query timeouts. In case we missed something, we don’t want a malicious query to go on for minutes.
If something does go wrong, make sure you can identify which clients, IPs, and queries are the cause, and disable their access if you can.
Use persisted queries and allow listing, and make sure to run some of these safe guards at query registration time.

Finally, there are new abuse vectors coming out for most technologies over time, so do make sure you keep your libraries, and knowledge updated over time! Have you heard of other ways to exploit the resources of a GraphQL server? Feel free to let me know. Of course, haven’t covered things like data access and authorization here which are really important, but out of scope for this article.

Marc