Be sure to cache Auth0 JWK when using Rails
25 Apr 2020 - Rails, Auth0, Datadog, Optimization
As of today, when using code from Auth0 Rails QuickStart you may encounter performance issues as the JWK is fetched for every request. Let’s fix that.
Recently a friend of mine asked me to help him understand why some of its requests needed roughly one second to be processed by Rails. They have the following flow:
- If there is a JWT token in
- Fetch the JWK from the auth tenant;
- Validate & verify the token;
- Fetch user associated to the user ID stored in the above JWT;
- Execute the GraphQL query.
Looking at the logs we see a huge amount of SQL queries generated. It is certainly related to a
N+1 issue, which is
quite common when using GraphQL. Except that
ActiveRecord accounts for less than 8% of the total time.
To get more visibility we set up Datadog APM. Initially,
Rails instrumentation using:
Here is the associated trace:
The user is fetched very late in the request while being the second step. After adding a couple of spans and
enabling the HTTP integration (using
c.use :http) we were able to easily spot the issue:
It seems most of the time is spent fetching the JWK from Auth0.
Note that the first 150ms blank here is due to the Rails server startup (SQL requests in red correspond to introspection requests). Conversely, the first flame graph was generated after warmup.
The code used to verify the JWT token (which fetch the JWK as well) has been copy-pasted from Auth0 Rails QuickStart. Unfortunately the JWK is fetched for every request.
So let’s cache it for few minutes/hours to avoid hurting performances & being rate-limited (it is the default in JS when using node-jwks-rsa from Auth0).
Here is a solution using
Rails.cache (do not forget to configure it in development):
Also, we added a call to
JsonWebToken.jwks_hash during application startup to avoid a cold start for the first
authenticated request (sadly users may experiments a cache miss that will lead to a slowdown).
Feedbacks to Auth0
Don’t forget to monitor your API endpoints. Depending on your resources you may be able to use tools such as Datadog to create dashboards & alerts.
Also, even if we have tackled the main issue, we are not forgetting the
N+1 issue 😄 Hopefully it should be part
of a new post.