envelopegithubhomelinkedinsearchrss

Be sure to cache Auth0 JWK when using Rails

25 Apr 2020 - Rails, Auth0, Datadog, Optimization

As of today, when using code from Auth0 Rails QuickStart you may encounter performance issues as the JWK is fetched for every request. Let’s fix that.

Context

Recently a friend of mine asked me to help him understand why some of its requests needed roughly one second to be processed by Rails. They have the following flow:

  • If there is a JWT token in Authorization:
    • Fetch the JWK from the auth tenant;
    • Validate & verify the token;
  • Fetch user associated to the user ID stored in the above JWT;
  • Execute the GraphQL query.

Requests traces

Looking at the logs we see a huge amount of SQL queries generated. It is certainly related to a N+1 issue, which is quite common when using GraphQL. Except that ActiveRecord accounts for less than 8% of the total time.

To get more visibility we set up Datadog APM. Initially, we configured Rails instrumentation using:

1
2
3
4
5
6
# config/initializers/datadog.rb

Datadog.configure do |c|
  c.analytics_enabled = true
  c.use :rails
end

Here is the associated trace:

Initial Flame Graph
Initial Flame Graph

The user is fetched very late in the request while being the second step. After adding a couple of spans and enabling the HTTP integration (using c.use :http) we were able to easily spot the issue:

Pimped Flame Graph
Pimped Flame Graph

It seems most of the time is spent fetching the JWK from Auth0.

Note that the first 150ms blank here is due to the Rails server startup (SQL requests in red correspond to introspection requests). Conversely, the first flame graph was generated after warmup.

Optimize

The code used to verify the JWT token (which fetch the JWK as well) has been copy-pasted from Auth0 Rails QuickStart. Unfortunately the JWK is fetched for every request.

So let’s cache it for few minutes/hours to avoid hurting performances & being rate-limited (it is the default in JS when using node-jwks-rsa from Auth0).

Here is a solution using Rails.cache (do not forget to configure it in development):

1
2
3
4
5
   def self.jwks_hash
-    jwks_raw = jwks_raw = Net::HTTP.get URI("https://YOUR_DOMAIN/.well-known/jwks.json")
+    jwks_raw = Rails.cache.fetch("jwks_hash", expires_in: 2.hours) do
+      Net::HTTP.get URI("https://YOUR_DOMAIN/.well-known/jwks.json")
+    end

Also, we added a call to JsonWebToken.jwks_hash during application startup to avoid a cold start for the first authenticated request (sadly users may experiments a cache miss that will lead to a slowdown).

Feedbacks to Auth0

It seems we were not the first ones to hit this issue.

Fortunately all Auth0 documentation is available on GitHub. I have created an issue to update the QuickStart. Hopefully it will contain updated code or a note about this issue. Stay tuned.

Final thoughts

Don’t forget to monitor your API endpoints. Depending on your resources you may be able to use tools such as Datadog to create dashboards & alerts.

Also, even if we have tackled the main issue, we are not forgetting the N+1 issue 😄 Hopefully it should be part of a new post.