Skip to content

Please Stop Reinventing Authentication

You may have guessed from my last post that I’m not especially fond of the current JavaScript ecosystem. You would have guessed correctly. Still, in the interest of employability I try to keep up to date with Facebook’s platform. Most recently that’s meant playing around with GraphQL and reading through some different GraphQL-based projects (and starter projects) on GitHub.

My experience of reviewing all of these GraphQL projects has been, well, alarming. “Batteries-included” frameworks like Rails and Django gave people reasonably secure authentication systems out of the box, but the trend toward single-page applications and microservices has pushed many people to use much more spartan platforms. As a result, they’ve started rolling their own authentication with the tragic results you’d expect.

In this post, I am going to make the case for being lazy and (to an extent) stuck in the past. Everyone has already had it drilled into their heads that rolling your own cryptography is madness, but I think we’ve reached the point where we need to recognize that rolling our own authentication is just as insane.

Please Stop Reinventing Sessions

I think a lot of the compulsion to re-invent sessions boils down to people not fully understanding how “traditional” session-based authentication systems work, so let’s take this one from the top.

HTTP is stateless. The client asks the server for some resource, and the server responds to the request. The client asks for another resource, and the server responds again, with no “memory” of the previous exchange(s). If the server needs extra information to process the request – such as the identity of the user who made it – we must send that data in an HTTP header.

Traditionally, the headers of choice for user sessions were Set-Cookie and Cookie. The Set-Cookie header defines some name=value pairs of data that the server should “remember”, an expiration date (after which the data is no longer valid), the name of the domain that the data is valid for, and (optionally) some useful security-related flags that limit the data to secure connections and prevent scripts from accessing it. Once Set-Cookie has been used, the Cookie header is sent with all subsequent requests and simply repeats the name=value data pairs. We call this a “cookie”.

Traditionally (there’s that word again), you would create a login session on the server – perhaps stored in a cache like Redis, a database like PostgreSQL, or even a file – and would send the session ID in a cryptographically-signed cookie. When the server processes a request, it (1) decrypts the cookie and (2) retrieves whatever information it needs about the currently-authenticated user based on the session ID. If you want to force a user’s session to end, you can simply delete the session from the store. The biggest weak point is the session ID itself, which we need to keep secret.

This is the kind of system you get out-of-the-box with a framework like Django or a library like express-session. Some of its features are:

  • Stateless requests with “state-ful” data sent in an HTTP header. Any HTTP client (web browsers, mobile apps, curl, etc.) can use them.
  • Backed by your data store of choice
  • Cryptographically signed
  • You control session expiry/invalidation (even after the session is created)
  • Built-in protection against cross-site scripting (XSS) attacks (via. the http-only flag)
  • Built-in protection against insecure Internet connections (via. the secure flag)
  • In-use for decades, widely tested, proven to be acceptably secure
  • Many quality, open-source implementations exist

For some reason, though, I keep seeing new projects reimplementing sessions with bizarre schemes of their own, most often based on JSON Web Tokens (JWTs) stored in localStorage. Like a session cookie, a JWT is a set of key-value data pairs sent to the server in an HTTP header. Like a session cookie, JWTs are cryptographically signed. And like a session cookie, JWTs can include a session ID that refers to a data store. The differences, though, are really important:

  • No built-in protection against cross-site scripting (XSS) attacks
  • No built-in protection against insecure Internet connections
  • Can be read (in its encrypted form) by anyone with access to the physical machine (think about shared environments, like computer labs)
  • Newer, less battle-tested technology
  • Poor track record with respect to security (in 2015, several popular JWT libraries were revealed to have critical security vulnerabilities, that could be traced back to a design flaw in the JWT spec itself)

These are just the problems for sessions that use JWTs with an actual data store backing them. Some JWT sessions I’ve seen just dump all the session data in the JWT itself. This means that in practice, you need to do one of two things:

  • Give JWTs very short expiration dates so that hijacked or stolen JWTs aren’t valid for very long (“very long” in this case is relative, you will find). Legitimate users will have to continually send “renewal” requests to your server in order to keep their sessions active.
  • Give JWTs long expiration dates (or none at all, making them valid forever). If a JWT is hijacked or stolen, the malicious user can use the JWT to their heart’s content.

Both of these choices are pretty terrible, in my opinion.

If you are implementing sessions in your app, please, I beg you, use cookies with an established, trusted library to handle the session logic for you. If you’re building a GraphQL API server, you could use express-session and Passport with Express, or Phoenix with Absinthe, or Django with Graphene, or… you get the idea. You have so many great options!

Cookies will work in your SPA React app, I promise. It will be OK.

Please Stop Reinventing Single Sign-On (SSO)

Let’s be real: if your thoughts about single sign-on don’t have an RFC number associated with them, they’re probably really bad. Lots and lots of people have bad thoughts about SSO and have written thousands and thousands of lines of bad SSO code. The Stack Exchange sites are full of questions that amount to “Is my bad idea for SSO secure?” and the respondents saying “NO!” (or, sadly, “Yes, I also have bad thoughts about SSO.”). It isn’t good.

What’s the deal with SSO anyway?

Recall that cookies are associated with a particular domain (www.example.com) or “family” of domains (.example.com => example.com, www.example.com, foo.example.com, et. al.). If our session id is stored in a cookie, how can we share it across multiple domains? And how can those domains translate the session ID into session data without having access to the session backend?

In general, most single sign-on schemes look something like this:

  • There’s a central auth provider (server) that all the consumers (apps) trust
  • If you aren’t logged in, you get prompted to authenticate with the provider
  • You log into the provider and a session cookie is created on its domain
  • The provider securely exchanges your session information with the consumer
  • The next time you log into a consumer, the auth provider uses its cookie to identify you and does not prompt you with a login form

On the surface this doesn’t sound terribly daunting. And yet…

  • How do you prevent replay attacks?
  • How do you protect the confidentiality of the session data?
  • What happens if the account is suspended, deleted, or otherwise no longer allowed to authenticate with the provider at the same time that the user has active sessions with consumers?
  • What happens if the consumer is using AJAX or fetch? What happens if the consumer is a “traditional” server-rendered app? What happens if the consumer is itself an API server?
  • What happens if different consumers are interested in different subsets of the user’s session data?
  • What happens when a user logs out? Are they logged out of everything or just that app? How does that work?
  • etc., etc.

Like many things in software development, single sign-on is simple as long as you don’t think about it too hard. The harder you think about it, the more complicated it becomes. The RFC for OAuth 2.0 is 75 pages long. Have you given your SSO proposal 75 pages worth of thought?

I’ve seen some really naive implementations of single sign-on in the wild that do things like pass around session IDs and tokens in GET parameters, allow literally any consumer to authenticate with the provider, and let tokens be reused indefinitely. This is bad. And it’s unnecessary, because (just like with sessions), you have so many good options for out-of-the box solutions.

You might use some combination of:

  • A hosted authentication provider like Auth0, Okta, or Amazon Cognito
  • OpenAM, LDAP, or other “enterprise-y” identity providers
  • CAS, Shibboleth, or SAML
  • OAuth and/or OpenID, either self-hosted or through third-party providers

These providers all have lots of nice libraries that implement their ideas for you. You will have to read some documentation along the way, which I hope will convince you that implementing a secure SSO system is too time-consuming and tedious to attempt on your own.

Please Stop Posting Your Stuff to GitHub Without Disclaimers

If you’ve gotten this far and think I’m a big, dumb jerk, or that your artisinal authentication system is an exception to the rule, then please at least entertain this last request: please don’t post it to GitHub without disclaimers.

A lot of people in web development are self-taught, and often will get started by tinkering with (or straight up copying) existing projects. Whether you think this is good or bad, it’s the reality. So when you push your half-baked, unvetted authentication system to a public repository, some impressionable novice developer may look at your code and think that’s The Way Things Are Done. Now your security vulnerability shows up in whatever bad auth system that developer goes on to write, and on and on. It’s like a disease. It spreads.

Yes, this is true for other kinds of security flaws in other kinds of code published on GitHub. That’s a rant for another day.