Do you really need a database for that Ory stack?

S3, Azure Blob, Google Storage and Minio, they’re all a K/V storage at the core. Yes, of course, they provide much, much more functionality beyond that but—at the core—object storage systems, S3, GCS, Minio and the likes, are K/V stores.

Each one provides a HTTP interface. Putting the data in object storage is done via HTTP PUT or POST requests, fetching is available via GET. To check if an object under the key exists, that’s a HEAD request. Deleting is a single HTTP DELETE away.

It is somewhat interesting that we usually don’t think about S3 or GCS as key/value systems. They’re hidden behind a HTTP layer and our perception tells us they’re some sort of directories and files. They’re certainly not file systems.

If we start treating object storage as K/V, we can quickly find resemblance to other K/V systems. For example Redis or Cassandra. Or a PostgreSQL table with a primary key.

Like S3 or Azure Blob, these dedicated systems provide functionality beyond just K/V. If the item to be fetched is identifiable by a well known ID and that fetch is one hop away, there’s no need to filter or join over anything else, that’s definitely K/V like!

§Ory Hydra as an example

The lifecycle of an OAuth token is not very complex. Once a token is generated, it lives for some period of time. Maybe 10 minutes, maybe a month. They’re handed over to an external application which holds on to them until a new token is needed. A token serves a couple of major purposes:

it assures the holder that the data in the token comes from the issuer; this fact can be proven by validating the token’s signature,
it can be sent to another application which can also—in turn—itself validate that the middleman has not tampered with the original data.

There are two major types of tokens:

JWT tokens: contains readable content, anybody can decode them, the information is readable,
opaque tokens: these are not meant to be read, only the issuer understands what’s inside.

Either type is issued in response to some event requiring assurance of a successful action on the issuer side. Most often, that’s an authentication or authorization. Tokens are idempotent. Once issued, they do not change.

Most often, the following actions are performed on the tokens:

signature validation: token does not need to be sent to the issuer, the signature can be validated on the consumer side by loading public keys from issuer’s JWKS
verification: is the token active?
- usually validated by checking the expiry timestamp and verifying that the server has not invalidated the token yet
invalidation: the token should not be recognized anymore
refreshing a token: in return for a valid refresh token, a new access token is issued

Even if the token is forwarded to a third party and used for some fancy application specific logic, the third-party will essentially do one of the four of actions listed above.

§the big question

So the big question this write up asks: is the database system even needed to store them?

The signature validation does not require a look up. Even the issuer does not need to look anything up. Verification, in case of a database system, usually implies checking if the row for the token exists. Eventually, if there is an invalidation row in another table. Invalidation usually means removing the token from the table or creating the invalidation row in another table.

These operations are the same for a refresh token but a new token is generated, if refresh token is still valid.

This looks awfully close to K/V.

So what’s the point of having a database for that at all? Why not using globally distributed object storage instead?

§mental exercise

Testing for expired token can be done in two complementary ways:

issue a HEAD request against the key with the token object, 404 Not Found means the token is not valid,
eventually, a HEAD request can be issued to test if there is an invalidation object for the respective token.

Maybe these operations can be reversed depending on how probable the invalidation of a token is. Additionally, all of the object storage systems provide object expiration so the cleanup of old tokens comes for free, as in.

Invalidation is an operation fulfilling the criteria for the conditions above.

§some numbers

Let’s go through some numbers based on a semi-real example. A client with roughly 1000 accounts, each account logging in once a week or so. That’s about 4000 tokens a month with further, say, 600K token verifications a month. If we wanted to run a cheapest version of HA database in the public cloud, options are (among other, of course):

Cloud SQL: $0.0966 per vCPU / GB RAM, that’s $69.552 / month based on 30 days
AWS RDS Multi-zone: $0.36 for 2 vCPU with 1GB RAM burstable db.t3.micro instances: $25.92 / month

It’s definitely not easy to run a HA database system with reasonable performance for less than $25 / month, even if we consider the likes of Hetzner.

Neither of these could be recommended for a production public facing system. The prices above are compute only. There is no:

data transfer cost,
maintenance cost,
backup / restore,
storage and snapshots.

My point is, neither of these prices reflects reality and the actual cost will be definitely higher.

Let’s compare this with some back-of-the-envelope calculations for object storage operations.

AWS S3 storage cost, for a few thousand tokens, is going to be negligible. We are taking about data in megabytes, not gigabytes. Azure, GCS would be the same. Object storage operations are by far the most expensive. S3 charges $0.005 / 1000 PUT requests and $0.004 / 1000 GET and other (including HEAD) requests.

GCS divides the operation in 2 classes. storage.object.put belongs to class A, these are $0.05 / 10000 operations, storage.object.get is a class B operation and these are priced at $0.004 per 10000 operations, effectively 10 times cheaper than S3.

In terms of number of operations:

issuing a token is 1 PUT request,
token verification is at most 2 HEAD requests
token refresh is at most 2 HEAD requests and 1 PUT request.

That’s respectively:

S3 Standard:
- $0.000005 for issuing a token
- $0.000008 for a token verification worst case scenario and $0.000004 for best case
- $0.000008+$0.000005 per token refresh worst case scenario and $0.000004+$0.000005 for best case
GCS:
- $0.000005 for issuing a token (price the same as S3)
- $0.0000008 for a token verification worst case scenario and $0.0000004 for best case (10 times cheaper than S3)
- $0.0000008+$0.000005 per token refresh worst case scenario and $0.0000004+$0.000005 for best case (PUT has the same price as S3 but GET is 10 times cheaper)

If there is no need to support invalidation and expiry can be checked at the edge, there is no need to touch storage at all. If invalidation check is opportunistic, not every token will be checked for invalidation. However, for the most pessimistic usage, this client could run their token storage for:

S3 Standard: $0.000005 * 4000 tokens + $0.000008 * 600K verifications = $4.82 / month
GCS: $0.000005 * 4000 tokens + $0.0000008 * 600K verifications = $0.50 / month

§what about compute

Aha, I’m glad this question came up. Turns out that running Ory Hydra on AWS Lambda using a Docker container results in an end to end request latency of ~100ms / request and it fits easily within the minimum 128MB execution runtime. That’s a mere $0.000000021 per request. The ~600K requests a month would cost that client a whopping $0.0126.

Cold start takes around 1.5 second but with 600K verifications, there is a request roughly every ~4 seconds so the cold start would not be frequent. They can be be minimized by trimming Hydra down, if there was no need to initialize ORM for example.

It’s fair to say that maybe $30 or $70 / month is not a lot of money. But consider that there are places where that might not be the case. Or maybe it’s a side project in an evaluation phase and that $70 becomes $840 / year. Finally, simply compare the number of requests per month and imagine that your $70 / month is used for a mostly idle resource. Why paying for a mostly idle resource at all?

There’s definitely a clear cut off point where a database becomes a reasonable choice worth paying for.

Mmm, how far away is it, though.

§no infrastructure

The side effect here is: it is possible to run all this authentication and authorization stuff without any permanent compute resources. The same principle applies to Keto and Oathkepeer. All the decision related data can be easily put in a container, if it needs to be.

There is no need for a database in any of these systems. Kratos is the only difficult case.