Post

Sep 3, 2021
Thick databases

It doesn’t matter how much you learn, there’s still a tremendous amount you don’t know, significant blindspots. I’ve been coding on and off since my high school days some 20+ years ago. After working for a while, I took time off to pursue a PhD in Philosophy, and now I’m back to programming full time. All of this is to highlight that I’ve been programming for a bit over two decades, though not full time for all of that period. I’ve learned a lot, but I still marvel at how much I just don’t know.

In this instance, I’m now exploring the idea that I’ve been giving databases too little credit, at least for the kinds of applications we write at Episub – applications that see daily full-time use by employees, but aren’t companies that are going to face the kinds of scaling challenges that come from thousands or millions of daily active users. Perhaps the ideas here are good even for very high traffic applications, but I am not considering this from that angle. Up until now, I’ve held the view that you should keep business logic out of the database. I didn’t realise I held this view overtly, but it came out in other ways. Notably, I’d seen solutions like the original Prisma (which I’m still not inclined towards using, but for other reasons), Hasura, and PostGraphile – automatic API’s built on top of databases. Automatic anything sounds great, but I had concerns – though when I say concerns, what I really thought was that these were deal-breakers that make such automatic API’s useless for anything other than toy projects. Here’s some of the “deal-breaking” concerns I had.

Objection 1: Database and API model often differ

For the projects I work on, very often the way we structure the database is not the way we want to present it to the end user. This can happen for a variety of reasons. Sometimes, to have a nicely normalised database, we model the data in a way that doesn’t make sense for the end user.

Let’s use an example for our discussion. Suppose that we have a ‘person’ table with a separate ‘address’ table. We did it this way because we’re going to re-use the concept of an address in other places – attached to a ‘job’ table, or an ‘organisation’ table, etc. We don’t anticipate that a person will have more than one address (e.g., no separate shipping and billing addresses), but maybe one day we’ll need that. So, to simplify the API for our users, we decide we want a ‘person’ endpoint that shows the person details combined with their address:

{
    person_id
    name
    address_1
    address_2
    state
...
}

So here’s the problem as I saw it for an automatic API: it’s going to give me an API structure that requires the consumer to fetch two objects. Maybe the API is smart enough to still make one call to the database, but either way you’re exposing the underlying structure of the database in ways that don’t make sense for this API. E.g., you have a result like:

{
    person_id,
    name,
    address {
        address_1
        address_2
        state
        ...
    }
}

Or do two separate calls fetching first the person…

{
    person_id,
    name,
    address_id,
}

and then their linked address:

{
    address_id
    address_1
    address_2
    state
    ...
}

We wanted to use the flattened result in our API, but because it’s presenting the data in the same way as it’s modelled in the database, we are limited. How often does a well designed API match the structure of the database? In my experience, despite significant overlap between the API model and the database model, they frequently diverge. I don’t want my API to be locked into matching my database model.

Objection 2: Requires business logic in the database

A lesson many of us have been taught is that we should separate the business logic and the data. Using an automatic API requires us to put business logic in the database. Suppose that we want to allow orders to be cancelled, but only if none of the items have yet shipped, and if cancelled, create a notification entry in another table to alert staff to the cancellation.

We have a few options.

Option one: Since the API doesn’t automatically implement such rules, we tell consumers of the API to check for lines that have shipped, and don’t cancel if so. If you do cancel, make sure you call another endpoint to create the notification entry. Obviously a terrible implementation when we require the consumers to follow our business rules.

Option 2: Create one or more triggers on the table that prohibit updating the ‘cancelled’ field (or a trigger on delete) when a shipped item exists. The same trigger, or another, creates an entry in the notification table.

That works, but it’s still not a neat interface for the user, since they’re just doing a CRUD style update to change the ‘cancelled’ field. Since it’s just toggling a field (or deleting a record), it’s not as obvious to the API consumer that it has side effects.

Option 3: Create a function called ‘cancel_order’ that performs all the steps required, exposed via the automatically generated API. We prevent our users from being able to update the ‘cancelled’ field directly.

That works, but we’ve now very clearly put our business logic into the database, something (so I thought before) to be avoided.

Objection 3: Some API tasks won’t be possible in a database

There are things we might want an API to do that a database cannot handle. Here are a couple of examples:

  • Endpoints that don’t involve working with our data at all. E.g., returning the latest weather, or stock prices, or whatever else, from independent API’s. We simply want to unpack the request, send our own request off to the appropriate service (whether those be our own or third party), and then return the reply.
  • Endpoints that trigger tasks (such as sending emails) or slow-to-calculate tasks (such as building a PDF). The database doesn’t know how to send an email or create a PDF, and moreover, tasks like these can sometimes be slow – even if we could do them in the database, we wouldn’t want to slow it down from doing the important work that only it can do.

An automated API is bound to turn otherwise straightforward features into unwieldy messes.

Where to next

If my objections were on point, then automatic API solutions should be unusable or burdensome nightmares for any serious project. Clearly, however, people do use them, and such popularity is evidence that my objections must not be so serious as I suppose. Having spent some time now looking into them, that has turned out right – in which circumstances a thick database is the best option I’m still undecided, but I’m certainly more open to leaning more heavily on the database. The catalyst for my change in attitude towards the maxim “no business logic in the database” came from looking at PostgREST and asking myself, “if it’s such a bad idea to do things this way, what do people find appealing about solutions like PostgREST?” As I often do, I like to read hackernews posts on technology to see what people with (often deep) experience have to say. On there, I saw many of the same concerns I held myself repeated by others, as well as responses I hadn’t previously considered.

I’m not a fan of ORM’s, and have certainly read the often recmomended Object-Relational Mapping is the Vietnam of Computer Science. It’s really great to have some helpers for simple things, but anything complex and I just want to write the SQL that does, and returns, exactly what I want. Tied to this is the fact that I’m happy to give up on the ability to easily swap my database (say, PostgreSQL) for another (say, MySQL/MariaDB). I don’t want an ORM to abstract away the database. Rather, I want to enjoy the advanced features it brings, particularly with a database like PostgreSQL. Given that I’ve already given up on the idea of treating my database as a commodity, I’m more open to diving in and exploring what it can do, and that includes having the database be ‘thick’, implementing some key parts of business logic in there instead of in code, on the understanding that I won’t be replacing the database. Sure, if I ever found myself with a project that had massive scaling requirements, I might split parts of it off into a database more suited for that role. But for the projects I work on, and any that I can foresee, I’m happy to lock the core into something like PostgreSQL.

And so here I am, exploring PostgREST as a way to generate an API off the back of my database. Here’s a few thoughts on thick databases. I’ll start by explaining how each of the objections I had above no longer seem so serious, followed by some further remarks on other aspects of this approach.

Objections answered

Resolution 1: Database and API model often differ

It is true that database and API models often differ. In my experience it’s very rare to have a situation where they do not differ in at least some places. Of course, there is often significant overlap, but very rarely is that overlap complete. An automated API therefore should support API models that differ from the shape that the underlying data is stored in.

Speaking with respect to PostgreSQL+PostgREST, the solution is to have a separate schema. Inside that schema, you configure all the views to look exactly like you want them to appear to a consumer of the API. For example, we could create a ‘persons’ view that combines the ‘person’ record with its associated ‘address’ record. Then, anyone who pulls data from our API will get the person with their address details just as we wanted in the original example:

{
    person_id
    name
    address_1
    address_2
    state
...
}

We can then create triggers on that view to define what is to be done for insert and update actions. Again, these triggers can enforce business rules that go beyond just “insert and update details in the appropriate ‘person’ and ‘address’ tables”. This could be things such as adding a new entry in some ‘person_entity’ joining table, or creating an alert, or triggering an email (see below for a discussion on handling emails).

We might also restrict access for fields like ‘cancelled’ on an order, or prohibit deleting an order. Through an api schema, we can prevent the ‘cancelled’ field from being updated directly through the view. Instead, we create a ‘cancel_order’ function that performs this action. That function implements the logic like ensuring that there are no line items that have shipped, for example.

With this approach, the API schema can differ as dramatically as we like from the original data. We are free to combine underlying tables into one view, or split underlying tables into multiple views, or mix and match them as we please.

Of course, this is not so automatic. We probably shouldn’t just point an automated API like PostgREST to our underlying database and restrict things. Instead, we control all access through the ‘api’ schema, so that we can define the shape of the API and more carefully control the permissions.

Resolution 2: Requires business logic in the database

Should we really keep business logic out of the database? Let’s consider another claim, one that might at first glance look the same:

Data and business logic should be kept separate

Suppose that this is true (and we’ll discuss this a bit more soon). Does putting business logic in the database entail that we are mixing data and business logic? The answer is ‘no’. Suppose we keep data in one schema, and the API’s structure in another, as proposed above. We have our application schema which stores all the data and, allegedly, no business logic. We have our api schema which does not hold any data, but contains our business logic. All access to the underlying application schema is through the api schema, so that business logic can be enforced at the database level, without existing at the data level.

In this way, our data and business logic are separate – they’re both in the database, but they are not mixed together. We have followed the rule above, and it looks to me like it’s not so bad to keep business logic in the database, particularly when we can still take care to keep business logic out of the data.

But let’s ask the next question – should we keep business logic out of the data? Here the answer is, I think, a murkier ‘no’, for the simple reason that most of us that work with databases already put business logic in the database and consider it best practice.

For example, consider our ‘person’ table with its name column. How long should a name be? Many of us will enforce a maximum length for fields like this in the database, but this is clearly a business logic decision. Can a name be null? If we know the rule in this case, that a name should always be provided, we can enforce this in the database by requiring it to be not null, or even having a check constraint to ensure that it’s at least one character long. What about usernames – do we allow upper and lower case characters to be different usernames? That is business logic. Does every person have to belong to an organisation? That’s business logic, enforced through a not null foreign key on the person table. And so on.

The data in our database arguably always includes business logic, just as a matter of fact about the way we software engineers tend to build systems, but in ways that we consider not just acceptable, but good. Suppose we want to require that a name be not just not null, but at least one character long, so we enforce that in the database. This gives us the benefit that when our application code (say, a front-end, or another service consuming our API) fails to set the name, we get an error back. Having received the error, we know that we need to improve our application to ensure that it always sets that value according to rules enforced in the database. Being able to enforce that rule in the database makes it much more likely we’ll catch errors and keep our data in a valid state.

Having said all this, should we just ignore the maxim and now put business logic in the database with reckless abandon? My intuition is that we should not, but that we’re going to have to be less rigid and evaluate our situation carefully. In short, we’ll have to take to heart a different and also popular maxim when considering what to do:

It depends on your situation

Some business logic can and should go into the database, but some should stay outside. In the case of a project using something like PostgREST, that means storing the bulk of the logic in our api schema, or some similar ‘gateway’. Doing so has another advantage, which I’ll outline below under ‘Rebuilding the API’.

Why have business logic in the database?

Before moving on, I want to state one significant advantage of having the business logic insider the database, rather than inside the application – where by ‘application’ I mean the code/services we write that sit outside the database. If our applications access the database solely through the api schema, then our code (outside of the database) becomes much more replaceable. We might write a service in Go, and then realise we want to write a specific service in Python to take advantage of a particular library. Both applications need the ability to cancel orders. If our business logic was all in our Go libraries, then we’d need to either re-implement them in Python, or find a way for the Python service to work through a Go service (perhaps calling a Go based API, such as a gRPC service, and call the ‘cancel_order’ endpoint). On the other hand, if all the business logic is in the database, the Python service can simply work with its own SQL libraries, pulling data from the API schema, and calling functions that we’ve created on the api schema to perform certain actions.

We have moved our business logic from the application to the database, and as a results we’re much freer to change between programming languages, using whichever works best. But now we’ve dived into the database with both feet, and swapping PostgreSQL out for, say, SQL Server, will be so much more difficult.

Which is the better option? To be locked more into a language, but freer to move our database, or locked into a database but freer to move our code? Speaking from my experience, with the projects I tend to be involved with, changing databases for an application virtually never happens, but we do with enough regularity find ourselves using different languages for different tasks, utilising the same underlying database. Just as we like to take advantage of the various features of a language, and the libraries available in its wider ecosystem, so would I like to be able to take advantage of the various features of a particular database. Instead of trying to abstract away the database through an ORM, as though it’s a fungible commodity to be replaced with some regularity, we lock ourselves into our choice and take advantage of its full capabilities.

Apart from this, there may also be performance benefits, where we can perform all the queries we need in the database at once. In many cases, applications will make multiple trips to the database for a single task. In some instances, this can be improved significantly by improving the application code, but in some cases it will just be faster to do it all on the database side, and less work in total for the database as well.

Resolution 3: Some API tasks won’t be possible in a database

Hopefully now you can see how we might have an API with a different model to that of the database, control access carefully, while still keeping our data separated enough from our business logic. How about more complicated features, like sending emails or building PDF’s? Or endpoints that don’t even need to touch the database, but instead, say, call another service and return its result?

A database like PostgreSQL cannot send emails or build PDF’s out of the box. But we do know how to do these things in other languages. PostgreSQL does allow you to write scripts in languages that do support doing things like sending emails or creating pdf’s, but I have not explored the benefits and limitations of doing so. My suspicion is that doing this directly in the database is not the best choice (more discussion in How to send mail from database? at depesz.com).

A good option for situations like this is to set up your own message or work queue in the database, and have external services connect to the database, which are responsible for doing tasks like sending emails or building PDF’s. These services regularly pull tasks out of your message queue table and process them. PostgreSQL introduced SKIP LOCKED in 9.5, which makes this even easier to use in situations where multiple instances of the same service are trying to pull tasks from the same message queue. You can use SKIP LOCKED to make sure that they don’t pull the same task, and that if a service dies or restarts then the task becomes available again. For an example of this, check out What is SKIP LOCKED for in PostgreSQL 9.5? by 2ndQuadrant, and Do You Really Need Redis? How to Get Away with Just PostgreSQL by Atomic Object. Where possible (although ‘it depends on your situation’), keep your infrastructure simple and boring.

Now we’ve opened up our capabilities to use any language we want to perform tasks. Moreover, it should not be so difficult to replace one of these worker services with another written in a new language should we want, even running them side by side.

What about cases where we need to bypass the automated API entirely? In such cases, we can use a proxy to redirect requests to a sidecar API, written in a language (or languages) of our choosing. This gives us more direct control in those cases where we need it. Many proxies will fill the task, such as OpenResty, which is used by subzero. When a request comes in, our proxy decides whether to forward it to our PostgREST API, or to our sidecar API. There are some downsides to this, as it won’t appear in PostgREST’s description of available endpoints, but it will work (I discuss an untested way to get it to appear in PostgREST’s description).

Let us return to the PDF example, where we want the user to submit a payload and get a generated PDF in response. Sometimes you may want to return a response immediately. E.g., an endpoint that submits a request for a PDF, and receive a reply. Without having implemented such a solution myself, I suspect that this will be another instance where we would use a proxy to bypass PostgREST. The sidecar API will receive this request, create a task in the message queue, and then monitor for and return the result (pdf or failure). With PostgREST, we would need a function that can create an entry in the task view, commit that write, and then monitor for and return the result. At the time of this writing, it looks like this would need autonomous transactions, not (yet?) supported by PostgreSQL. However, you can approximate them using a db worker (see Autonomous transaction support in PostgreSQL by dalibo.com). I have not tested it, but this may work as a solution for creating a new task in the task queue, and waiting to return the result, all as part of the one PostgREST API request.

Another option is to use something like OpenResty to create the PDF and return the response.

Or, we can design our API so that consumers submit in one action then poll for and download the result via another when it’s ready.

Other remarks

The above answers the main parts of my objections, but beyond my initial objections I’ve had some additional thoughts experimenting with this approach, which I now present.

A different abstraction

As developers, we try to walk a line between abstraction and raw access. We prefer abstraction when, among other reasons, it allows us to build significantly more with less effort (e.g., generating html surveys from a json specification). We dislike abstraction when it prevents us from using features of the underlying technology (e.g., utilising features of html that the html form generator we’re using hasn’t exposed, when we know how to write the feature using html but we don’t know how to get our generator to write that html).

Automated API’s fall into the same category – an abstraction that we hope can work for most situations, but sometimes it will get in the way when it doesn’t support implementing something that we need. The thing we want is an API with a particular structure, and so we build a PostgreSQL schema with the goal of having our automated API produce the API we want. To put it another way: we want a particular API, but we are not building that directly – we are building a schema that will define the API. When we cannot write the schema in just the right way for the automated API to produce the structure we want, we may find ourselves frustrated with the tool.

In such cases, we can bypass an automated API such as PostgREST using a proxy like nginx, such as described earlier, implementing whatever we can’t handle through the automated API. If we find ourselves doing that too frequently, we may start to regret using an automated API and wish to return to building such things more directly.

Suppose that we did decide that an automated API was more trouble than benefit, because we have too many special cases. Does this imply that we should go back to keeping the bulk of business logic in the application and out of the database? I think it does not. We can still benefit from the powerful features of the database, while writing our own custom API that utilises our API schema, rather than using an automated API. It gives us that same flexibility in which language we can use in exchange for inflexibility in which database we use. An automated API is nice when it works, but without it there may still be reasons to desire keeping business logic in the database.

Taking care when writing code – the developer experience

In my experimentations with this new (old?) way of building systems, there are ways in which our database business logic could become an unmanageable nightmare if we don’t take care. Here are some of my new-to-the-process thoughts.

Tracking changes and rebuilding schema

Managing migrations can be troublesome. As a simple setup, we might just write scripts that transform the database in the way required, so it’s easy to see what changes were made and apply them. For our projects, this is often enough.

However, when building something like an API, even an automated API, it would be much nicer to be able to track changes in the schema/functions via git, just like we do for application code.

When building an application, if you want to change one line in a function, you simply change the line and commit the code. The exact change is tracked, so it’s easy to see what was done.

Contrast this with a simple migration setup like described above. The process here to modify a function involves copying the old function in its entirety, changing the single line, and committing the full new function in a new sequential file. The exact change is not tracked, and if you look at the git history you’ll see the full function with no insight into what’s changed.

When it comes to the api schema, there is an alternative. If we can avoid storing any data in the api schema, then we always have the option of recreating the whole schema from scratch. PostgreSQL supports Transactional DDL, which means we can make most changes to the structure of a database inside a transaction. If there’s an error during the dropping and re-creating of the api schema, then the transaction will fail and we can roll back to how it was before the upgrade. If it succeeds, we replace the new schema with the old.

Now, the advantage here is that instead of having migrations describing how to transform the database from an earlier state to a later state, we simply have a script, or set of scripts, that describe how the API should be. If we want to change one line in a function, we change just that one line, and commit. Now in our commit history we can see exactly which line changed, just as we could for typical application code outside of the database. When we want to apply this change to production, we re-run all of the scripts inside a transaction, which will effectively replace the old schema with the new. Since we don’t store data in the API schema, we are not concerned with losing any.

Without some way of tracking changes nicely, I could foresee that writing an api schema in a thick database could become unmanageable. I am interested to see if the same principles here could be applied to business logic that needs to go in the data layer. For example, we may have row level security policies and functions inside the main data database. Ideally, these are (re)created as part of the schema updates above, so we can easily track changes. Because these modify the data schema, much more care needs to be taken. One option might be to use DROP IF EXISTS syntax to drop any and all policies that were ever used, followed by their (re)creation, all inside a transaction. As a result, most changes to the schema can be tracked through git easily.

I am also aware that there are other tools, such as sqitch, available for helping manage database changes in a way that makes it easier to track the history of changes. I have not tried such tools myself, but may look into them in the future. It may be preferable to use tools like this if recreating the schema completely is too slow, or there are changes for which you can’t just simply recreate the schema entirely.

Writing tests

Tests are always important, and here is no exception. Ideally we have tests both as part of our SQL statements, as well as integration tests that make requests from our API (whether custom built or automatically generated).

For running unit-style tests against the database, pgtap is a popular choice. Such tests could be built into a CI/CD test, and ideally locally whenever rebuilding the schema. When it comes to integration tests, any tools you like that allow for testing API’s could be used.

Extensions in the database

Among its many features, PostgreSQL supports the creation of extensions. (note: many cloud provider managed PostgreSQL instances limit which extensions can be used – using some extensions may require you to manage your own installation). These allow you to expand the functionality of the database.

Having started my own journey towards learning Rust, I am particularly interested in pgx which allows you to write extensions and background workers in Rust. Writing extensions opens up a lot of possibilites.

Authentication

While you can roll your own, if you use an open source product like Keycloak, or a commercial one like Auth0, it should be fairly straight forward to set up. Keycloak can produce the JWT’s that PostgREST can consume.

Without going into detail, I have successfully followed a guide on setting up Auth0 with Flutter but using Keycloak instead of Auth0, which generates tokens at login which can be used directly with a PostgREST API.

In short, there should be a few options available for handling authentication.

Authorisation

PostgREST provides information on how to handle authorisation, including a tutorial called The Golden Key.

PostgreSQL provides support for row level security policies (performance improved in PostgreSQL 10). Using such policies, we can restrict what records a user has access to, and an automatic API like PostgREST will only display the records that user has a right to access. Of course, there is going to be a performance hit from using such policies. That cost needs to be paid somewhere, though – we need to determine whether a user is authorised for some action or not, and authorisation will involve fetching rows from the database whether we do the authorisation step itself in the database or in the application.

Re-using code

In my limited experience writing functions inside PostgreSQL so far, I’m hitting situations where two functions are remarkably similar, but differ in a small way – e.g., with regards to the columns they return. This is still an area of exploration for me on how to manage these in a maintainable way, but I’m considering two options:

  1. Dynamically create queries in SQL. Using features such as ‘EXECUTE format’ I may be able to achieve the result I want. See Executing Dynamic Commands
  2. Compiled templates. As is popular with Go, I could create the generic templates along with the configuration required to generate the final SQL statements needed. Might be more powerful, but could be harder to work with.

Pure SQL functions used inside other queries can in some cases be inlined, so they could be used as an efficient way to encapsulate some repeated code.

Proxy and API spec

If we follow the path of using a proxy to implement endpoints that bypass PostgREST, we’ll find that the description of the API provided by PostgREST will diverge further and further. To bypass this, perhaps we can implement mock functions in the API schema so that PostgREST includes them in its description.

Final remarks

The above is a deeper dive into the thoughts of someone who is new to this way of building systems. When I started looking into it, I was hoping to find a guide of ‘best practices’ for both keeping business logic in the database, as well as using an automated API backed by that database. If I continue to use this approach and find it suitable as a go-to approach, I may write such a document to help others identify good practice and pitfalls, once I work out what those good practices and pitfalls are! For now, hopefully this will serve as some kind of basic introduction for someone else looking into thick databases.