Recently, I’ve spent a lot of time discussing the evolution of SaaS company Tenancy Models with my colleague Benjamin. These conversations have revealed that my thinking on the subject is vague and needs focus and sharpening through writing.
This is the first in a series of posts where I will dive deep on the technical aspect of tenancy models, the tradeoffs, which factors go into deciding on appropriate models, and how implementations evolve over time.
What are Tenancy Models?
There are 2 ideal models, single-tenant and multi-tenant, but most actual implementations are a hybrid mix.
In the computer realm, single-tenant systems are ones where the client is the only user of the servers, databases and other system tiers. Software is installed on the system and it runs for one client. Multi-tenant means that there are multiple clients on the servers and client data is mingled in the databases.
Pre-web software tended to be single-tenant because it ran on the client’s hardware. As software migrated online and the SaaS model took off more complicated models became possible. Moving from Offline to Online to the Cloud was mostly an exercise in who owned the hardware, and how difficult it was to get more.
When the software ran on the client’s hardware, at the client’s site, the hardware was basically unchangeable. As things moved online, software became much easier to update, but hardware considerations were often made years in advance. With cloud services, more hardware is just a click away allowing continuous evolution.
Main factors driving Technical Tenancy Decisions
The main factors driving tenancy decisions are complexity, security, scalability, and consistent performance.
Keeping client data mingled on the servers without exposing anything to the wrong client tends to make multi-tenant software more complex than single-tenant. The extra complexity translates to longer development cycles and higher developer costs.
Most SaaS software starts off with a single-tenant design by accident. It isn’t a case of tech debt or cutting corners, Version 1 of the software needs to support a single client. Supporting 10 clients with 10 instances is usually easier than developing 1 instance that supports 10 clients. Being overwhelmed by interested clients is a good problem to have!
Eventually the complexity cost of running massive numbers of single instances outweighs development savings, and the model begins evolving towards a multi-tenant model.
The biggest driver of complexity is the second most pressing factor - security. Ensuring that data doesn’t leak between clients is difficult.
A setup like this looks simple, but is extremely dangerous:
Forgetting to include client_id in any SQL Where clause will result in a data leak.
On the server side, it is also very easy to have a user log in, but lose track of which client an active session belongs to, and which data it can access. This creates a whole collection of bugs around guessing and iterating contact ids.
Single-tenant systems don’t have these types of security problems. No matter how badly a system is secured, each instance can only leak data for a single client. Systems in industries with heavy penalties for leaking data, like Healthcare and Education tend to be more single-tenant. Single tenant models make audits easier and reduce overall company risk.
Scalability concerns come in after complexity and security because they fall into the “good problems to have” category. Scaling problems are a sign of product market fit and paying customers. Being able to go internet scale and process 1 million events a second is nice, but it is meaningless without customers.
Single-tenant systems scale poorly. Each client needs separate servers, databases, caches, and other resources. There are no economies or efficiencies of scale. The smallest, least powered machines are generally way more powerful than any single client. Worse, usage patterns mean that these resources will mostly eat money and sit idle.
Finally, all of those machines have to be maintained. That’s not a big deal with 10 clients, or even 100. With 100,000 clients, completely separate stacks would require teams of people to maintain.
Multi-tenant models scale much better because the clients share resources. Cloud services make it easy to add another server to a pool, and large pools make the impact of adding clients negligible. Adding database nodes is more difficult, but the principle holds - serving dozens to hundreds of clients on a single database allows the SaaS to minimize wasted resources and keeps teams smaller.
Consistent Performance, also known as the Noisy Neighbor Problem, comes up as a negative side effect of multi-tenant systems.
Perfectly even load distribution is impossible. At any given moment, some clients will have greater needs than others. Whichever servers and databases those clients are on will run hotter than others. Other clients will experience worse performance than normal because there are fewer resources available on the server.
Bursty and compute intensive SaaS will feel these problems more than SaaS with a regular cadence. For example a URL shortening service will have a long tail of links that rarely, if ever, get hit. Some links will suddenly go viral and suck up massive amounts of resources. On the other extreme - a company that does End Of Day processing for retail stores knows when the data processing starts, and the amount of sales in any one store is limited by the number of registers.
Single tenant systems don’t have these problems because there are no neighbors sucking up resources. But, due to their higher operating costs, they also don’t have as much extra resources available to handle bursts.
Consistent performance is rarely a driver in initial single vs multi-tenant design because the problems appear as a side effect of scale. By the time the issue comes up, the design has been in production for years. Instead, consistent performance becomes a major factor as designs evolve.
Initial forays into multi-tenant design are especially vulnerable to these problems. Multi-tenant worker pools fed from single-tenant client repositories are ripe for bursty and long running process problems.
Fully multi-tenant systems, with large resource pools, have more resilience. Additionally, processing layers have access to all of the data needed to orchestrate and balance between clients.
In this post I covered the two tenancy models, touched on why most SaaS companies start off with single-tenant models, and the major factors impacting and influencing tenancy design.
Single tenant systems tend to be simpler to develop and more secure, but are more expensive to run on a per client basis and don’t scale well. Multi tenant systems are harder to develop and secure, but have economic and performance advantages as they scale. As a result, SaaS companies usually start with single tenant designs and iterate towards multi-tenancy.
Next up, I will cover the gray dividing line between single and multi-tenant data within a SaaS, The Tenancy Line.