Postgres FM | Transcript: Multi-tenant options

June 20, 2025 • 50 Minutes

Multi-tenant options

Michael: Hello and welcome to Postgres.FM, a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard, and I'm joined as usual

by my co-host Nikolay, founder of Postgres.AI.

Hey Nikolay.

Nikolay: Hi Michael, hi.

Michael: And today we are joined by special guest Gwen Shapira,

co-founder and chief product officer at Nile, to talk all things

multi-tenancy.

Hello Gwen, thank you for joining us.

Gwen: Thank you for having me, it's very exciting for me to be

in your show.

Michael: Oh, we're excited to have you, thank you.

So to start perhaps you could give us a little bit of background

and what got you interested in this topic of multi-tenancy

in general?

Gwen: As many things it started with an incident but this time

not actually one of mine, so when my co-founder and I started Nile

we actually started with a very different idea.

After about 9 months, we were like, this idea is not working

very well.

We developed some things, we're not finding the market we hoped

to find.

We were sitting in the Hacker Space over in Mountain View, and

we're like, what did we learn?

We talked to 200 companies all doing SaaS.

What have we found out as a result?

And the things that called to us is that very early on, basically

when the first lines of code are getting written, you have to

choose a multi-tenancy model.

And then about 2, 3 or 4 years later, depending on how fast you're

growing, you have to change it.

And we heard a lot of stories on what caused people to change

it and whether they regret earlier choices or they're like, we

didn't know better and how things went for them.

And then we started looking in different blogs and we found so

many by very famous companies with either incidents where something

that was done to a single tenant caused a whole chain of events

that took down their entire system, sometimes for days.

And also a lot of slightly better stories, how we sharded our

highly multi-tenant database.

And we found story after story after story on how people had

to re-architect their entire database, which is, as you guys

know, extremely painful to do after you're a successful company

3 years in.

And we're like, this is a good problem.

So many people have it.

It is so common.

My past was in databases, not as much Postgres, more Oracle and

MySQL.

But I've seen this problem again
and again in all kinds of companies.

My co-founders seen this problem
again and again in all kinds

of companies.

This is such a good problem.

Everyone has it and nobody's working
on it.

Why is nobody working on it?

And that's how we got into it.

Michael: Yeah, nice.

Should we go back then in terms
of how do you like to describe

the different models or the different
options that people have

at the in the early stages?

Gwen: Yeah, so everyone basically
starts with 1 out of 2.

And I'm using AWS terminology,
even though there is other terminology

that people apply to it.

AWS calls it the pooled model versus
isolated model.

And in a pooled model, you basically
create your tables as normal,

and then add a tenant ID Column
to each and every 1 of your tables,

pretty much.

Some, maybe not everyone, some
have shared data, but most of

your tables are going to end up
with a tenant ID Column that

tells you which tenant this Row
belongs to.

Very easy when you start out and
all you have to do is sprinkle

some work clauses every now and
then and you're pretty much good

to go.

How hard can it possibly be?

The places where it gets you is
that you have no solution.

Everyone is in 1 big pool.

And if you have a problem tenant,
if someone grows really, really

large and suddenly you need to
Query start getting slow for them,

If you need to do an upgrade and
1 Customer absolutely refuses

to accept changes or needs their
own time window.

There is a lot of different ways
you may discover that by putting

all your Customers in 1 big pool,
you save a lot of effort, you

save a lot of money.

This is by far the cheapest option
you're going to have.

It's shared resources in 1 Database.

But you are not allowing yourself
to do anything specific for

any 1 Customer should they need
it.

The other approach is basically
the reverse.

It gives every Customer its own
Database or sometimes its own

schema.

This still counts as isolated,
even though It's not all that

isolated.

You share quite a lot in that scenario.

But the schema is separate.

And it's quite a bit easier to
move them out if needed, if the

schema is separated.

And In this scenario, first of
all, there is a nice benefit that

you get help from a lot of popular
frameworks.

I think Ruby has an, I think it's
called apartment plugin for

having this kind of multi-tenancy
model.

Django has something.

So a lot of very popular frameworks
have something that helps

you in that model.

But if you accidentally grow to
a large number of databases,

it starts being very painful.

Obviously, a database with a large
number of objects is no longer

as easy to work with.

Suddenly, you start learning how
much space and memory the catalog

can really take when you have connections.

Suddenly you start learning that
pg_dump can take a very long

time.

If you actually have a database
for each tenant, then doing any

kind of maintenance on 100 databases
is already not fun.

If you end up going into a thousand
databases, it's really not

fun.

And if you think about it, a lot
of SaaS have customers in the

hundreds of thousands, not just
thousand.

So it becomes very painful exactly
when you grow.

Nikolay: Yeah.

For example, upgrade includes dump
restore schema.

And we've had cases with 250,000
tables, 2 million indexes.

Well, indexes are there not dumped,
but tables.

Gwen: Exactly.

This is worse than no dump, right?

If only we could dump indexes.

Nikolay: Yeah, and then you need
to update statistics after upgrade

for all of those tables.

It's a nightmare, honestly.

Gwen: Yeah.

On the other hand, at least you
get to do it to a customer at

a time.

Imagine that everyone is in 1 really
big database and now you

have to upgrade all of them together.

Nikolay: Yeah.

Yeah.

It's painful, all this.

Gwen: Yeah.

And usually, and then there is
the mixed model where I think

pretty much everyone ends up with
where you basically have, you

start with the pool model, and
as you grow, you shard it, but

you're actually pretty smart, and
you realize that not all customers

are of the same size, and you can
have some dedicated shards

for your biggest or most sensitive
or most demanding customers.

And this is, I think, if you look
5 years into the life of a

company, I would say that this
is the dominant model.

Some variation of we have shared
databases with pool and then

some dedicated databases with specific
customers.

Nikolay: What is the problem we're
trying to solve?

Is it security or performance or
both?

And if we go back to some customers,
they share this pool, it

affects security goal, right?

So we don't achieve it.

Gwen: Absolutely.

So this is 1, First of all, this
is 1 driver that people actually

start with the isolated model.

They know that they're going into
a sensitive area.

They're focusing on SaaS for healthcare,
SaaS for finance.

Those companies definitely start
with isolated model and try

to figure out how to manage large
number of databases.

A lot of times those companies
don't become huge.

There is that many hospitals in
the United States, but they still

have to build all the tooling to
manage a large number of isolated

databases.

For other companies, it's more
complicated.

I would say maybe 70% of the time,
the reason for eventually

moving customers out and sharding
and isolating would be performance.

It's amazing how many performance
problems can be solved by just

having less data in each database.

On the other side, there are the
story where 2 years in, suddenly

a very sensitive customer shows
up or you want to sell into,

like you thought you were building
a normal CRM or some kind

of a rug database, but then a healthcare
company shows up, a

bank shows up, or even worse, a
government shows up, and they

show up with a list of demands
and since they usually have good

amounts of money to back those
demands there is a lot of incentive

to figure out a solution for them.

Michael: Nice.

We have 1 kind of ugly duckling
in the Postgres world that I'm

not sure quite fits either of these
models.

I wonder if it's worth discussing
row level security briefly,

because if I was to bucket it based
on those definitions, it's

kind of the pooled model in a way
because all of the data is

together but there is some isolation

Gwen: between tenants

Michael: absolutely yeah

Gwen: yeah I have a love-hate relationship
with RLS I think a

lot of people do.

Because you're right, on 1 hand,
it's absolutely a lifesaver

in the pooled model.

Developers make mistakes as joins
get, and conditions get more

complicated.

It's very easy to misplace a work
clause and actually leak data

that you don't want to.

So RLS will prevent you from doing
it if you do it right.

It turns out that a lot of times
the rules could get complicated

and then it leads to bugs.

It also turns out that a lot of
times the rules get complicated

and it leads to terrible performance.

And 1 thing that developers really
don't realize, I'd say almost

no developer realizes it until
they run into it.

The work conditions that RLS introduces
are not optimized like

the work conditions that you introduce.

Because Postgres, thankfully, is
very good about security, It

treats the work conditions in RLS
differently.

I think they have, they call it
security conditions or something

like that.

And they are very, very conservative
on how they optimize it

and how they plan for it.

This has benefits.

You get very strong security guarantees,
very few bugs as a result.

This is fantastic.

On the other hand, the plan would
be sometimes significantly

worse than what you would come
up with if you were to look at

it really hard and do it yourself.

And with RLS, there is basically
no way to force the plan you

want.

You cannot set, enable, disable
different rules because, again,

the main overriding rule is that
we're very conservative on how

we optimize those RLS conditions.

So Some people call RLS a performance
killer.

I wouldn't necessarily go this
far, but you can definitely run

into gotchas and you need to be
aware that it's not a normal

wear conditions that you're looking
at.

Michael: Yeah.

Nice.

Nikolay: So what does Nile offer
today and what is the ideal

solution to all this in Postgres
context of course?

Gwen: Yeah, so basically we wanted
to do maybe 3 things.

First of all, give isolation while
not degrading the developer

experience.

So for example, we partitioned
data by tenant out of the box

for you, completely transparently,
because we know that a bit

later on, you're going to want
it, and it's going to be a pain

in the ass to edit.

We shard it Transparently.

Basically, your database may be
spread across multiple different

charts.

We will route the queries for you
and make sure that they are

working as you should.

So, in a way, you get the model
you will have anyway in 4 to

5 years, but you're getting it
from the get-go and without doing

a lot of the work because we are
doing a lot of the management

for you.

The other thing is that we have
done some work to basically bypass

RLS and still give you isolation.

So the queries, you kind of do
the same RLS set tenant ID equal.

We use that to actually direct
queries.

We rewrite the queries immediately
to the partitions that we

know has the data for that tenant.

So we have a small extension that
kind of replaces table names

with partition names in the query
itself.

And this vastly improves performance
in the majority of cases.

I mean, we've seen it in a bunch
of cases, especially if you

have slightly weird indexes that
RLS may conservatively not use.

The improvement is quite stark,
depending, obviously, on table

sizes.

You can get a benchmark that proves
anything, so I don't want

to throw numbers out there.

But obviously, if you break down
a table with a million rows

into thousand tenants with a thousand
rows each, then you can

show quite a, you can see where
I'm going with that.

The other thing we did, and that
was probably the most work,

and this is still work in progress,
is allow moving tenants around.

Because 1 of the biggest problems
is that the tenant gets large

or noisy.

And you want to give it its own
machine, moving it is usually

a long downtime.

If you catch it after the tenant
is already large.

By doing the compute storage separation,
we can basically make

it transparent.

It's a latency spike while we're
holding off some queries, while

we're moving things like setting
up sequence ideas, moving, pointing

into a different compute into the
same part of the storage.

But it's essentially a no downtime
operation.

So we think it's a huge deal because
again, it's just a problem

that we keep seeing again and again.

Nikolay: Yeah, I'm curious, is
it all open source what you build

or only parts of it?

Gwen: Right now it's mostly hidden.

We have started registering parts
of it under a Apache license

veto.

So yeah, the goal is to open source
it and we already publicly

declared that it's going to be
open source.

We have not a date, but a point
of completion where we plan to

open it.

Nikolay: Yeah.

So I've heard about this, several
interesting things here.

So 1 is extension for this, I guess
it's called, in your documentation,

it's called RLS virtualization
or how's it called?

Gwen: We call it tenant virtualization.

The extension itself, I think we
called it Karnak.

We call everything after stuff
in Egypt and Karnak is a famous

temple.

Nikolay: So data is stored in separate
tables and in the same

database, but extension rewrites
queries to basically route query

to proper table, right?

Is this based on?

Gwen: It's in separate partitions
And we rewrite queries to go

to the correct partition.

We basically bypass RLS, we bypass
the planner trying to make

those calls.

We found out that with a large
number of partitions, this is

significantly more efficient.

Nikolay: I see.

So it's Postgres partitioning.

I see.

And you mentioned also another
thing you mentioned here is sharding,

right?

Gwen: Yeah.

So we use foreign data wrappers
to allow...

So we have 2 things in the architecture.

First of all, we have a proxy,
it's a routing proxy.

So it keeps track of every connection,
which tenant is the current

tenant and it routes it to the
shard that has the correct tenant

in it.

And then we also have some cases
where some developers want to

write queries that touch multiple
tenants.

Those are not going to be as fast,
but we do allow them by use

of foreign data wrappers.

And mixing partitions with our
own partitioning rules with foreign

data wrappers and still keeping
things efficient, we didn't want

the planner on any machine to be
aware of all the partitions

in all the other shards, because
it just explodes the planning

time in ways that we saw as unacceptable.

So what we did is represent each
shard with a table and then

put hierarchical table inheritance
on top of it.

And the end result is basically
a union all between the table

with the table inheritance that
points to all those other shards,

and the table with the partitions.

Now this gives us basically predicate
pushdown because the planner

will push the plan to all those
different charts.

The other charts know that they
have partitions, which the source

planner didn't know.

They will plan correctly with all
the partitions, but they only

know about a subset of partitions.

So we see it's a bit hacky and
it's a bit weird to explain.

And we think we can do better with
some modifications to Postgres,

which we have not done yet, but
this does give us predicate pushdown,

fairly fast planning, and the ability
to do queries that cross

tenants in situations where this
is required, essentially.

Nikolay: Yeah, if you don't involve
2PC, just rely on front-end

wrappers, no two-phase commit.

I'm curious what kind of anomalies
can happen there.

Gwen: Yes, and we prevent a lot
of things that could cause anomalies.

So we do have a transaction coordinator,
but in order to not

overload and also not overcomplicate
our architecture, we limit

some things.

So, DDL has to be done on a single
tenant, and you cannot mix

cross-tenant queries.

Sorry, DML, INSERTs and UPDATEs have to be done on a single tenant,

and you cannot mix in a Transaction cross, so the moment you

start a Transaction, you have to know what tenant you're working

on.

And then we route it to the correct shard, which has the correct

table, and everything has the absolutely correct guarantees.

If you need to do something cross-tenant, you do not involve

it with any kind of UPDATE.

You could still be exposed to some anomalies, I agree, because

it could be ongoing Transactions from other people in other places.

So you get the basic Read Committed guarantees that Postgres

gives you, I believe, but not anything more than that.

But again, we believe that cross tenant Queries are rare and

mostly done in analytical cases, where you do reporting where

it's slightly less critical to have those.

Nikolay: So you forbid the writing to 2 shards in 1 Transaction?

Gwen: If you want to write to a shard, it's fantastic.

You tell us what tenant you're writing data into, and we will

direct you to the correct shard.

Nikolay: I mean, if there is a Transaction which needs to write

to 2 different shards, this is a big problem, because without

PC.

Gwen: Yes, exactly.

And we don't let you do that, essentially, in order to avoid

anomalies.

Nikolay: I see.

Another question here, have you considered the approach used

in Vitesse?

As I understand it, maybe I'm wrong.

Where mostly for analytical Queries maybe, to avoid distributed

Transactions, data is brought synchronously from 1 shard to another.

And we have it locally, like basically kind of Materialized U

on top of logical replication, for example, or something.

And it has eventual consistency approach, of course, but you

can just join it in 1 Postgres, in 1 shard, right?

Have you considered this approach?

Gwen: Y.K.

We have considered it.

I think maybe CitusDB has something similar, if I remember correctly.

I'm not 100% sure.

But yeah, it's something that we were like, yeah, this is a good

idea that we may examine in the future.

It's definitely, we are trying to build something useful gradually,

and we understand that early on, it's almost safer to have a

bunch of limitations that over time will resolve, rather than

allow people to do something unsafe and also build a kitchen

sink.

Nikolay: It sounds to me like Postgres versus MySQL approaches

because MySQL approach, you remember MyISAM?

Maybe you don't remember, but it was like quite bad.

You need to run a repair table
all the time because it's not

ACID and so on.

It allowed too much.

Gwen: Exactly, yeah.

And Marathon had a lot of issues.

I mean, there is a reason why InnoDB
became extremely popular.

Nikolay: Right, right.

Michael: But also, with a multi-tenancy
use case, I think you're

quite right that, well, we, I mean,
you'll find out soon enough,

right, if lots of people want these
cross-tenant or cross-shard

queries which are by definition
cross-tenant queries and if they

don't, if you don't need to worry
about it, you save a bunch

of effort having to even implement
that.

So yeah, I like that a lot.

Nikolay: Yeah, last comment here
is I'm excited to see that finally

Postgres ecosystem receives some
tension in the area of sharding.

I guess it's just time has come
and more and more databases became

too large to be handled.

Gwen: I'm almost surprised.

I'm honestly surprised it took
that long.

I mean, again, if you look at MySQL.

Nikolay: We just, it's just unfortunate.

First time I touched this topic,
it was 2006 immediately when

we started working with Postgres,
honestly.

And There was a PL/Proxy from Skype
at that time already, but

it required you to write everything
in functions.

Gwen: Partitioning didn't exist
back then.

Nikolay: Existed.

It was based on inheritance.

required much more manual.

It was fun, actually.

You understood it better, you know?

But yeah, but it was not convenient,
100%.

Not super convenient.

It's just, I see that it's just
unfortunate how it turned out

in Postgres ecosystem.

And now definitely there is huge
pressure.

Many companies need partitioning...

Need sharding.

And it seems

Gwen: like How would history work?

Imagine that YouTube picked Postgres
and not MySQL as their first

database.

Nikolay: Yeah Google or Facebook
they both chose MySQL somehow

Yeah.

Michael: The test would be for
Postgres first, right?

Gwen: Exactly.

It could have turned out so differently.

Michael: Yeah.

Funny, isn't it?

Changing topics slightly or LLMs
are quite top of mind at the

moment, are you seeing people's
initial choices change as a result

of asking for advice earlier from
our robot friends?

Gwen: Oh my god, we're seeing so
many weird things, it's just

unbelievable how much things are
changing.

First of all, we're having people
show up on our Discord and

say things like, I'm using Nile
because my LLM thought it's a

good idea.

And I don't really know Postgres,
so I need some help, but my

LLM assured me that this is still
a good idea.

Like a lot of people are just like
people who are very much beginners,

like maybe I can say when I started
developing, the first time

I had to use a database, my company
sent me to a 3 weeks database

class.

I think it was Oracle about 20
years back.

And I came back a lot more confident
that I know how to use Oracle

and not to leave transactions open
for too long because people

will yell at me.

These days people don't do the
three-week class before they try

using a database.

So you see a lot more, people are
using a database earlier on

and they do expect more hand-holding
from the vendors.

Like the LLMs give them advice
up to a certain point, but eventually,

if things are slow, they will come
to you and say, hey, why is

my query slow?

I'm sure you guys have seen your
share of that.

I'm also seeing people use Postgres
for their LLMs in different

ways.

And this is really exciting to
me.

People use Postgres via MCPs, people
use Postgres with vectors,

people building AI applications
on top of Postgres.

We're seeing a lot of that.

And, I mean, personally, I'm really
excited that people can program

with LLMs, not knowing a lot about
Postgres, not knowing even

a lot about software engineering
at all, and still get reasonable

security guarantees.

You don't need to know to ask your
LLM about RLS or about, Are

you sure this query actually properly
isolates tenants?

And it's also interesting how much
the results differ when people

use different LLMs.

Like I would say syncing models
do fairly well iterating in order

to get good code and checking their
own results, again, given

via MCP access to a database.

I would say that if you use HRGPT
4.0, you will get a lot of

random hallucinatory stuff still
in your code.

Nikolay: Yeah, it's funny.

I already told Michael, we had
the cases doing consulting, But

like maybe already almost a year
ago.

I started noticing that people
send us Like some we are building

this part of database.

Can you review it?

We are reviewing we use different
lamps supporting this review

and then we have a call and I'm
curious, code looks great, I

mean schema looks great but something
is off, right?

And then we have a call and I see
they open tabs and charge a

PT, Claude there as well so I realize
they used LLM to create

schema and then send us for review
and we use a lens to review

it and then There's like a 4 party
process, you know It's it

leads to good place, but someone
needs to jump with proper expertise

and say, this is not a good approach.

Gwen: This is hilarious.

Do you think at some point you
and your customer can just step

out and let the LLMs figure it
out between themselves?

Nikolay: Well, there's a problem
here because like it's great,

but there is like it does 80% of
drop in 1% not not even 20%

of effort very quickly, but there
is 20% of problems, which again,

like, I don't know, maybe in next
few years, it will change.

I think it will change.

But right now I feel my internal
LLM trained much better than

than Charger PT.

Gwen: You trained your own LLM,
right?

Nikolay: Yeah, no, I mean my own.

Gwen: Oh yeah, no, but I think
you actually trained your own.

Nikolay: Yeah, we experiment.

We do some stuff, we experiment.

We have some things like we have,
we start with fine tuning,

moving to our own LLM, but not
yet there still.

So yeah, there is a lot of stuff
can be done there to properly.

It's hard to compete with Cloud
and they have very high pace.

With Cloud 4 release, I see, wow,
it's really great.

But it's still missing many things
you learn from practice, which

were not discussed.

That's why they don't bring it.

Many problems were not discussed
yet.

And you explore them if you have
a lot of data and heavy workloads.

Gwen: So you're saying that even
training on the Postgres mailing

list doesn't have all the information
in it, essentially?

Nikolay: Well, yes.

For example, random problem, we
recently touched it.

There is a buffer pool in Postgres.

And there are 128 basically partitions.

So you can have 128 locks.

And if you have a huge buffer pool,
this becomes bottleneck.

And some people say, let's maybe
make it tunable, configurable.

But when you start researching
this topic, you end up finding

some recent, well, recent last
August and September last year,

conversation in hackers mailing
list, which is open-ended.

It's not complete, because somebody
needs to run benchmarks and

prove that this is worth having
new setting.

And that's it.

There's a patch proposed, but that's
it.

No experiments yet.

Gwen: This is, by the way, 1 of
the reasons I was so excited

about your LLM approach, because
if you think about what is the

bottleneck for doing a lot of the
database improvement things,

and I am feeling it Very personally,
running benchmarks is hard.

Properly planning a benchmark is
hard.

LLMs can help with that, but again,
they sometimes just go off

the rails.

And even if they help plan that,
I don't know LLMs that actually

run benchmarks to the point where
they provision the machines

in AWS and know that you have to
provision a separate machine

as a database and a separate machine,
maybe a few of them, to

drive the workload and they both
need to have appropriate resources.

Like all those and then don't get
me started on analyzing the

results, which is kind of 99.9%
of the work.

So the fact that you actually kind
of started your LLM for I

have an LLM that can actually do
benchmarks, is just, I think

this will be the biggest breakthrough
in both people tuning their

own Postgres, and also Postgres
as a community being able to

advance the state of the art?

Nikolay: Let me share sad news.

Like, I'm not giving up, but it's
a roller coaster.

I spend more than we spend, like
the team of maybe 5 engineers

spent more than 1 year trying to
achieve that.

We achieved many things, but first
of all, we chose Gemini because

they gave us credits.

And I think it was a huge mistake.

Gemini has a lot of problems.

Suddenly you have 500 errors, which
are so many problems.

It's just not mature product, Gemini,
and it has hallucinations

all the time.

It's good, for example, for JSON,
working with JSON, because

when you need to run experiment,
we decided to choose JSON as

config format.

It writes much better than GPT-4,
4.0 and so on.

But many things, it just hallucinates.

It invents all the time some things.

It just makes up results all the
time.

And we have a system to control
it but it bypasses all the time.

Like it's silly hard.

So then yeah, we experimented with
additional DeepSeek, Lama

and we've fine-tuned a lot.

All versions of GPT, all modern
fresh versions, we also bring

them all the time.

And Claude, Claude is much better,
we just added it to this system

we have.

But After 1 year, I decided, you
know what, I'm like, benchmarks

is extremely hard topic.

And we cannot trust it anymore.

I mean, we cannot trust LLM to
create precise configuration and

process results fully.

So we decided that LLM is just
more like connection thing.

When you engineer benchmark, expert
needs to engineer.

I don't trust any LLM for now,
because any experiment, we plan

to publish maybe 15 to 20 experiments
in our blog last year.

And if you open our blog, you see
just 1 experiment.

And even there we screwed up and
someone on Twitter said this

is not right and we quickly corrected,
which is good.

And we had achievements like interesting
things, bottlenecks

popped up here and there.

It's really fun to iterate with
LLM.

But once you allow to think, to
design experiment and to treat

results, in 99% of time we have
wrong, wrong results, wrong conclusions

and so on.

So for now we are thinking, okay,
this is just, accelerator of

performing experiments, but design
and understanding results

should be in human brain for now.

100%.

Gwen: I love the fact that Benchmarks
is also hard for robots,

to be honest.

It's so hard for humans, right?

Nikolay: Yeah, and we collect so
many artifacts, But somehow

it's super hard still.

So you always think where is the
bottleneck?

And simple question, right?

But for now, it's extremely hard
to let LLM find bottleneck and

draw proper conclusions.

Gwen: And honestly, if you just
had an LM that always said it's

a network, it would be correct
about 80% of the time.

Nikolay: What if it's local?

If it's, there's no network, everything
through Unix sockets

and we had these cases as well.

So I agree with you in production.

Like in production, yes, but in
experiments when we learn Postgres

behavior on single machine running
pgbench locally, we don't

care sometimes.

So there's no network there sometimes.

So it's hard.

So I had many moments of frustration.

But it's so good, like I still
believe that iterations are great.

So if you say this is great benchmark,
just check it on new version.

Just changing 1 thing.

This is good.

We have automation, we have interface,
and it repeats the process

of analysis again.

This is where LLM helps a lot.

Because without LLM, you could
just, you have some form, and

oh, we don't have this parameter
programmed.

It's not exposed in the interface.

It's bad.

With LLM, you have freedom to change
things and iterate based

on existing good benchmarks.

So, yeah, I cannot say we are there
yet.

This project, like, right now is
we are, like, thinking about

next level of it, where I think
we will let LLM, we will give

it less freedom, you know, that's
the key, and control more by

human brain.

Gwen: Human in the loop kind of
thing.

Nikolay: Yeah, yeah, exactly.

Exactly.

So design and first analysis, only
human should be there.

But once you have confidence that
you're moving in the right

direction and you just need to
iterate and expand, for example,

to different versions, platforms,
everything, this is where you

can relax.

You already verified results.

You can say just repeat, but on
different something.

This is where LLM already can bring
you.

It can just speed up everything.

You can throw it to this benchmarking
process and have like 10

experiments running in 10 different
versions or something.

Gwen: And I think this is also
kind of almost the general directions

that agents are taking shape.

I mean, people started to 2025
was supposed to be the year of

the agent.

I think it's almost becoming the
year of the human in the loop

with the agent.

Yeah.

Like all the successful products
I see are you tell the agent

to do some stuff.

You ask it, please plan something.

You give it feedback.

You then say, okay, now that we
have a good plan, go and execute

on it.

You come back an hour later, you
had your coffee.

Okay, let's see what you've got.

Here's some feedback.

Go fix some stuff.

I think it's always every successful
product is a bit like this.

Nikolay: Right.

But sometimes humans start using
different LLM when reviewing

things, right?

Being lazy, right?

Gwen: Yes.

Nikolay: That's interesting.

So I'm not sure how successful
it will be, but my gut tells me

that we need to move, move, move
in this direction anyway and

have some, I don't know, like more
experiments and so on.

I hope we will have more soon to
publish, I mean, and start iterating.

But yeah, it was a rollercoaster
last year, So now we are rebuilding

stuff.

We will see how it works.

And for example, 1 of the experiments
we must do, I think, is

to conduct various benchmarks for
RLS.

Because it's obviously...

That would

Gwen: be a fantastic example.

Yes.

And I mean, this is something that
humans with experience are

pretty good at finding cases where
you're like, is RLS going

to actually be an issue and have
the stories that then the LLM

can go implement and test.

Nikolay: Yeah, we had several cases
and also Supabase has public

materials, blog posts about this.

Obviously, there is already quite
known case when you have like

current_setting function inside
RLS expression.

And this is, yeah, if you select
count 1000000 rows, it's terrible.

And it's quite easy to fix actually.

But yeah, so these kinds of experiments
to collect them and see

how.

Actually, my goal with this experiment
I think will be to prove

that they are not a problem if
you do it right.

Gwen: Interesting.

I would contribute, I think there
was a recent post on the bug

tracker where basically the optimizer,
the planner refused to

use, I think it was a GiST index
or a GiN index due to the belief

that the RLS optimization is incorrect,
is unsafe.

I can look it up and send it to
you.

Yeah, it's interesting.

But yeah, that can also be interesting.

Like if you have a fix for Postgres,
then you can, it will obviously

be nice to showcase.

Nikolay: Right.

I think also if you, so you don't
use RLS, you said bypass it,

right?

Gwen: We bypass it, yeah.

Nikolay: Yeah, that's interesting.

I'm curious if you, in this mixed
schema when we have partitions

or shards, and RLS doesn't make
sense at all to involve RLS additionally

locally if some of shards have
a mixed pool of customers.

Gwen: Okay, 1 of the questions
that I have in mind, and this

is something that we're trying
to help figure out for our users,

often on top of the tenant, you
still have permissions for specific

users in the tenant.

Like you have an admin that can
do anything, and then you may

have someone who is not allowed
to see some rows at all because

they're too sensitive, all this
kind of stuff.

And our users ask us, they can
do it with RLS inside those partitions,

or they can do it in their application.

There is a lot of application level
tools or like middleware

kind of tools that will do it for
them.

Is it better to do it in the app
layer or in Postgres with RLS

is a good question that I don't
have an immediate answer for.

Nikolay: Right.

So there are several layers of
multi-tenancy basically.

Yeah.

Not multi-tenancy, but different
layers.

And If your customer being a tenant
for you, they might have

tenants for them, but also inside
them, they might have additional

clusters or segments of users inside
each tenant.

So it's apartments and rooms inside
apartments.

Gwen: Nice.

I like that.

Yeah.

This is 1 of the things that make
multi-tenancy confusing, right?

Because it's almost like Matryoshka
kind of scenario.

Nikolay: Yes.

This term, by the way, used in
Postgres ecosystem multiple times,

starting with, you mentioned GiST
original paper by Hellerstein

mentions that Matryoshka..,
RD-tree, Russian doll tree, so basically

Matryoshka

tree.

Gwen: That's true.

And also in embeddings, they have
the Matryoshka type embeddings

where you can make them of any
size.

Nikolay: Yeah, I read about this
as well.

Yeah, it's funny.

So yeah, great.

And what are your plans for AI?

I saw MCP servers already with
some integration.

Gwen: We basically have 3 directions.

1 is MCP server and making it public,
giving it authentication.

Right now, it's open-source, you
can run it on your own, but

we're not hosting it, so we should
start hosting it at some point.

So things that just make people
who use LLM, make it easier for

them to use it.

Other thing we've done that we
think is very useful for LLMs,

this we already have, just make
it 0 time to create new databases.

Because LLMs just 0 time and 0
cost.

Because they love creating a lot
of them.

Every time something goes wrong,
okay, let's try from scratch

with a new database kind of situation.

So we're making it fast and cheap.

The other thing that we're still
working on is really to make

our documentation more LLM friendly.

LLMs.txt is absolutely not enough.

It is actually, it creates a very
large file.

The LLMs tend to get lost in it.

We need to figure out how to make it better.

And then also, everyone is kind of thinking, can we have our

own agents?

Can we do something around that?

We're kind of thinking about that.

Something we already have that is useful is just that with the

multi-tenant model, the embedding, the vector indexes are much

smaller.

And this is a huge deal for people building those agents and

LLMs.

Nikolay: Yeah.

And actually, you mentioned, so this zero startup goal, you mentioned

separation of compute and data.

How is that achieved?

Using what approach?

Can you elaborate?

Gwen: Oh, yeah.

Sorry.

This is more than a two-minute answer, but the short one is that

you kind of, you patch Postgres, and then you find a better way

to do your storage and basically wrap every Postgres storage

function with an equivalent with your storage.

And then you also need to apply the WAL continuously to the

storage layer.

Nikolay: Great job doing this like in actually 20 seconds.

I understood very well.

Yeah.

So, and do you have plans to make this open source as well?

Gwen: Yes, absolutely.

I mean, this is a first of all, as you know, WAL readers have

to be registered, ours already is, and with the open-source license.

And then, yeah, we are planning to open

Nikolay: source our

Gwen: storage layer, our extension.

I think at this point it's about seven different patches we've made

on Postgres.

I don't think we'll want to open source it as a Postgres fork,

because the number of patches is quite small and we are maintaining

it.

I think everything from Postgres 12 to 18 at this point, or maybe

13 to 18, something along those lines.

Nikolay: All supported versions, I guess.

Yeah.

That's great.

Yeah.

Well, looking forward to checking it out.

And actually, one more question for me, maybe last one.

Are you open to some benchmarks we probably will do?

We plan to do some benchmarks with various platforms.

It all started after acquisition of Neon and there's a discussion

on LinkedIn.

So I thought about trying some
benchmarks for different platforms.

How do you think, like what are
your thoughts about it?

Gwen: Oh my God, this is very scary.

We are benchmarking ourselves all
the time.

So I'm keenly aware of exactly
like what benchmarks make us look

good and what benchmark make us
look bad.

And I think that's also how we
react to benchmarks in social

media.

Unless it's a benchmark that shows
something in Postgres itself

and like clearly attempting to
educate and help people.

You can design a benchmark to make
anyone in the world look bad.

Nikolay: Benchmarketing, it's called.

Michael: You can design

Nikolay: a benchmark to

Gwen: make everyone look good.

Exactly.

So, I will, at any given time,
I can publish my benchmarks that

makes Nile look fantastic, and
you can run benchmarks that you

decided how to build them, and
may be very realistic and may

even expose a problem that we have
that I didn't know about.

But, yeah, in general, I love benchmarks.

I just have opinions on how benchmarks
are used by marketing

people.

Michael: Very good answer.

For anybody interested in a bit
more about Niall's architecture,

Gwen gave a really good talk at
PGConf.dev recently and the video

just went up on YouTube so I will
put that in the show notes

for anybody that wants a deeper
dive there.

There was also a good talk I saw
at a PgDay Paris event by Pierre

Ducroquet, I'm not sure if I'm
pronouncing that anywhere near

correctly, all about multi-tenant
database design, especially

focusing on something we didn't
focus much on today, which was

the downsides of schema per tenant
design, including things like

observability and monitoring, which
I thought was really fascinating.

So anybody considering going down
that route definitely check

out that video and I'll put that
in the show notes as well.

So yeah thank you so much Gwen
I think we're out of time it's

been a real pleasure.

Gwen: It's been a pleasure Thank
you for having me on.

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

Guest

Gwen Shapira

Co-Founder and CPO at Nile Database

Multi-tenant options

Creators and Guests

Some kind things our listeners have said