Postgres FM | Transcript: Constraints

December 8, 2023 • 39 Minutes

Constraints

Michael: Hello and welcome to PostgresFM episode 75.

This is a weekly show about all things Postgres.

I'm Michael, founder of pgMustard.

This is my co-host Nikolay, founder of Postgres.ai.

Hello Nikolay, what are we talking about today?

Nikolay: Hi Michael, constraints.

Michael: Yeah.

And specifically the 6 DDL constraints that Postgres supports.

Nikolay: Yeah, all of them.

Michael: We're back to basics.

I love this kind of thing.

So nice choice.

Thank you.

Nikolay: Ah, it was my choice.

I'm writing this Postgres Marathon series of how-tos and considering

how to create various constraints without downtime.

Already covered checks and foreign keys and something else.

Saying this, those who follow me closely now understand when

exactly we record this podcast, right?

Okay, because I read these how-tos every day.

So, about constraints in the same order as the documentation describes

them. But also let's talk about practical complications when

you have a lot of data and a lot of TPS as usual, right?

Because the documentation actually doesn't cover these topics.

Michael: Yeah.

Another thing I don't think the documentation covers is why do

we have constraints?

I guess it's so obvious.

But I think it's worth mentioning that without these, we'd be

in real trouble database-wise, wouldn't we?

Nikolay: Well, yes.

So, constraint.

So we have a schema, right?

And without schema, it's not good to live without a schema.

Let's say no to NoSQL in general.

Maybe in some cases, it's fine, but if it's financial data and

so on, you need structure to ensure that data has good quality.

And constraints is the next step.

So you have a schema, so you define column names, data types, and

constraints is an addition to all this to ensure even better quality

of data.

For example, you say no one can create more than two rows with the

same value in this column.

For example, email, and this is an interesting situation because

usually people forget about case sensitivity,

Michael: Right?

Nikolay: Yeah.

Of text or varchar.

And then you say if there is a row in this table referencing

for example user id.

It means that such user should
exist and so on and so forth.

So it's all about data
quality but also sometimes

for foreign key constraints.

It also provides some automation.

I would say limited.

If you have automation for
handling, for example, deletes

should dependent rows in the dependent
table be deleted

or not?

If the main row is deleted.

But this should be used with care
if you have a million rows

dependent.

Deleting 1 row might take a lot
of time and this is also not

good.

Right?

Michael: Yeah, I feel like we've
already dived into a few specifics

around unique constraints and foreign
key constraints there.

But let's go.

Yeah, I think you're right.

I think the Postgres documentation
does cover them in a really

good order.

And it starts with check constraints,
which are super flexible

user-defined constraints, almost,
I'd say.

Is that a fair description?

Where we can choose, I think it
describes them as a boolean condition.

So it's a check that returns true
or false for each row that's

inserted or updated.

Nikolay: Yeah, check constraint
is usually very, how to say,

underappreciated.

It's underused in my opinion.

And unfortunately, there you can
define only some expression

related to this table.

You cannot involve different tables
and subqueries and so on.

But it's so it limits the expressive
power of it.

Michael: Yeah, so we can define it like we can have a check

constraint on a column or we can
have it on multiple columns.

Nikolay: Multiple columns is okay.

You can say, for example, I don't
know, like the sum of these 2 columns

should be positive or something.

Some crazy stuff.

It's possible, definitely.

And this is good.

I mean, for example, if you say,
if you want to say this is integer,

but it should always be odd or
even or something like that, right?

You just define the constraint
that will be checked, if you try to insert

something which violates this constraint,
you will get an error.

And this is how you can achieve
better data quality.

Michael: Yeah.

When you say it's underused or
underappreciated, what are the

kind of typical cases you see for
people?

Nikolay: Typical cases, people
rely on ORM and perform all checks

on...

So, usual, like, 3 parts of architecture,
front-end, back-end,

and database, usually people start
with front-end when they realize.

And this is fair because you should
check a lot of stuff, including

all constraints should be checked
on front to minimize feedback

loop.

Users should quickly see that something
is not wrong.

Ideally before they make an action,
For example, filling some

form, I would prefer seeing constraint
violated and an explanation

how to fix it before I press submit.

It's not good to press submit,
wait some time, and then have

some retries.

It's very annoying.

So constraint checks on frontend
make sense a lot.

But then people, if they use ORM,
they prefer checking it in

code because it's more flexible,
and those who write the logic

for Python, Ruby, anything, Java,
they prefer writing it right

there because it's their favorite
language.

But the thing is that if a company
grows, the project grows, and

then you start having different
users or applications, for example,

someone directly works with data
in some good UI, or you have

another application written in
different code, or in the same

code, but skipping these checks.

Implementation of constraints in
application code is weak because

it's not guaranteed.

Only the database can guarantee it.

That's why check constraints or
other types of constraints are

good to have in the database because
it's the safest way to safeguard,

right?

So you're on the safe side and
nobody will violate it unless

there is a bug.

Sometimes I saw some bugs and a unique
constraint violation happened.

It's good that, for example, with uncheck
soon we will have a unique

constraint corruption check.

Michael: Nice.

Nikolay: Yeah, but it's a different
story.

So check constraint, you just say
the expression always should

be followed, right?

It should always return true.

If it turns false, such an insert
or such an update should be discarded.

Rollback.

Right.

Michael: And what, is it worth
us discussing what to do or how

to add one retro?

Let's say you listen to the podcast
and you realize you should

have some of these in place but
you don't and they're on quite

large tables.

How would you go about adding?

Nikolay: Yeah, actually let me
finish about this consideration

about front-end, back-end, and database
and relationships between

them.

I remember in 2004, I implemented
what's called MVC, right?

Model View Controller.

A very old architecture, maybe
not cool anymore, I don't know.

But what I implemented there, I
implemented a simple thing.

So we define constraints on Postgres
and then we, at bootstrap

of our application, we analyze
all constraints and build logic

to inject it to form and also with
digital signature because

forms can be...

It was long ago, but it's interesting.

Forms, frontend followed constraints
from the database.

It was cool.

I think it's an interesting idea,
probably it should be also

rediscovered, I don't know.

So you just define constraints
where they should be defined in

the database, but then the frontend follows
exactly the same logic and

you don't need to implement it
twice because if you implement

it twice, you will have bugs, different
logic, right?

Michael: Yeah, well, and I've just
realized there's another reason

which is handling concurrent sessions.

So you might have a constraint.

Let's say you have like an amount
of stock of an item and

it needs to not go below 0 or an
account balance that needs to

not go below 0.

If you have concurrent transactions,
you need them at the database

level to make sure you don't end
up letting a user take out more

money than they have.

Nikolay: Exactly.

You cannot check if such a row exists
and then make a decision

outside of the database.

You need to make a decision inside
the database.

Yes.

But also, what I'm talking about
is having main constraints should

be in the database, but you can mirror
them back and front, and you

can have automation, and it's good
if somebody who develops ORMs

or GraphQL or something would follow
this approach, considering

database side constraints as the
main one.

So, check constraints are not only
very flexible, not super flexible,

but quite flexible, but it also
has this beautiful option to

be created in an online fashion,
so to speak.

Yeah, not valid.

So you say not valid, quite a confusing
term.

Michael: Yes, very.

Nikolay: Yes, so things to remember.

When you create something not valid,
it means that it's already

being validated for all new writes.

So this is super confusing.

Michael: So it's kind of not validated
on existing data.

Nikolay: So you cannot create not
valid constraint and then insert

something that violates it.

This write will provide an error.

But what it does is add a not valid
flag when it creates a check

constraint.

It just triggers a long-lasting operation
of a full table scan to

check that all existing rows follow
this logic of the constraint.

Michael: Which is the default behavior
when you add a new constraint.

Nikolay: So if you don't think
about it and just try to create

a check constraint for an existing
large table, it will block

DDL and DML, everything basically,
to this table, and it's not

fun.

So what you should do, if you want
to do it without partial downtime,

is to do it in an online fashion.

You create, so 3 steps, not 2,
3 actually.

My how-to yesterday was not full.

So first you create it with the not
valid flag.

Second, you understand that all
new writes are already being verified

automatically by Postgres.

You yourself take care of existing
rows.

You check existing rows are okay
with this constraint.

Just with simple selects and if
you find some rows that violate

it, you probably want to delete
them or update them to adjust

the values depending on your application
logic or on your logic.

And then only then the third step,
alter table validate constraint,

which will scan the whole table but
this step won't acquire locks

that would block your DML.

It will block only DDL, but hopefully,
you don't alter during

this.

You don't issue any DDLs.

Michael: That middle step's nice,
and you can even do it in batches,

I guess, if you want or need to.

But I guess it shouldn't be a big
deal.

Nikolay: Yeah, depending.

But maybe scanning the whole table can also
be fine, because it's just select,

right?

Well, if you update, yes, in batches,
if you found many millions

of rows that are violated.

But this is, it depends, but this
three-step approach is very,

like, universal, zero downtime approach
and it's great.

Michael: Well, what's the downside
of jumping straight to step

Because you're kind of doing that
if you think your data is fine.

I guess is it in the real world
you're most likely to have some...

Nikolay: Jump, if you want, you
just need to accept this risk

and that's it, of course.

If you are 100% sure, step 2 is
optional, let's say.

But also, as usual, if you issue
an alter with the not valid, you also

need to set lock_timeout and retry.

Because you still need to change
the metadata of the table.

And if, for example, Autovacuum
is running its transaction ID wraparound

prevention mode, processing your
table, you won't be able to

acquire a lock and without lock
timeout and retries logic,

you will start waiting and again,
in this case, you will block

everyone, even if it's not valid,
it's not good.

So retries are needed and graceful
alter needed.

I wish there was such an option like
graceful and you say like how

many retries and how long timeout
for a particular operation should

be.

Michael: That would be a nice word,
like instead of concurrently

it could be like gracefully.

Alter table gracefully.

Nikolay: Or concurrently, something
like that.

Because I think 99% don't think
about it until they have many

thousand TPS and then they realize
some basic operation.

It was always working fine.

Sometimes probably not, but people
like, you know, okay, we had

an issue lasting 30 seconds, something
was not right, but it's

okay.

And then we don't understand why,
right?

I mean, you blocked everyone dealing
with this table for 30 seconds,

for example, But kind of fine,
and we live with it until it starts

annoying you too much, and then
you realize that you need to

lower lock timeout and retries.

I mean, it requires effort,
unfortunately, to have this, right?

And if you're small, you don't
care, but I wish it would be easier,

like something similar to create
indexes concurrently or refresh

materialized view concurrently.

So also, when you validate, I think
if, for example, a DDL-like

kind of create index concurrently
is happening or autovacuum

processing, you won't be able to
acquire this lock, so you need

also to be careful.

But in general, if there you start
waiting, it's kind of fine.

It just makes your operation longer,
but at least no DML transactions

performing DML operations are behind
you in line, right?

So I mean, this is also an issue
with this final step, but it's

not so harmful as in the first
step when you need to instantly

inject this constraint
with a not valid flag.

Yeah, so I think we covered it,
right?

So let's move on.

Michael: The next one in the documentation
is

Nikolay: not null.

I think this should be hidden behind
some, I don't know, concurrently

or gracefully.

That would
Michael: be awesome.

The next one in the documentation
is the not null constraint, which

the documentation points out is
probably one of the ones people

are most familiar with seeing in
schema definitions.

But it's just a special case of
a check constraint, which I hadn't

thought of before.

Nikolay: Right.

But unfortunately, you cannot,
well, you can already, but if

you have check constraint NOT NULL,
saying this column is not

null, like logically it's the same
as the standard NOT NULL.

Same.

But primary key needs the latter,
right?

It cannot use...

But it can.

Since Postgres 12, if you don't
have NOT NULL constraint and

you define primary key or redefine
it, it will try to implicitly

create NOT NULL constraint.

But since Postgres 12, when you
create NOT NULL constraint and

you already have check, it's
NOT NULL.

It will just reuse it, skipping
full table scan, which is very

good optimization.

So you just create check constraint
in this three-phase or two-phase

approach, as we just discussed.

And then you can rely on it when
you're creating primary key,

for example.

Michael: Or

Nikolay: you can define NOT NULL
constraint explicitly if you

need it for primary key or any
other, I don't know, maybe your

application wants regular NOT NULL.

Relying on existing check is NOT NULL.

And then you can drop check and
NOT NULL is still there and you

skipped this unfortunate full table
scan.

While it's like I'm telling this
because NOT NOW, creation of

NOT NOW itself, Postgres doesn't
support 3-step or 2-step approach.

If you want to create NOT NOW right
away on existing table, existing

column, it will need to scan whole
table.

Michael: So we don't have NOT NULL,
NOT VALID.

Yeah.

Nikolay: So yeah, this is like
some nuances to keep in mind.

Generally, my recommendation is
to think more about Check Constraints.

This is why I say they are underappreciated.

They are good and they, you see,
here they support NOT NULL constraint

creation.

Since Postgres 12, not before.

But it means all currently supported
Postgres versions.

Michael: Yeah, true.

Nikolay: 12 is already the oldest.

Michael: Let's move on.

Nikolay: Yeah, unique constraint.

Unique constraint, this is interesting.

I don't know how much detail we
should cover here.

Before our recording, we discussed
the case I discovered in 2017

and still saw in Postgres 16.

So, unique constraint, physically
it relies on unique index,

but it's a kind of implementation
detail.

You can say I want a unique constraint
and Postgres will create

unique index implicitly.

And it's good that it can be done
concurrently, of course, right?

Because indexes can be created
concurrently, which is good.

That's it, basically.

You create a constraint, but I think
you can say using, right?

Create unique constraint using
some index if the index already

exists.

Or what?

Yeah, I don't remember in detail.

But what I do know is that although
unique constraint relies

on an index, unique index, it's not
absolutely the same.

Logically, again, it should be
kind of the same, but you can

have an index without a constraint,
not vice versa.

You cannot have a unique constraint
without a unique index because

Postgres needs a unique index to
support validation checks.

So imagine we created a unique index,
but we haven't created a unique

constraint.

One place where you can see a constraint
is needed is insert on conflict,

right?

If I'm not mistaken, right?

Michael: Yeah, I think so.

Nikolay: Yeah.

So on conflict requires a constraint
to be present.

And if you say you have a conflict,
like for example, do nothing,

you cannot, if you have an index without
a constraint, you cannot

say insert blah blah on conflict
on this constraint because the constraint

does not exist, it will tell you
explicitly the constraint does not

exist.

But at the same time, Postgres
has an interesting, I think it's

a logical bug, still not fixed, and
I reported it in 2017, and

today I checked in Postgres 16,
it has it still.

If you try to insert multiple rows,
it will explicitly say

that the constraint, and it will use
the index name, is violated.

So in one case it says there is no
such constraint, and in another

case it says this constraint exactly
with the same name is violated.

Okay, so inconsistency.

I think it's just a bug that needs
to be fixed and that's it.

Michael: That's funny.

I'll link up the bug report as
well.

Nikolay: Yeah, but honestly, from
a user perspective, I think

it would be good to stop thinking
about unique constraints and

unique indexes as something very
different.

I cannot imagine the case when
we, like, they should go together,

I think.

I cannot imagine, like, we have
a unique index but why don't we

have a unique constraint in this
case?

I think they should go together
all the time. That's it.

Michael: It makes sense.

Nikolay: In this case, there would
not be any inconsistencies

if the constraint...

When I create a unique constraint,
a unique index is created implicitly.

Okay, but why not vice versa?

When I create a unique index, why
Postgres doesn't create a unique

constraint?

I have no answer for this.

Michael: You can't create a unique
constraint not valid as well,

can you?

So there's no difference there.

Nikolay: I don't think so.

Michael: The one thing they have
at, like a change in recent versions.

Nikolay: As always, maybe I'm wrong.

I think I'm not wrong here.

Michael: I don't think so.

The one thing that has changed with
these in the last couple of

years is in Postgres 15, we got
this nulls not distinct option.

Which I still don't, I'd love to
hear from people that have good

use cases for these, but it allows
you to specify that you can

only allow a single null value
rather than multiple null values.

Nikolay: You know, null, it's the
biggest problem in SQL model

now, right?

We discussed it.

Michael: We have a whole episode
on it.

Nikolay: Right.

But why I think people might want
null as like, according to

this, how is it called, ternary
logic, so three-value logic, true,

false, unknown, right?

According to this logic, null means
unknown, and comparing one

unknown to another unknown, you
cannot conclude they are the

same.

You always say they are not the
same.

So the comparison should always
yield to unknown, so basically

to another null.

Mixing nulls and unknowns is another
topic.

This means that a unique index, unlike
a primary key, of course,

a unique constraint, a unique key,
let's also introduce the term

unique key because the SQL standard
doesn't follow this term.

Unique key, unlike primary key,
allows nulls in the column or

in multiple columns if it's a multi-column
index or constraint.

But since we don't know if they
are the same or not, we can allow

multiple nulls.

But historically, exactly because
of the problems with manipulation

of large tables and so on.

For example, before Postgres 11,
we had, if we, for example,

add a new column and we want a default,
it's like a full table rewrite,

we cannot do it.

It was fixed in Postgres 11, not
fixed.

A great feature that you can define
like a virtual default, right?

But we say a default, we cannot say
a default, I don't want a full

table rewrite, I don't want, so I say,
okay, I will consider null as

false.

Just in my application.
```

All right?

So, before Postgres 11, null would
be my false, and true would

be my true.

In this case, I'm breaking theoretical
concepts here.

Null should not be considered as
false.

It's not right.

But just I don't want this operation
to be such a nightmare.

I have a billion rows.

Now it will be my false.

And this leads me to the idea I
want to be like 1 value in index.

That's why.

So to avoid the long heavy operations,
I give null a special

meaning, not as it was supposed
to have.

This is practice, it's not theory,
right?

This is usual, like in many industries,
theory was very good,

we developed great concepts, and
then you go to construction,

for example, oh, this is how it's
used.

We didn't expect.

This is how nulls are used.

Some people use nulls given special
meaning, and they consider

it as a normal value.

In this case, they might want an index
or constraint to say there

should be only 1 row if null.

This is my maybe there are many
other understandings but this

is what I have from my practice.

Michael: Cool.

In fact, you mentioned in passing
there one other important thing

about unique constraints is that
they can be defined across multiple

columns in the table as well.

It can be single columns; it's really
common, but you can do it across

multiple as well.

Nikolay: Okay.

Should we move on?

Primary key

Michael: being a special case again.

Nikolay: Primary key, okay.

So not null plus unique constraint
basically.

This pair gives you a primary key
but there can be only one primary

key.

So I think we already covered.

You need not-null.

For not-null, you probably need
implicitly or explicitly.

You can do it yourself or just
rely on it when the primary key is

redefined.

For an existing large table, right?

For small tables, no problem.

Also, by the way, just this morning
we had a discussion, for

example, creating this concurrently,
should we use it for new

tables?

In my opinion, no.

Because creating this concurrently,
or these multi-step operations,

in this case you lose the good,
beautiful property Postgres has,

transactional DDL and ability to
pack everything into a single

transaction.

If you just define the table, follow
normal approach, don't care

about this 0 downtime stuff.

And you will have single transaction,
right?

All or nothing.

Michael: Same for tiny tables.

Anything will work.

Nikolay: Yeah, maybe like less
than 10,000 rows you don't care

about.

It takes like 100 milliseconds.

Let's go.

You have single step, it's atomic,
great.

But if you have large tables, you
need to redefine primary key

and int4 to int8, for
example.

Michael: That's the big 1, yeah.

Nikolay: Yeah, my team and I implemented
all types of this

operation and we helped multiple
companies, a few billion dollar,

multi-billion dollar companies,
public companies, we helped them

to convert int4 to int8.

I know a lot of interesting stuff
around it, but in general,

you just need like not null, and
we discussed how there's also

tricks if you post-guess 11 trick,
like default minus 1, not

null, you can define right away,
virtually, right?

You don't need even check constraint
but since Postgres 12 we

rely on check constraint but we
remember primary key needs actual

not null And we also create unique
index and when we create primary

key, we say, using this index.

This allows us to put primary key
creation as the final step into

a transaction, which will probably
swap something, right?

Yeah.

Rename columns as well.

And there, of course, you also need
to think about lock acquisition,

retries, low lock timeout, like
all this stuff.

And it's, of course, if you have
many, many, many, I don't know,

like gigabytes, dozens, hundreds,
maybe terabytes, and a lot

of TPS.

You need to engineer this carefully.

I mean,

Michael: this is probably the most
involved of all the things

we're talking about.

And I think it's probably too big
to cover today.

But there is a really good talk
by Robert Treat that I saw that

covers this in about half an hour
in depth for people that actually

have to do this kind of thing.

Nikolay: I'm sure it's not possible
to cover everything in half

an hour because there are several
methods.

They have pros and cons and there
are many many nuances, for example

Foreign keys.

If you redefine primary key, you
need to deal with foreign key

redefinition.

And it's also interesting.

And autovacuum and running transaction
**transaction ID wraparound prevention mode** can

block you.

And also if you decide to mark
foreign key as not valid and then

you realize it blocks writes.

Or you just forgot to drop old
foreign key and new rows after

switch.

So a lot of mistakes.

Michael: Yeah, please use bigints
or **int8** in your

new tables.

Right away.

Nikolay: Or UUID version 7, 8.

Yeah,

Michael: Or UUIDs, yeah.

Cool.

Foreign keys?

Or actually, one last thing, is it
worth discussing, like, multiple,

you can have multiple column primary
keys?

I guess that's obvious from the
multiple column unique ones as

well.

But yeah, foreign keys.

Nikolay: I don't think it's something
somewhat different.

Yeah, so just you need to have
not null on each column participating

in primary key definition.

That's it.

Unlike unique keys, of course.

So foreign keys involve two tables.

And creation of foreign key requires
several locks to be acquired

on both sides.

Fortunately, and full table scan
of both tables to ensure that

values in referencing table have
values that are present in the

referenced table.

So in this case, if you just don't
care and like brute force

approach, like defining documentation,
just create it, that's

it.

Well, you have an issue because
you will you are going to block

probably not, I think lock level
there is not so bad than in

previous cases we discussed check
constraints but you're going

to block DDL for sure.

Probably DML won't be blocked or
will be blocked.

Yeah, probably it will be blocked
as well.

I don't remember details here,
although I wrote it a few hours

ago.
Check out my how-to.

I specified all the locks and what
you're going to block.

But in general, you should care
about it as well.

And generally, under load, you
shouldn't want to create foreign

key in one step.

You need, again, three steps.

First, creation with not valid,
with retries and low timeout.

Then you need to check, again,
like with checks, Postgres will

start checking new writes that
inserted and updated rows, or

deleted in this case as well, because
if you...

Michael: Very important, yeah.

Nikolay: Yeah, yeah, yeah.

And you need an index, but it's
another story.

So, it will start checking to validate
this constraint for new

writes, but for existing rows,
we're still not sure.

So optional step two is to validate
and fix if you see problems,

potential violation, right?

And then third step, you say alter
validate constraint.

Again, understanding that ongoing
index creation or recreation

or vacuum can block you.

So yeah, also three steps.

I explained in detail in my latest
marathon post.

So what else to say here?

Foreign keys can be marked as,
how's it called?

Deferred, right?

Am I like deferred constraints,
right?

So when you have a complex transaction,
you might want foreign

keys to be checked at commit time
later, not at each statement

time, later.

And in this case, I don't remember
again details.

Several years ago we had an issue,
we worked with Miro and had

issue with using PgBouncer to fight
bloat and they used deferred

constraints.

So there is an article about it,
how to use PgBouncer if you have

deferred constraints.

And it explains a lot of details
I will provide, Nick.

So Miro Engineering wrote it a
few years ago.

It was interesting.

Also issued to address.

What else?

I think that's it.

Michael: I think on delete cascade
is worth mentioning.

Like you can define at the point
of constraint creation, what

you want to happen if, like if
you have a, a really common example

is a table of blogs and a table
of blog posts.

If you delete the blog, do you
want the blog posts to be deleted?

Like what do you want to happen
in those cases?

And that's the case where I think
it's really important to mention

that whilst with when we define
a primary key, we get an index.

Foreign keys defined on one table,
their referencing column is

not necessarily, we don't check
that it's indexed as well.

In fact, I've even read a blog
post saying you shouldn't always

index that column.

But I think the cases that you
should far outweigh the cases

where you shouldn't.

It is worth checking you do have
indexes on those, so that

those deletes on cascade are efficient.

Nikolay: Right.

I always try to avoid using this
option, but I see people in

quite large databases under a large
load use this deletion propagation

or update propagation logic.

So 2 things here, you're right
about indexes.

And Postgres DBA toolkit I have
and postgres-checkup tool, they

have reports to find, like, you
have a foreign key, there is an index,

primary key on one side, you don't
have an index on the other side,

so when you will need to delete,
you probably won't need to delete

every day, but at some point if
you have to, it will be a sequential

scan.

Very bad, very slow.

But this is only one thing.

What if you're deleting a blog
which has a million posts?

I don't like it.

I don't like this propagation.

If we know that only low volumes,
low number of rows will be

automatically deleted.

It's okay.

But if it's an unpredictable number
of rows, I would prefer having

my own logic with asynchronous
propagation of change, with some

batches and so on.

Michael: Makes sense.

Nikolay: But I see people use it
on a quite large scale.

Michael: Makes a lot of sense.

Cool, I think we're down to the
last one.

We've made it to exclusion constraints.

Nikolay: Right, I always confuse
exclusion constraints with constraint

exclusion, which was related to
partitioning.

But exclusion constraints are for
time ranges, like intervals,

and you want to define, you want
to say my intervals should not

overlap.

So it's kind of advanced uniqueness,
right?

Maybe not uniqueness, but it's
like for special data, for GIST

SP-GiST indexes and so on.

So you define, I say, I'm building
some schedule.

I cannot allow overlapping, or
I'm like describing some, I don't

know, like 3D world and I have
various shapes and I don't want

them to overlap.

In this case, GIST and its variation,
SP-GiST, will support this constraint

and ensure that no, like, no balls,
for example, or no cubes

or something overlap in that space.

So, Postgres supports this kind
of thing.

Which is good, but it's quite a narrow
use case for me.

I don't see it often.

Michael: Yeah, very cool.

I've heard, I think I've seen it
being used in examples where

People are designing room booking
apps or things like that.

But no, not used it myself.

In fact, it's a tiny little documentation
entry, isn't it?

So it's

Nikolay: time and space.

Yes.

But yeah, the combination is very
brief.

And here, I honestly don't have
anything to say.

It's supported by some index, I
guess, right?

So an index you create with concurrently,
usually, if you want

0 downtime approach.

I'm not sure if exclusion constraint
can be created based on

an existing index.

It should be possible.

I have no idea.

Michael: It says, yeah, good point.

I don't know.

Let us know in the comments.

Nikolay: Yeah, but in general,
I think we covered pretty well

how 6 types of constraints, at
least 5 types of constraints we

covered, and how to create all
of them without downtime.

Just for this final part, let's
think how to drop them.

Well, I think dropping is straightforward
usually.

Yeah.

If it's like unique constraint,
which is based on an index, you

probably just drop an index with
concurrently, right?

Yeah.

Others, you just drop them, but
you need a low lock timeout and

retries to be involved, because
Postgres needs to change table

metadata, and during this, if it's
blocked, it's not good.

So again, low lock timeout and retries
are going to help here.

Michael: What about altering?

Like I saw a good blog post by
where people took some one of the

use cases for check constraints
is, for example, we often say

use text in Postgres.

It's in a lot of recommendations
to use text and not be kind

of constrained on how long.

Nikolay: Not varchar, not char.

Michael: Exactly, yeah.

Nikolay: This, strictly speaking,
is not a constraint, right?

But in a broader sense, it's a constraint.

But it's- Oh, but

Michael: you could implement it
with a constraint.

Nikolay: Oh, exactly.

Yeah, and I like it because you
control exactly how and when

it will be verified.

And you know, again, this is exactly
why, again, check constraints

are underappreciated.

And recently, I indeed prefer using
just text without limits,

not varchar(n), but just text.

And then if I need to limit, I
use check constraints, I know how

to set it up without downtime.

And I know how to change it if
I need to increase or decrease

this limit, I know how to do it,
right?

And It's good.

Michael: But yeah, I think we only
lose 1.

I think we only lose 1 benefit
or like 1 optimization.

I'm not sure if there are if there
is this optimization for check

constraints.

Let's say we want to relax the
constraint from 200 characters

to 300 characters.

I think in if you're using varchar
Postgres will handle that

knowing it doesn't have to recheck
existing data.

But that's the only downside.

Nikolay: If you increase n in varchar,
it won't rewrite.

And it won't scan it, because obviously
if you just increase

the limit, existing rows already
are okay with this.

So yeah, I agree.

And in the case of check constraint,
I'm not sure if we have

any...
Michael: I don't think so.

I've not heard of one.

But we can do the three-step process
you mentioned.

In fact, we can probably add another
constraint and then drop

the existing.

Nikolay: Exactly.

Michael: Yeah, great.

Cool.

Thanks so much, Nikolay.

Thanks, everyone, for listening.

Catch you next week.

Nikolay: Yeah.
See you later.

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

Constraints

Creators and Guests

Some kind things our listeners have said