Constraints
Michael: Hello and welcome to PostgresFM episode 75.
This is a weekly show about all things Postgres.
I'm Michael, founder of pgMustard.
This is my co-host Nikolay, founder of Postgres.ai.
Hello Nikolay, what are we talking about today?
Nikolay: Hi Michael, constraints.
Michael: Yeah.
And specifically the 6 DDL constraints that Postgres supports.
Nikolay: Yeah, all of them.
Michael: We're back to basics.
I love this kind of thing.
So nice choice.
Thank you.
Nikolay: Ah, it was my choice.
I'm writing this Postgres Marathon series of how-tos and considering
how to create various constraints without downtime.
Already covered checks and foreign keys and something else.
Saying this, those who follow me closely now understand when
exactly we record this podcast, right?
Okay, because I read these how-tos every day.
So, about constraints in the same order as the documentation describes
them. But also let's talk about practical complications when
you have a lot of data and a lot of TPS as usual, right?
Because the documentation actually doesn't cover these topics.
Michael: Yeah.
Another thing I don't think the documentation covers is why do
we have constraints?
I guess it's so obvious.
But I think it's worth mentioning that without these, we'd be
in real trouble database-wise, wouldn't we?
Nikolay: Well, yes.
So, constraint.
So we have a schema, right?
And without schema, it's not good to live without a schema.
Let's say no to NoSQL in general.
Maybe in some cases, it's fine, but if it's financial data and
so on, you need structure to ensure that data has good quality.
And constraints is the next step.
So you have a schema, so you define column names, data types, and
constraints is an addition to all this to ensure even better quality
of data.
For example, you say no one can create more than two rows with the
same value in this column.
For example, email, and this is an interesting situation because
usually people forget about case sensitivity,
Michael: Right?
Nikolay: Yeah.
Of text or varchar.
And then you say if there is a row in this table referencing
for example user id.
It means that such user should
exist and so on and so forth.
So it's all about data
quality but also sometimes
for foreign key constraints.
It also provides some automation.
I would say limited.
If you have automation for
handling, for example, deletes
should dependent rows in the dependent
table be deleted
or not?
If the main row is deleted.
But this should be used with care
if you have a million rows
dependent.
Deleting 1 row might take a lot
of time and this is also not
good.
Right?
Michael: Yeah, I feel like we've
already dived into a few specifics
around unique constraints and foreign
key constraints there.
But let's go.
Yeah, I think you're right.
I think the Postgres documentation
does cover them in a really
good order.
And it starts with check constraints,
which are super flexible
user-defined constraints, almost,
I'd say.
Is that a fair description?
Where we can choose, I think it
describes them as a boolean condition.
So it's a check that returns true
or false for each row that's
inserted or updated.
Nikolay: Yeah, check constraint
is usually very, how to say,
underappreciated.
It's underused in my opinion.
And unfortunately, there you can
define only some expression
related to this table.
You cannot involve different tables
and subqueries and so on.
But it's so it limits the expressive
power of it.
Michael: Yeah, so we can define it like we can have a check
constraint on a column or we can
have it on multiple columns.
Nikolay: Multiple columns is okay.
You can say, for example, I don't
know, like the sum of these 2 columns
should be positive or something.
Some crazy stuff.
It's possible, definitely.
And this is good.
I mean, for example, if you say,
if you want to say this is integer,
but it should always be odd or
even or something like that, right?
You just define the constraint
that will be checked, if you try to insert
something which violates this constraint,
you will get an error.
And this is how you can achieve
better data quality.
Michael: Yeah.
When you say it's underused or
underappreciated, what are the
kind of typical cases you see for
people?
Nikolay: Typical cases, people
rely on ORM and perform all checks
on...
So, usual, like, 3 parts of architecture,
front-end, back-end,
and database, usually people start
with front-end when they realize.
And this is fair because you should
check a lot of stuff, including
all constraints should be checked
on front to minimize feedback
loop.
Users should quickly see that something
is not wrong.
Ideally before they make an action,
For example, filling some
form, I would prefer seeing constraint
violated and an explanation
how to fix it before I press submit.
It's not good to press submit,
wait some time, and then have
some retries.
It's very annoying.
So constraint checks on frontend
make sense a lot.
But then people, if they use ORM,
they prefer checking it in
code because it's more flexible,
and those who write the logic
for Python, Ruby, anything, Java,
they prefer writing it right
there because it's their favorite
language.
But the thing is that if a company
grows, the project grows, and
then you start having different
users or applications, for example,
someone directly works with data
in some good UI, or you have
another application written in
different code, or in the same
code, but skipping these checks.
Implementation of constraints in
application code is weak because
it's not guaranteed.
Only the database can guarantee it.
That's why check constraints or
other types of constraints are
good to have in the database because
it's the safest way to safeguard,
right?
So you're on the safe side and
nobody will violate it unless
there is a bug.
Sometimes I saw some bugs and a unique
constraint violation happened.
It's good that, for example, with uncheck
soon we will have a unique
constraint corruption check.
Michael: Nice.
Nikolay: Yeah, but it's a different
story.
So check constraint, you just say
the expression always should
be followed, right?
It should always return true.
If it turns false, such an insert
or such an update should be discarded.
Rollback.
Right.
Michael: And what, is it worth
us discussing what to do or how
to add one retro?
Let's say you listen to the podcast
and you realize you should
have some of these in place but
you don't and they're on quite
large tables.
How would you go about adding?
Nikolay: Yeah, actually let me
finish about this consideration
about front-end, back-end, and database
and relationships between
them.
I remember in 2004, I implemented
what's called MVC, right?
Model View Controller.
A very old architecture, maybe
not cool anymore, I don't know.
But what I implemented there, I
implemented a simple thing.
So we define constraints on Postgres
and then we, at bootstrap
of our application, we analyze
all constraints and build logic
to inject it to form and also with
digital signature because
forms can be...
It was long ago, but it's interesting.
Forms, frontend followed constraints
from the database.
It was cool.
I think it's an interesting idea,
probably it should be also
rediscovered, I don't know.
So you just define constraints
where they should be defined in
the database, but then the frontend follows
exactly the same logic and
you don't need to implement it
twice because if you implement
it twice, you will have bugs, different
logic, right?
Michael: Yeah, well, and I've just
realized there's another reason
which is handling concurrent sessions.
So you might have a constraint.
Let's say you have like an amount
of stock of an item and
it needs to not go below 0 or an
account balance that needs to
not go below 0.
If you have concurrent transactions,
you need them at the database
level to make sure you don't end
up letting a user take out more
money than they have.
Nikolay: Exactly.
You cannot check if such a row exists
and then make a decision
outside of the database.
You need to make a decision inside
the database.
Yes.
But also, what I'm talking about
is having main constraints should
be in the database, but you can mirror
them back and front, and you
can have automation, and it's good
if somebody who develops ORMs
or GraphQL or something would follow
this approach, considering
database side constraints as the
main one.
So, check constraints are not only
very flexible, not super flexible,
but quite flexible, but it also
has this beautiful option to
be created in an online fashion,
so to speak.
Yeah, not valid.
So you say not valid, quite a confusing
term.
Michael: Yes, very.
Nikolay: Yes, so things to remember.
When you create something not valid,
it means that it's already
being validated for all new writes.
So this is super confusing.
Michael: So it's kind of not validated
on existing data.
Nikolay: So you cannot create not
valid constraint and then insert
something that violates it.
This write will provide an error.
But what it does is add a not valid
flag when it creates a check
constraint.
It just triggers a long-lasting operation
of a full table scan to
check that all existing rows follow
this logic of the constraint.
Michael: Which is the default behavior
when you add a new constraint.
Nikolay: So if you don't think
about it and just try to create
a check constraint for an existing
large table, it will block
DDL and DML, everything basically,
to this table, and it's not
fun.
So what you should do, if you want
to do it without partial downtime,
is to do it in an online fashion.
You create, so 3 steps, not 2,
3 actually.
My how-to yesterday was not full.
So first you create it with the not
valid flag.
Second, you understand that all
new writes are already being verified
automatically by Postgres.
You yourself take care of existing
rows.
You check existing rows are okay
with this constraint.
Just with simple selects and if
you find some rows that violate
it, you probably want to delete
them or update them to adjust
the values depending on your application
logic or on your logic.
And then only then the third step,
alter table validate constraint,
which will scan the whole table but
this step won't acquire locks
that would block your DML.
It will block only DDL, but hopefully,
you don't alter during
this.
You don't issue any DDLs.
Michael: That middle step's nice,
and you can even do it in batches,
I guess, if you want or need to.
But I guess it shouldn't be a big
deal.
Nikolay: Yeah, depending.
But maybe scanning the whole table can also
be fine, because it's just select,
right?
Well, if you update, yes, in batches,
if you found many millions
of rows that are violated.
But this is, it depends, but this
three-step approach is very,
like, universal, zero downtime approach
and it's great.
Michael: Well, what's the downside
of jumping straight to step
3?
Because you're kind of doing that
if you think your data is fine.
I guess is it in the real world
you're most likely to have some...
Nikolay: Jump, if you want, you
just need to accept this risk
and that's it, of course.
If you are 100% sure, step 2 is
optional, let's say.
But also, as usual, if you issue
an alter with the not valid, you also
need to set lock_timeout and retry.
Because you still need to change
the metadata of the table.
And if, for example, Autovacuum
is running its transaction ID wraparound
prevention mode, processing your
table, you won't be able to
acquire a lock and without lock
timeout and retries logic,
you will start waiting and again,
in this case, you will block
everyone, even if it's not valid,
it's not good.
So retries are needed and graceful
alter needed.
I wish there was such an option like
graceful and you say like how
many retries and how long timeout
for a particular operation should
be.
Michael: That would be a nice word,
like instead of concurrently
it could be like gracefully.
Alter table gracefully.
Nikolay: Or concurrently, something
like that.
Because I think 99% don't think
about it until they have many
thousand TPS and then they realize
some basic operation.
It was always working fine.
Sometimes probably not, but people
like, you know, okay, we had
an issue lasting 30 seconds, something
was not right, but it's
okay.
And then we don't understand why,
right?
I mean, you blocked everyone dealing
with this table for 30 seconds,
for example, But kind of fine,
and we live with it until it starts
annoying you too much, and then
you realize that you need to
lower lock timeout and retries.
I mean, it requires effort,
unfortunately, to have this, right?
And if you're small, you don't
care, but I wish it would be easier,
like something similar to create
indexes concurrently or refresh
materialized view concurrently.
So also, when you validate, I think
if, for example, a DDL-like
kind of create index concurrently
is happening or autovacuum
processing, you won't be able to
acquire this lock, so you need
also to be careful.
But in general, if there you start
waiting, it's kind of fine.
It just makes your operation longer,
but at least no DML transactions
performing DML operations are behind
you in line, right?
So I mean, this is also an issue
with this final step, but it's
not so harmful as in the first
step when you need to instantly
inject this constraint
with a not valid flag.
Yeah, so I think we covered it,
right?
So let's move on.
Michael: The next one in the documentation
is
Nikolay: not null.
I think this should be hidden behind
some, I don't know, concurrently
or gracefully.
That would
Michael: be awesome.
The next one in the documentation
is the not null constraint, which
the documentation points out is
probably one of the ones people
are most familiar with seeing in
schema definitions.
But it's just a special case of
a check constraint, which I hadn't
thought of before.
Nikolay: Right.
But unfortunately, you cannot,
well, you can already, but if
you have check constraint NOT NULL,
saying this column is not
null, like logically it's the same
as the standard NOT NULL.
Same.
But primary key needs the latter,
right?
It cannot use...
But it can.
Since Postgres 12, if you don't
have NOT NULL constraint and
you define primary key or redefine
it, it will try to implicitly
create NOT NULL constraint.
But since Postgres 12, when you
create NOT NULL constraint and
you already have check, it's
NOT NULL.
It will just reuse it, skipping
full table scan, which is very
good optimization.
So you just create check constraint
in this three-phase or two-phase
approach, as we just discussed.
And then you can rely on it when
you're creating primary key,
for example.
Michael: Or
Nikolay: you can define NOT NULL
constraint explicitly if you
need it for primary key or any
other, I don't know, maybe your
application wants regular NOT NULL.
Relying on existing check is NOT NULL.
And then you can drop check and
NOT NULL is still there and you
skipped this unfortunate full table
scan.
While it's like I'm telling this
because NOT NOW, creation of
NOT NOW itself, Postgres doesn't
support 3-step or 2-step approach.
If you want to create NOT NOW right
away on existing table, existing
column, it will need to scan whole
table.
Michael: So we don't have NOT NULL,
NOT VALID.
Yeah.
Nikolay: So yeah, this is like
some nuances to keep in mind.
Generally, my recommendation is
to think more about Check Constraints.
This is why I say they are underappreciated.
They are good and they, you see,
here they support NOT NULL constraint
creation.
Since Postgres 12, not before.
But it means all currently supported
Postgres versions.
Michael: Yeah, true.
Nikolay: 12 is already the oldest.
Michael: Let's move on.
Nikolay: Yeah, unique constraint.
Unique constraint, this is interesting.
I don't know how much detail we
should cover here.
Before our recording, we discussed
the case I discovered in 2017
and still saw in Postgres 16.
So, unique constraint, physically
it relies on unique index,
but it's a kind of implementation
detail.
You can say I want a unique constraint
and Postgres will create
unique index implicitly.
And it's good that it can be done
concurrently, of course, right?
Because indexes can be created
concurrently, which is good.
That's it, basically.
You create a constraint, but I think
you can say using, right?
Create unique constraint using
some index if the index already
exists.
Or what?
Yeah, I don't remember in detail.
But what I do know is that although
unique constraint relies
on an index, unique index, it's not
absolutely the same.
Logically, again, it should be
kind of the same, but you can
have an index without a constraint,
not vice versa.
You cannot have a unique constraint
without a unique index because
Postgres needs a unique index to
support validation checks.
So imagine we created a unique index,
but we haven't created a unique
constraint.
One place where you can see a constraint
is needed is insert on conflict,
right?
If I'm not mistaken, right?
Michael: Yeah, I think so.
Nikolay: Yeah.
So on conflict requires a constraint
to be present.
And if you say you have a conflict,
like for example, do nothing,
you cannot, if you have an index without
a constraint, you cannot
say insert blah blah on conflict
on this constraint because the constraint
does not exist, it will tell you
explicitly the constraint does not
exist.
But at the same time, Postgres
has an interesting, I think it's
a logical bug, still not fixed, and
I reported it in 2017, and
today I checked in Postgres 16,
it has it still.
If you try to insert multiple rows,
it will explicitly say
that the constraint, and it will use
the index name, is violated.
So in one case it says there is no
such constraint, and in another
case it says this constraint exactly
with the same name is violated.
Okay, so inconsistency.
I think it's just a bug that needs
to be fixed and that's it.
Michael: That's funny.
I'll link up the bug report as
well.
Nikolay: Yeah, but honestly, from
a user perspective, I think
it would be good to stop thinking
about unique constraints and
unique indexes as something very
different.
I cannot imagine the case when
we, like, they should go together,
I think.
I cannot imagine, like, we have
a unique index but why don't we
have a unique constraint in this
case?
I think they should go together
all the time. That's it.
Michael: It makes sense.
Nikolay: In this case, there would
not be any inconsistencies
if the constraint...
When I create a unique constraint,
a unique index is created implicitly.
Okay, but why not vice versa?
When I create a unique index, why
Postgres doesn't create a unique
constraint?
I have no answer for this.
Michael: You can't create a unique
constraint not valid as well,
can you?
So there's no difference there.
Nikolay: I don't think so.
Michael: The one thing they have
at, like a change in recent versions.
Nikolay: As always, maybe I'm wrong.
I think I'm not wrong here.
Michael: I don't think so.
The one thing that has changed with
these in the last couple of
years is in Postgres 15, we got
this nulls not distinct option.
Which I still don't, I'd love to
hear from people that have good
use cases for these, but it allows
you to specify that you can
only allow a single null value
rather than multiple null values.
Nikolay: You know, null, it's the
biggest problem in SQL model
now, right?
We discussed it.
Michael: We have a whole episode
on it.
Nikolay: Right.
But why I think people might want
null as like, according to
this, how is it called, ternary
logic, so three-value logic, true,
false, unknown, right?
According to this logic, null means
unknown, and comparing one
unknown to another unknown, you
cannot conclude they are the
same.
You always say they are not the
same.
So the comparison should always
yield to unknown, so basically
to another null.
Mixing nulls and unknowns is another
topic.
This means that a unique index, unlike
a primary key, of course,
a unique constraint, a unique key,
let's also introduce the term
unique key because the SQL standard
doesn't follow this term.
Unique key, unlike primary key,
allows nulls in the column or
in multiple columns if it's a multi-column
index or constraint.
But since we don't know if they
are the same or not, we can allow
multiple nulls.
But historically, exactly because
of the problems with manipulation
of large tables and so on.
For example, before Postgres 11,
we had, if we, for example,
add a new column and we want a default,
it's like a full table rewrite,
we cannot do it.
It was fixed in Postgres 11, not
fixed.
A great feature that you can define
like a virtual default, right?
But we say a default, we cannot say
a default, I don't want a full
table rewrite, I don't want, so I say,
okay, I will consider null as
false.
Just in my application.
```
All right?
So, before Postgres 11, null would
be my false, and true would
be my true.
In this case, I'm breaking theoretical
concepts here.
Null should not be considered as
false.
It's not right.
But just I don't want this operation
to be such a nightmare.
I have a billion rows.
Now it will be my false.
And this leads me to the idea I
want to be like 1 value in index.
That's why.
So to avoid the long heavy operations,
I give null a special
meaning, not as it was supposed
to have.
This is practice, it's not theory,
right?
This is usual, like in many industries,
theory was very good,
we developed great concepts, and
then you go to construction,
for example, oh, this is how it's
used.
We didn't expect.
This is how nulls are used.
Some people use nulls given special
meaning, and they consider
it as a normal value.
In this case, they might want an index
or constraint to say there
should be only 1 row if null.
This is my maybe there are many
other understandings but this
is what I have from my practice.
Michael: Cool.
In fact, you mentioned in passing
there one other important thing
about unique constraints is that
they can be defined across multiple
columns in the table as well.
It can be single columns; it's really
common, but you can do it across
multiple as well.
Nikolay: Okay.
Should we move on?
Primary key
Michael: being a special case again.
Nikolay: Primary key, okay.
So not null plus unique constraint
basically.
This pair gives you a primary key
but there can be only one primary
key.
So I think we already covered.
You need not-null.
For not-null, you probably need
implicitly or explicitly.
You can do it yourself or just
rely on it when the primary key is
redefined.
For an existing large table, right?
For small tables, no problem.
Also, by the way, just this morning
we had a discussion, for
example, creating this concurrently,
should we use it for new
tables?
In my opinion, no.
Because creating this concurrently,
or these multi-step operations,
in this case you lose the good,
beautiful property Postgres has,
transactional DDL and ability to
pack everything into a single
transaction.
If you just define the table, follow
normal approach, don't care
about this 0 downtime stuff.
And you will have single transaction,
right?
All or nothing.
Michael: Same for tiny tables.
Anything will work.
Nikolay: Yeah, maybe like less
than 10,000 rows you don't care
about.
It takes like 100 milliseconds.
Let's go.
You have single step, it's atomic,
great.
But if you have large tables, you
need to redefine primary key
and int4 to int8, for
example.
Michael: That's the big 1, yeah.
Nikolay: Yeah, my team and I implemented
all types of this
operation and we helped multiple
companies, a few billion dollar,
multi-billion dollar companies,
public companies, we helped them
to convert int4 to int8.
I know a lot of interesting stuff
around it, but in general,
you just need like not null, and
we discussed how there's also
tricks if you post-guess 11 trick,
like default minus 1, not
null, you can define right away,
virtually, right?
You don't need even check constraint
but since Postgres 12 we
rely on check constraint but we
remember primary key needs actual
not null And we also create unique
index and when we create primary
key, we say, using this index.
This allows us to put primary key
creation as the final step into
a transaction, which will probably
swap something, right?
Yeah.
Rename columns as well.
And there, of course, you also need
to think about lock acquisition,
retries, low lock timeout, like
all this stuff.
And it's, of course, if you have
many, many, many, I don't know,
like gigabytes, dozens, hundreds,
maybe terabytes, and a lot
of TPS.
You need to engineer this carefully.
I mean,
Michael: this is probably the most
involved of all the things
we're talking about.
And I think it's probably too big
to cover today.
But there is a really good talk
by Robert Treat that I saw that
covers this in about half an hour
in depth for people that actually
have to do this kind of thing.
Nikolay: I'm sure it's not possible
to cover everything in half
an hour because there are several
methods.
They have pros and cons and there
are many many nuances, for example
Foreign keys.
If you redefine primary key, you
need to deal with foreign key
redefinition.
And it's also interesting.
And autovacuum and running transaction
**transaction ID wraparound prevention mode** can
block you.
And also if you decide to mark
foreign key as not valid and then
you realize it blocks writes.
Or you just forgot to drop old
foreign key and new rows after
switch.
So a lot of mistakes.
Michael: Yeah, please use bigints
or **int8** in your
new tables.
Right away.
Nikolay: Or UUID version 7, 8.
Yeah,
Michael: Or UUIDs, yeah.
Cool.
Foreign keys?
Or actually, one last thing, is it
worth discussing, like, multiple,
you can have multiple column primary
keys?
I guess that's obvious from the
multiple column unique ones as
well.
But yeah, foreign keys.
Nikolay: I don't think it's something
somewhat different.
Yeah, so just you need to have
not null on each column participating
in primary key definition.
That's it.
Unlike unique keys, of course.
So foreign keys involve two tables.
And creation of foreign key requires
several locks to be acquired
on both sides.
Fortunately, and full table scan
of both tables to ensure that
values in referencing table have
values that are present in the
referenced table.
So in this case, if you just don't
care and like brute force
approach, like defining documentation,
just create it, that's
it.
Well, you have an issue because
you will you are going to block
probably not, I think lock level
there is not so bad than in
previous cases we discussed check
constraints but you're going
to block DDL for sure.
Probably DML won't be blocked or
will be blocked.
Yeah, probably it will be blocked
as well.
I don't remember details here,
although I wrote it a few hours
ago.
Check out my how-to.
I specified all the locks and what
you're going to block.
But in general, you should care
about it as well.
And generally, under load, you
shouldn't want to create foreign
key in one step.
You need, again, three steps.
First, creation with not valid,
with retries and low timeout.
Then you need to check, again,
like with checks, Postgres will
start checking new writes that
inserted and updated rows, or
deleted in this case as well, because
if you...
Michael: Very important, yeah.
Nikolay: Yeah, yeah, yeah.
And you need an index, but it's
another story.
So, it will start checking to validate
this constraint for new
writes, but for existing rows,
we're still not sure.
So optional step two is to validate
and fix if you see problems,
potential violation, right?
And then third step, you say alter
validate constraint.
Again, understanding that ongoing
index creation or recreation
or vacuum can block you.
So yeah, also three steps.
I explained in detail in my latest
marathon post.
So what else to say here?
Foreign keys can be marked as,
how's it called?
Deferred, right?
Am I like deferred constraints,
right?
So when you have a complex transaction,
you might want foreign
keys to be checked at commit time
later, not at each statement
time, later.
And in this case, I don't remember
again details.
Several years ago we had an issue,
we worked with Miro and had
issue with using PgBouncer to fight
bloat and they used deferred
constraints.
So there is an article about it,
how to use PgBouncer if you have
deferred constraints.
And it explains a lot of details
I will provide, Nick.
So Miro Engineering wrote it a
few years ago.
It was interesting.
Also issued to address.
What else?
I think that's it.
Michael: I think on delete cascade
is worth mentioning.
Like you can define at the point
of constraint creation, what
you want to happen if, like if
you have a, a really common example
is a table of blogs and a table
of blog posts.
If you delete the blog, do you
want the blog posts to be deleted?
Like what do you want to happen
in those cases?
And that's the case where I think
it's really important to mention
that whilst with when we define
a primary key, we get an index.
Foreign keys defined on one table,
their referencing column is
not necessarily, we don't check
that it's indexed as well.
In fact, I've even read a blog
post saying you shouldn't always
index that column.
But I think the cases that you
should far outweigh the cases
where you shouldn't.
It is worth checking you do have
indexes on those, so that
those deletes on cascade are efficient.
Nikolay: Right.
I always try to avoid using this
option, but I see people in
quite large databases under a large
load use this deletion propagation
or update propagation logic.
So 2 things here, you're right
about indexes.
And Postgres DBA toolkit I have
and postgres-checkup tool, they
have reports to find, like, you
have a foreign key, there is an index,
primary key on one side, you don't
have an index on the other side,
so when you will need to delete,
you probably won't need to delete
every day, but at some point if
you have to, it will be a sequential
scan.
Very bad, very slow.
But this is only one thing.
What if you're deleting a blog
which has a million posts?
I don't like it.
I don't like this propagation.
If we know that only low volumes,
low number of rows will be
automatically deleted.
It's okay.
But if it's an unpredictable number
of rows, I would prefer having
my own logic with asynchronous
propagation of change, with some
batches and so on.
Michael: Makes sense.
Nikolay: But I see people use it
on a quite large scale.
Michael: Makes a lot of sense.
Cool, I think we're down to the
last one.
We've made it to exclusion constraints.
Nikolay: Right, I always confuse
exclusion constraints with constraint
exclusion, which was related to
partitioning.
But exclusion constraints are for
time ranges, like intervals,
and you want to define, you want
to say my intervals should not
overlap.
So it's kind of advanced uniqueness,
right?
Maybe not uniqueness, but it's
like for special data, for GIST
SP-GiST indexes and so on.
So you define, I say, I'm building
some schedule.
I cannot allow overlapping, or
I'm like describing some, I don't
know, like 3D world and I have
various shapes and I don't want
them to overlap.
In this case, GIST and its variation,
SP-GiST, will support this constraint
and ensure that no, like, no balls,
for example, or no cubes
or something overlap in that space.
So, Postgres supports this kind
of thing.
Which is good, but it's quite a narrow
use case for me.
I don't see it often.
Michael: Yeah, very cool.
I've heard, I think I've seen it
being used in examples where
People are designing room booking
apps or things like that.
But no, not used it myself.
In fact, it's a tiny little documentation
entry, isn't it?
So it's
Nikolay: time and space.
Yes.
But yeah, the combination is very
brief.
And here, I honestly don't have
anything to say.
It's supported by some index, I
guess, right?
So an index you create with concurrently,
usually, if you want
0 downtime approach.
I'm not sure if exclusion constraint
can be created based on
an existing index.
It should be possible.
I have no idea.
Michael: It says, yeah, good point.
I don't know.
Let us know in the comments.
Nikolay: Yeah, but in general,
I think we covered pretty well
how 6 types of constraints, at
least 5 types of constraints we
covered, and how to create all
of them without downtime.
Just for this final part, let's
think how to drop them.
Well, I think dropping is straightforward
usually.
Yeah.
If it's like unique constraint,
which is based on an index, you
probably just drop an index with
concurrently, right?
Yeah.
Others, you just drop them, but
you need a low lock timeout and
retries to be involved, because
Postgres needs to change table
metadata, and during this, if it's
blocked, it's not good.
So again, low lock timeout and retries
are going to help here.
Michael: What about altering?
Like I saw a good blog post by
where people took some one of the
use cases for check constraints
is, for example, we often say
use text in Postgres.
It's in a lot of recommendations
to use text and not be kind
of constrained on how long.
Nikolay: Not varchar, not char.
Michael: Exactly, yeah.
Nikolay: This, strictly speaking,
is not a constraint, right?
But in a broader sense, it's a constraint.
But it's- Oh, but
Michael: you could implement it
with a constraint.
Nikolay: Oh, exactly.
Yeah, and I like it because you
control exactly how and when
it will be verified.
And you know, again, this is exactly
why, again, check constraints
are underappreciated.
And recently, I indeed prefer using
just text without limits,
not varchar(n), but just text.
And then if I need to limit, I
use check constraints, I know how
to set it up without downtime.
And I know how to change it if
I need to increase or decrease
this limit, I know how to do it,
right?
And It's good.
Michael: But yeah, I think we only
lose 1.
I think we only lose 1 benefit
or like 1 optimization.
I'm not sure if there are if there
is this optimization for check
constraints.
Let's say we want to relax the
constraint from 200 characters
to 300 characters.
I think in if you're using varchar
Postgres will handle that
knowing it doesn't have to recheck
existing data.
But that's the only downside.
Nikolay: If you increase n in varchar,
it won't rewrite.
And it won't scan it, because obviously
if you just increase
the limit, existing rows already
are okay with this.
So yeah, I agree.
And in the case of check constraint,
I'm not sure if we have
any...
Michael: I don't think so.
I've not heard of one.
But we can do the three-step process
you mentioned.
In fact, we can probably add another
constraint and then drop
the existing.
Nikolay: Exactly.
Michael: Yeah, great.
Cool.
Thanks so much, Nikolay.
Thanks, everyone, for listening.
Catch you next week.
Nikolay: Yeah.
See you later.