Postgres FM | Transcript: Vacuum

August 12, 2022 • 32 Minutes

Vacuum

Nikolay and Michael discuss vacuum — what it is, why we need it, some improvements in recent years, and even a creative use for vacuum full.

Michael: Hello, and welcome to Postgres FM a week.

Share about all things.

Postgres girl, I am Michael founder of PG mustard,
and this is my co-host ly founder of Postgres AI.

Hey ly, what are we discussing today?

Nikolay: Hi, Michael, we are reacting to
requests and we are discussing vacuum.

Maybe we, we are only starting to discuss vacuum,

Michael: yeah, absolutely deep, deep topic.

But yes, this was, as you say, I think this was one of the first ones.

Somebody, uh, tweeted us once, once we said we were open
to talking about things, people wanted to hear a vacuum

was a, is always a topic that people want to talk about.

It seems like it's something that people don't
necessarily know about when they first encounter Postgres,

but they quickly learn about one way or the other.

And as you get more and more advanced, I.

There still seem to be things that people can learn
about vacuum, even people with many years of experience.

Nikolay: Right, right.

We had a couple of requests and I wanted to
thank everyone who provides feedback to us.

So we see it almost all every day already, especially on Twitter.

And it's so good to, to read it and to see what people think about our show.

And also as usual, I ask everyone to like in the place they
listen or watch, we have a video version, which is uncut by

the way, longer, a little bit on YouTube postgres.tv is a short
web address, but of course it's available on iTunes and so on.

And I encourage everyone to like this show or this episode right now.

And also please share in, social networks you use or working
groups where you discuss Postgres or engineering of everything.

Thank you so much, everyone.

So, speaking of Postgres TV channel, we also
have A great guests coming almost every week.

It's called open talks.

The idea is we invite people who, presented some interesting
talk at some conference, but recording was not done.

And one of the such talks recently was Hannah crossing X Skype.

He created a lot of, of positive related stuff at Skype.

And, he's now at Google.

and the talk was titled, do you vacuum every day?

It was great.

Like, it's very great deep at the, and at the same time, very simple
material, everyone can watch it and understand a lot of stuff.

So I, I, again, like encourage a hundred percent everyone should
watches so it's, it's directly related to our today topic.

Michael: I watched that.

I thought it was fantastic.

He managed to keep it, as you say, kept it simple,
but I definitely still learnt a few things during it.

I think it was about an hour, maybe a little bit longer.

So it is definitely a deep dive, but, definitely were worth watch.

Nikolay: Yeah, we don't, we don't limit our speakers, with timing
unlike at conference, when we have very strict constraints.

So we can spend a couple of hours, definitely
if you want, and we have material and desired.

Definitely.

So, but our show is roughly 30 minutes because, as
usual, I want to say big hello of those who are running

or riding bicycle, or especially walking their dogs.

I know some people.

from feedback, I've learned it and I, I think docs must love our show.

Michael: Yeah, it's their excuse to get out.

Uh, I think dogs and elephants would be friends too.

Keep it in the spirit.

Nikolay: right.

So vacuum.

Where to start.

What do you think.

Michael: I think probably right at the beginning.

Right.

So why do we have VA?

Like, what is vacuum?

Why do we have it?

Some people kind of see it as a negative part of Postgres.

I see it as something we need because of some of the design decisions, right.

At the beginning of Postgres, I personally really like it.

I, I think we've gained so much from some of those design decisions.

But this one is to do with multi versions, concurrency control.

Right?

So MVCC the fact that we keep old versions of nearly,
I was about to say rose then, but I guess tuples

Nikolay: rogue version, or you say doubles or

Michael: which is correct?

I say the correct one.

Nikolay: I don't know.

I heard both.

So which we need to choose.

Michael: So.

I'm gonna say tuple.

I don't know why.

Nikolay: I, will choose two Okay.

Taps, uh, let's let's let's see use taps, so right.

A tap is a physical version of a row and each row can have multiple
tops at the same and one transaction sees only one of them.

It may happen that some tap, which is physical
version of a row is not visible to any transaction.

So basically it's dead.

Right.

And, you can check when the two was created,
because you can select, uh, XME X, max, CT.

I D we, we discussed CT ID some time.

it's very convenient sometimes to know that there are hidden
system columns in each table, they are created automatically.

You cannot, for example, you cannot use
CT I D or XME in, in your column names.

So, so it's like reserv words, but you can select,
uh, and see the value of XME X max, and the CT.

I D CTD is a physical address like page and offset X mean is when the
transaction idea which created the two, but X max is slightly more complex.

It's usually zero means like a row is live, but it can be not zero.

It can be something different, some transaction which was rolled back.

So X max is present.

You can see it in your transaction.

Obviously this tool is live this like, so it means that
when you select Xen X, max, you have a feeling of double.

Right and the next time you select, you
can have different values of XME X marks.

XME no, but marks and CTD, they can change, , with hidden columns.

So, right.

So if, uh, twofold becomes dead, it's a
problem because it still occupies some space.

This is a key problem that, leads us to the, the need of vacuuming, right?

And also like this very simple exercise.

I recommend everyone who starts working with Paulus, create
a table with just single column, uh, ID intro eight, for

example, and, puts a row where ID equals one and then check I.

and then just update is not changing values.

Say ID equals ID and you will see CT ID will change.

So this is how you can feel that new tap is created.

Every time you try to do update, even if you try and enrolled back, tap
will be created, but new tap will be marked that instead of old one.

Michael: I didn't realize that that's

Nikolay: Yeah.

Yeah.

Or you can insert an inside transaction and then roll back your transaction.

So cancel your insert.

And this insert will produce some tops.

Again, this shows us that testing on production is not good.

Some, some people say, oh, we will insert some data and then roll back
our transaction, not to disturb production, but you disturb anyway,

because you will produce, fresh dead twos and, you'll make a vacuum
to make, to vacuum, to make more work than, than without your actions.

Testing on production.

Not is not a good idea.

Michael: And this is the reason we are discussing this part of it is.

One of the core tasks that vacuum has and presumably where
it got its name from is then going and looking for these

dead two pools and, vacuuming them up or, removing them so
that, reclaiming the space, I guess is there is one that's.

Nikolay: By the way to finish this idea about testing the one
type of workload, which you can do not disturbing physical

layout and not making out vacuum to work more is cancel, delete.

So you delete a lot.

Then roll back.

You will produce a lot of high.

Uh, a lot of records will go to wall, so it'll put some
stress to replication, but physical layout won't change

because your transaction will just put X, max new X max value.

Then when it got canceled with transaction idea marked as canceled.

So it'll.

It will logically will equal to zero, meaning
that these twos, these twos are still alive.

So this is a very interesting kind of workload.

I sometimes use on regular clones when we need to have many iterations.

Of workload, but we don't want our physical layout to
be changed so to start from the same point every time.

So this is the only kind of workload that
won't disturb physical layout and vacuum.

Right.

Sorry for maybe it's kind of off topic a little bit.

, but I, I want everyone just to start thinking about, uh,
pH physical rogue versions, which call apples when they

think about performance, because it's very, very related.

You cannot optimize performance, not understanding VCC and apples, right.

Michael: Yeah, absolutely.

So.

Where do you want to go next?

Do you want to look to talk about performance a little bit?

Or do you want to talk about the different
like, , so one of vacuums tasks is to, free up space.

Nikolay: Yeah, which which, which tasks vacuum has, uh, first
of all, cleaning up the twos, tops, tops, and second is, uh,

preventing transaction ID wraparound, which is called freezing.

And third one additional one is recalculation of table statistics.

The, the statistics of data to help the planner to
understand which plan to choose based on, on data stats.

Right.

And this is also like analyzed part,

Michael: Yeah.

So that happens.

If you, you can, you can, additionally do vacuum
analyze can't you, but also auto vacuum does this.

So I wasn't sure if you were gonna bring this up, so yeah.

Nikolay: Yeah.

Yeah.

Well, I switched us to auto vacuum actually, uh, implicitly, and
this is of course, uh, this is what auto vacuum can do by the way.

It's, it's interesting that it, sometimes it
does just analyze for a table when time comes.

Right.

But sometimes it, it chooses to do vacuum analyze.

So both actions at once.

It's interesting that both are, are possible for auto vacuum.

And that vacuum is great.

It's it allows you to forget about vacuuming.

I remember times when, uh, vacuum wasn't present in POS.

So you, if by default you are in trouble.

if you don't vacuum what happens?

Michael: Yeah that's, that's, a really good point, actually.

So there might be people listening who are, who.

Maybe they're relatively new to progress or they've started
a project and they've not had to worry about vacuum.

And that's probably because for so far, at least auto
vacuum has been, you know, it's, it's got certain default

settings that probably a bit conservative in general.

But when you're first getting started, when
the tables are small, um, they're okay.

They're fine.

They'll keep you they'll keep this from being a problem
for longer than, uh, as you say, if they don't have it.

And I think, uh, as far as I can tell, unless you're
extremely advanced and know exactly what you're doing,

you really shouldn't be turning auto vacuum off.

So if anybody suggests doing that, that should
set off, big, big sirens and warning signals.

Place of work.

I think.

Nikolay: Right.

Well, uh, in my opinion, based on my like 17 plus experience of working with
pores in, uh, quite large setups with hundreds of thousands transactions

per second, multi dozens of terabytes of data and so on in all TP context,
when a lot of users are present and they need very good performance.

So my experience, my opinion is, uh, Uh, is not enough for almost everyone.

So, so it should be tuned in the very beginning.

For example, uh, only three auto workers is not enough for modern servers.

Okay.

If, if it's, if it's a laptop or some very, very small, uh, system.

Three workers is probably enough, but since every time
you produce the twos and we discuss just discussed you

many transactions produce the twos you need cleanup.

So it means that cleanup is almost like constant work that Pogs needs to do.

So my recommendation is.

Check how many course you have and allocate
least 30% of those scores to, to WAM.

So if you have 12 course, four workers, if you have,
uh, I don't know, like 40, um, how many, like 96

course, uh, it means that you need to have 30 workers.

At least if, if you have a very big server and, uh, I'm,
I'm excited to see that some, uh, some settings default

settings were recently changed, uh, cost limit auto it's.

It's very complex.

Like, uh, it's very one setting depends on another
and so on, but, uh, roughly a cost limit on cost.

Uh, associated with auto vacuum.

We are very, very, very old defaults.

And, but roughly speaking, uh, you had only add maybe by its per second
of reads, uh, for all workers that you have not more, this is like

quarter, uh, and, uh, it was changed in PO August, maybe 12 or, or, or.

Plus plus minus one.

Uh, and it was like became 10 times more.

So now, now roughly 80 max per second, but it's not enough

for big, big setups with modern discs.

You probably want like half terabyte or even even bigger, quota.

So you need to tune

farther father.

Michael: Yeah I think there were quite a few good tuning guides.

Aren't there.

We could, I can link to it.

I know at least one really good one in terms of giving people advice on.

Go on.

Nikolay: I like couple of articles from, I don't remember
who, who wrote it, but it was on second quant blog.

when not vacuum doesn't vacuum, when vacuum does,
does vacuum understanding basics of vacuuming.

This is, this is like very like great basic material.

And I, I even open it to very often when I need to explain
something to others, I always mention this, these couple of posts.

So let's link them in, in encourage everyone to, to read it.

But I also wanted to say that, like, if you don't tune it in the very.

will come, right.

Eventually

Michael: Yes,

Nikolay: the two risks, right, right.

Transaction idea up around and blo two risks

Michael: Yep.

and that, like, that manifests itself in gradually gradually
slow performance or, you know, like various things will start

to degrade at that point, disc like will start to increase.

Like, there's, there's a few different ways that that will manifest.

Um, and I, I actually wanted to say like, oh, before we move on from auto
vacuum, The the best advice I ever heard was it's a bit like exercise.

If you, if it hurts, you're not doing it enough.

So it's the, if auto vacuum looks like it's getting in the way
of things, so sometimes I think people see it running and that's

blocking other things, or it's taking ages, things like that.

If you're in that situation, don't turn it off.

I think some people get tempted.

To turn it off so that, so that the problem goes away, the, the solution
is more, make it more aggressive, make it happen more frequently.

Exactly.

So, um, that, that seems to be what, what trips people up, around this

Nikolay: yeah.

Funny word.

When I describe what to do with AKI, I also use, uh, aggressive
and sometimes some company they have interesting bot in,

in slack that say, no, you shouldn't use word aggressive.

choose another one and options but I agree like OWA tuning.

It's two things make it aggressive, move faster, right?

So aggressive.

Like what does it mean?

Move faster.

So give more quota don't limit risk discretes, and so on.

and cost limit cost delay, these like set of settings.

There are several settings.

so also at vacuum settings, depend on vacuum settings.

If it it's minus one, it means get it from there.

So like, that's why I, I see, I say it's quite complex.

You need to, to spend some time understanding these like hierarchy
of settings, but, this is one thing, another thing is frequency.

So like speed quarters, aggressiveness and frequency.

Frequency is very important because if you allow it move fast,
but you keep a defaults like 10% of that twos it's a lot.

I, 1% below T we need to go down.

Like we need to, remove the chips more often.

Michael: Yeah.

So just for anybody wondering, that's like by default, I think all tables.

Uh, so the like basically any table has to grow by 10% in order for it to
trigger another auto vacuum, which when, when the tables are a hundred or

Nikolay: has, has, sorry, needs to accumulate at least 10% of thats,

Michael: great point.

Yes.

Nikolay: right?

It can grow.

For example, a Pand leak situation.

It was fixed by the way in PO 13, maybe when, when this analyzed part
of auto didn't trigger because we don't have dead twos and special

setting was added at additional logic for Pand leak situations.

But the, the, like at least 10% of dead twos by
default, and in my opinion should be 1% in OTP,

Michael: Or maybe not even like, should it Def.

Should it definitely be a percentage.

Like I've seen people switch completely to a raw number OFTU pools instead.

So they, they don't have this degradation
over time as it, as it, as those relations

Nikolay: right.

So yeah, it's also not simple.

There is scale factors, also set of
settings for, for analyze, for vacuum part.

And also there is a.

Threshold is like some absolute number, like
50 by default, as I remember, 50 by twos.

And, uh, there is some formula based on these two based off these two
settings that gives you real threshold that when like defining when not

vacuum can triggers, , well, this static, number approach is, you know,
like there are many topics people have different opinions about, for

example, some, I, I see several groups of people and you still see them
that say, uh, uh, ATO vacuum default auto vacuum is a very silly algorithm.

, for example, it doesn't understand that on, on weekends we have much
more space for work for auto vacuum or at nights it's, it doesn't respect

the time of day or of week or, or the day week, and we should change it.

So they switch off photo vacuum and, uh,
implement their own, algorithm and run vacuum.

Or do both.

Michael: Both both seems like really smart to me.

Right?

So if you know when your business is quiet and you can afford
most of the time to be able to do it, , a manual vacuum analyze,

it's not that you're doubling the work by keeping auto vacuum on
because when auto vacuum then does trigger, it has less to do there.

There's there's less work.

So I, I do see the logic to that myself.

Nikolay: Yeah.

Also, if you do it, with your script, it means that you run so called manual.

Of course it's a full, automated this case, but it's so called manual vacuum.

And in recent versions of POS it has benefit unlike auto vacuum.

It can process indexes of a table in parallel.

So multiple processes and it's like, ization is a great to
improve vacuuming on largers it's it's inevitable way I would say.

And it, it leads us to the topic of partitioning, but maybe

that's a little bit postpone.

Right, right.

But, uh, I agree.

Well, I don't.

I, I don't like the idea of turning a vacuum
completely and relying on your custom inhouse tool.

It may be dangerous, but combination sounds good.

But, if you run vacuum, if you connect to POS using P SQL and run vacuum, it's
UN throttled, or thro is a good way, is a good way to name quotas differently.

Right.

So auto vacuum is throttled too much by default, but manual vacuum
is not throttled at all because vacuum cost delay is zero by default.

It means Thero link is not.

and my question is, can you put your system
down in terms of situation of this Coyo?

Just doing some vacuum, maybe in multiple,
uh, processes, multiple tables at once.

I, I tried I on modern hardware on one, like NBME disks.

I could not do it.

maybe on older hardware, it's possible.

I saw problems related to performance when vacuum was.

Aggressive right.

Moving Too fast, using too much of our disco or
capacity, but, on modern disks, we, we recently tried it.

Like we tried to prove that, unleash auto vacuum was not a
good idea, but we failed, always had room, even if we used.

Almost all, all CPUs are doing some work.

It's interesting.

Maybe I'm wrong here.

It's an interesting exercise to see where the limit is.

Michael: That sounds like a, that doesn't sound like a failure at all.

That sounds like a successful experiment
where you learn something really important.

Nikolay: well, there is always bottleneck somewhere, right?

If we cannot, , situate our dis it means that
probably we have some, we, we spend some time in code.

Loading CPU more than could be like, I don't know.

Like it's interesting.

I'm just trying to say that on modern disks, you probably should almost
unleash it to move faster if you, especially, if you have a lot, of course.

Michael: There were a couple more.

So I think partitioning actually might be worth discussing
briefly, probably not the depths of it, but I think a lot of

people think partitioning is gonna help them with performance.

And my experience is different.

My experience is the main argument for it is all around maintenance.

It's all around being able to delete.

I'm getting big thumbs up here.

So yeah, go on.

Nikolay: Yes.

And no.

So I agree like B three is great.

the height B three grows very slow.

So if we have, uh, hundred gigabytes, if we have
10 terabytes, the difference is not that big.

We have, we need just additional, uh, um, buffers.

To be checked to perform index scan, right.

Index on the scan, if you talk about it.

But, the problem is definitely related to maintenance because if we
have 10 terabyte table, a vacuum is taking a day, maybe depending

on, on your speed and, and the power of disks and, and CPU and so on.

But, uh, like it's, it's it's problem itself.

Michael: partition table.

Right?

Nikolay: no partition.

Of course I I'm.

I'm I'm trying to understand like how partitioning can be connected
to vacuuming and, uh, one how partitioning can help vacuuming.

Of course, if a table is partitioned, the problem
is that auto vacuum cannot,, process even.

Indexes of a table yet it maybe it'll be implemented in future
because, uh, regular vacuum can do it, but also like even if it

processed the, the indexes in parallel, for example, you have
10 indexes and you process, uh, using 10, 10 CPUs, 10 processes.

Right.

But, , heap itself, it's hard to paralyze it also, I think
it's possible, but it's maybe like it's complex task right now.

It's not possible.

So if you have a huge table the process of vacuuming it with all
its SynXis will be single credited by, by auto vacuum auto vacuum.

But if it, you partition it, you can benefit
from having many auto vacuum workers.

Of course, if you tuned it from default free to
a bigger number and you should on larger systems.

So this greatly improves the speed of vacuum.

but not only this of course, if your table is partitioned, you can
re-index and create index or recreate indexes, uh, much faster, but

also the state of cash improves because data is localized, much better.

New data is present in pages where mostly
new data or some pages have all data.

And if these twos tops are not changed in, in
some page, it's, it's all frozen, all visible.

Auto vacuums keeps it right.

It's so good.

And maybe it's even not present in cash.

So you have more space in your buffer pool and page cash for newer data.

So cash efficiency increases as well.

Right?

And this directly improves performance, but also one topic I only
recently realized is that, ind indexing and re-indexing it affects vacuum.

If creation of some index or recreation, some index takes hours
during this period, vacuum cannot delete freshly that twos

twos which became that tops, which became that only recently.

Right?

And this is a problem because indexing indexing, they hold horizon.

Michael: E even if it's

Nikolay: Yes, concurrently as well.

There was an attempt to, uh, optimize this
in posts 14 for index and index concurrently.

And, it was so great, like index and index concurrently.

They, they don't hold Xin two.

So all the two, which became that recently vacuum can clean.

Great.

But in recently in June, in Pogs 14.4, this triggered,
this releases understand immediate back fix needed,

and this functionality was reverted in Pogs 14.

Unfortunately.

So the rule of fam you don't want indexation to be very long.

So you need partitioning for faster vacuuming to, to, run it parallel And
to avoid blood because what, like, we didn't discuss what blood is, right?

Blo happens when you accumulate too many that to, to taps, right.

And then a vacuum, delete them a lot of them at once.

And you have a lot of space free space inside pages.

Michael: Yes.

And, and I think we probably should, because we're on the topic of indexes.

Talk about, we we've talked mostly about table bloat so far,
so, that row versions, dead row versions, but there are

also, there's also index bloat, which is subtly different
in my opinion, because you can, you've got the same entries.

Like freeing them up.

Doesn't do as much good as in, in the table or is in the,
in the heap, in the heap, they can be reused very easily.

Another row, a row version can be inserted there.

It doesn't matter what, but in a, in an index because
it's, , it or in a bere index, for example, Adding more

rows can split pages and vacuum won't UNS spit the pages.

So, even if you vacuum as very strongly or very aggressively, you.

Nikolay: rebel.

Michael: It does.

Yeah, exactly.

You won't get it back to the same size as it was at at first.

And therefore you might need to re-index from time to time if you want to
remove index bloat as well, or, or you can stay on as on top of it as you can.

And there's some good optimizations for
this in Postgres 30, and I believe maybe 14.

So yeah, it it's just something that people sometimes don't realize.

Nikolay: Yeah, we talked about it in both in both B three optimizations.

I think they started in POG, August 12, even in or 13.

And we discussed it on Pogo TV channel with on a, and then with Peter Gagan.

It's it's like so great that this optimization happen that very fundamental.

Affecting every post set up, so good D duplication and so on.

I, I agree, but, I think, blog sometimes as useful,
both in hip and, uh, indexes like, if in inhibit

can help to have more often, hip only two updates.

More optimized.

That's not, not without need to touch indexes,
but in indexes, I think it's also useful.

Otherwise, why do we have default fuel
factor in indexes 90 not hundred, right?

I'm not like, like I'm not expert in indexes.

We have other experts like Peter Gagan and, and regen, for
example, or like, by the way, the way, , Inus is a great.

Set of topics, right?

We, we should talk about them at some point, but related to vacuum, I,
I think you're right, over time indexes health degrades anyway, even in

like in modern versions, latest versions of Postgres, it's much better
situations improve, improve blood growth decreased, but still we need to

perform so-called index maintenance and rebuild indexes from time to time.

even if our vacuum is well tuned and, uh, we have
the latest version, we still need to rebuild indexes.

And when we rebuild indexes, we want to be very fast because
of Xin horizon, because we affect, uh, all tables by the way.

So if we have, if we build one index and if it takes
many hours, indexes and our database are affected.

So vacuum cannot clean tops in all tables, all indexes
systems, freshly dead tubes and interest to, to that thats.

So it's, it's huge global problem.

So you want indexing to be fast.

Michael: Yeah, super interesting.

I had one last thing that I thought we couldn't finish this episode without
talking about briefly and, or even if it's just a public service announcement.

There is a parameter for vacuum or I think it should
really be called a different thing, but it's, it's

called vacuum full and you probably never want it.

Like, I,, I could imagine people very happily
using Postgres for decades and never using it.

But I, I just wanted to say, because I think it can trip people up thinking,
well, like the name suggests that it's gonna do very, a very comprehensive

version of vacuum, but, because of the locks it takes it's, it's,

Nikolay: I use it all the time.

I use it all the time.

Michael: go on.

Well for,

Nikolay: really, I

Michael: you joking?

Nikolay: well.

So I use it all the time because it's related to
how we estimate how we see how much blood we have.

You know, that blood estimation is not a simple task.

, scripts.

All, everyone has various versions of scripts.

They have lightweight, but they can be very wrong.

For example, I, I can show you how I can, uh, this, the script says we
have 40% of blood, but blood is zero because the table is just created.

It's easy to do.

It's related to alignment aging.

So if you want to understand real blood numbers, The only there is a top
two top extension extension, but I had issues with, using it some time ago.

So since we are big fans of real experimenting and using clones
and think loans C doesn't matter in this case, what do we do?

We just create clone and promote it to primary.

So it's real clone writeable, and we vacuum
full whole database or specific tables.

And we compare numbers before and after.

Right.

And we can say a hundred percent.

We know that blood is this current blood, is this,

Michael: So, and, and, also, presumably
we're talking about test environments here.

We are not talking about.

Nikolay: Of course detached instance, not product it's,
it's not test because it's production data, but it's like

closer to production to get a real measurement of vacuum.

This, this is very simple, straightforward, brute force approach.

useful.

Michael: Yes, I okay.

I agree.

I was thinking, uh, purely for production use case,
but you are kind a hundred percent more as usual.

Is there anything else you wanted to make sure we talked about today?

Nikolay: No, let's make some summary, maybe like
recommendations, uh, understand V VCC of course.

Right.

Just like read about it.

Understand there are many good materials around then tune auto vacuum.

And then there are, there, there is a separate
topic of monitoring maybe next time, right.

Also run, create indexes from time to time, probably an automated fashion.

It's all, it's not a simple task, also worth separate
discussion, , and maybe use some tools like PPG pack or

PPG squeeze to deal with blood, in tables themselves.

Right.

What else?

Ah, partitioning.

there's a rule of fam, it's like an empiric rule, , not based on like some
logic, but it's based on experience of many people if the table grows.

So, and, or has chances to grow over hundred gigs, it should be partitioned.

Michael: Yeah, I like this.

And I think you word one more rule.

Didn't you like?

So it's, uh, if you're thinking about
hundred gigabytes, think about partitioning.

If you're thinking about a terabyte, think about shouting.

Was that your role?

Nikolay: Shouting.

Yeah.

it was, yeah, it was, uh, discussion too.

It's not my role.

It's like it started in different places and then,

Michael: sounds good.

It's easy to remember.

Wonderful.

Well, thank you.

Thanks again to everybody.

Who's been sharing this on Twitter and wherever else.

We really appreciate it.

And yeah, looking forward to talking to you again next week.

Nikolay: Thank you.

Bye.

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

Vacuum

Creators and Guests

Some kind things our listeners have said