Postgres FM | Transcript: PostgreSQL 15

October 14, 2022 • 30 Minutes

PostgreSQL 15

Nikolay and Michael discuss PostgreSQL 15 which was released yesterday! We go through our favourite features and some other ones that caught our eye.

Michael: Hello and welcome to Postgres FM,
a weekly show about all things PostgreSQL.

I'm Michael, founder of pgMustard, and this
is my cohost Nikolay, founder of Postgres AI.

Hey, Nikolay, what are we talking about today?

Nikolay: Hi Michael.

This is episode number 15, right?

Michael: It is, and because we are geniuses are, we plan this
really far ahead and we are we gonna be talking about Postgres 15?

Nikolay: Because this week it's released . I
think it's already released yesterday.

Right.

Michael: Yeah, exactly the way we, we record beforehand.

So fingers cross the release went well and , if so, it came out yesterday.

So thanks to everybody involved in that.

And sorry in advance if, if anything went wrong.

Nikolay: Right.

and also I should congratulate you and myself
because we didn't skip any weeks 15 weeks in a row.

It's a big.

Big achievement.

Right.

And Thank you all, for great feedback.

Again.

We received a very good feedback.

Thank you.

And requests also, we listen to requests.

Requests exceed our capabilities.

Definitely.

But we will try to catch up.

Please continue.

Michael: Yeah, absolutely.

I think it would've been easy to skip a week
if we weren't getting lots of nice comments.

So thank you everybody.

Nikolay: Okay.

August 15.

What's your favorite feature?

Michael: straight to it.

you know, I'm a performance fan in general, but I have a few reasons
for picking the performance improvements to sorts specifically.

Nikolay: Uh, there are multiple ones.

Michael: yeah, there are lots, and I, I know they are separate
features, but when you consider them together as a group, I think

they're so powerful, mostly because, Anybody who upgrades will
benefit from them without having to change anything on their side.

Nikolay: Right?

And everyone does it.

Any project has order by, right?

Michael: exactly,

Nikolay: I think so.

99 point 99% have it

Michael: Even.

Yeah, exactly.

And I think it's used in other cases as well, right?

Like I see query plans with sorts in them that don't, you
know, they might be other related to other operations as well.

Nikolay: Right.

The question is how much improvement can be
not noticeable, but like, I, I don't know.

I haven't seen details, haven't tried myself, but I, I do see
many mentioning of whether by small improvements here and there,

like with this indexes and other things I, I noticed it as well.

Yeah, so Order buy was,

Michael: Yeah, and there's a really good blog post by David Rowley,
or rarely, I'm not sure, Sorry that I've definitely got that wrong.

At least once.

On the Microsoft blog that I can include obviously benchmarks are tricky, but
has some benchmarks on each of them, and there's some decent wins in there.

Nikolay: Oh.

So a whole blog post only about uh, sorting improvements.

Michael: in Postgres 15 yep.

Nikolay: By the way, I don't like the word
sorting, and this is official in SQL and p scale.

Release notes use it as well, but sorting sometimes like.

Normal people, not engineers.

They, think about it like, okay, this goes here, this goes there, Right?

Not, not the changing order, you know, this problem, right?

But Ordering.

is much better in, my head than sorting.

But sorting, this is what we have in source code and everywhere and so on.

So yeah, one of the features there, it's
a bunch, bunch of improvements, right?

And one of improvements I would like to notice, is Improvement
of performance, of sorting or ordering when work MA has exceeded.

This is interesting, right?

If I'm not mistaken from Peter, it was his name was
mentioned there, but I would like to test this one.

I'm not sure how much it was improved, but it's definitely
sounds interesting because workman sometimes is not enough.

Like we have files and so on, and this is.

Michael: Yeah, there were multiple, there was, yeah, there were
improvements to on disc sorts, so that's what we're talking about here.

There were improve improvements to in memory sorts,
and there were improvements to the amount of memory

needed for sorting, especially certain data types.

So, Really common data types have had specific optimizations put in for them.

And that's important because it means some sorts that previous, even
if you don't change your work, me setting some sorts that previously

would've spilled to disk will now be able to happen in memory.

So that be an extra performance boost around that threshold.

So yeah, so many improvements that hopefully in
combination will will help people without them noticing.

Not necessarily without noticing, but hopefully upgrading, like always,
hopefully upgrading will give you a performance boost right out of the gate.

Nikolay: Yeah.

Worth testing definitely.

And checking.

Interesting,

Michael: how about you then?

What's your, your favorite feature?

Nikolay: Very small one.

A small feature prob by the way, also,
Peter , as I remember, was involved there.

It was collation control.

Like Postgres will I don't know details obviously, but I
know the problem very well when we upgrade operational.

glibc gse silent, we talked about it in previous episodes might happen, right?

And usually it's, it's usually happens if you upgrade, for
example, from Pogo from Ubuntu 1804 to 2204, for example.

And the question is, is a dangerous sub upgrade of glibc or it's not?

And more often.

Then I would like to, to have it, It's, it's quite dangerous.

So we, you can have some indexes corrupted silently and nobody will tell you.

So this is like a field of minds.

you can step into it and uh, after great.

Also you don't see any problems.

But after a couple of days, your users started to complain.

Some queries don't work as expected, and this is obvious sign of corruption.

So you should test it.

We can check.

so now in Postgres 15, the tool.

That actual version it's not what database expect,
and it's controlled on database level, as I know.

So , it's good at least to have immediate error or message.

Michael: yeah.

So is that a log message?

How does it report it?

Nikolay: I don't know actually.

So , I just, I was, I had just saw that this problem was
addressed, at least somehow, in my opinion, it should be solved.

Like posi should care and posi should know which glibc version was used
when table was created at the base was created, and now it's there.

So it knows and it complains about difference.

How it complains.

I have no idea, unfortunately.

Sorry.

We, we will see unfortunately, In systems I deal with, we will see only
in like three or so years, because like, it'll take time to upgrade big

systems, but smaller systems, it's good that it'll be there very fast.

And it's quite common when you, for example, copy operation.

System upgrade is one of cases.

So you can also, for example, take your PPG data and bring to different.

and without noticing that that GIPSY version has changed.

Or for example, you run pogin containers, p data was
created using one Gipsy version, but in container, you,

you have different gypsy version and also have problems.

So finally, since POGS 15, we will have visibility.

To this issue.

And this is very important thing, I think.

Like it's, it's like, it feels quite small, but it's so painful
to not to have it great at posts 15, finally has it also merch.

Of course.

This is big, right?

I, I have checked the history of merch.

Can you guess when the talks about it started in pos, in the POS project.

Michael: I can cheat because I saw a talk by Simon Riggs at
Progress London, and I think he was involved from quite early on.

But yeah, so it is a long, even though, even though I was told a few months
ago, I'm, I'm still probably gonna under guess let's say six years ago,

Nikolay: 2005.

Michael: so yeah, 16 years ago.

Nikolay: Yeah.

16 or 17.

Oh, roughly.

Right.

And it was reverted in 2018 in Pog, August 11 yeah, it it had issues.

So Simon Rick committed it, and then he needed to revert it, unfortunately.

And this was a big, like, disappointment moment for Postgres 11.

I remember it quite well.

so we lived with upsert but merch is much more power.

it has like conditions.

It has, it has ability to delete instead of just a update or insert.

So it's like, and it's also Standard com compliant which is very important.

And it also Oracle SQL Server Digital, like big digital,
like, okay, two big databases, Oracle and SQL Server.

They support it.

So if you migrate from there, it's like one of very common points of pain
, when you need to rewrite your queries now it'll be much more convenient

Michael: Yeah, exactly.

Less work to do.

A big, like, everything we can do to make those migrations
from things like Oracle, less work overall mean the balance of.

Nikolay: benefit from kind from the project, right?

Michael: Exactly the cost, benefit ratio keeps going in our favor.

So yeah, thanks to everyone who's worked on that for so long.

Nikolay: Yeah.

I, I was surprised at it, like suddenly Okay.

It, it's committed again.

This time of obviously quality is, great.

So is going to stay and it was a big surprise so many years, right?

It's probably, the longest feature in development.

Merge

Michael: so yes.

And then the, the other thing you mentioned there was SQL Standard complaints,
and we've, we talked even quite recently, I think a couple of episodes

ago on why we like how important that is for people choosing Postgres.

So everything you

Nikolay: mention the other standard compliant
feature that probably is repeating the path of merge

because it was reverted after first be or second be.

I don't remember exactly.

Right.

SQL J part of SQL Standard.

It was reverted both from 15 already after beta, right.

and from 16 development branch, it was also
reverted for some time until it polished.

And it was probably the biggest disappointment of the 15 release.

Michael: Yeah, absolutely.

But when I read the, a lot of the things we've praised progress
for recently are things like how how high the quality bar is

and how strict the release process is and things like that.

And it seemed like.

There were really good reasons for not committing it, and that overall we
are probably better off with it coming back at a later date in a better state

Nikolay: Right.

Some, some other feature was reverted, but I don't remember which
one with smaller feature, but also was reverted in this release.

So already after first better couple of revert actions happened.

So interesting observation.

Michael: it's good to see things happening in the beta phase though.

It means people are trying it, people are, you know,
looking at each other's patches and just making sure this

is being held to the, the same standard across the board.

I, I really appreciate it as somebody who mostly relies on Postgres
is reliability and performance of course, but mostly reliability.

I think a lot of the community.

Is here because it just works and features
that, you know, features that go in.

And then you can end up with weird things.

Like if you look at the data types j I know it's not, related, but
data type Jason and then data type type Jason b we, we are forever

having to tell people about Jason B because of Jason being done first.

And I, I'm wondering like if, if.

If that's the kind of thing that maybe wouldn't have happened in
the current, current way of releasing things, but I'm not sure.

Nikolay: Well, between them, there are a couple of years of development, so.

Know, Right?

Yeah.

So what's next?

What's the topic?

to discuss?

Michael: Well, we have a couple of like, there are a couple of other things
listed in the top line features I'd be interested in your opinion on.

There's some improvements to logical, some
logical replication improvements and some.

Nikolay: bunch of improvements.

Right.

Michael: That seems to be something that's
getting better and better each major release.

It's not something I use myself, so I haven't read them in detail,

Nikolay: funny effect.

I'm, I'm using it, but not intensively on big production systems

yet because of issue with it.

And I see observing improvements during last couple of years, like
much more active improvements compared to several previous years.

I'm excited about it because it feels like soon
we will have much better logical replication.

Much, much better.

So, big systems, for example, those who generate more than one
terabyte of wall data per day, or having like dozens of thousands of

TPS size like terabytes or dozens of terabytes, the maintenance of
logical replication will be not such painful as it it is right now.

And I'm not going to describe all of the features.

It's a bunch of good improvements and features.

I will mention only couple of them, for example, you now, you,
you'll be able to skip some actions from the stream of change.

And because if somehow on recipient side, on, subscriber side
for example, conflict occurs, unique key violation or lack

of something because of foreign key violation or something,
usually, it means that's it for your logical duplication.

You need to start from scratch or, to fix somehow
to, to get rid of unique key or something.

It's, it's not good, but right now there, there will be ability to skip.

Some record in this stream and you can continue and understand why
this happened and fix it later but, the big goal number one, is to

continue because if you have a lot of changes, you need to continue
applying changes and losing just one changes is less problematic than.

Being stuck and not apply changes at all.

So this is a quite interesting feature and some
other features are also related to performance and.

and I would like to mention Thatit Cap,
who participated in many of these improvements.

He gave a talk in our Positive TV Open Talk series a few months ago.

So go to Positive tv.

It's a YouTube channel or you're already here, right?

If you watch us with our faces, not only on
podcast version so, and just listen to that talk.

It was like from firsthand a lot of insight, ideas, thoughts, and
observations, How both about Positive 15 and future versions as well.

So this, this is better to listen from
there instead of just listening to us here.

Okay.

Let's sit with logical.

What's next?

What do you think?

Michael: We have a couple of other ones.

There's more compression options.

For example,

Nikolay: Yeah.

Michael: backup.

Yeah.

Nikolay: Yeah, PG Facebook is a way to create
a, I call it thick clone, like regular clone.

Like you copy, if you have terabyte, you will copy this terabyte to
different place on the same disc or different disc, maybe different server.

And of course, compression is good to have because
we, we have a lot of CPU power in many cases.

But this can network maybe.

Worse bottleneck than cpu.

So compressing everything and sending less
is can be beneficial in terms of time, right?

So I easily can see how we can win in many places in our daily operations.

So, DBA operations.

So this, I'm glad this appeared in post.

Michael: Yeah, it's really cool.

I think we've seen a few compression related features in the
last couple of versions, so it feels like there's probably

a few people pushing those, so thank you to them as well.

Nikolay: Right.

And also wall compression can be controlled.

I mean, in wall there is no such thing as wall compression.

There is you, if you enable wall compression, you basically
enable, enable not everything, but only for full page, right.

The wall is recorded in two types.

if you change some row in a table, it's recorded as a change, but
if full page rights are enabled and they're enabled by default

and should be enabled to avoid corruption in many cases.

First change after checkpoint is recorded
as like full page eight kilobytes KB bytes.

And if compression is not, You spend the
eight KPIs, if it's enabled, you spend.

And I recommend everyone to enable this whole compression.

However, I saw in Twitter, somebody complained CPU usage,
some, some queries degraded after enabling compression in my

own experience dealing with large systems, heavily loaded,
all compression was always beneficial and only benefits.

We are.

Significant benefit.

So we, we write much less in wall.

So now, as I remember, positive 15 will
allow you to control the compression type.

because usually it's like, it was quite lightweight compression.

Maybe you want to compress more heavily so now you can tune it.

Additionally, it's interesting.

Yeah.

So this, this is what means.

Wall compression only full page race
first, change in the page after checkpoint.

Subsequent changes until next checkpoint will be
recorded individually only the data, what was changed.

Michael: Okay, I'm with you now.

That makes sense.

Nikolay: Right, Right.

So like more fine tuning for for in hands of DBAs.

This is good,

Michael: Yeah.

And the last one made the.

Kind of a major features list, at least in the release.

The draft release notes was Jason format for Lux,

Nikolay: Oh, before, before we go there yeah I also,
have you noticed this is what is interesting item.

I'm looking at it right now.

I had support for writing wall using direct IO on macOS from Thomas Munro.

This is interesting, usually POS has no anything with direct io.

Some systems have my SQL Oracle, they allowed.

So now only on my course, it's.

with additional conditions like Max VCE zero and, and world level is minimal.

So this is, this looks like of experimental
thing and only on Makos, which is funny, right?

Who?

Who runs production?

Pogs on Makos.

Michael: No, but then again, I guess sometimes people do be, Yeah, I guess
sometimes people do look at performance things On their local machine.

And that's an interesting case, maybe a problem.

But the, the other thing that I've heard of is people
wanting to test the, the processes, the MAC processes as, you

know, potentially interesting to run Postgres workloads on.

Yeah.

Not in production yet, but if they, you know, the M one, M two processes, if
they're really good, then I wonder like what kind of performance we could.

Database

Nikolay: Yeah, for, Well, it's interesting, like I, I
would definitely spend some time benchmarking it and just

to understand what kind of benefits we can have here.

But my opinion, like this is experimental.

Some small move.

I didn't see discussions unfortunately in hackers about it.

But I think.

Something will happen in future, my gut tells me, right?

Because not only on my course, on Linux as well
in this area, and this is, this is interesting.

So yeah, this is it about wall.

Some, some improvements based back up also
happened, some more control and so on.

Like this is also good.

And exclusive Bcca mold is killed.

So no more exclusive mold.

nobody was using it for several years already.

it was default and I remember some confusion, but
now just we forget some cleanup work happened here.

I think that's it about backups.

Let's talk about some develop developer stuff.

What else?

, we discussed, merge, We discussed revert at sql.

Jason, we discussed some sorting or ordering optimizations with es.

There is some work continued related to duplicates.

Remember dation and the improvements in B3 in
post 13 and I think in 12 and 14, definitely.

from Peter Gagan and others, and now more cases are supported, right?

So tables with toast, their indexes also have improvements.

This will affect as as you, Yeah, as you said, this
will affect everyone if you have a quite big table.

So if table is toasted, meaning that you have
records, rough, very roughly more than two kilobytes.

Michael: Yeah.

Nikolay: example, Jason text, or okay.

Right.

Well, there are, sometimes we have small J
values, but, but often we have quite large ones.

And in this case, index, if this table is indexed with
B three Now this benefits, previous releases, introduc.

In terms of index size and how degradation happens when you update it.

So.

Below growth will slow down and so on.

and this is good.

So like, probably it's finalizing the work in this area.

Maybe, I'm not sure.

Maybe there is something else.

But it's feels like what was done before.

Now we have full coverage of case of cases also.

Now remember this, like in unique case we can.

all nus are the.

Which is not what books teach us now should be dis like now equals
now no nows are dis distinguished usually because now is unknown.

So we, if we compare nows, usually the result was, no, they are not the same.

But now the is ability to say they are all the same.

And we have only one single knowledge in our universe in terms of unique keys.

And Yeah.

In some cases it's useful.

Michael: There's a blog post by Ryan Lambert on RustProof Labs about that.

I'd forgotten that was in 15.

Just on the development front, I've just been looking at a few of Peter's
other commits and I'd forgotten that hash_mem_multiplier has been increased.

So this was, I think, introduced in 14.

So anybody running 14 might be interested in this as well.

It's a multiplier to work_mem that can let you raise the amount
of memory available for hashes but without raising it for sorts.

So you could say I want there to be 16 megabytes available for
workmen, but I want there to be 32 or 64 available for uh, hashes.

So you could set multiplier to two or four.

Nikolay: Interesting.

How to make this decision like how can,
we can make the decision with numbers.

We need some proper analysis before we do it.

Right.

This is interesting.

Michael: It's super interesting that the, the default's being changed.

I think that's a really, that's something that Postgres
generally shys away from doing, and I'm really impressed that

Nikolay: You know, my opinion about defaults, right?

Michael: no, I don't think I do.

Nikolay: my opinion, a lot of defaults are absolutely outdated.

It should be changed.

We should care about modern service or SSDs and so on.

And in this context, I will pull us back.

to operational side and log Check Point is on now.

Log check points.

It was off by default and it.

Terrible state because checkpoint data you always wanted.

Right?

And this discussion was like, discussion was like, we don't want to
generate a lot of logs because, you know, like sometimes we have small

machine with small disk and we don't want to fill it with logging.

But chip point data is so useful to understand what's happening for DBAs.

it should be.

And this is small win.

I think this is good.

I'm in the camp of, let's make defaults much more modern,
up to date and on for checkpoints, definitely a win.

The other one log mean auto vacuum duration was also changed to 10
minutes, and I think it's only partial when I would change it to.

One second, or I, I personally use sometimes zero.

Of course, s a lot of logs, but also useful for analysis.

And even if uh, do Icom took half of second,
you have interesting data to analyze you.

You see a lot of interesting stuff.

But of course in some, have a lot of the systems
that will produce a lot of logging volumes.

Michael: Yeah, the, the line in the release notes for this
is fascinating and I think Shows that maybe you have a slight

difference of ideology with whoever's making these decisions.

Cause it, it says this will cause even an idle server to
generate some log output, which might cause problems on

resource constrainted servers without log file rotation.

So the question is, why, why are we optimizing for idle servers
without log file rotation versus a lot of people running those things.

Nikolay: Yeah, this is, I think this note is be, is about logic
points because we have Check point time out and they happen.

By default, five minutes or also not good default as well.

But anyway.

And I think it's, they just want to avoid the situation that when you install
pores, go the, the usual I've been in about pogo from like wide audience.

from long, long ago, it's hard to start.

But once you install it, it's just working and you simply forget about it.

Right.

And Of course, it's, it makes sense to think about
it's not good if you installed it and it stopped,

working after a year or two suddenly because of logs.

It may happen, of course, yes.

But log rotation is, it should be enabled, but find a fact.

A couple of days ago we had a system where
log rotation, quite important system in our.

Infrastructure where ation was disabled and
we had the zero disc space, free disc space.

So it happens, but still log check point on.

Michael: One thing on that note is that I guess the world is changing
quite a lot and, and a lot of people, a lot of the time these defaults

don't matter as much because more and more, like, I think I saw a
survey not that long ago that suggested it might even be close to 50%

Now of instances are running on managed services on care provider.

Nikolay: their own set of defaults.

Michael: And they can change that.

They can set the defaults for you and they can do
these things like logging like log rotation for you.

Exactly.

So I think some of these things,

Nikolay: but there this problem is solved already.

I like, I think, I'm not sure, but I, I'm almost sure
that on RRGs luxury points are wrong by default for log.

Maybe I'm wrong, maybe I'm wrong, but it should be solve, This is

very important formation.

Michael: Yeah.

while we're on this topic, I think the other one that makes sense.

You said for modern systems, the first one that I
thought of was random page cost, and that's still being.

Nikolay: random patch cost should be very close to sec patch cost.

If you use ssd, definitely.

Or if your database is below your ram, also different like then.

Yeah.

So because it means that if, if, if you are in fully cashed
state, it means that sequential random, doesn't matter.

Like similar.

So they should, the cost should, should be close or equal.

But speak back to logging.

Many things can be, should be changed,
reconsidered for example uh, log, um, log logging.

You also should have it log bits off by default or, and many things.

Some, some people for example enable logging of connection
disconnections as well, but it can spam your logs, definitely.

so.

Michael: But yeah, based on, based on this philosophy, an idle
server wouldn't generate loads of logs for either of those.

So I could see, I could see people being more
open to that than maybe some of the other ones.

Anyway, it seems like pro

Nikolay: Depends.

Michael: even if

Nikolay: yes.

Michael: we wanted.

It's so much progress.

Nikolay: I hope this the consideration of
defaults will continue in the right direction,

Michael: Yeah.

Nikolay: but 10 minutes for auto Icom is not enough.

What?

10 minutes.

Okay.

Michael: The other thing I'm looking forward to this year is
seeing how fast each of the cloud providers release new versions.

I think we've seen some some of the newer ones.

Some of them were very fast last year.

Yeah, exactly.

So I think Crunchy Bridge were pretty much the
same day, if not a day or two after the release.

Last year they had a version of 14 available.

Nikolay: Mm-hmm.

Michael: Microsoft, one of their services was,
I think rds still take a little bit of time but

Nikolay: it's a,

Michael: relatively

Nikolay: I, I saw some science of improvement in terms of speed, but yeah,
also Google Interesting Google Cloud which started to improve in terms of.

Pogs suddenly because a few years ago, I, I was, I was saying
don't use Google Cloud there are managed pogs right now.

Things are changing.

They have interesting things.

Again, we had, we had the guest on pogs tv.

Ha.

Crossing.

former Skype pogo architect, and it was great.

Talk about

Michael: Yeah, it really was.

Yeah.

And we're,

I'm guilty of being a Google Cloud Postgres care user as
well, and they, they are doing some really cool things.

They did release 14 quite quickly as well,

Nikolay: Mm-hmm.

.so things are improving.

I think it's competi.

Michael: Yeah,

absolutely.

And people wanting it, which is good news,

people, users asking for it.

So yeah, encourage everybody to, to check out 15 as and when they can
and to keep upgrading, especially I guess anybody on version 10 or

before should definitely be thinking about upgrading as soon as possible.

Cause that will stop

Nikolay: Oh, 10 is 10 is 10 is, That's it.

For 11.

One year left.

Michael: Yeah.

And even, regardless of security patches and things,
just the number of performance improvements you

could get by upgrading is is worth checking out them.

Yeah, exactly.

As well as all of these features we've been talking about.

Nikolay: Great.

Some partitioning improvements we haven't
mentioned some other things like for, from data.

Okay.

A lot of, a lot of more improvements are there, so it's good.

It.

Michael: So many hundreds.

In fact, actually, another website that I will link up is a good one

Nikolay: upgrade d

Michael: Yes.

Nikolay: com?

Why upgrade?

Yeah.

I, I always use it to show people what they are missing.

Michael: Yeah, it's really good at showing security patches.

You're missing in

isn't it?

And

Nikolay: red.

Red.

Michael: yeah, and additionally, it's, it's got a nice search feature.

So if you're wondering about any changes in the last few major
versions to a feature you really care about, you can search for that

feature by name and see all of the commit messages related to that.

Nikolay: Mm-hmm.

. Mm-hmm.

Michael: It's cool.

Nikolay: Great.

Michael: Wonderful.

Any last things?

Nikolay: Yeah, usually reminder to subscribe, like, and provide more feedback.

We like it.

Thank you.

Thank you

Michael: Yeah, thanks.

It keeps us going and we really appreciate it.

So yeah.

Have a good one, people, and see you next week.

Nikolay: Bye

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

PostgreSQL 15

Creators and Guests

Some kind things our listeners have said