Slow queries and slow transactions
Hello, and welcome to Postgres FM, episode number one
Pilot episode.
pilot episode yeah.
I'm Michael I'm from pgMustard and I'm joined today by
Nikolay, Samokhvalov, don't try my last name, unless
you have Russian or Ukrainian roots or Belarusian roots.
So yes, I'm Nikolay from Postgres AI.
Wonderful, looking forward to these.
Let's discuss why we do it because I felt so long that we should
have some podcasts pure podcasts without any screencasting or slides
and so on, because it's also a good format to have some discussion.
And I feel like meetups died because of course this started
to die before COVID, but COVID finished them completely.
So we, it's very hard to, to meet at some place and have some pizza,
pizza and beer, and I mean, people still do it, but only like 10 person's
come so like, it's not, it's not working well, so we need more formats.
I know many people hate on purely online
formats, but I actually actually love them.
And I think we should experiment with more and more styles for formats.
And the Postgres community needs some podcasts.
That's why we discussed with Michael and we should start Postgres FM probably.
Right.
Maybe it will die also, but who knows?
Right.
But I, I, I hope not.
Why, why do you do it?
Yeah.
Well, I, I love podcasts in general.
I love them for learning.
I find them really helpful.
I can, I can listen to them at times where I
can't watch a video or I can't attend an event.
I find them really helpful in terms of learning,
but also getting to know people in the community.
See, different guests getting to hear interesting conversations
between people that I probably wouldn't be a part of generally.
So I love them as kind of a, an insight into something that
you're not necessarily familiar with already, or a way of
keeping up with people that you don't talk to all the time.
Like that's quite a nice format as well.
But yeah, basically thought it would be interesting.
I thought I would like to listen to this.
Right.
I would like to listen to a couple of people talking about Postgres each
week or discussing something that is super interesting or controversial
or not yet decided or a common issue that people see all the time.
Something that I've just hit or might be able
to avoid hitting in future, that kind of thing.
Right, I'm the same.
I'm just trapped sometimes trying to fill gaps.
So when I walk my dog or I'm on an airplane, just, I have some
time and I want to not just some entertainment, but something
useful and podcasts is great for this kind of gaps to fill.
Right.
So, yeah.
Okay.
Let's, let's start from.
Topics we have, I think we, like in general, we will try to talk more
about performance, but maybe not only right, but definitely about Postgres
Yeah, I think always about Postgres.
I think based on what we focus on, we probably will end up
bringing performance topics more than most would, but equally
I think we're open to suggestions from other people, as
well as to what, what you'd be interested in hearing about.
So yeah, we're very open on the topics front.
My opinion about performance It was surprise to me,
but I think not everyone is interested in performance.
Not every engineer who writes SQL is interested in performance.
This is discovery I've made.
Like it's like, it feels like the most interesting part,
like, scale, scalability, performance, these kinds of topic
but I found many people, not just interested, they just need
some query to work and return proper results and that's it.
So like, I wish everyone to be interested in performance,
but, anyway let's, let's start with the first topic.
Well, I'm interested in that one, actually.
Do you find there's a pattern as to who cares and who doesn't care so much?
Or is it that they, they only care if it's below a certain
threshold or is it that they care about one thing, way more?
I think, yeah, I think we're movement towards one of the topics we
wanted to discuss and the word, the word threshold is like as good here.
So what I observed in organizations from like, small startups that
grow very fast, like several times all numbers increased several
times per year, or very large organizations with thousands of
people and hundreds of engineers, I observe a quite good pattern.
Like not good, like obvious pattern where, when a business dictates
a feature delivery, but to be very fast, like a lot of competitors
around, so requirements are very strict, so we need to move faster.
So developers mostly interested in having
features delivered like daily, for example.
Right.
So very, very fast move, very fast and under risk pressure.
They don't have time to have the best performance ever.
Right.
They don't have this Like with just don't have time.
So then they start to move like, okay, it works.
Let's go.
And, but the problem is who defines this threshold?
Like where is the minimum and who checks it?
How, how to check it.
This it's usually a problem.
So sometimes you observe very good performance when you develop the
feature, but when you deploy it, it's not good or you deploy it also good.
But one year later, Normally detach this code, it left
unoptimized you have more data and in the query degrades
.
So my thereshold, I have my threshold.
I wrote an article about it.
So it's.
At every webpage should be faster than like 200, 300 milliseconds.
One second as an absolute maximum.
And since we consider page or API request, of course, and since
every page or API request may consist of multiple SQL queries.
It may have zero.
But sometimes we have dozens of queries.
It's not good.
Sometimes we have like a loop with queries.
Of course it's not good, but in general it means like, like if we have the
requirement for a webpage to be not, not longer than one second, it needs to
that general requirement for SQL to be not longer than dozens of milliseconds.
And also my article also describes where this
number comes from, where does it come from?
A human perception.
It's like 200 milliseconds, a reaction of any human plus minus 15, maybe.
So if you press some button, you expect the reaction
below 200 milliseconds better 100 milliseconds.
Yeah, I think I've heard, is it below a
hundred milliseconds it feels instantaneous.
Anything above that we can perceive, we
perceive some delay, even if it's not instant.
Yeah.
So I know exactly where you're coming from on, that makes a lot of sense.
So I guess the question is do you see a lot of companies that
generally have way worse than that performance and still don't care.
Of course.
Well, yes, yes, yes.
A lot.
If users don't complain, oftentimes we just don't care.
Like, we have some functionality it's working, but we have
so many things to create, to compete with others, right.
To, to expand, to grow.
So, if users don't complain, sometimes you see some pagers
or API requests are above one second in duration and nobody.
You can see it and from logs actually, you can check.
If you have, for example, log_min_duration_statement
longer than one second, for example.
And here we can discuss percentiles, right?
So not only every query should be below one second or
below a hundred milliseconds, we say 99% should be below.
Right?
So any big projects should talk in these terms, but sometimes we have
a lot of slow queries and nobody cares until big feedback loop works.
Big feedback loop is when a lot of users started complaining.
And management understands.
It's a big problem.
Or churn right?
Like you hear some users complain, but you realize that
it's the ones that are churning the other one's complaining.
They're the ones that are,
well, yeah,
know, it's real dollars that are being affected or, you know,
e-commerce maybe your conversion rate on the pages that are slower
E-commerce knows how to measure every, every
second of downtime or at least minute of downtime.
So yes.
And degradation is tricky in terms of how to
measure it in dollars, but also maybe it's possible.
I think it's possible.
Again, if you check slow, slow query log, usually a lot of things are, I
mean, in the logical logic company or a large project, a lot of bad things.
And if you have a luxury.
To get a few weeks for optimization that is always
in large project to fill them with optimization.
Right.
But yeah, it's, it's a decision from management
let's optimize and be prepared for further growth.
Yeah.
So I've seen some exceptions that I guess, I think you'll
probably I've read your post and I think it's great.
I think you've probably accounting for them cause they
probably fall a little bit into the analytical workloads,
but some applications almost do analytics as OLTP.
So they might be an analytics product.
And if they let you set a few filters and it takes a few seconds to
load, sometimes that's acceptable from a user, but that seems like
the, they seem like the exceptions where the user understands that
it's doing something complicated or data heavy behind the scenes.
And if they, if they're waiting for that on a one-off
basis, and it's only a few seconds, they seem okay with it.
But yeah, I think that's probably only at the exceptions.
Some complex search.
It may, it might, might be okay.
But usually the question is what will happen in a few years when we
have, when we will have a lot of data, much more data than today.
If today we have 10 terabytes in a few years, we might have 50 terabytes.
So what will happen?
So if it takes three seconds today, it
may be already above 10 seconds in future.
And of course, well, people usually are okay to wait a second or
two during different search from my opinion, but still not good.
We should optimize.
And the examples I've seen with, you mentioned
people scaling and having problems there.
Sometimes these startups, they bring on a customer who has three
four times more data than all of their other customers combined.
You know, when they're in a, an early phase, that's not that unusual.
And if they haven't tested sometimes well performance
drops off a cliff or something goes badly wrong.
We've seen that a few times.
So it's yeah.
Super interesting to think of like planning ahead, but most companies
don't seem to because, as you say that it, it has to be a focus on
feature delivery and, you know, investors need updates and customers
want certain things delivered in certain timeframes, that kind of thing.
Yeah, well
So what can we do?
What can we do to help people?
Well, first, first thing to define some, some thresholds,
as you said, maybe it can be defined the form of life.
If some salaries are involved and they have
some methodologies to control uptime and so on.
So similar approach can be applied here.
Like we can say we have good quality if, for example,
99% of all, queries are below a hundred milliseconds.
We define it a form of SLO service level
objective, and we start monitoring it.
We start having alerts.
If it goes down, we.
Also second thing we perform from time to time, like at least
once per quarter, we perform analysis of current situation.
And also we try to predict future.
So like some capacity planning.
Are we okay with numbers growing with what we observe or we
predict our thresholds to be broken already in next quarter.
So this is like usual approach to growing project.
And then we should of course go down and perform query analysis and
optimization and analyzing whole workload, but it's a called different topic.
Probably we should discuss it separately,
Yeah, well, and the slow query log is a great way of
getting started right the log_min_duration_statement.
If people don't have that turned on, that feels
really sensible for pretty much everybody.
I've seen startups that don't do any monitoring yet.
They don't do any performance monitoring.
So the idea of even getting the 99th percentile might be a stretch.
But if you, if you just start logging the worst queries.
Yeah, I saw this.
So, so what I described is for a larger organization.
So like, like it's, it's already a sound
like some bureaucracy involved, right?
So some processes, but it works in the larger organization that we
need to do it a little bit more complex process to be established.
But then if it's a small startup, a few engineers only I saw this.
CTO with couple of engineers, very small startup, by the way,
it was very successful and sold recently to very large company.
But in the beginning, like three or four years ago, I saw
them a CTO having every query that went to the Postgres
log because of log_min_duration_statement, 500 milliseconds
or so it was sent to Slack and immediate reaction.
So they try to achieve zero, zero events for this in the beginning.
It's fine.
But this process doesn't scale well, at some point you will
be overwhelmed of course, but in the beginning it's good,
it's like you'll react to every occurrence of slow query.
That's great.
Anything else you wanted to cover on this one?
Well, not at this point.
I think that it's enough about slow queries.
Like maybe we can talk about slow transactions because
it's quite different topic related, but different.
Yeah, go for it.
So we, we hear like, maybe like you discussed analytical
workloads, but it looks like we mostly discuss OLTP worloads for.
It means like web and mobile apps.
So if we talk about transactions, we should understand like that there is
a query, transaction and sessions three levels and to execute the query.
You need to have a transaction, even if you don't define
it, it will be defined automatically a single query.
Like it's like implicit transaction.
Anyway I hate Ruby developers when they say I will
run this database migration without transaction.
It's not possible.
Right.
And this disabled DDL transaction it's wrong.
You cannot disable transactions in Postgres,
but, but still this like a weird term it's used.
So you, you need to open transaction, but you
cannot open transaction if you don't open a session.
Right.
So like it's like one inside another and the problem with
transactions long transactions is that they there are two problems.
First, if you acquired some exclusive lock and keep it, it's always
kept until the very end of transaction, either commit or rollback.
So you need to stop otherwise you still keeping it.
So at that means, so you can block out.
And the second problem is auto vacuum.
If you keep transaction open, even if it's read only transaction.
And sometimes even if it's on a replica on a standby
server, if hostel the, that is on a, it means that
autovacuum cannot delete freshly dead tuples right.
So we, block some autovacuum work at least
partially, but the first problem is the most.
Like it can be seen immediately if you keep long transaction
and you can, you can have a storm of locking issues.
So this, this means like ideally transactions
should be also below one second, right?
Oh, interesting.
Right because otherwise, imagine you acquired the lock in the very
beginning of a transaction and someone also trying to acquire the lock.
So you updated the row and someone else tries to all day to the same row.
And this happens in the beginning of your transaction.
If you keep it longer than one second, you may
block this session, for longer then one second.
So our previous topic will be broken as well.
Right.
So this is a similar thing.
And that's why I also say when you're split, worked into batches, try
to find some batch size that will allow you not to exceed one second.
Roughly.
Exactly because of this, right?
Awesome.
So what going back to the folks that don't have much of this setup
at all, and what would you recommend them logging or monitoring?
Yeah, well every morning throwing should
have monitoring for a long transactions.
And I see most of monitoring fails to do.
I just fails like they don't have it.
They don't report what, like we cannot, we open
monitoring and we can not answer the simple question.
What is the longest transaction?
What's the duration right now?
Is it five minutes?
Is it one hour?
We are interested in this.
We are even just, we don't discuss autovacuum and the xmin horizon here.
Just this simple question.
And I think this is first step.
You should have it in any monitoring.
If it is, if it's missing, you should add it.
And then you should have alerts again, like soft or hard alerts.
Like if some transaction is running a longer than 10 minutes
already not good like, like it's always this bad situation.
Of course, sometimes you should exclude some transactions.
For example, vacuum can run longer.
But regular transactions should be under control in OLTP, we
cannot allow somebody just opened transaction and keep it forever.
Yeah, so, I know you're focusing on the first part of this, but the
second part does feel worth mentioning in terms of avoiding transaction
ID wraparound it, like, I know, I know some small companies shouldn't
get anywhere close to it at least for a long, long time, but we've
seen big companies over the last few years, get tripped up by that.
And it feels like maybe they might not have
been monitoring for this kind of thing.
Even at these large company.
With massive scale.
Right, right, right.
One of consequences of keeping transaction
very long is you block autovacuum as well.
And that can not delete freshly dead tuples if you do it,
sometimes people open transaction and keep it for several days.
If you're modifying queries or transactions are coming
at very high rate, it's it can be a problem as well.
But I observe usually already every monitoring, has it like I'm less
concerned here because I had just see, for example, Datadog has.
Others also implemented exactly because of this very painful
experience from Mailchimp and Sentry before like several years ago.
It's a good that those guys, by the way, blogged about it.
It's that's visibility.
Yeah.
Blogging is very, very important.
So, I guess, even in my mind, I don't have very good model
for monitoring ideal model because we should distinguish
various types of transactions modifying, which acquired lock.
And also they have real transaction ID
or don't only do transactions on standby.
And like different things, right?
So, so maybe we should have couple of
monitorings aimed, to different purposes.
One is for locking issues and other is for autovacuum issues . By the way,
we can discuss different things, not transaction duration, but xmin horizon.
And only then we care where.
From our transaction on our primary or
from replication slot or somewhere else.
So, so maybe we should have different things,
two monitoring charts, metrics, right?
The other thing that interplays with this that I think I've heard you
speaking about before are timeouts, so to protect against some of these
Yeah.
I recently had one company, quite a really big startup.
And they experienced a lot of issues because of lack of timeouts involved.
And once again, we saw the problem that Postgres.
It doesn't have very important handle.
Actually somebody should develop it.
Maybe I should develop it but I'm not a hacker anymore.
First, what do we have, if we talk about like
session transaction, and query statement.
So three levels, we can limit statements using statement timeout
but we cannot limit transaction.
There is no such way in Postgres at all.
That's it.
Can we can, we do session timeouts.
That's
I think it's, it's usually, if it comes
through pgbouncer, we can limit idle sessions.
So we can say drop connection.
If we like, I don't remember the concrete
names out of top of my head, but right.
But it's, I think this is possible, but session is less a problem.
This is not a big problem.
I would, I would prefer to have a way to
limit transaction duration, but there is no
But if we, if we kill the session that kills the transaction.
Right.
But I guess if it's only, only have it's idle.
Yeah.
Interesting.
But like if transactions are small, session for example can be
very long and then you establish connection yesterday and continue
working politely like very small, transaction, very brief.
Why should we kill you?
We can limit statement and we can limit breaks
between statements and side transaction.
It's called idle_in_transaction_session_timeout, a very long name.
So we can limit statement, limit breaks pauses between statements
and everyone should do it actually, I think any OLTP should start
with, like, we had the discussion in Twitter sometime ago and I'm
big fan of global default, very limiting for all OLTP project to
those who need to extend they can do it in session or for user,
but in OLTP, I prefer, like to see statement timeout, 30 seconds
idle_in_transaction_session_timeout also like 30 seconds, sometimes even 15.
Imagine that transaction, which consists of like a chain of very small,
very brief statements with small pauses between you don't break any timeout
settings in this case and your transaction can last hours and I saw it.
Yeah,
it's bad.
It's bad.
point.
Yeah.
And it seems actually potentially very tricky.
When do you see people using multi query transactions?
So like, let's say the, the rails app that we were talking about
Well, with long transaction we have two risks again, like locking
issues, so you can block others and it can be very painful.
And autovacuum . So what I saw people don't understand this simple idea
that locks are released in the very end and they split work into batches,
but these batches are inside one transaction block and it's, it's awful.
Like we perform small updates.
Everything is fine.
But transactional lasts one hour.
And those updates that we are in the very beginning
locks that's a shame that those updates are still held.
So you have a lot of blocked concurrent commits and
they have those like victim queries they can, can
be, of course can reach statement timeout and fail.
So we see degradation in best case a statement.
Timeout will save us to have a, like a
chain reaction, but it's still not good.
This is why.
Yeah.
That's why I think by default in OLTP transaction duration should
be limited, and I saw people implementing these on application side.
Which is of course like a weak limitation, because you might have multiple
application called parts if sometimes in different, different languages.
So, and still somebody can connect to using some tool and so on.
And so like, I would rather see it on
the Postgres side, but doesn't exist yet.
Yeah.
And just to go back to something you said earlier.
So if you set quite an aggressive one, let's say even if it's
30 seconds as a timeout if we're doing a big data migration,
or if we, if we need to do a big schema change or add an
added large index, we might need to set that the okay, cool.
I'm guessing if we create an index concurrently
you should do statement amounts set to zero.
Right?
Yup.
Right.
But it's, there's a, another topic there.
We probably should discuss it next time about lock timeout setting
. Let's keep it outside of today's discussion so yeah, exactly.
This is a good point.
Wonderful.
I think we've done alright there.
What I like about it actually like sometimes people
say we don't want to have timeout because it's painful.
You reach it, you fail.
Well, if you have 15 seconds and then you fail, you see it immediately.
Then you said, like, if you have 30 minutes, for example,
and you fail after 30 minutes, damage already there.
And you also like feedback loop is huge.
This is painful, but small statement timeouts are not that painful.
People see them, people adjust settings and go, that's
That's a really good point.
I'd not considered that, that the smaller it is the less painful it is.
Of course the user doesn't see what they're expecting to see, and there's
a problem, but in the grand scheme of things, it's a much smaller
problem than you would have had if it had been a minutes or, or longer.
Yeah.
Great point.
Right.
Small, timeouts are good in terms of user experience,
because it's like fail fast and, and adjust that's it.
Right.
Awesome.
Is there anything else you wanted to cover on that?
maybe that's it for, for first step pilot episode, right?
Yeah.
Let's see.
Let's see what people will tell us in Twitter where, where we can.
Get the feedback.
I'll put some links in our, in the show notes so you can find us yeah,
please let us know what you think, what you want to be discussed.
Any questions you have be really welcome.
Thank you, Ni see you next week.
Good.
See you.
Bye bye.