Postgres FM | Transcript: auto

April 28, 2023 • 37 Minutes

auto_explain

[00:00:00] Michael: Hello and welcome to Postgres fm, a weekly show about all things PostgreSQL. I am Michael, founder of PG Mustard. This is my co-host Nikolay, founder of Postgres ai. Hi Nikolay. What are we talking about today?

[00:00:11] Nikolay: Hi Michael. the topic is quite obvious for me, over the last, uh, week or two, it popped up several times in various places. let's talk about auto to explain.

[00:00:22] Michael: Yeah, I think this might be our first purely extension focused episode, which is cool. And I know it's I count it as an extension. Do you, in terms of it being like a contra module?

[00:00:32] Nikolay: it's, uh, extension, right, but like, What else? It, it, it, it's a digital extension. there are, two kinds of extensions, main extensions, which are country module, so-called country modules. Old, old name, right. And they come with PSGs and you can install it from official package poss skull country trip.

And there are all other extensions. but probably we should talk about extension separately. Uh, it's a good episode idea maybe. Right? But, uh, this, uh, auto to explain, it's a country module. It's very like, super official extension. it probably has, more potential risks in terms of box, compared to POGS itself, but, much, much, much less compared to other extensions because it's maintained by the same team as pogs.

[00:01:20] Michael: Yeah, that's a really good point. it comes out on the same schedule. it's shipped with Postgres, so if you install Community Postgres, you get it, um, off by the default, but, it makes it that bit easier for cloud providers to offer it. They don't all offer it, but, it does make it easier.

[00:01:34] Nikolay: Who doesn't offer, who doesn't offer?

[00:01:37] Michael: I was looking at Heroku's docs. I'm pretty sure they still don't offer it. Uh, there's a chance that the docs are out of date. I don't even think some of the newer ones, so like, super base, I'm not sure they offer it based on their docks.

Maybe that's out of date and e even. There are some new ones that don't offer extensions at all,

[00:01:54] Nikolay: but RJs, G C P, cloud, SQL, and uh, Azure, they all offered right.

[00:02:00] Michael: Yeah, exactly. Quite a few of the, of the other popular ones do.

[00:02:04] Nikolay: All right.

[00:02:04] Michael: maybe we should cover what auto explain is.

[00:02:08] Nikolay: Automatic explain,

[00:02:10] Michael: Yeah, so it logs them, right? so it logs execution plans like you would get if you ran, explain,

[00:02:16] Nikolay: explaining the lies.

[00:02:18] Michael: Yep. So you've got access to all of the parameters. there are some minor differences. But yeah, you've got access. All the parameters you do when you run, explain, and a couple of additional ones, right?

So you can set a threshold so that it will only log plans above a certain, uh, duration or execution time

[00:02:36] Nikolay: It's, uh, it's competitor to log mini duration statement, which logs queries, which exceeded, um, log mini duration statement logs, queries after successful execution, right? Not in the beginning. As would the log statement do, but when it's successful and if, duration exceeds, log mini duration statement value, for example.

One second, right? Or half a second. I remember, by the way, one of POCUS advantages, uh, like 15 years ago was that compared to my sql, is that POGS allowed, fraction of second. For this parameter, why MySQL was like one second was minimal value you can use, so auto to explain is competitor here, right?

Because sometimes people do it, you have both, uh, duration statement, set one second or, or 500 milliseconds and auto explains set to the same value, but uh, doesn't make sense at all. to use this approach because auto explain output will include query as well,

[00:03:38] Michael: I thought about this quite a lot in the past and I think there are some difference, like subtle differences that might make it worthwhile, but you are duplicating effort for sure. one example is another bonus parameter we get with auto explain, which is the sample rate.

So becau like, because there is a small amount of

[00:03:56] Nikolay: is not Bono anymore because log mid statement also has this capability since po. August 14, if I'm not mistaken. We can also sample.

[00:04:05] Michael: So what I meant was that if we're sampling we might want to sample one but not the other, for example. So we might want

[00:04:12] Nikolay: Ah,

[00:04:13] Michael: Get all. Yeah. So you, you understood. so I feel like we've actually covered, two of the big points, one is that we've got this log in duration for or to explain, and we can set that in milliseconds so it's off by default.

So it's mi it's negative one minus one by default, which means don't log anything. So you have to set it to something would heavily, heavily suggest you do not set it to zero. Which will log every, every

[00:04:40] Nikolay: in many, in many cases. It'll just kill you if you do it.

[00:04:44] Michael: Yeah. The logging overhead I is severe, not just in terms of performance, but also in terms of just size, sheer volume of logs, if

[00:04:52] Nikolay: Yeah. Yeah. I mean, I mean, uh, the logging subsystem, will probably quickly become bottleneck and no queries will be possible to execute for POCUS at all. And, and we can discuss with details here. It's interesting topic, uh, separately.

[00:05:05] Michael: yes. So a, a typical value for that on a transactional system might be a few hundred milliseconds. So it's in milliseconds by default. Uh, you can set it to a hundred and that would be a hundred milliseconds or a thousand. That would be a second.

[00:05:18] Nikolay: Yeah. And remember, we discussed what, slow query is and here it's, uh, it's very connected. so why 200 milliseconds? Because this is exactly 200 milliseconds. This is exactly when, perception of human. mind starts working, right? So if, if something exceeds 200 milliseconds, human notices this latency.

So we don't want the queries to exceeded in LTP systems, and that's why 200 milliseconds. But unfortunately in some systems it's quite low value. And volumes you will be seen in logs and, and observer effect from logging will already start hitting you. So sometimes people choose 500 or even second. It depends.

[00:05:59] Michael: this is something where if you are, if you're worried about the overhead or you feel like you've got a lot of slow, slow queries at the moment and you just wanna work your way down, you could start with a higher value and reduce it over time as you, as you have

[00:06:13] Nikolay: Right. This can be changed on without restart. What you cannot change without restart is switching to login collector, which is, uh, I would highly recommend to do, but switching to login collector requires restart. This

[00:06:26] Michael: Yeah, actually that's a really good point. I have seen guides that suggest, installing or loading. or to explain in shared preload libraries, but I think the general advice is to do it in session preload library, because then you don't have to restart, to,

[00:06:42] Nikolay: depends, of course, but I, I would like, usually when you need it, if you didn't think in advance, you don't have it. Right. And it's hard to quickly even, in session, it's hard to implement it. So I would still recommend to put it to libraries just in, and probably even maybe turned off or with very high, threshold of, in terms of duration and just be prepared to use it, uh, for some incidents.

[00:07:09] Michael: Yeah. that's an interesting point. A hydration would mean you don't log too much at least, but you would still be paying the overhead of running Explain all the time. So maybe also a low sample rate

[00:07:20] Nikolay: There are multiple overheads here. Indeed. Yeah. So, let's finish, discussing why it's useful and then we will. Talk details about, various overhead things. So why it's useful? It's useful because, doesn't have, we have al statements, but they log only query. in normalized way. So first, question we usually have when we try to optimize some particularly bad query group or normalized query. We need examples. Examples can be. Taken from various places. but log, mid duration statement, uh, slow examples from logs. It's, quite good idea.

But, then we, we think, okay, we experiment in some environment. For example, it's a exact copy or it's like, for example, database engine. We have a think loan branch and we check plan there, but then we think maybe. in reality, few hours ago when the incident happened in production, maybe it had this, this query had different plan and we don't know.

Right. Maybe, maybe

like we

[00:08:27] Michael: can't reproduce it, right? If you, if you run, explain,

[00:08:30] Nikolay: can reproduce it with the tabla pension. You can do point in time recovery to go to particular point in time and like exactly in the middle of incident. And then you can use the exact, same. planner, settings and work, ma'am. And then Postgres should, should be had similarly, like 99%.

It'll be so, but sometimes you don't have database, lab engine. Sometimes you still have doubts. In this case. Auto to explain is a good way, it is very reliable way to tell that this is what we had in terms of plan because we have it locked, right? Like maybe the most reliable. Way to understand what Plan Pocus used when executing this query.

[00:09:13] Michael: Yeah, exactly. And the metrics involved, right? Like, and the buffer numbers, the timings. So it, it can, it's

[00:09:20] Nikolay: and you can choose, you can choose both. None of them or just timing or just buffers. It's, it's possible to tune it and choose and you have very good articles saying that sometimes timing overhead is not that big. But let's finish like, Process of it. So to me, auto plan feels this approach, working with actual plans feels like Mm, like generation one, right?

Same as like, remember we didn't have PK statements 15 years ago, and we all used PG first and then and, uh, normalized queries, uh, processing logs. Same approach as position statements, but only for. Slower queries. Those who exceeded, which exceeded, uh, threshold, and the tool normalizes them. And we have reports, nice reports.

We understand that probably we don't see full picture. Okay. Maybe sometimes we allow ourselves to turn on full query logging for some short period. This was considered as vari approach, understanding risks overhead and so on. We turn it on for some minutes. Just to collect whole iceberg, right? And then we have a, a cool analysis, but without plans, right?

We, we didn't, we never saw plans there. But then PTA statements replaced this approach almost because like, okay, logs are good, but it's tip of iceberg PTA statements is very good way to control almost all. It has like 5,000 limit in terms of normalized queries. Sometimes it's not enough, but still, like you see.

Almost whole iceberg, but without examples and without plants. So steel logs are good. For examples, I would choose activity. We've increased the track size, uh, I don't remember. it's 1024. By default, usually what you want, it's like to put 10,000 there or so, and see real examples to sampling froms activity, but where to get plans.

This is the, this is the big question because if you want to gather samples of plants or maybe even more normalized plants and understand what's happening,

[00:11:31] Michael: Yeah,

[00:11:31] Nikolay: or to explain as a way to go still,

I think in the future we will have some generation two tools, maybe based on the bpf by the way.

Maybe some, like some extensions. But right now, this is the best way to collect actual plans or to explain. That's why it's so important and uh, many people want it. That's why.

[00:11:53] Michael: yes, and it, I think it's the only way of. Running your queries, getting the output of your queries, and also logging the plan. if you run, explain, analyze manually, the queries run, but you don't get back the results. So it's, it's an interesting, like, best of both worlds. There

[00:12:11] Nikolay: Uh, I don't get it. Or to explain. Don't log results. It logs, query, query

by end plan.

[00:12:18] Michael: It doesn't log them, but whichever client like requested that. So let's say you're running queries in client A over here, it's getting results back from those queries and separately we are logging the results, whereas if we start up a session as client two and one explain, analyze on one of those queries, um, we

[00:12:38] Nikolay: Ah, I see.

[00:12:39] Michael: we get the plan back, but we don't get the crew.

[00:12:43] Nikolay: I, I see, I see. If you, yeah, yeah, yeah. So if you request explaining lies, you, you say, I want plan, but not results. But sometimes you want both. Yeah. It makes sense. Makes sense. Mm-hmm.

[00:12:55] Michael: so yeah, so

[00:12:56] Nikolay: And, and, and you can play with parameters of what you explain in your session and say, okay, now in this session, I want all queries.

To be logged by auto plan with plan, with plans, with timing, with buffers. Let's go like full, like tracing all plants and then you, you work normally obtaining results. That can be some application you, you use, right? Or just human Who is working with database, using sql. And then for this session with the lock all plans because we want to understand what's happening, right?

[00:13:28] Michael: Yeah, like let's say for example, you have something that's intermittently slow and you can't work out why it's intermittently slow. You run, explain, analyze 10 times and it's not slow in any of those. But then when you run it by your application, one in a hundred is slow. So how do you know when to run?

Explain like how do you reproduce that or to explains a really good way? in terms of the v2 of, of PTA statements, that's a really interesting topic and not, not one that I prepared for, but there is that extension from Perona that's sit that they're trying to replace and I think that

[00:14:01] Nikolay: I I would consider it is V2 0.5, not v3, but V2 0.5.

[00:14:06] Michael: Nice.

[00:14:07] Nikolay: Yeah. Because it has more, more things than PPG monitor, right. Ppg monitor, ppg Start pg, start monitor.

[00:14:13] Michael: we'll get,

[00:14:14] Nikolay: a replacement.

[00:14:15] Michael: and I'll put it in the log,

[00:14:16] Nikolay: it's replacement for statements. Very promising. Interesting. But I think, uh, like to become, uh, popular, you, we need to check it.

It's overhead. It's very interesting. I would like to learn more about it. And also probably it should, be in country models

[00:14:32] Michael: Well, that, I suspect that's their goal long term,

[00:14:35] Nikolay: Yeah, I talked to them when, they started. I, I, I talked to Peter actually and asked, uh, about this particular decision to, to, instead of improving Patricia settlements to work separately, and the one of the reasons was obviously independent release cycle, because if you want to, to improve something, From country modules, you understand that really cycle is one here and you need to convince lot of people and experimental things and so on.

It's hard to get into it. Right? So it'll take years to build something very breakthrough. cause it's, it's, it's hard. Yeah. But separate, uh, you can move faster and, uh, do many things.

[00:15:13] Michael: I think it's the smart strategy. So you've mentioned overhead a couple of times, and I think this is a really interesting area, not just for to explain, but for a lot of these, like there's a lot of things we've mentioned. I have looked in the past and tried to find some even basic benchmarks of the overhead of things like PPG stat statements, or to explain, uh, PG monitor, you know, anything

[00:15:36] Nikolay: Benchmarks are hard.

[00:15:38] Michael: I understand that. but the flip side of that is that in quite a few places people mention horror stories, or there were really severe warnings in the docks saying for some of them, not all of them, that the overhead's really, that can be really massive. So be careful, but no one ever kind of says how much it is.

And then in, if you go digging in the, the mailing list, every now and again, you see. Either, like some people mentioned a few percent here or there, some people, one person mentioned like two or three X overhead. so there were some really scary stories out there. So I, yeah, that was the

[00:16:14] Nikolay: it can be killing overhead if you, if you use, uh, if you don't use log collector, if you use slog with journal D well, maybe it's improved, but, I benchmarked it like many years ago. and then use, you say, uh, threshold is zero. Meaning that, meaning that you lock everything. I think you don't even need, to track timing and buffers.

It'll be enough to kill your system or like, just like that. so, speaking of benchmarks, let's be a little bit in the context this week. Uh, yesterday actually,

autotune published an article saying the the worst thing in Imposs is vcc. Like they hate it most. Well, I understand. But I to underlying, research and, uh, there is a paper, like scientific paper comparing VCC in various systems and it doesn't provide any, discussion any, any numbers, like how exactly they tune auto working, for example and so on.

So details are very important here. and we all know, like it should be written on the front page of pogo scale.org, that defaults are very old. Don't use defaults. Right? and you, you do need to tune it. Unfortunately, this is pocus. You do need to tune it.

You need to, to understand some things and you, you need all. Probably help from, from expert and tune it, right? So, auto to explain is great thing, but of course if you enable, like saying auto to explain is bad. It's the same as saying logman duration statement is bad. They both are bad and, but they both are good.

Let's talk about overhead. There are maybe three places where I see overhead. First. We already touched, actually first is, uh, logging. and this cayo overhead. So if you log too much since you log to disc. It's not good, For example, if you logged into some, by the way, I did it in some, uh, pfs, like into memory, small disk, but very, very quick rotation, understanding risks that it can be situated at, jumps off space, but if you rotate it very quickly and send details to some.

Offloaded to somewhere you don't touch disc at all, you can lock everything. In this case, it's possible to use it. here I would blame POCUS for not having, some, easy ways not to lock to disc. I would rather, for example, say it over u uh, udp. to some different place. And that's it. Just send, send locks.

And in this case, we would log much more and just network would be our bottleneck, right? But if you log to disk, as we do in most cases with pos, so this is number one thing to, to consider. And sometimes people the store logs on, in, on separate disk. And I saw several times that this disc is magnetic.

So, yeah, so this is easy. Like you can, you can shoot your foot, feet, uh, very quickly here, So this is number one. Number two, cpu, right? Maybe you can tell more here, like

how, how to

[00:19:13] Michael: so there are, there's some, there've been good blog posts in the past about like one from, the undress team around. Yeah. Yes. So he said, explain, analyze.

Uh, can tell you lies. Yeah, exactly. It's a very good title. But the main point he has is that if the, if the track timing calls to measure timings, if you're running, explain, analyze with timings, which is the default for explain, analyze, then even the fact of measure.

So the fact of measuring them. So there's an observer effect,

[00:19:43] Nikolay: Classic observer, observer

effect. Once you measure it, it's changed. You measure something else already.

[00:19:49] Michael: Yeah, exactly. So the timing's reported and it, so the query will be longer because you are measuring it. now

[00:19:57] Nikolay: But it depends on, on, on C P U. Right on On Zion, for example, modern, you know, like Cascade Lake for example. Well, relatively modern. It's not that bad if you check it

[00:20:08] Michael: No, there is a tool within Postgre, I'll link it up in the notes a bit. I think it's called PG Test Timing or PG Track Timing, something like that. Um,

[00:20:15] Nikolay: in, in bin directory. Mm-hmm.

[00:20:18] Michael: Yeah, and you can, it's very quick and easy to check if yours is like relatively fast. The only place in recent time that I've seen a very, like a knock while performing one is when somebody was running Postgres in daca, and.

Most production systems I've seen are run on sufficiently decent hardware that it hasn't been a big issue. But I do trust that there are some people, there are very well respected people who do recommend turning timing off when you use auto to explain for this reason. And I think it's mentioned in the docks still as well.

So I suspect there are cases where people are still running production systems or there were workloads where uh, I think Alvero mentioned. Like a nested loops having a particularly high, like overhead, for example. So if you are looping, if you have a workload that has lots of loops that do things in the tens of thousands, hundreds of thousands,

[00:21:09] Nikolay: So for some

queries it's not that

[00:21:11] Michael: but it's

[00:21:12] Nikolay: It depends on

query. On query. Mm-hmm.

[00:21:14] Michael: Yeah.

[00:21:15] Nikolay: I wonder what's water had on arm, like Graviton two and so on, like, I, I have you seen

anything about it?

[00:21:22] Michael: I think I've published the only blog post on this that's ever been published. Like, I would love

to see more, but, but on this, but I haven't seen any.

[00:21:30] Nikolay: It's good. Uh, it's a good, good area to research. Definitely interesting.

[00:21:35] Michael: I think it's interesting, but, based on the reaction I got, which is like, it's just a comment from you and nothing else, uh, I think there's limited interest in this for some reason, and I've seen quite a few production systems with. Or to explain on sample rate one, mean duration a hundred milliseconds and timing on so, and not just timing Right.

Buffers as well. So I think this is like, there are, you mentioned it's not just you've, I think you saw in the past that buffers might have a small overhead as well.

[00:22:04] Nikolay: Buffers definitely have overhead. If you run, like if you say, explain, analyze timing of buffer soft, you see? I mean manually. Right? And you can do it with, the same with auto to explain. You, you get rid of timing and buffers over overhead, and you just, it's also already useful, right? Because you, you see structure of the plan and you have total timing because it's, it's already calculated.

It's like sim similar to log mid duration statement, uh, log entries. So it's already useful to see structure and just total time. Good. But. Then buffers and timing, I would choose buffers. They have some overhead. It, it's also interesting to understand for your system, uh, and for your queries what, what it is.

But buffer numbers, in this podcast are honored we have timing already, right? For overall timing.

[00:22:56] Michael: we don't have to choose either or though, right? Like we can

[00:22:58] Nikolay: well, we can get, we can get both, right. But, let's agree on this. even if someone performs very good benchmark of overhead of auto explain, I would consider it as valid only for particular system, particular workload. That's it. In each particular case, if we can, if we are serious guys and, and we want, uh, our system, to be reliable performance and, and like scalable, grow, handle growth, and so on.

We need, our specific benchmark for our workload, which is very interesting topic and how to, uh, we probably also need to talk about one time and, for our hardware, for our positive version, and so on and so on. And in this case, we should measure two overheads from buffers and from from timing, And consider four points, like without anything, with both, with only timing, with only buffers. Here I would exclude with timing only because buffers, in my opinion, in terms of internal metrics in inside the query buffers are, are more important than timing numbers. What do you think

[00:24:07] Michael: Uh, more important. I, I think it's like, An argument for the sake of an argument. I don't, I don't see it as an either or. Like if timing overhead is not worth it for your system, if it's too much. Don't track it and buffers are useful, but would I like, if I had the choice of just buffers or buffers and timing, I would choose buffers and timing.

I'm never in a situation where it's a choice one or the other. I've not seen, like if they had similar overhead and we could, if we could afford one but not the other. Sure. But I've never seen that and I can't imagine it's true because, for example, timing's tracked per loop.

Right. It. But buffers are reported not per loop. They've reported per, um, like per node. So I wouldn't be surprised if it doesn't have the same looping overheads, for

[00:24:53] Nikolay: Hmm. Interesting.

[00:24:55] Michael: and there are no warnings about it in the docs, so it's, I've anyway. Yes, sure. In principle, I think you're probably right, but I just don't, haven't seen that be a practical question ever.

[00:25:06] Nikolay: Interesting. Okay. Well, okay. Maybe you should enable both and have sampling. It

[00:25:13] Michael: Maybe, but if you do check this on your system, I'd love to hear, even if you can't share, details, I'd love to hear like what order of magnitude were the overheads? Like,

[00:25:22] Nikolay: yeah. I, I, I have it enabled in a couple of places, actually quite low at once recently, but we didn't do, somehow I missed this. Uh, we didn't have, it's, it doesn't follow my principles. My principles is always have, uh, some benchmarks and make changes, uh, based on numbers and so on. But since it was enabled with, some quiet, low, uh, simpler rate rate.

I mean like below 10%, for example, it was considered as low risk and the threshold and sampler, right? So like, let's have something, right, already have something, uh, already useful. We didn't discuss it, but, uh, one of the use cases, probably maybe very important ones, one is when we enable it and, uh, if we have big teams, we.

use this as to support their work on optimization, like continuous optimization. we say, okay, these are bad bed slow queries. And we have examples with plans. Look at them. And in this case, if you enable it above this, like 500 milliseconds, uh, uh, threshold, and also sampling 10%. Enable both, right? But it's very bad if we do it blindly. We need to run benchmarks and understand, what's the price here? How would we do it in, in some system? For example, we have some system ltp and we want to measure it how?

[00:26:46] Michael: We, we did a whole episode on this, right? Like benchmarking,

[00:26:50] Nikolay: Well, yeah, but this is specific benchmarking. First thing I will do here is to ensure that we don't have overhead from, uh, observer effects from logging system,

right.

[00:27:00] Michael: So the test, our timing, um,

[00:27:03] Nikolay: these, uh, things and benchmarks should not hit, not bu bump into this ceiling, right? We, we don't want to ate our disc and then make conclusions like, oh, you know, to explain as 10 x, this is how you can do it very easily.

So you should, you need to make sure that you have room in terms of this cayo, and then you can start thinking about overhead. Well, probably I would run, Two systems, like exactly the same one with how to explain one without, and you know, I would probably like make sure I don't separate anything at all.

It means that probably I would do it in single session. I don't need multiple sessions here at all, which means that it's possible also also to run it on some non-production environments like database engine for example. You can, you can just compare and see like, Infrequent queries, like single session around this query, around that query and compare, details.

[00:28:07] Michael: Probably latency mostly.

[00:28:08] Nikolay: yeah, yeah. Let see. Right, but, but do it many times with the same query

[00:28:12] Michael: Yeah,

[00:28:13] Nikolay: to eliminate

[00:28:14] Michael: the, or a workload that you can replay on both, right?

[00:28:17] Nikolay: Right, So you don't want to reach the edge in terms of, uh, CPU and so on. Like very, like it's still performance experiment. It's performance research, and you repeat same queries many times to have reliable data and then you just compare without it and with it. What's the difference? Uh, and yeah, let see.

Probably What, else we have, we can have start statements enabled and, check not only, latency, but also, I dunno, like also, uh, buffer numbers. Well, buffer numbers should not change, right? Because we, we

[00:28:51] Michael: They shouldn't.

[00:28:52] Nikolay: They shouldn't.

Right? What else?

probably weight event analysis would, would be applied here as well. Maybe like I'm, I'm trying to invent something on the fly, but, definitely let us see the main point here,

[00:29:04] Michael: Yeah.

[00:29:05] Nikolay: the, main danger in this benchmark is to bump into some different limit like this Cayo or anything else, and measure it instead of, overhead from auto explain.

[00:29:15] Michael: Yeah. even I fell into that trap in my blog post of the, I was probably doing far too higher, workload for the machine. I was running it on and I also tested the logging overhead and in on my system, it was, there was about 25% overhead just from logging everything. And then I turned that off and it got it down to like, it

[00:29:36] Nikolay: Yeah. By the way, E P P F analysis would be great here and like, uh, pep Yeah, you, you, you should see that overhead comes from what to explain. This is, would be like the best if you, okay. Like you measure for this type of query, we have like 5% overhead because we, we know like this, this query executes with nested loops like, For different parameters, by the way, different parameters and producer statements.

We have one query, but with different parameters. We have different plans, right? So we can have same query, but depending on parameters, uh, overhead may be different because plans are different.

[00:30:13] Michael: Yep.

[00:30:13] Nikolay: So parameters, parameters also define the plan. And of course, so we need to take care of planner settings and workman and so on.

I, I mean, this experiment can be done on different hardware. Uh, I mean, CPU should be the same, but I don't care about disks, unless they're saturated. If we don't dis co discs can be the same. Fast system can be, can be different, different fast system. that's why the first and database lab can work.

But course, CPU course should be the same. POS version should be the same. Planner settings should be the same. And then we play with queries in single session and compare. Interesting experiment by the way. Yeah, and if we have ability to check with a bpf where like to perform in CPU analysis and see that indeed like 5% overhead for this particular queries with particular, with particular plans, with 5% overhead comes from how to explain this is success.

[00:31:12] Michael: So you mentioned three oh, types of overhead, I think you said disco for logging for example. Uh, CPU for timing. Was there a third

[00:31:21] Nikolay: For buffers.

[00:31:22] Michael: CPU for buffers? Yeah. Makes sense.

[00:31:24] Nikolay: Yeah. Well, what else? What about, additional memory? Probably no, right. What about network? Probably no. So I don't see any. Maybe if there are some other types of it, I just don't see them.

[00:31:39] Michael: No, there are, there are a couple of limitations I wanted to make sure we covered, that I've seen catch people out in the past. one is, that it doesn't track planning time. So these timings are all about execution times and there's no reporting of planning time. So if, for example, your client is reporting that a, that a query is taking 1.1 seconds and it's only a couple of hundred milliseconds in order to explain in the logs, that the difference, and you can verify this by running, explain, analyze, is chances are it's a very high planning time overhead. And

that's, I guess,

[00:32:14] Nikolay: we, we have it in since post August 13. We have it in, um, producer statements, plenty of time and.

[00:32:21] Michael: Nice. Yeah.

[00:32:22] Nikolay: And, and of course like it would be interesting to dive into, like auto explain power is the three execution, three shows, right? It would be interesting to dive into planning time for particular notes.

Like it's, it's not possible Playing time is, is a whole and a whole. We have it in PTA statements. If we divided by calls we have average, uh, planning time and so on. Actually we

have, it. we have we have some, we have it there like mean marks. Um, uh, it, it's, it's, uh, there, it's tracked there

[00:32:51] Michael: I think this might be another argument for the log. Uh, what's the other set? The logging one we were talking about having both on. I think that one would, in does include planning

time. So That's

[00:33:02] Nikolay: Mm-hmm.

[00:33:03] Michael: so any difference there. Anyway, so that's a, that's kind of a bit of a Gotcha. Um, the, another small difference is that, and I think this is coming in post grade 16, I saw a commit for it.

Is it doesn't currently log the query identifier, which can be really useful. even if you've Yeah, query id. It also explained currently doesn't log it even if you use verbose, which is the parameter that adds the query ID in explain. So that's another like minor difference between them.

[00:33:31] Nikolay: And, duration statement does it right,

[00:33:34] Michael: Yes. Good point. Another reason

[00:33:36] Nikolay: since some, or, yes. I remember in some version 13, 14. Yeah. The start. Uh, we have since that version, we have Kuk matching PK statement, KU in both. Logs. Slow log. We call it slow log, but it's not separate, unfortunately. and inpe activity, right?

So we can basically load it to some tables and then join them and have quite good, uh, analysis. And some people do this, this, uh, right. Yeah, that, that's makes sense. Interesting.

[00:34:05] Michael: and one other thing I forgot to mention that is different between them is that in auto explain, we have a. An extra parameter, which is log nested statements. And I, that's another use case for this. We don't have access to that when we wanna explain, for example, doing function. Like, I know a couple of people that use a lot of, uh,

[00:34:24] Nikolay: You go already too far, Fred. So we like, I saw discussions on Twitter last couple of weeks and, uh, there are different approaches. Some people say, no, no, no, not on, not on production. Not on production. But some people say yes, yes. And I feel say yes. Just, just carefully,

[00:34:43] Michael: I like it

[00:34:45] Nikolay: but details slowly to, to like, oh, by the way, sampling is very interesting. I, I didn't realize it until like maybe a couple of years ago when you wrote this article that, uh, sampling exists for many virtuals already unlike, query logging for which sampling, appeared in 13 or 14, only like a couple of years ago in order to explain.

It exists for many years and many processus versions. So we can benefit from, for example, inus 12 or, or 11.

[00:35:16] Michael: But, uh, all supportive versions approach Chris, for sure.

[00:35:19] Nikolay: Yeah, so it's great. It's great. Like, I like sampling, uh, in heavily load systems. Uh, even if you, even if concluded that we have overhead, like 50%. Or a hundred percent. We, we still may decide for, to have simp at least like 1% of, uh, of, uh, queries because we need this information.

We will, of course, if we use it's better if we have overhead, we enable it and nobody's using this information, but if we do need it, it's okay to pay some price right, in some cases and have this very important information, to optimize our system.

[00:35:54] Michael: I completely

[00:35:55] Nikolay: It, it'll, it'll pay off very quickly, right?

[00:35:59] Michael: all monitoring of any description has some overhead, right? Like we, we can't monitor at all. And most people think monitoring is a good idea. So then the question becomes how much overhead is worth it for the benefit.

[00:36:10] Nikolay: Yeah. Over, um, observers and observer effects are everywhere.

[00:36:15] Michael: Yeah. Wonderful. Should we call it a day?

[00:36:21] Nikolay: Yeah, I think it's like a summary. Is that measured on your system?

[00:36:25] Michael: Mm-hmm. Mm-hmm.

[00:36:26] Nikolay: be quite optimistic, I think. Like, fears are, discussed around sometimes go from. Not understanding where it can hit you. Right? And, uh, if you understand where it can hit you'll be fine. Like

[00:36:40] Michael: Yep.

[00:36:41] Nikolay: and this like we discussed it and, uh, and sampling and so, so you, you can be good.

And even if for overhead for your system is quite significant with sampling and high threshold, it, you can still find it very useful because plants are a very good thing.

[00:36:56] Michael: Yeah. And don't set log mean duration to zero. That's my final

[00:37:00] Nikolay: Uh, Right, right. And don't send queries as is to charge PT and, and plan. Also, don't send plans as is. Yeah.

[00:37:11] Michael: Nice. Alright, well thank you Nikolai. Thanks everybody and see you next week.

[00:37:16] Nikolay: See you. Bye.

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

auto_explain

Creators and Guests

Some kind things our listeners have said