Our favourite v16 feature
[00:00:00] Nikolay: Hi everyone. This is episode number 62, this is Nikolai and Michael. Hi Michael.
[00:00:06] Michael: Hey, Nikolai.
[00:00:07] Nikolay: Your time to choose the topic. Tell us what is it?
[00:00:11] Michael: Yeah, so I chose to go along with the monthly blogging event, so Ryan Booze invited people to blog on the topic of what their favorite Postgres 16 feature is. So this is the upcoming, release.
[00:00:25] Nikolay: Next
[00:00:25] Michael: yeah, we had a release candidate one last week or so. Yeah.
[00:00:29] Nikolay: week we, that next week release.
[00:00:32] Michael: yeah, normally it means release is coming unless anybody finds a bug or two.
so thank you to everyone involved in that and figured it would be a good time to talk. Favorite features.
[00:00:42] Nikolay: right. By the way, I, I just realized I'm interrupting you and making the process of, uh, subtitle, transcript, creation Wars, but this is good, moment to remind our audience that, first of all, we have. our podcast published in Everywhere, basically. one, uh, listener very good post gba I know many years wrote me that it's a good idea to publish us on Spotify.
And not only on YouTube, I told like posts, affirm. That's it. You can see, subscribe or something, the button, which will show you many options. Spotify or Apple Podcasts, or Google Podcast, everything, Google Play, anything, it's convenient. And yesterday I saw, even on Amazon, it's also published. Interesting, And of of course it's worth not only subscribing, but leaving a review. You know what to do there, So, and yes, we have, good transcripts, on Postgres film. We have outdated version. I hope we'll Update them soon. But on YouTube, we started publishing, improved version of sub subscript. So it's combined effort from Open AI APIs, whisper and G P T four and manual corrections. We have glossary, we have improved process, and I also plan to apply this process for some talks.
So it's interesting content pipeline, which probably will produce content for people who want to read it maybe later or something. Okay. Enough. Uh, Postico 16. Where to start.
[00:02:12] Michael: Whoa. what's your favorite feature and how did you go
[00:02:16] Nikolay: My favorite feature? Yes. I thought about it and my, I have two answers here. My most favorite feature is that all features are open
[00:02:26] Michael: Nice. Okay.
[00:02:27] Nikolay: in age. We live in. When companies, chose not open are choosing not open path. Elastic and Monga first. Then now we have HashiCorp. It's not related to databases, but it's adjacent technology.
Many people provision infrastructure using Terraform. And, uh, this morning I've learned that Click House decided not to open source, new implementation of some, uh, I forget already, let me check. I have it open. Share merge three table Engine. new implementation will be not open.
In click house. Click house is open source and we appreciated that. It, it gained popularity being very good Engine for analytical, workloads. It was open from very beginning. It was amazing. But they also started to choose the dark side.
[00:03:17] Michael: So you mean like almost like a meta feature that you are
[00:03:20] Nikolay: Exactly. This is what I Exactly, I wanted to say meta feature. Yes, exactly. you caught it. So, all Postico features are open still. Let's appreciate it.
[00:03:32] Michael: Well, yeah. I want to add to that. I love, not just that the code is open, but also the discussion around the feature is open on the mailing lists and in on the Commit Fest, there's
[00:03:44] Nikolay: Here we come to. To some dangers area, not discussing versions and tech technology, because I have my, uh, personal opinion and I see that many things are not open, unlike many companies which are more open than Postgres project, but it's a different discussion. Let's discuss it in a, in another episode.
[00:04:01] Michael: Cool. so other than the meta feature that, the coders all open and the features are all open source, what's your favorite actual feature?
[00:04:10] Nikolay: okay. I have two more answers. One, one of them, purely technical, one of them is that, the feature we discussed, it was my idea. you know, I like, I like poor man's, uh, solutions. You know, like when you don't trustfully or you are missing something in monitoring, you start using some, system performance, tools, uh, like starting from top, pss, uh, I iota, I top anything. And if you can connect, of course, I'm not talking about RDS here. So similarly, if you need to Update billion rows. And, uh, you understand that, it'll take ages to, to explain how to do it properly to people, but it's very like urgent.
For example, next week your integer for Primary key will be out of capacity, so you just use P SQL scripting and the backslash watch. So I always select ability to specify number of cycles I need and, I said this to Andre and we coded it at Postico tv. It was live, Andre send it to hackers, and then Postico 16 will have this feature.
Obviously, I like it. it's a tiny feature, but not only it was committed to Postico 16. In PostgREST 17, I already see, continuation of this idea and people add more. Tweaks more options to slash watch command and pq. It means that it resonated. I, I, I'm happy,
[00:05:40] Michael: So just to to go into the specifics of what, so in the past you could watch and it would basically run the same query to check on the progress, for example, of something. Uh, and you could do it. Yeah, go on.
[00:05:52] Nikolay: Let me, as usual, I, I explain some things I learned over years, uh, which are like tricky and not obvious for, first of all, backslash Watch is a replacement for semicon. You don't need semicolon. By the way, even Andrea learned this from me when we discussed this Extension. He like, I, bet 95% don't realize you don't need semicon.
You say, select blah, blah, or Update, blah, blah, blah. And then backslash watch. And it already works as semicolon and, you don't, uh, need extra execution. So immediately you start executing in loop. By default, it will, it'll be two seconds, of wait time between loops. not loop time, but delay time. Two seconds because also query takes some time, usually, right? It's not zero. and the only option backslash watch had in the past. It's, uh, this wait time. You can adjust it. And in extreme cases, I remember putting something like.one, like.one means.
[00:06:56] Michael: 10th of a second, or,
[00:06:58] Nikolay: a hundred milliseconds. So it's very like, very quickly doing something. You can see, like if connection is good, you can, for example, monitor. Which is an activity almost you, you almost, you see data changing there almost life. It's kind of cool if you understand the, the overhead of it, of what you are doing is not high.
But now you can also specify a number of loops you want. If you want to run something 10 times, do it. Or for example, you, with simple math, you can understand I want this to run approximately one hour, the query to purchase activity. It's fast. Okay, we have 3,600 E every second, for example, and this will continue one hour.
Poor men inside me is happy. I mean, poor men approaches, uh, I like them a lot because they are very trustworthy. S k is the most trustworthy client we have,
[00:07:55] Michael: And, and it's a simple solution, right? I, I. In the past, so just to clarify, in the past it would just run infinitely until you stopped
[00:08:03] Nikolay: Right? Yes. And I used, uh, select, c t e and then if nothing to Update, for example, I just used division by by number of rows, which is zero, and it's stopped by error. This is how I stopped and in SS 17, like highlighting this new feature, which is like additional option other people developed. Sorry, I don't remember names, but it's, it's not Andrea Original. I'm, I'm not involved at all. Additional feature will say how many rows to require to continue if like something like minimal rows and, and so on. And you can say if zero rows stop now officially without division by zero, which, which is very good.
[00:08:45] Michael: Just a probably a stupid question. do you know why, if what you really wanted was, let's say, please do this for an hour and instead of having to do the maths, why not have a, a timeframe
[00:08:58] Nikolay: Uh, well now, the, like, Pandora box is open, right? We can, uh, we can propose new options and I, I, my my honest belief like, uh, p SQL itself needs some extensibility improved. you know, like idea of plugins in V V vi, in Tmox, in Sublime everywhere. Yesterday I tried to install something for sublime for my daughter.
It was insane. I, I, I gave up. So I usually use many plugins until I don't because, uh, if you switch computers, switch hosts, and they were like, you are in new environment and it's default via. No control. A is not working anymore. You need to go back to control B as as control. For control, Sequence, right?
So you use Control B. So I stopped using plugins for tmox and VI mostly, but uh, still plugins for p SQL is a good idea. in my opinion, but it's not, uh, it doesn't exist yet. Right? But the extensibility like this for backslash watch, it's good. I think small, very small, but, uh, useful. So let me, um, actually technical feature I like the most is different.
One, this, this feature is, uh, I'm related to it, so that's why I, I, I want to talk about it. But in my call features I like the most. know, PTAT io, the new system view? No, no, no, not it. Op bytes, Column inside it. This is
what I like the most. Op bytes, separation bytes, and, if you check it, it'll be always, uh, 8,192, so eight k, eight eight
[00:10:40] Michael: 8 1, 9 2.
[00:10:41] Nikolay: it'll be filled with, with the same number. Always. It's, it looks kind of silly, a little bit. Right, but why? I, I like it because it changes mindset from buffers to bites. Finally,
[00:10:58] Michael: Yeah. Okay.
[00:10:59] Nikolay: and I checked multiple articles about talking about pets. And they mentioned, oh, you know, there's this strange Column. And they immediately started to say, if you have this number of buffer operations rights or reads, this translates to this number of gigabytes, or something, bi Bibi bytes and so on.
And this is, I like a lot because I know how. Uh, important it is to translate this for Backend engineers who usually, um, it's like additional, uh, step in mind needed to understand what buffers mean and it's
[00:11:35] Michael: And it's not, it's not, simple arithmetic and the numbers get big quickly, so I'm guessing what you just multiply them inquiries where you're looking into this
[00:11:44] Nikolay: yeah. You
[00:11:45] Michael: You multiply them
[00:11:46] Nikolay: How many times my BC in, in my terminal bc you know, like calculator, how many times it received comments, like some number multiplied by eight and divide by 1024 to understand number of MIBI bytes. It's like we already like type it very quickly. Time saved slash 1024 just to translate to.
Maybe bios and it's explain to other engineers, this is, this is a lot, you know, or it's tiny, it's nothing. So I think this strange Column, let me highlight it. It's very good, thing to have, although it's not, I like, ideally I would prefer seeing, derivatives like derived columns from original block numbers.
Derived additional computer. It's called computer. Like it exists in Postgres
also, right? Although
[00:12:42] Michael: this is a view as well,
[00:12:44] Nikolay: it's a view just, yeah, it could be on fly. It'll be good to see, like byte numbers.
[00:12:50] Michael: There's
[00:12:50] Nikolay: block numbers.
[00:12:52] Michael: there's also some functions, old ones not new in this release, but, ones for showing bytes, like for making them prettier, right? So if you've got a number in
megabytes, it would, you don't,
[00:13:02] Nikolay: It's, it's old fashioned. It does respect gigabytes versus gigabytes. It, uh, has issues. Yeah. I, I don't like it. I, us,
[00:13:12] Michael: and do you show everything in megabytes then, or maybe,
[00:13:14] Nikolay: I usually write my own for this.
If, if it's, yeah, it's easy. It,
[00:13:19] Michael: Cool. And do you, um, I know Melanie Plamen, I dunno if I'm pronouncing that correctly. as with most Postgres names, I think they were involved, but I'm not sure who else was.
[00:13:31] Nikolay: Yeah. I, I attended the talk. Uh, she gave at, it was one of very few talks I attended. It was good. I liked it Playing with Pitch Bench a lot to see how IO is working. By the way, this, uh, view of course itself is a big achievement because we always, needed something to answer question how much IO is related to Autovacuum activities, for example.
But I would like to, point that of course this, view shows, uh, high level numbers. It's not actual. It's operations with, buffer pole. So some of those numbers are purely memory io, not this io. So we still need
[00:14:13] Michael: system. Yeah.
[00:14:14] Nikolay: old tricks to find the process id, uh, of auto vacuums or workers and, uh, checking broke.
Process ID io numbers are still valid, but it's still very good. You can perform, um, uh, analysis, in convenient way you can understand your io. And io and Database is the most important thing, uh, is a main source of, uh, latency, right. More io, more latency, all
[00:14:44] Michael: a lot
[00:14:45] Nikolay: buffers, buffers, like pg io is global. Very detailed buffers,
[00:14:52] Michael: Yeah. And system level rather than query level, right?
[00:14:56] Nikolay: right. But from there, you can go to pg_stat_statements and, and so on. If you see backends have a lot of higher rights or reads and so on.
I, I expect many pie charts and maybe some additional graphs, , in, well, may be like, , colorful graphs in monitoring, which will show spikes of, for example, Checkpoint or like backends produced rights or reads and so on.
It's good. It also has additional things like, evictions. Like some extra things. So usually, for example, in AL segments, we don't have it Here we have it, it's great. I mean, it's, development of, , analysis ideas, right? Yeah. Good.
[00:15:39] Michael: Nice. So yeah, on that note is these system views are really helpful and it's great to see new ones being added. I, not related to this conversation, but I saw there was one for weight events coming. Well, at least, looks like it's been committed to. Version
17, which is really cool. But yeah, back to 16.
we probably, like, it feels like this is a really good episode to talk about our favorite features, even if they're small. We'll probably do another one on the larger features. But, so in on that note, um,
[00:16:09] Nikolay: I, I can tell you about larger features. Very in short, like very short about larger features. Almost nothing.
[00:16:17] Michael: or
[00:16:17] Nikolay: gap between what Postgres delivers every year and what people expect continues to grow. This is my opinion.
[00:16:24] Michael: I thought there were quite big changes around logical
[00:16:27] Nikolay: Perfect. Unfortunately, d d L is still under development, uh, logical replication of d d, but the ability to replicate from standbys, it's long awaited feature and also Parallel application, of long Transactions longrun Transactions. Very, very important feature. I like them a lot. More aware. I need them a lot.
We need them a lot. can't wait d g l to happen. It's a very good thing to have. But, , I'm talking about things like, zero downtime things. A lot of them even alter, table zero downtime. Fashion doesn't exist, but upgrades, people need to stop thinking about upgrades and many things, right?
[00:17:07] Michael: Let's talk about it another time.
[00:17:08] Nikolay: yeah, let's, let's, talk about that. yeah.
[00:17:10] Michael: yeah. I have a small feature that's like, it's probably, uh, just based on the bias of what I tend to come across. but my favorite is also a tiny little one, which might be even considered a bug that has been fixed, which, uh, in version 14 explain verb verbose.
So explain with the parameter, verbose would. Give you the query identifier, which was added to PGS stat statements as well. So this is a really cool feature to, uniquely identify, a query, uh, parameterized query. so two queries that are looking for different ID would still give you the same query identifier.
[00:17:50] Nikolay: To join data from various sources, from logs, from activity, from pg_stat_statements, uh, and, and find examples to.
[00:17:58] Michael: Exactly, but, and you brought up the important point logs. There was something that was missing that, that nobody noticed for a couple of versions, uh, or at least nobody reported and was, has been
[00:18:10] Nikolay: Nobody created. Maybe the, the main part is nobody created, nobody proposed a
p probably, or, or proposed it, but it was, uh, not perfect in the beginning. This is, yeah, idea. That idea is lives many, many years and the need is, is there many years? Right.
[00:18:29] Michael: Yep. So it wasn't in, if you used auto_explain with verbose, so it has like auto_explain dot Log verbose setting. even in versions 14 and 15, you wouldn't get the query identified in the logs. And now as of 16 you will. so. I think this is obviously tiny, but there were a couple of really cool things about it.
One is, I think this empowers monitoring tools like we were talking about just now, to if you, if you, for example, even without a monitoring tool, if you are looking at PG stat statements, you find some. Top queries, maybe by, maybe by io. Uh, computer set statements has buffer information in there as well.
And you want an example query plan. so you can look into it,
[00:19:12] Nikolay: Yeah,
[00:19:14] Michael: either an example, query plan or even just an example query, example parameters, or to explain is now a really good way to get that as a version 16. it's as simple as searching the logs for the same identifier.
So that, that's really cool. Um, obviously it makes it easier for tools as well.
[00:19:30] Nikolay: Can you remind me, at the same time, now you, you are, it's possible to explain genetic plan to get, uh, like if you, if you don't have examples, at least you can get like default plan from the planner. And it's also was added in post 16, right? Explain genetic plan.
[00:19:49] Michael: Yeah. Yeah, exactly. now this
[00:19:52] Nikolay: I don't like it. Right?
[00:19:54] Michael: Yeah. I mean, it's just not. I can see how it is useful if you, if you
[00:19:59] Nikolay: find examples.
[00:20:00] Michael: are using prepared statements or if you need to know about genetic plan behavior in general. It's a really cool feature, but I don't see, uh, the people I work with don't tend to need it.
so I, I suspect it's 'cause I'm not seeing the right use cases.
[00:20:15] Nikolay: the right use case, you develop something new, new feature, or completely new product and you don't have data yet. Right. So, but a generic plan will be not good in this case, right?
[00:20:27] Michael: I, I think it's, I think it's more like if something, I think it's more around prepared. I, I don't understand it well enough, uh,
clearly because, uh, but I saw, I saw people were excited about it. Like Lawrence from cyber tech. I can't, i, he definitely blog about it. He might even,
[00:20:44] Nikolay: Locus Epi analyzed. Locus fetal also discussed it. I remember. They, they, they did it at the same time around like in spring, maybe this spring. Yeah.
[00:20:54] Michael: We can link those up for people that are interested in that. , but yeah, in the, performance work I've been doing and seen customers doing, I've not
[00:21:01] Nikolay: I,
[00:21:01] Michael: a case where adding
[00:21:02] Nikolay: see. It's, it's can be beneficial in automation of analysis when you have a lot of queries to Analyze at least somehow. And you need to get at least something and understand, like highlight, like I, I don't like low, low hanging fruits, like for example, sequential scandal on large table. And this can, can be helpful, but it's rough approach. Not fine, fine grained approach.
[00:21:33] Michael: Well, and, and you don't know that it will be using the, like it's not real performance data, right? It's the generic plan, whereas in. You what you really, what's tends to be most helpful is looking at examples that have actually been slow and looking at performance data there. And that's why I love auto_explain.
Like it's helpful not even just so some, sometimes it's intermittently slow, right? If a query is intermittently slow, it's actually quite difficult to look into. auto_explain helps because it captures the plan when it was slow. So I, I prefer to looking at like, And I understand the, I understand the argument is you can't always get the plan for some, like you might have constraints or you might be in an environment where that's not possible.
[00:22:13] Nikolay: I, I'm, I'm trying to invent ideas where it can be solved. Another idea is, uh, to understand how many tables and indexes , can be involved in execution. So, , how many, objects and Database we need to Lock. , in shared mode, , and, uh, there is like some threshold 16 when you reach it, fast path is not possible anymore, and if you have very frequent query, query to be executed, for example, many thousands per second, in this case, you might have issues with Log manager and it hits you and there is no good solution.
So I, I can imagine a generic plan could happen to, could help understand. The number of logs this query needs. And probably this could be like if, if we find, , the second part of information, uh, expectations of Q P s for this query. If we understand it's very frequent query and we see a lot of objects are involved, probably it's time to optimize, proactively not waiting for his Log manager.
Weight event, popping up in activity. But it's kind of quite like, I, I'm not sure I'm talking about something a lot of people understand. We should discuss it another time as well.
[00:23:29] Michael: and it, it's not my favorite feature, so there's a reason.
[00:23:32] Nikolay: Okay. So sounds good. I, I, I, yes, I almost the agenda here, let me not to forget to mention, very good work, which is published , with every release every year. it's, uh, from, NoDa Hett Packard Enterprise Japan , it's big p d f with, , examples of new features for and for every release, uh, they publish it and this time is no exclusion. We can see, uh, PostgREST 16 new features, published already. Let's link it as
[00:24:07] Michael: Oh wow. Yeah, I haven't seen that. I'd love to see that.
[00:24:10] Nikolay: it's usually like better one better two around the time they publish it, and it's great to, like, it's, it's can be used like a, a reference, you know, like, uh, if you're curious about how, how some feature is working, you have practical example and it's great.
[00:24:25] Michael: Yeah, I'll look at that. in fact, you mentioning Japan reminded me of a couple of other things I really liked about this little auto_explain feature was that it was a cross continent, cross company collaboration, I know reviewers often have to be, at least it's considered best practice, that they're not from the same company as the, as the person proposing the patch.
But it's really cool to see people from Japan working with people from Europe, from Amazon, from uh, N t T data, like all, just lots of different people from different. Walks of life and different companies collaborating on the same thing's. Really cool. And the other thing I liked about it was the discussion.
I mean, it, it's probably a reasonable question. People might have wondered when I mentioned that this won't work in versions 14 and 15, why not backpack it to 14 and 15? And I think that's a very reasonable question. And I, I'm not entirely sure it's a good idea not to. But I loved the reason why, which was they didn't want to break any Log PAing tools that people have built. So not even anything within Postgres, but very much something that, so a part of the ecosystem now, I think the most popular ones pgBadger on. I don't think that would break by adding them in. And they might, even if they talk to the maintainers of pgBadger, they might even want this feature, back patch to versions that people would probably more like to be using in the near term.
But, As an external tool provider myself, I love that they consi that people, the developers are considering our needs and what, what might help or hinder us. So really cool that to see that conversation, to see the reasons, to see somebody question it and people answer it.
[00:26:06] Nikolay: Good.
[00:26:07] Michael: Anything else to add from your side?
[00:26:09] Nikolay: No, I think, um, already, uh, out of quarter already. Right. We, we wanted one feature. I expl, I, I discussed, I talked about too many,
[00:26:19] Michael: We will be back with a PostgREST 16 episode, I'm sure.
[00:26:23] Nikolay: okay. Okay. Maybe we'll to overview a wider picture and, uh, let, let's plan to discuss features it doesn't have,
[00:26:33] Michael: Maybe that or that? Oh yeah, maybe both. Um, I'd like to discuss like what the important features are. Like thi this is favorite. I, whilst I claim this is my favorite feature, I d I would never claim. This is the one I think is the most important. so apologies in advance to all the great features that I didn't, that didn't get a mention yet.
[00:26:51] Nikolay: Well, yeah, yeah, yeah. There are many things. For example, uh, improvement of performance of copy command, like three X faster. It's great. And so on. Yeah, of good
[00:27:00] Michael: It's like a couple of hundred or something, isn't it? Like it's so many.
[00:27:05] Nikolay: Yeah.
[00:27:05] Michael: Anyway, thank you, Nikolai. Thank you everybody. Catch you next week.
[00:27:11] Nikolay: It's time not to add features, but to remove some Autovacuum should be removed. Okay. I'm joking. Uh, thank you
[00:27:18] Michael: It's not April the first. Nikolai. Yeah.
[00:27:20] Nikolay: Okay. Okay. Okay. Uh, thank you everyone. Indeed. I see you next time.
[00:27:26] Michael: Take care. Bye.