pg_flight_recorder
Michael: Hello and welcome to Postgres.FM,
a weekly show about
all things PostgreSQL.
I'm Michael, founder of pgMustard
and I'm joined as always by Nik,
founder of PostgresAI.
Hey Nik.
Nikolay: Hi Michael.
Michael: And we have a special
guest, David Ventimiglia, Solutions
Architect at Supabase and the
creator of pg_flight_recorder,
which we're talking about today.
Welcome, David.
David: Thank you so much.
Good morning, and I guess good
afternoon and good evening.
It's nice to meet you.
Michael: Yeah, I think we've got
all 3 bases covered.
Where would you like to start,
perhaps, with why another tool
in this area?
What's the origin story?
What's the motivation?
David: The motivation, so as a
solutions architect at Supabase,
my job evidently is to help our
customers and often they come
to us and they say, I've got this
problem.
My database is slow, or this query
is behaving weirdly.
Please help us.
And then we try to bring whatever
tools we have to bear on the
subject.
And to be honest, often we're starting
out with no idea what
the problem is.
And our beautiful customers cannot
be relied upon to, you know,
relay all the information perfectly.
And I just needed more.
I was telling Nik this at 1 point,
you know, some people say
less is more.
I say more is more.
I needed more data and I just wasn't
getting it.
And I did the usual thing that
people would do these days.
Version 0 of this, maybe 4 or 5
months back, it was a rush job
with our good friend Claude Code.
And I just cobbled something together
to get the data that I
needed for a particular customer.
And we got the data that we needed
and we were able to get through
that particular instance rather
swimmingly.
And I was pleased by the outcome
and I thought, you know, let's
try to turn this something into
something a little better and
a little something more real.
As a sidebar, I think we're all,
most of us are familiar with
the excellent pg_wait_sampling,
which is a excellent extension,
but sadly is not available on all
managed Postgres services.
Even Supabase...
Nikolay: Cloud SQL
David: is the only 1.
Cloud SQL, but even Supabase,
on whose staff is Alexander Korotkov,
who wrote it, but we don't have
that extension on the platform.
And there's, you know, there's
some resistance within managed
platforms to get new extensions
added.
I mean, there's a rich and vibrant
ecosystem for extensions and
that's 1 of the strengths of PostgreSQL,
but a weakness is getting
those into all of the places where
you needed it.
So I needed to, I needed something
that was a poor person's substitute
for pg_wait_sampling.
That's really how it started out
was just a worse version of
pg_wait_sampling written in SQL
in PL/pgSQL.
And then you know how these things
go.
It just took off from there.
Nikolay: Yeah.
There's a lot of unpack here actually.
Yeah.
Yeah.
And, and I agree with you starting
from the end of your intro.
I keep saying extensions are not
extending us anymore, because
in reality of managed Postgres,
we are limited by only the set
of extensions which are present.
There is pg_tle, and we should discuss
that separately, which
is, I think, a great idea.
Actually, you told me.
It's a lot of happening also.
Let me tell my story.
I think Every experienced Postgres
DBA at least once wrote this
snapshot tool for pg_stat_statements
and other pg_stat views because
it's only cumulative statistics.
Some numbers are growing, that's
it.
We need persistent storage.
Usually for bigger clusters we
have monitoring.
But I remember I was working with
a really big company called
Chewy.
And I remember they had great monitoring
already.
Some clusters were on RDS already,
so they also have some performance
insights.
Still I wrote my snapshot tooling
because I didn't fully trust
monitoring things.
I also wanted to verify, and some
details were missing because
they didn't capture all metrics.
I needed specific metrics and so
on.
So meanwhile, there are some projects
which implement this idea,
but they are extensions, so they
are Not available.
pg_profile, I think, there is such
a thing, right?
pg_profile.
So that's why many, many DBAs,
maybe not all, but many DBAs at
least once wrote some snapshotting
tool.
And I remember our checkup tool
was also snapshotting.
First versions of checkup tool,
it was a shell script.
So we snapshot it.
You have 2 snapshots, now you need
diff.
And since we were in Bash, we didn't
want to do diff, so we sent
these snapshots back to do diff
on the observed Postgres.
So I saw a lot of solutions which
try to make this data persistent
and have snapshots and then show
like everything without the
need to set up full-fledged monitoring.
And even if it exists still sometimes
we need additional lightweight
solution.
This is 1 thing to warm up us why
it's needed.
Another thing is that I guess Supabase
has a lot of clusters.
Some of them are,
David: we have a few.
Nikolay: Yeah.
Yeah.
Like millions, right?
It's quite unique, interesting
story.
And many of them are small and
you cannot justify full-fledged
monitoring, right?
David: And getting smaller, yeah.
Nikolay: Getting smaller, yeah,
because people just experiment
so much, right?
And they need so many databases.
David: Yeah, we are now, I mean,
we have a skewed distribution,
of course.
We have a few giant customers,
but we have lots of medium-sized
customers and many more small ones
and millions of tiny ones.
And now with the AI builders, we
are shoveling millions of nano
instances into the AI builder furnaces.
Like where are these databases
going?
What's the long-term prognosis
for these tiny databases?
Who knows, who cares?
That's a completely different animal.
But yes, it's a very, it's a vibrant
ecological niche that we've
developed here.
Michael: Yeah.
On that note, who's this tool for?
Like of that distribution, are
there some where it's not appropriate
for them and some where it's ideal
or like where does it fit?
David: Yeah, that's a great question
Michael.
Again, and as Nik indicated, this
is, you know, all these things
what time is a flat circle, all
these things will happen before
and will happen again.
You know, versions of this have
been written before and versions
of this will be written in the
future.
There's nothing really that profound
about this, but this is
the tool that I needed right now
at this time for the reasons
that Nik just described.
Among which is, you know, we have,
I, who, who this tool would
be for, I think would be startups,
SMBs, builders, sort of the
canonical Supabase customer.
Those who are starting out, you
know, building a business, building
a backend, building a project,
they need PostgreSQL and off they
go.
You know, we at Supabase, we,
we don't, I don't think it's any
big surprise.
We don't really have that many
migrations onto Supabase.
I mean, we would love to have more
than anybody who's willing
to bring giant workloads over to
Supabase.
Come on over.
But we know that databases are
infamously sticky tools anyway.
People don't really migrate that
often.
And they're probably less likely
to migrate giant workloads over
from Oracle or Microsoft SQL Server
to Supabase, although we
are entertaining that option.
But if we did do those things,
those folks would probably come
over with DBAs, database experience,
database expertise.
So our sort of customer portfolio
doesn't really reflect that.
What we have are, even our largest
customers, I would say, tend
to be, I mean, they may have 3
or 4 or 5 years of experience
with PostgreSQL now, by dint of
hard effort, but they all started
out small.
Every 1 of our large customers
was a little acorn that grew into
a giant oak.
And we try to make Supabase easy,
and we do.
It's certainly easy to get into.
I've used this analogy too many
times, but it's like the car
dealership.
You can drive it off the lot in
5 minutes, but actually Operating
it, especially at scale is something
different.
And we, Nik knows this, Michael,
you know this, we would all
benefit from more and better automation
and it's coming.
It's coming from within the community
and it's coming from Supabase.
We will be able to help these customers
more seamlessly and operate
their databases in the future.
But right now what we need is tooling
to help customers as they
grow and as they scale.
So in a nutshell, like who's this
for?
People who are not DBAs, not database
experts.
They just want to run a business.
They want to grow that business
and they want some tools to help
them.
That's it.
Nikolay: Some of them start with
some small, very small database
instance paying 25 or some very
low number of bucks, right?
And it's hard to justify paying
right away 150 or $400 or $500
for monitoring a full-fledged solution.
And then you need to spend time
there and so on.
It cannot be justified easily.
And also I wanted to mention, it's
quite elastic.
So if you just inject this tool
inside your Postgres database,
it starts collecting inside, like
self-observed.
And you pay a little bit for those
megabytes per day, I don't
know.
Since I helped with storage to
rewrite it, it's quite efficient.
I again used this approach for
PgQue, rotation of partitions and
truncates, So it's very efficient
and so on.
And I'm just saying it's like a
little bit, you pay a little
bit and it's self observed, right?
And when I was thinking what are
pros and cons, self-observed
versus externally observed, ideally
we need to have both actually,
Because you cannot install agent
right to RDS or super-based
machine.
So if you observe it outside with
external monitoring tool, if
something bad happens, maybe you
don't have connectivity, right?
While this thing sitting inside,
it still keeps observing, right?
David: That's right.
Nikolay: At the same time, if everything
is down, you don't see,
you cannot reach the data, right?
So like external tools also have
benefits.
They have both pros and cons if
you think about it.
It's interesting.
So in my realization, even bigger
clusters should have maybe
a small black box or flight recorder,
right?
While we have a full-fledged solution
outside, they're both remote
telemetry and something internal,
right?
David: Yeah, that's right.
And I think I landed on the name
pg_flight_recorder.
And then at some point, I think
it had some reservations because
I thought, sure, but in the event
of a crash, then maybe the
data aren't available and it's
not really that useful.
But then, I mean, if an actual
airplane crashes, then that airplane
also is not really useful either.
That airplane is dead.
No 1
Nikolay: will be using it.
Just a side note, I just learned
David has a PhD in astrophysics,
so this name is not a random thing,
I guess, right?
David: And a master's in aerospace
engineering.
But at every turn, I was trying
to do something else.
And I was trying to get away from
computers.
And it just, I just kept getting
sucked back in.
But I grew up in the 70s when It
seemed like airplanes were crashing
all the time when they weren't
being hijacked.
Mercifully, that doesn't really
seem to happen all that often.
But I think, I'm not a pilot, but
it's my understanding that
actual, you know, flight recorders
are useful for far beyond
crash investigation.
They're useful for optimization,
for troubleshooting, like, in-flight
incidences.
And so I think the nature of this
is hopefully a little bit more
like that.
Nik, you would know better than
I would, but I have the feeling
that in reality, databases don't
really actually crash all that
often.
What they do is they exhibit behavior,
and we want to be able
to investigate that behavior.
And that's what this helps us do.
Briefly about the tool itself,
again, all it really does is it
takes snapshots of wait events.
That's how it started.
It's like pg_ash.
Nikolay: We developed it in parallel,
actually.
Yeah.
So when I told David that there
should be something small which
self-observes, David just sent
me the link.
It's already done.
It was interesting that we had
parallel courses of development
of pg_ash and pg_flight_recorder.
They're very similar in this case.
David: Yeah, very similar and like
that it captures active session
history, wait events, nature of
course of Vacuum and Idle Hands
of the Devil's Workshop.
You know, with the tools available,
I couldn't resist the urge
to just keep pouring more into
it.
So it records lock activity and
checkpointer activity and background
activity and IO stats and statement
stats and...
Nikolay: Config changes, right?
David: Config changes as well.
And hopefully the conjecture is
that there's some value in, if
not capturing everything, having
an opinionated and curated set
of many things that are captured
simultaneously in a correlated
fashion so that maybe you experience
a checkpoint storm and then
you notice that there has been
a config change recently and you're
able to bring these things together.
So that's the idea.
As Nik indicated, it was when
we talked about this, version
0 was done.
Again, it's a pretty simple tool.
I had a few guiding principles,
1 of which was sort of the Hippocratic
oath, try to do no harm.
I put a lot of effort into making
this safe to run.
Nikolay: Statement timeouts and
so on, right?
David: Yeah, statement timeouts,
circuit breakers, graceful degradation
of some of the components, dozens,
too many configuration settings,
but then configuration profiles
that that capture those to make
it easy to use.
I think I got 3 quarters of the
way there or maybe 50% of the
way there, but there were still
some improvements to be made,
among which the storage engine,
which we can thank Nik for rewriting
using PgQ or essentially the engine
that is part of PgQ.
Correct, Nik?
Nikolay: Yeah, it's like partitions,
rotation, I think daily
partitions.
There is also a roll up for all
data to have it less precise,
not raw but aggregated.
Everything is already implemented.
And I remember I was brainstorming
with Claude Code, like what
kind of storage we should choose
because I think originally you
used a lot of JSON, right?
It's quite bloated in my opinion
sometimes.
David: Well, It wasn't JSON, but
I had originally, I was using
skip locked and unlogged tables in
a vain attempt to mitigate dead
tools and bloat.
But If you're not diligent, then
they're still there.
And so that's why you had to rewrite
the engine.
Nikolay: Also, like, data format
is interesting.
And I had multiple ideas, and I
have some, like, brainstorm document
where, like, thinking what to choose.
And some ideas were compressing
data quite a lot, but it was
hard to deal with because it was
basically encoded so much that
it's inconvenient, so I did a trade-off
choice that it should
be human-readable even in raw form,
although I did apply some
tricks from pg_ash as well, like timestamps
are relative to, I think,
I don't know, 2020 or something.
Like Unix timestamp by shifted,
so we have capacity until the
end of century.
And you have as few bytes wasted
as possible, like very compact
way.
And plus this PgQ style rotation.
And also, worth mentioning, there
is a soft requirement, pg_cron.
It's not a requirement, but it's
very recommended, because this
is how it's ticking as well, right?
David: That's correct.
So again, it's a simple tool.
It's 2 sort of packages, 2 simple
install scripts, 2 schemas,
1 of which is required, 1 of which
is optional.
The part that's required, it's
the data model, the tables and
the views and the functions to
record those data.
And then the other optional piece
is a set of functions for analyzing
those data.
But again, they could be analyzed
in raw form in whatever way
you like.
But as Nik indicated, somebody
has got to generate the ticks.
Somebody has got to force the samples
and that could be pg_cron,
it could be something like an outside
scheduler, somebody's got
to do it.
The sort of default way is with
pg_cron.
Nikolay: And pg_cron is available
everywhere, right?
David: Yeah, pg_cron sort of snuck
in before the sort of iron
curtain started to drop on extensions,
maybe.
So it's in a lot of places at least,
you know.
Michael: I, I, yeah, maybe that's
true.
I got the impression it solved
such a useful problem and was
from such a reputable author that
I think people trusted it and
also thought it's simple enough
that we can maintain.
I understand why managed service
providers don't just offer any
extension but if you think about
how much work it would be to
maintain pg_cron if the author ditched
it or you know actually
it's not huge and it's
Nikolay: so useful.
And it should be in core.
David: There's all of it.
That is true.
Nikolay: I wish pg_cron was in core.
And we would have, for example,
automated new partition creation
out of the box without any extensions.
Magic.
So simple thing, actually, right?
Michael: 1 thing about pg_cron from
a database perspective is it
is, I think, once per second is
the lowest you can schedule.
So what do you use, David, when
you're using this with people?
Do you use pg_cron with a one-second
tick, or do you suggest
something else?
David: So far, I have used it just
with pg_cron.
The resolution that has been sufficient
so far, you know, because
with the customers that I've worked
with, the resolution before
has been, I guess, infinite as
in they didn't have this at all.
So it's just worth having the data.
And a finer resolution, I haven't
encountered a demand for that
or a need for that yet, although
certainly plausible.
But again, yeah, that has worked
well so far.
And Again, this is just a simple
tool.
The idea, the objective anyway,
is sort of a set it and forget
it.
Install the tool, then forget that
you've installed the tool.
But because it's safe to run, by
virtue of Nik's hard effort,
it is safe to run.
So just forget that it's there.
And then you have an incident,
and then you think, oh wait, I
have pg_flight_recorder.
Let's find out what happened.
Nikolay: And just point your eye
to the data, maybe a dump of
that data or something, and that's
it.
And the second package you mentioned,
it has interesting functions,
like what happened at or something,
right?
It's based on function names.
I see already your thought, oh,
AI should guess, right?
You designed it for, So it's self-explanatory,
right?
So it's great.
But I also wanted to...
About pg_cron a little bit.
Version 1.5 is, I remember, the
lowest resolution, once per second,
and I use it for pg_ash, but I guess
you use it by default at much
less frequency, especially for
Azure data, right?
Maybe once per 30 seconds or 60,
but it's tunable, right?
David: Everything is tunable.
You know, default sample collection
is, I think I have it set
to once per second, but There are
aggregates that are taken at
a coarser resolution.
There are roll-ups that are taken
at a coarser resolution.
Data are archived at a coarser
resolution.
Then there's retention for the
core tables, as I think that my
default is 7 days for the aggregates
of the 7 days.
And then for, for snapshots, I
think it's by default 30 days,
but all of these are configurable.
So there, there are a few different
cadences that are happening.
And there is, again, as you indicated
on the analyze side, there
is a wall of functions, appropriately
named, meant to be understandable
by humans and by AI alike, so that
they can use these functions
to analyze the data.
But then, of course, it's always
available to be analyzed in
raw form as well.
Nikolay: Yeah, I can share some
interesting story from PgQ about
function names.
When I was developing recently
client libraries for PgQue, multiple
times Claude Code made a mistake
because there is a function
force tick, but it's not ticking,
it's just shifting this pointer.
And then you need to run ticker
in a separate transaction.
And Claude couldn't get it because
it's confusing name actually.
And it made mistake multiple times
developing this.
And this, I had huge flashback
to 15 years, 15 plus years ago
when I made the same mistake manually
without AI, because it
was so confusing to me 15 years
ago.
So I just renamed that function
to force next tick.
So you immediately understand this
is about next tick.
You're not doing, you're just preparing
this job.
And looking at your functions,
what happened at, incident timeline,
I'm just thinking, this is self-explanatory.
Maybe long, but everyone will understand
what it is for.
So it's worth making a lot of.
David: Yeah.
Yeah.
That's right.
And if there are too many functions,
then, again, we can use
AI to paw our way through and figure
out which ones to use.
Just a final thought.
It's meant for a few things, not
just incident response, but
also capacity planning, blast radius,
evaluation.
You know, again, where I intend
to go with this is just getting
back to Supabase briefly.
I have lots of customers that I
have to, I should say I'm blessed
with helping, but so many of them
I want to get to early.
It's, you know, again we talked
about this, but all of these
databases are small.
They start out small anyway.
I would say from a certain point
of view, from the point of view
of scale, many of them are sort
of doing things wrong, but that's
okay because they're small.
You can do, with a small database,
you can do everything wrong
and it's fine, no problems.
But it's when you start to scale
that you need to think about
this.
So it was like exercise.
It's something you have to get
into the habit of doing it early
and often, even though you don't
want to, and maybe you would
benefit from a personal trainer
and some encouragement to get
you on the path early so that it
pays dividends.
When you're old, like I am, and
when you're a big database, like
some of these eventually will become.
Nikolay: I wanted also to mention,
by default, it's consuming
up to a couple of gigabytes for
those 7 days, right?
But again, it's tunable if you,
or less.
David: Yeah, it's tunable.
It's a few gigabytes.
Nik, I think you and I, we benchmarked
this.
I think we, with the new storage
engine, I think we estimate
maybe for like under, on the happy
path, maybe I think around
like 20 gigabytes for the month.
Nikolay: It depends.
David: There's some, 1 of my goals
is to sort of draw more attention
to this so that I could get feedback
and improve it.
And there is some low-hanging fruit
to be plucked insofar as
data retention for path-wise.
Nikolay: I think it depends on
how many queries you have in Pagesaw
statements by default, up to 5,
000.
And Also, you collect data about
indexes and tables, so how many
tables and indices you have.
I think you have limits there,
but still it depends a lot on
the cardinality of these things.
About use cases, I used it recently
for benchmarking PgQue.
For me it's so natural.
I had multiple already projects
like this and I just, okay.
I injected both pg_ash and pg_flight_recorder
because pg_ash has
more frequency and more details
about ash data.
pg_flight_recorder brings a lot of
stuff, right?
So I just injected it into some
synthetic database as provisioned
with pg_cron configured.
And I just asked AI, of course,
to do it, so just inject it.
And then don't forget to dump after
each run.
And then visualize it.
That's it, Only 3 sentences.
David: Yeah, exactly.
Nikolay: And this is how I created
a beautiful looking.
I actually asked to animate benchmarks
because it's great to
look how lines go.
And this is what brought PgQue good
attention because this data
is like easy to understand Right.
So for example, how much WAL was
generated, right?
Lot of stuff.
What was the behavior of check
pointer or autovacuum and so on?
Michael: That was gonna mean my
next question actually on the
wall front.
You mentioned a few minutes ago
about how you originally went
with unlogged tables.
Does that mean these are now logged
and there is
Nikolay: WAL generated?
Yeah, it was my decision to say
that, first of all, important
limitation of all those tools which
are ticking on pg_cron and
write something and it's only PL/pgSQL,
it's primary only.
But we live in this strange situation
for me, old DBA, when a
lot of clusters are single node.
I have even cases clients are coming
like 10 plus like 15 terabytes
on single node and they are fine.
Cloud like resources became quite
relevant.
For some actually I think it's
okay Because backups matter more
than HA because they are fine to
be done, but not to pay for
additional couple of nodes and
so on.
So anyway, this is primary only
because we cannot write to- Can
Michael: I add something?
Yeah.
Can I add something to that?
Because you say single node, but
I think that's slightly simplistic
because a lot of ones I see, they're
HA, but the replicas are
not read replicas, they're like failover
replicas.
Nikolay: Shadow standby node.
Michael: I wouldn't call that a
one-node cluster, but it's still
you only need to
Nikolay: monitor the primary.
Yeah, you're right actually, you're
right.
But in this case you are not interested
because you don't have
any workload on that standby, hidden
standby, right?
Michael: Exactly, exactly.
Nikolay: So you are interested
on the primary and we see so many
projects reaching dozens of terabytes
already, which single node.
You inject it, we need to understand,
okay, it's self-recording,
so it's going to produce some writes.
If it's unlogged table, if it crashed,
it's gone.
That's the key idea.
We cannot afford...
David: At least, where the data
lands initially, those data would
be gone, right?
When they were unlogged tables.
Nikolay: You need to snapshot,
you need external means.
So to understand the incident after
crash, we should use regular
tables.
And when we've redesigned storage,
it's not so super expensive.
Of course, there is some WAL to
be written and some data storage
to be paid.
And of course, a little bit of
shared buffers occupied by our
data.
It goes to, if you have replicas,
it goes to replicas.
Maybe it's not a bad thing because
if it's the health of primary,
we can pay this price.
Michael: What are we roughly talking
about?
You mentioned a few gigabytes up
to maybe 20 gigabyte, like that
kind of amount for storage.
What are we talking about in terms
of WAL generation by default,
just to give people a rough idea?
Nikolay: I don't remember.
I thought about baby clusters like
1 gigabyte once, like free
tier is up to 1 gigabyte, right?
So I thought they should afford
this maybe with a little bit
tuned to less frequency or something
retention wise But it's
Michael: a storage now or WAL
Nikolay: both both.
Michael: Okay,
Nikolay: They are connected actually.
If you need to write-
Michael: You mean on super base?
Nikolay: Anywhere, any Postgres.
If you need to write 100 megabytes
to storage, you will produce
like very roughly, you will produce
kind of close to 100 megabytes
to WAL because this is the same
data.
Yes, in different form, but it's
the same data, right?
If you need to write 100 times
more, expect 100 times more of
wall.
David: Yeah, order of magnitude,
it would be about the same.
Nikolay: Yeah, very roughly.
Of course, like full page writes,
all the compression, but it's
very different.
Michael: I also think of them very
differently because with WAL
I think of it as like megabytes
per second, always.
It's always like a, with a time
component, if that makes sense.
So it's like a, it's a constant
amount that we're generating,
of course over a month or whatever
it is, that's a few gigabytes.
But I guess that doesn't actually
add up to very much per second
in terms of, yeah.
Nikolay: I think we should expect
something like 100 to a few
hundreds megabytes per day with
a lot of queries and indexes
and so on.
Yeah.
David: I mean, if you like ballpark
math, if it were 30 gigabytes
of data per month, that would be
roughly, I guess, by the power
of arithmetic, maybe a gigabyte
per day.
Nikolay: It's very stable.
David: And by 3600, you could figure
out megabytes per second
of WAL generation.
Nikolay: Kilobytes maybe already,
right?
And it's very stable because it
depends only on this cardinality.
And if you have some spikes of
workload, it doesn't affect the
amount of data, these snapshots,
right to WAL and data directory,
right?
David: It's a baseline.
So just, you know, you're paying
maybe $25 to Supabase, maybe
pay $26 and just pay for a little
bit more storage.
Nikolay: Not hundreds more as you
would pay if you install full-fledged
monitoring.
David: And if it's only, if it's
to first order for the primary
node.
Again, all of these small projects
are starting out with only
1 node anyway.
I mean, life is complicated and
these people are just trying
to get like a business off the
ground and a job done.
They're not thinking about multiple
nodes, especially early on,
but they, you, I mean, we 3 know
that they will need data to
guide them on their journey.
So this is just part of that.
I mean, Michael, you work in the
observability space.
We sort of dilate this out to a
wider view.
This is just another entry in the
observability space, like maybe
a new generation 0.5 of observability
tools for PostgreSQL.
But I mean, and there's pg_ash and
there's...
Nikolay: So it's not only about
observability.
I see, actually, the word new kind
of breed you used, I think,
in our discussions.
So I have pg_ash, you have pg_flight_recorder.
I also trying to revive PgQ in
this very format, pg_cron and PL/pgSQL
only.
That's it.
So it can be installed anywhere
and just tick.
I also have Lindex, which is not
yet released, which is rebuilding
indexes on pg_cron.
That's it.
You can inject PL/pgSQL and tick
on pg_cron.
That's it.
It's super simple.
I already think about tool for
automated partition creation without
heavy tools.
I don't know, it should be easy
to use.
But then you mentioned pg_tle.
David: Yes.
Nikolay: So can you maybe elaborate
a little bit why pg_tle?
Why not just single SQL file or
PL/pgSQL file?
David: It's both.
So, for those who don't know, TLE
is trusted language extensions,
which I have my own view on that.
I regard it as just a little bit
of extra housekeeping that's
associated with just a simple SQL
install file, but they are,
they, you know, TLE sort of dress
up SQL and PL/pgSQL code as
if they were a sort of kind of
managed extension, but they can
be installed without super user
privileges.
pg_flight_recorder comprises both
just simple install scripts.
You can use psql to install it,
but it also is available as a
trusted language extension that
can be installed through I think
dbdev Because that's how some people
want to be able to install.
Nikolay: Just to track like metadata.
David: Yeah, just so you can do,
it makes housekeeping a little
bit easier.
You can slide pg_flight_recorder
in with an install and if you
don't like it, you can uninstall
it in a very managed fashion.
So that's all that's meant there.
But yeah, it's very TLE, it's a
very lightweight way to have
managed extensions and Flight Recorder
offers that as well.
Nikolay: Yeah, I actually wish
PjCrone and TLE, they're both
inside Postgres itself.
And we would say something like
create package or something.
I don't know.
And it's just a bunch of SQL and
Pell, just go code or maybe
Pell Python if you want like anything
and it just can be installed
anywhere with versioning and so
on with like tracking of CVS.
I don't know, like if there are
any and so on, who knows.
Yeah.
But just like extensions don't
feel like a part of extensibility
of Postgres to me anymore.
This is my honest like feeling
lately.
David: Well, it also, what worries
me is who is testing, I mean,
with, with major version upgrades,
let alone minor versions,
who's testing all of these extensions?
Nikolay: But this question is also
applicable to any regular
backend code.
You use some libraries, You just
import them to your code, somehow
include, and that's it.
And versions also matter there,
and it's on your shoulders.
It should be on your shoulders.
This idea that we're not providing
some extensions because we
will need to maintain, give it
to shoulders of people.
This is a different part of the
thing.
David: This is intention with products
which telegraph or advertise
that were easy to use and you don't
need to worry, we will handle
it for you.
Nikolay: Yeah.
But I think It's great to have
flexibility and if people can
use various, like, choose and...
But they need to be responsible
for upgrades and part of maintaining
as they are already for libraries
and Go language, any language,
right?
TypeScript and so on.
So there's something here.
And I think it's AWS guys who created
pg_tle, right?
So definitely this project was
created with realization that
something is limiting people here
and let's bring something here.
That's a great idea.
And I wish- I mean,
David: Michael, do you experience
that at all with your customers?
Their challenges?
I mean I know you work in a slightly
different space but you
certainly you must encounter this
as well like tensions with
extensions, with managed database
providers.
Michael: Yeah so I get the impression
so I don't speak to people
all the time about this kind of
thing, but I get the impression
that people are looking for a little
bit of advice almost from
their managed service provider
on which extensions they should
trust, which are like the best
at what they do.
You know, often there's a choice
of 2 or 3 and they kind of want
their managed service providers
to pick 1 and say, you know,
this is the 1 we suggest or this
is the 1 we support and I feel
like there's a little bit of that
going on as well so it isn't
just yeah I think it is a little
bit of I don't know if it's
like king making or something but
like people saying this is
the 1 everyone using when people
come to Postgres for example
for the first time they're like
which backup tool should I use
which monitoring tool what's everyone
else using And there's
no kind of like official, there's
no, there's barely any kind
of extension management systems.
There's been about 3 or 4 kind
of created and I think there is
still, is it PGXN?
That's probably the most used.
But it doesn't have like reviews
or it doesn't have, Like it
doesn't have a lot of things people
are looking for in terms
of which ones are actually used
which ones people do that people
actually like which ones have got
a good track record when it
comes to major versions or low
Maybe no CVEs or very few CVEs,
you know that kind of thing So
I think trust is a big part of
it, and also people want a shortcut
as to which ones of these
should I be using.
Nikolay: I actually agree.
Supabase's, this database.dev,
it's another attempt to have this
register of extensions, right?
David: It is yet another attempt,
which seems to be somewhat
honored in the breach but it's
yet another attempt but Michael
I take your point definitely that
there is a need and I would
say a growing need for if not king
making at least someone to
offer guidance and to bless these
You would know as well as I
would, the sort of persona for
database operators definitely
does seem to be changing.
I mean, there once was a time when
databases were an arena for
people to sort of develop and then
project expertise, which is
certainly true.
But more and more, I encounter
people, customers, super-based
users, who are very candid.
They will say to me or to us, I
don't know what I'm doing.
I'm not a DBA.
Some of them will say, I'm not
even a tech.
I'm not even technical.
I'm a founder.
And I just vibe coded my way into
this.
And, you know, there's less of
an urge now than there was in
the past to sort of burnish your
credentials as a database expert.
People are very happily, they're
very candid, and they will say,
I am not a database expert at all,
so please, can you help us?
Can you offer guidance?
If you tell me what extension to
install, I'll install it.
So there's growing need for those
kinds of tools and for greater
and better automation.
That can be a topic for another
session.
That's where my mind is.
Nikolay: And I agree actually with
this authority, what's good,
what's reliable.
Sometimes I have cases where we
have huge Postgres, self-managed
Postgres clusters, and when I'm
saying we should add some extension,
I'm saying it's available on this
managed platform, so it's reliable,
you know, let's add it.
It helps me to speed things up
so yeah
Michael: I had a couple of last
things I wanted to make sure
like or to get it'd be great to
get your thoughts on 1 is is
there any like Is there anything
that we haven't talked about
that you have 1 of your favorite
features of the talk?
I've seen there's quite a few in
there and then also is there
anything missing that you really
want to add?
David: There are In reverse order
things that I want to add I
again I really want I would like
to fortify this against observer
effect, against deleterious consequences.
I already know that there are some
important fixes to be made
and I'm committed to doing that.
I'm hoping, my tender hope is that
people will use this to some
degree again, so that I can get
feedback.
If there are, if it needs to be
strengthened, I will strengthen
it.
I will pour effort into it to make
sure that happens.
So I wouldn't say there are features
that I want to add.
It's probably already bloated in
terms of features anyway.
So maybe I'll just stop in terms
of adding features.
In terms of favorite features,
I mean, Capacity planning, Nik,
you know, is another worthwhile
endeavor besides incident management.
And it's something that is sorely
needed for super-based customers.
There are functions within Flight
Recorder to help you project
your capacity needs.
Like everything else, those functions
will be strengthened and
improved, but I really would like
to exercise those.
I'd rather this tool is used to
forestall problems rather than
to investigate ones.
Let's just not have problems at
all.
Nikolay: You touched several things.
I wish we had a separate episode
on each.
And actually, I think how we would
approach incident response,
RCA, with this tool, particularly
step-by-step, it's possible.
And also capacity planning, I agree.
And thing related to observer effect,
1 thing everyone who is
using this new breed of tools ticking
on pg_cron should remember
that pg_cron records logs, right?
And you need to clean them up.
And I think we need to team up
and bring some pull requests to
make this configurable and make
it Unix way, Linux way, when
everything is cool, don't say anything
and don't log anything
because it works, right?
Like some levels, right?
Like warning error level for each
job in pg_cron.
And another thing is this verbosity.
pg_cron also depends on this frequency
and it can produce a lot
of bloated logs right in Postgres
as well.
Especially if we go to sub-second
frequency, which is just implemented
for PgQue, just running a single
stored procedure, ticking 10
times within 1 second.
Yeah, I don't know if it's needed
for pg_flight_recorder.
Maybe not at this point.
It's too much precision, right?
Yeah, anyway.
David: Anyway, yes.
So there is a soft dependency on
pg_cron.
pg_cron is also maybe a little overly
chatty.
My kitchen wall clock just silently
ticks away.
It doesn't generate a daily journal
of the fact that it ticked.
That's what I would like.
This is what we have to live with.
So yeah.
Nikolay: Let's create pull requests.
I can create or you can create
and just support each other.
Maybe pg_cron maintainers will
agree that it's...
Actually, I opened the issue.
I didn't see feedback from them,
but it's definitely an issue.
It was an issue before we started
creating these tools.
I have another places where pg_cron
was chatty in logs.
Yeah, anyway.
And final thing for me, a good
place to start is benchmarking.
If you just do benchmarks, just
inject these tools, pg_flight_recorder,
and ask your AI to visualize the
result of that data, like before
destroying instance or something,
destroying database, just dump
that data and visualize it.
It's so easy these days.
David: It is.
Yeah.
The nature of these tools is changing
and this makes it super
easy.
That's what I learned as well.
But yeah, That's it.
No, it's not very profound.
It's not very complicated, but
we need more and better automation
in the community.
This is just 1 small contribution.
Many more to come, I think from
you 2 and hopefully me and others
in the community.
So looking forward to that.
Michael: Nice 1, David.
And just to check, what's the license
here?
David: It's as generous as I could
make it.
Michael: Nice.
So, open and permissive.
David: Open, yeah, exactly.
Michael: Supabase style.
David: Supabase style, it is.
I work at Supabase, but this belongs
to, if anybody, to the
community.
Michael: Wonderful.
Lovely to meet you.
You as well.
David: Very kind.
I appreciate it.
Thank
Nikolay: you.