Michael: Hello and welcome to Postgres.FM, a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard, and I'm joined as usual by

Nikolay from Postgres.AI.

Hey Nikolay.

Nikolay: Hi Michael.

Michael: And today we are delighted to be joined by 2 excellent

guests who have each contributed a lot to Postgres over many

years now and who both recently published blog posts on the topic

we're going to be discussing.

Let me introduce you both quickly.

First we have Gülçin Yıldırım Jelínek, who co-founded the Prague

PostgreSQL Meetup and is a staff engineer at Xata.

Welcome Gülçin.

Gülçin: Hello, thank you for having me.

Michael: We're delighted to.

And we're also honoured to be joined by Robert Haas, long-serving

PostgreSQL major contributor and committer and VP Chief Architect

Database Service at EDB.

Welcome, Robert.

Robert: Hello, Thank you for having me.

Michael: It's our pleasure as well.

So to kick us off, I've prepared a couple of questions to ask

each of you in turn, but I'd also like to encourage you to ask

each other questions as we go along.

Perhaps we can start with you, Gülçin.

What are your high-level thoughts on the topic of is pg_dump a

backup tool and why is it something you wanted to write about

recently?

Gülçin: It is funny because I didn't actually want to write about

pg_dump.

I just joined my current employer Xata and it was my first week.

And then I noticed something in the discord channel that we have.

Somebody's having an issue with pg_dump.

I was like, Oh, what's happening?

And I saw like some parameter that I didn't recognize, like in

the error message, I was like, restrict non-system relation kind.

I was like, I don't know this configuration option or anything.

And then I noticed it was actually introduced recently at that

time.
And I was like, oh, okay, why?

And then I check it and it is kind of related to the CVE.

I remember the number 2024-734.

It doesn't matter, But there's a blog post about it, so you can

find with this number.

And in there, it explains like, what is this vulnerability and

how can actually people use this vulnerability to actually compromise

when you are, potentially your database, because it affects the

pg_dump.

So people can actually create a non-temporary object in the database.

And then just before pg_dump begins, it changes this object with

a different thing, like a view or a foreign table so people can

INSERT SQL there.

And then when pg_dump attempts to do the backup, then it can run

the injected SQL code.

So why are we there?

Because it affects it.

And then I said, hey, this affects from Postgres 12 to 16, upgrade

the Postgres versions and test if pg_dump scripts are working,

review the user permissions, the standard recommendations when

this kind of thing happens.

And then when we were sharing this blog post on Twitter, I think

our marketing team made like, okay, it's a pg_dump, a tool to

backup Postgres databases.

It was the definition of that tool, basically.

And then I read through it and everything, and I noticed, oh,

people are saying, you know, the usual, when you say something

about pg_dump, it is not a backup tool.

And I was like, okay.

And then basically it kept going.

So I had to write another blog post to say, is it really, or

is it not?

Nikolay: Who first said this?

Gülçin: I don't know.

I didn't know this because there were so many.

And I know that because in Postgres community, whenever pg_dump

topic opens up, somebody will say, you know, pg_dump is not a

backup tool.

But then actually a few days before this discussion happened,

Peter Eisentraut committed a change, which will be in effect

in PG18, that tries to remove the backup terminology and kind

of converts to export so that people are not considering it as

a backup tool in a way.

So this, I think, made people to be more vocal, saying that,

look, this was how it was before, but not anymore, and you should

not say it.

And then I had to write another, I mean, I felt like I should

write something more about it to explain why and why not we cannot

consider pg_dump backup or not.

And in my opinion, it is a tool that can be used to backup a

database.

And it is a logical way of doing a backup.

You can call it a dump.

Maybe you can define, you know, Nikolay was saying is it backup

tool or yes, no, or define backup.

So it can be a backup.

In your case, when I was like working as a DBA for a long time,

I was using it to backup databases.

Depends basically the context of how you use it and the nuances

that you can actually utilize this tool.

So yeah, that's where I stand today.

I don't agree that it is not a backup tool.

It can be a backup tool, but there are maybe later on in the

discussion we can discuss what are the drawbacks with it and

how actually regular backup tools that are out of Postgres can

help like pgBackRest or something.

Nikolay: Can I jump straight away with a question?

Gülçin: Yeah, please.

Nikolay: Yeah, I saw also comments that it's maybe for very large

databases like many terabytes and more it's not a good tool,

backup tool, but at least for small databases it's good and also

partial, you can export only 1 table.

Imagine we have a tiny database, just like, I don't know, like

100 rows, 1 table.

And I SELECT * from this table in psql and just make a picture

on my iPhone.

This is a backup, this picture we can restore from it right

Gülçin: Well maybe it's a snapshot right yeah why not well

Nikolay: dump is also snapshot right

Gülçin: yeah and that I don't really see like why it can't be

called backup

Nikolay: okay It

Gülçin: is a moment and you can use that moment to do something

with it.

Michael: I thought it was a really good blog post Gülçin, I'll

share it in the show notes as well for anybody that hasn't seen

it.

And speaking of good blog posts on the topic, I think Robert

added a lot of good points as well.

Both of your first blog posts included a lot of technical details

like the technical aspects of why it technically could be considered

a backup tool but also the drawbacks the many drawbacks of it

and why you might recommend for a general purpose backup tool

using something else.

So Robert, how about yourself?

How would you summarize your high-level thoughts on the topic

and why it was something you want, or maybe you didn't want to

write about it either?

Robert: Well I think it just kind of got under my skin because

you know Gülçin's blog post was not the first time that I've

heard people sort of using this pg_dump as not a backup to a line

and to me that kind of came across as shouting at people without

necessarily like giving you know a reasoning right you know The

documentation said for literally 20 years that pg_dump could be

used to make backups, or I don't remember exactly what the wording

was.

When I looked into the history in Git, I actually found that

the language that it's been changed to now with exporting the

database is very similar to the original language that was used

to describe pg_dump when that code was first added to PostgreSQL,

but there was a 2 decade period
in the middle when the documentation

said, hey, you can use this to
take backups.

And from my point of view, it doesn't
even matter whether that's

true or whether you think that's
true.

If the documentation said for 2
decades that X piece of software

could be used to do Y, then nobody
should get in trouble for

saying that.

Like, nobody should get called
out for saying that.

That just doesn't make any sense
to me.

Like, I mean, honestly, I think,
you know, we, some of us, self-included

can be a little too eager to jump
on people's case from time

to time.

And I don't think that's like good
for our community.

I think we wanna be the kind of
community where when people show

up we give them help, we give them
good advice, and we don't

come down on them like a ton of
bricks.

And Gülçin is not the only person
I've seen who seemed to me

to be kind of getting beaten up
a little bit.

And I was just like, why are we
doing this?

Like, clearly, pg_dump isn't right
for every purpose.

And there are lots of situations
where it's probably not what

you want.

But I just the tone is baffling
to me, because it seemed very

hostile to me and I couldn't make
any sense of really why we

should be that hostile about anything,
but especially why we

should be so hostile about that
in particular.

Gülçin: And to that I actually
have something to add, because

after this discussion started to
come up again, and I was looking

at the groups like where people
are actually using this, like

it's not a backup tool rhetoric.

And I seen like few users that
are trying to get help from this

Postgres communities that we have
online, a lot of them.

And there was like, I noted 2 of
them for today.

1 of them is asking, pg_dump can
limit a backup by schema.

I mean, it's like using this sentence
and there's somebody answering

directly.

It's not related there, but pg_dump
is not a backup.

And then there's another user,
can someone send me the command

to take backup of partial Database?

Which actually pg_dump can, right?

We can do the Schema only, we can
do just the Data, whatever,

or we can do a Table, any type
of Object.

And then answer is like, there
is no such command.

The standard backup tools take
backups of the entire Database

cluster.

So basically, it doesn't consider
it as a backup tool, even though

there is a pg_dump command that
can actually do what people are

asking.

So that's what I find very not
helpful, right?

We could just say to people, look,
this is this pg_dump command

that you can actually take this table that you want to take,

or selective restore, whatever you want to do with it, and help

people to the direction that they're actually trying to get there.

Instead, just saying, there is no such a command.

It's not a backup tool anyway, because the standard backup tools

takes the entire database cluster.

So that I don't find helpful, is what Robert is saying.

That is not helpful at all.

You might not agree that it's a good tool for using it as a backup

solution and which we can talk where it could be improved or

why actually people should prefer backup solutions.

But this is still not helpful.

There's a tool that we were all using for a long time and it

can does all the things that these people were asking.

So nuance of the question matters, the context of it.

And that's where I am, basically.

Robert: And you know, if somebody asks about how to use pg_dump,

and you want to tell them, hey, here's how to do that thing with

pg_dump, but maybe you want to consider some other alternatives

instead.

Cool, like I got no problem with that.

That can be helpful advice.

But like pretending that the thing that they're asking about

doesn't exist when it does, that I just don't understand that

at all.

Michael: So yeah, I've definitely got some theories as to why

people are behaving and speaking like that.

But I

Nikolay: do think...

1 of such people is just here.

I can speak to him if you want.

Michael: I wanted to, yeah, I wanted to, I think you've got some

really good language around this Nikolay around logical backups

and physical backups that really helps clarify and I think if

people use that language in those sentences it would immediately

help with clarity and also limitations but I'd love to hear your

like high-level thoughts on topic as well and and yeah why is

something you say

Nikolay: so this statement is a dump is not about the capital

is reaction to the statement documentation had 20 years And we

saw so many disastrous situations in many companies who Tried

to rely on this as backup tool while growing So we did like actually

it was not my statement, right?

I just picked it, right?

I think Franck Pachot also mentioned it.

I'm not sure he was the first who reacted to the Gilson's article.

But I joined as well, and I'm sure in many, not only Discord

or Slack or IRC, anywhere many people are picking up this motto

because it's painful to observe how many companies relied on

pg_dump as dumps as backups, right?

If we call dumps And considering
backups, okay, we can do that,

but there are limitations.

There is big power in this, not
only partial.

You can take specific Tables.

These days we have many managed
Postgres offerings and they don't

share backups with us.

If you want multi-cloud backup,
you must use pg_dump.

You cannot get data or copy or
something.

You cannot get physical backups
out of RDS, for example, right?

But this pain observed for a couple
of decades caused me like

joining this Movement saying that
pg_dump is not a backup tool.

At the same time, there is a like
I told Michael there is like

it's There is like a kind of professional
shift in my mind here.

Because when somebody says backups,
I envision only physical

backups.

Although there are logical backups,
of course.

And again, this is not my idea
to introduce this language.

I checked it in Oracle and MySQL
documentation.

I think maybe it's a good idea
to borrow this concept and mention

specifically SQL.

There are 2 kinds of backups, physical
and logical.

They all have pros and cons.

For example, logical backup, If
you rely on pg_dump as a backup

tool, like for example, partial
and escaping from RDS, it's good

pros, right?

Speaking of cons, it's always like
kind of snapshot.

It puts pressure on your Database
in terms of xmin horizon, affecting

autovacuum behavior, which is
unacceptable if you have 10 plus

terabytes and heavy load.

Also, at 1 day, some bug or corruption
might happen, and you

simply cannot read your data at
logical level.

While physical backups are not
affected.

They just copy files, right?

And like, there are many pros and
cons to compare, right?

And I like the idea to split language
between logical and physical.

And for me personally, when somebody
says backups without specification,

I still see by default physical
only.

Right.

Gülçin: If we are considering the
corruption, the logical backups,

the corruption can be also in the
physical level.

Nikolay: Right.

I'm okay with that, but I have
backup, I can restore and deal

with it, right?

Gülçin: Well, then actually you
can maybe keep this corruption

between your physical backups if
you didn't notice, if it's gone

unnoticed.

And then if you had the logical
backup on top of it, maybe, you

know, it could be another tool
to fight this physical corruption

that you have.

Nikolay: Yeah, what I'm trying
to say, if I have physical backups

with corruption, I will deal with
it and so on.

But if I have corruption which
prevents pg_dump from reading data,

it will just fail and I don't have
anything.

Robert: Yeah, so I'd just like
to make a couple of comments here.

I think 1 of the things that I
find really interesting is that

people who work for different companies
that all support and

use Postgres can have very different
experiences of some of this

stuff.

And I've seen that before with
other issues and I'm seeing it

here too.

Because my typical experience with
pg_dump is not the 1 that you

were describing at all.

In fact, since I've worked at EDB,
which is the whole of my professional

Postgres career, I've never had
that situation happen.

Like not once have I run into a
situation where a customer should

have been using something other
than pg_dump and they were just

using pg_dump and then they got
into trouble.

What happens to me rather frequently
is that someone has used

some other kind of backup and things
have gone really badly wrong

for some reason and pg_dump becomes
the way that we can help that

customer to get out from under
that problem.

So just as your experience with
the customers that you've worked

with is informing the way that
you view the issue.

I have a different set of experiences,
a very different set of

experiences from what it sounds
like.

And so this thing that to you feels
like, ah, this is the catastrophe.

We've got to steer clear of this.

In my experience, that's never
the problem.

It's always the thing we reach
for to get out from under the

problem.

And I really just want to highlight
that because I'm not saying

your experience isn't valid and
I hope you'll return the same

courtesy.

Gülçin: I actually understand partially
what Nikolay is trying

to say here because I was before
EDB, before working with Robert,

I was working for Second Quadrant
and we were building our own

backup solution, Barman.

Nikolay: And
now it's EDB owns it.

And then I know, because I was
actually doing remote DBA work

And there was a lot of customers
with backup issues.

They had their own home cook scripts.

In the wrong hands, this can go
wrong, because there are some

things that pg_dump and restore,
you have to know about it.

How do you do the dump process?

How do you do restore?

Do you actually test these things?

Do you copy the whole directory
or do you consider it as just

some logs that we can actually
delete at some point and so on?

So if people don't know how to
maybe put these things together

in a way, it is not really helpful
for some people, then things

can go wrong.

And I seen that things actually
went wrong.

That's why we were steering people,
you know, if you just do

regular backups and restores, use
this tool that we have or any

other tool that can be used for
backups and you can keep the

retention period, you can keep
your backups for X days, you can

restore them and test and you can
have continuous backups that

edit.

So it's not like partial, you know,
it can be just like a continuous

thing that you don't need to worry
and you can do point in time

recovery and so on and so on.

So I understand this rhetoric and
I was the advocacy of it, but

then I also feel like it went too
far saying, you know, this

is not, this is not usable and
that I, I oppose basically.

Nikolay: Yeah.
It's like pendulum.

I agree.

Yeah.

The start of this pendulum is these
20 years of documentation.

So you raised a very good point
about restore.

When I hear backup, full-fledged
backup, it's not only physical

to me, it's also verified.

And if we have physical backup
which we test, that's great.

While with dumps, I'm very curious,
while Robert, you didn't

see an ability of pg_dump to read
some, I don't know, some database

which is corrupted and we cannot
get dump out of it.

But second question like here.

Okay.

Robert: That actually happens all
the time.

And one of the things that I often
end up helping people do is

fixing the database enough that
we can use pg_dump to get the

data out of it.

Because if the database has incurred
a lot of damage at a physical

level for some reason, we're never
going to be able to repair

that well enough to give confidence
that everything is the way

that it should be.

So a dump and restore in my professional
opinion is absolutely

essential in that situation to
get back to a clean state.

Now you are 100% correct that the
dump may also fail or the restore

may also fail, but those are problems
that we can understand

and fix.

We can look and say, ah, well you
have a pg_class entry, but

you're missing a pg_index entry,
so we need to create the one or

delete the other.

That's a problem where we can say,
ah-ha, that's something that

we as Postgres experts can look
into and understand what needs

to be done to bring this back to
a state where pg_dump is going

to run.

But the blocks being messed up
at a physical level or out of

sync with each other because we've
had some time travel of some

kind or something like that, Those
are problems we won't be able

to get out from under that ever.

Does that make sense?

Nikolay: Yeah, it makes total sense.

And moreover, it's a very popular
approach to use pg_dump to test

physical backups to see that we
can read all except indexes.

For indexes, we use amcheck,
but to test physical backups,

we use pg_dump to /dev/null, for example,
just to see that there

is no corruption, like We can read
it for sure.

And the second, like you mentioned
restore.

I remember a couple of times I
saw a dump could not be restored

because of a unique key violation,
right?

Because of corruption of uniqueness
constraint.

Because some duplicates happened
and unique key didn't save us

due to some bugs or something.

Maybe somebody disabled something,
I don't know.

Or foreign keys, foreign keys as
well.

If you disable triggers, you can
corrupt your data easily, right?

You disable triggers, you load
something and you enable triggers

and Postgres won't check it.

And during pg_dump, pg_dump you can
have, but you cannot restore

from it.

Right?

So yeah, we see some mutual points
definitely here.

And the question is just about
language I guess.

That's it.

Michael: Well I think it's also
about experience Nikolay, you

mentioned some disasters, is it
my right and understanding this

is folks who have come to you with
some issue and they've only...

It's not just that they're using
pg_dump as a backup tool, it's

their only form of backup.

And what kind of issues is that
causing?

Nikolay: Remember the first managed
service, managed Postgres

service created, popular at least.

It was called Heroku.

I think it still exists, but not
being actively developed these

days.

And they offer backups as dumps.

You can download them.

That's great actually.

If a managed service, Postgres service
provider allows you to download

backups, that's great.

But it was just backups.

And nobody does this.

I mean, nobody among very popular
managed Postgres providers

do this.

They rely on physical backups these
days, right?

And also on snapshots and so on.

I mean, cloud snapshots, full disk
snapshots.

And this also shows evolution of
backup concept in many people's

minds, Not only us.

So I think it would be great just
to agree on the language and

discuss.

I'm okay to be alone thinking that
backup is just physical backup.

Backup could include both logical
and physical, and we could

clarify documentation and language
articles and so on.

And I see it's a pendulum, right?

Again, this is my point.

Too long documentation was claiming
this is a backup tool.

This language was super harsh.

And I remember I was trying to
explain at least a couple of times

in my life, I was trying to explain
to some customers with growing

Postgres databases, exceeding terabytes
and approaching 10 terabytes.

I'm saying, don't rely on pg_dump
as a backup solution And they

just showed me documentation saying,
this is like, this is what

they say.

Vendor is saying this, right?

Robert: Yeah.

I mean, I think that there is a,
maybe a difference between something

that creates a backup and a backup
tool.

I mean, this does get down a little
bit to what you think words

mean, so it almost seems like a
silly thing to argue about, right?

But I think, you know, you asked
Gotcha at the beginning, like,

if I take a snapshot of all of
my data on a cell phone, is that

a backup?

And I think the answer is obviously
yes, but equally obviously,

that's a silly way to do a backup
because your restore procedure

is going to be very unpleasant,
which is not what you want.

I think sometimes when people talk
about a backup or a tool that

can take a backup or a backup tool,
sometimes they mean like,

can I get a copy of my data from
which I could recover?

Right, and that's 1 question.

And pg_dump will give you that,
right?

The other question, sometimes what
people mean is, they mean,

is this like, and they may have
some particular commercial product

in mind that offers a certain feature
set and their question

is am I going to get this feature
set where for example my retention

times will be managed and my my
actual process of orchestrating

the backup and orchestrating the
recovery will be managed.

And then the answer is no, pg_dump
is not going to do that for

you.

And you probably do want those
things in most cases.

So I don't know, like, I think
there's a lot of nuance that's

possible in the language here.

But for me, the important thing
is to make sure that we're clearly

able to explain what the benefits
and drawbacks of the different

approaches are rather than, you
know, spending too much time

fighting about the specific language,
which for me, it gets a

little bit silly.

Nikolay: I agree.

Yeah.

Michael: I agree as well, Robert.

In your blog post, you make a really
good case for the tone of

the statement being difficult,
and I think you actually use some

language that is that like waters
it down a little bit or explains

a little bit more it doesn't take
many more extra words to do

so but I also wanted to ask do
you see this problem in other

statements in the Postgres community
like are there other things

people are saying that remind you
of the tone of this kind of

statement as well?

Robert: I don't have specific examples
in mind off the top of

my head, but definitely yes.

I mean, it's a chronic problem
on Hackers.

You know, I think I wrote a blog
post about the sort of tone

of dialogue in the Postgres community
towards the end of last

year.

And it's always a problem because
when you post your patch on

Postgres Hackers, you're essentially
soliciting review.

And people are rarely going to
write you a review where they're

like, you know what, this patch
is amazing and I love it.

I mean, it happens.

People actually do get those kinds
of reviews, and it's a great

day when you do.

But generally, when you're reviewing
a patch, you're picking

something that you actually like
and would like to see go forward.

And then you're saying the worst
things about it that you can

think of to say.

You're like, so here's all the
problems.

Here's all of the stuff that I
think needs to be better in order

for this to become part of the
product, which I hope it will,

but these things are the things
that I think need to be fixed

first.

And so what I see is that actually
for a lot of committers, in

particular, people's mental health
is not in a great place.

You know, I kinda thought my mental
health was not in a great

place around some of this stuff,
and then I talked to some other

people and found that they were
feeling worse about it than I

was feeling by like significant
margins.

And it's, in my opinion, it's rarely
because of bad intent.

I mean, obviously people get frustrated.

People say things that they shouldn't
have said or they don't

say it in the right way or they're
pissed off.

I mean, those things happen and
I don't wanna pretend like they

don't.

But I think very, very often it's
a case of the nature of the

workflow and the nature of the
process and the kind of engineering

that we're doing.

It's difficult and it's error prone
And even the absolute smartest

people in the community make all
kinds of mistakes, you know,

over and over again, right?

Like we were doing a rewrap of
a scheduled minor release that

happened last week.

We're doing that this week because
somebody committed a fix for

a bug and the fix contained another
bug.

And it doesn't matter who made
the mistake or who didn't catch

the mistake, that's not relevant.

It happens all the time.

And I think it's really challenging
to people because we work

in a very open environment where
everybody sees every email we

write, every patch we commit, every
patch we thought about committing.

You know, it's out there constantly
and you just realize that

there are so many ways for you
to screw up and every time you

make a mistake, everybody sees
it.

So I think it's a struggle for
everybody.

As far as I can tell, every single
person who works on Hackers

encounters this problem of getting
the tone right all the time.

And I am certainly not going to
sit here and pretend like I get

it right more often than average.

I think a lot of people would say
I am below average in that

way, but I can tell you I'm very
aware of the problem and I am

trying to figure out how to do
it better because at the end of

the day, it's not enough for us
to deliver great software.

We need to deliver great software
while also creating a community

that people want to participate
in.

And that applies for me, first
of all, to the developer community

because that's where I spent most
of my time, but it also I think

applies more much more broadly
to the user community.

And I think that is part of the
reason this issue set me off

a little bit, because, you know,
it's the sort of thing that

I'm struggling, often in vain,
to do right on a daily basis.

But instead of being targeted at
other developers who at least

kind of know that the negative
feedback is coming.

Some of this felt to me like it
was targeted toward users who

like they don't realize that they're
about to get jumped on for

you know wading into a flame war
about whether something is or

isn't something you know and I
just don't want you I don't want

users that I don't want anybody
to have that experience I certainly

don't want users to have that experience.

Michael: I personally think that
only from having you articulate

that I've thought of 1 that I that
annoys me a little bit and

that's the correction of people
pronouncing or spelling Postgres

wrongly or missing the S off sometimes
happens if people are

new to the community and immediately
they get jumped on.

I think, oh, come on, they're clearly
new.

So yeah, I can definitely see that.

Robert: It also happens a lot with
people based on their language

of origin.

Like the fact that we pronounce
it PostgreSQL, I believe that's

at least 1 of the canonical pronunciations,
that is much more

natural for somebody who learned
to speak English in the United

States Than it is for somebody
who learned to speak English and

for example India, right?

Like it is English, But the way
that English is spoken in India,

it's a distinct dialect.

It has its own ways that people
say things, ways that people

communicate characteristic patterns
of speech.

And that's not the only place,
certainly.

I think actually there are probably
other countries that where

the problem is even more acute
because English isn't even used

as a common language communication
in many parts of the world

But even when it is it's not necessarily
the same as your English

and people aren't necessarily going
to be You know starting from

the same point, right?

If I read a word that is unfamiliar
and my wife reads the same

word, we're likely to pronounce
it the same way in most cases.

But if a colleague from halfway
around the world reads the same

word, their instinct may not be
the same as mine.

And that's not necessarily a question
of me being right and them

being wrong.

That's the question of we went
to different schools.

We were taught different things.

Gülçin: Yeah, I think it also points
out to the wider problem

in many communities, like the longevity
of the projects will

depend on people.

And if you are hostile to people
or like, because we all come

from different parts of the world,
I didn't learn English until

I was like, you know, an older
kid.

And that is always a problem when
I give a talk or when I write

an email.

It is still in the back of my mind
that I try to correct myself,

I use multiple tools, I try to
present myself as good as I can,

but there are limits.

I still confuse the propositions
I use in and at, all around,

randomly.

I could never fix this.

And that doesn't mean that I can't
contribute to the project,

and I could and I do.

And that's what I believe, like
these little statements, maybe

we took it to a philosophical approach
through, it's not about

pg_dump, backup or not, but like
as, you know, saying Postgres,

but we should do better in how
we handle communication because

this is the way that people interact
with today, report issues.

And if you don't accept the problems,
well, people will not report

it or they will not actually use
this and report back what they

use so that you don't actually
get the feedback from people.

And because you cut these channels
that people actually try to

communicate to you, instead of
opening all these channels that

we should actually amplify, we
should have more channels for

people to bring stuff that they
interact with Postgres or ecosystem

in general.

So that's where I was really impressed
by Robert's blog about

how open he was about this.

And I appreciate the efforts that
going on towards this, because

when I started, I also felt scared,
almost reading some of the

emails.

I was like, I wouldn't want this
reaction to come to me, for

example.

So it shouldn't be like that.

Robert: And I think it's not just
an issue of dialect either.

You know, like that is definitely
part of it.

But 1 thing that I've noticed on
Hackers is that clarity and

extreme precision of expression
is very, very highly valued,

right?

Like someone can come along with
a worse idea and because they

explain it extremely clearly and
precisely either it gets accepted

or they get feedback on how it
should be changed or positive

comments.

Welcome to the community.

Hey, great to have you, right?

Somebody else writes a worse email
about a better idea, and it

actually gets a worse response.

And I do understand some of why
that happens, right?

We value people whose style of
expression is similar to our own,

where we feel like we can freely
and easily communicate with

those people, and everybody's busy,
so you don't wanna spend

a huge amount of time trying to
understand email A if you could

very quickly and easily understand
email B But it's obviously

super off-putting to people when
you may have proposed something

that was actually great And if
somebody had given you 5 minutes

of their time, they could have
understood exactly what you were

trying to say, but they just flip
through the email really fast.

And then they moved on because
they're busy.

And that's obviously going to be
demoralizing to people.

Michael: To play devil's advocate
a little bit, I personally

err on the side of being polite
and trying to be kind and trying

to be welcoming, but I also think
sometimes that approach doesn't

always land, people don't always
take the lesson from it or learn

from the statement or realise that
maybe what I'm really trying

to say or I'm not being clear enough,
that kind of thing.

And I do think, for example, with
the comment that we started

with, I feel like there's a certain
amount of trying to save

people from themselves or trying
to shock people, deliberately

trying to be provocative in order
to make people think, oh we

shouldn't only be relying on this
tool for this purpose or you

know we maybe I should be rethinking
my thoughts, you know that

it doesn't apply to all of these
cases like mispronouncing the

project name but I've seen this
specific comment come mostly

from consultants, some experienced
consultants, some who are

very kind and also involved in
like diversity initiatives.

I've definitely seen this from
people that you wouldn't necessarily

expect to be direct and unkind
so the exact phrase pg_dump is

not a backup tool.

So I think that's coming from a
place of having seen people shoot

themselves in the foot and wanting
to save people from that and

wanting to be quite direct to avoid
it.

So I don't know for sure, but I
believe their intent is good,

but maybe they're deliberately
choosing to be provocative or

direct or I'm not sure.

I'm not sure.

Maybe I'm putting words in their
mouths basically.

Gülçin: I think it's like we are
not calling out people for just

saying, you know, this is not a
backup too, because we understand

where they come from, because we
are in the same industry, working

for ages.

We know these people, we all had
the customer stories and so

on.

But I think the general idea from
here that when somebody shares

a blog post, let's say we all wrote
about it.

I had wrote 2 blog posts and Robert
wrote 2 more.

And we just got together and talking
about it.

Let's say he's pointing out why
pg_dump is good at dependency

management, let's say.

We take it for granted, which I
wanted to bring up in today's

call to just actually showcase
that there are things we should

appreciate in this tool, why he
says it is an amazing tool.

Then towards this, somebody writes
like, but it is not a backup

tool.

Then I don't get it, because it's
not what is the discussion

about.

We are trying to discuss that there
are ways you can make this

tool in your tool set.

It's not the only tool.

There are professional solutions
for backing up your database

against disaster recovery, as we
mentioned, the retention and

the whole orchestration of the
database backups and recovery.

But when we are discussing this
tool specifically, which I feel

that is important here because
there's nuance to be discussed,

and just shutting down the discussion
saying, but it's not a

backup tool, this is where I feel
that this needs to be improved

better because then you don't really
contribute to this because

you need to say then why it is
not in this case, why don't you

agree with this?

Let's say, is it dependency management
thing is not for you or

why it could be improved?

You could say that pg_dump could
be improved because let's say

we could run vacuum after it, or
we can do, I don't know, like

do statistics better or something.

I mean, to contribute where a pg_dump
might have been improved,

because I've seen people like in
the discussions that they struggle

with mapping, let's say, pg_dump
options to pg_restore options

because they assume the order will
be the same and they don't

get it and so on.

So there are things maybe we could
get input from why people

complain about these things and
to improve.

That's where I go for issues.

I see these comments in the forums
and like, oh, OK, this is

a good idea.

Maybe I can actually talk about
this.

But then when we are discussing
this and coming with like, okay,

this is not a backup tool, it kind
of brings back to the 0 and

doesn't really improve anything.

Nikolay: But your second article
was basically agreeing that

it's not a backup tool.

Gülçin: No, in the sense that people
say, as I'm saying, as a

solution, if you want to orchestrate
your backups, use a, I don't

know, a tool that is like, you
know, Barman, Baker's or something.

But then another discussion we
have, why it can't be?

Why pg_dump?

We are discussing the, because
in the second blog post of Robert,

for example, he gives up like this,
you know, why it could be

a nice tool for these of the use
cases that he lists.

And they're getting the question
of again, that I don't agree

basically, like, okay, use a better
maybe solution if you are

managing production databases in
multiple environments that are

giant databases and you really
don't need to deal with, you know,

home run.

But he's still historically, it
is still a tool that we use,

you know, it could be used for
different cases.

Nikolay: What, what I hear is you're
saying when people come

to you to comment to your first
blog post saying, I think it

was Franck Pachot and I'm joining
him, still joining.

And he said, pg_dump is not
a backup tool.

You think it's like shuts down
some discussion and so on.

But I just explained that this
is a pain from a lot of experience

and we are just reacting and what
I hear you still try to judge

him, right?

Let's just...

Gülçin: No, no, that's definitely
not for it.

I'm just saying we discussed that,
but the second blog post was

about, you know, there's backup
tools, you should use it.

But then when Robert was describing
a part of why pg_dump is

good, in my opinion, it was like
very valuable points.

And there it was not even relevant.

We were not even discussing, should
you use this tool or not?

And I'm not targeting anybody.

I'm not targeting anybody.

So be clear about it.

Nikolay: Yeah.
So the change happened only now.

It's in Postgres 18.

And recently I had discussion this
like claiming, oh, it's not

backup tool.

Somebody said, oh, what is this
about then?

And sending me a link to pg_dump
documentation.

So I think I would not judge people
who are saying pg_dump is

not a backup tool until we have
this change in documentation

and start recovering from this
stress we had 20 years.

This is my point.

I stay on this point very strong.

And common ground is let's start
distinguishing physical and

logical backups.

We can clarify this on documentation
as Oracle and MySQL did.

And there is already part of documentation
speaking of backups,

it describes dumps, I mean, pg_dump
and then file system snapshots

and then point-in-time recovery,
full-fledged backups.

And just if we clarify documentation
and I will stop seeing customers

sending me this link saying you're
wrong, this is documentation

saying you are wrong.

Robert: But like, I think, you
know, I don't know, like, if you

can't win an argument against a
documentation link, I don't know,

it feels like something's not right
there, you know, like, I'm

not trying to be harsh.

And I just feel like, you know,
if somebody hires you to give

them good advice, and you give
them advice that is actually good,

and their response is...

Nikolay: Robert, let me interrupt
you.

Sorry.

I'm just like, I feel judgment
in you and Galaxian's words.

Like, you tell me now how you want
to be welcoming, and now you

judge me like I cannot win.

I cannot win 2 things.

pg_dump is a backup tool.

Sometimes I cannot.

They say they trust documentation
more because many more minds

behind it.

And also pg_stat_statements, documentation
says you cannot say

set it to positive value, keep
it 0 globally because globally

it's a bad idea.

I already like some customers I
win, some customers I don't.

I'm not genius, right?

But I feel in both of you, I feel
judgment.

Why don't we stop judging people
and sentiment and so on?

I bring you like improvement.

Let's say there are 2 types of
backups, logical and physical.

And then we, we develop language
from there.

And this joins us.

Right?

When you judge people saying they
came to me with this statement,

or you say, you cannot win your
customer authentication, This

splits us.

And I start fighting with you.

I don't want to fight with you.

Robert: But I mean, that's also my complaint about the language

that you were using.

So I don't know how to have this discussion without having opinions

about whether certain language is good or bad.

And I don't think I mean, you can't write like we have to be

able to talk about what the language does and to what extent

it helps or hurts.

And yeah, of course, there's some judgment there I don't know

like I definitely have been in the situation of having a customer

who?

Wouldn't listen and I The frustration that you feel with that

situation feels very genuine to me like I I can totally imagine

that happening and being a bad experience, but I don't know.

I'm not even saying it's a bad thing that we changed the language

in the documentation.

I was only reacting against sort of like conclusory statement

pg_dump is a backup tool and now I don't want to talk about it

anymore I think we should always be talking about it more I think

we should be trying to as you say bring clarity to it and bring

precision to it

Nikolay: I agree with you totally like I hear you now well and

I think we will stop saying this actually, if documentation will

be, it's already fixed, I think it can be fixed even better if

we say it's a logical backup tool, for example.

Everyone will be happy, I think, right?

And we will stop saying it's not a backup tool.

We will start saying it's not a physical backup tool, which is

obvious, right?

And this will join everything and so on, right?

I agree with your reaction, actually, which says this statement

it's not a backup tool it's like too like far from balance right

it's off balance I agree with this so it's not a good statement

actually I admit but again it's a reaction to another not a good

statement which we had in documentation which didn't say logical

backup it said just just backup okay

Michael: we're pretty much out of time okay I wanted to thank

you all for your thoughts on this I think is a difficult subject

and I think actually it's really nice to have 3 people that all

care about educating folks and teaching people how to do things

well with different opinions on how to do so or you know slightly

different approaches on how to do so but as Nikolay says, as

Gülçin pointed out in her blog post, the language around this

has been changed in the documentation.

Robert, keep fighting the good fight on the hacker, the tone

on things on Hackers.

Is there any last words anybody else wants to add?

Let's start Gülçin, did you want to say anything else at the

end?

Gülçin: No, I'm happy that we are discussing it and I don't take

things personally.

I mean, we are here just to discuss technically why this could

be useful in some cases and why not.

And Yeah, that was anyway the summary of what I said in the blog

post as well.

So if people like to read it and comment on it, and I'm happy

to discuss more.

Thanks.

Michael: Wonderful.

Well, we're looking forward to your future blog posts, whether

you want to write them or not.

Robert, any last words from you?

Robert: I just think, you know, on Nikolay's comment about making

the documentation better, what I would encourage, and of course

this is much longer than we can actually do in this forum, is,

you know, let's get down beyond the headline, right?

Like saying in the headline that it is a backup tool or that

it's not a backup tool, it's an export, it's a dump, it's a lot.

We got to get beyond that subject line and think about what we

say down deeper.

I think one of the areas where the Postgres documentation is sometimes

weak is it doesn't always do a good job listing pros and cons.

Pros and cons very often don't get listed for things.

So you know that that's probably an area where we could we could

grow as a community.

Nikolay: Big time.

Michael: Brilliant.

Well thank you so much everybody and thanks Nikolay, catch you

next week.

Nikolay: Thank you.

Thank you for coming.

Gülçin: Thanks.

Bye-bye.
Robert: Ciao.

Nikolay: Bye.

Gülçin: Ciao.

Creators and Guests

Gülçin Yıldırım Jelínek
Guest
Gülçin Yıldırım Jelínek
Staff Database Engineer at Xata, Co-founder of Prague PostgreSQL Meetup
Robert Haas
Guest
Robert Haas
VP, Chief Architect, Database Servers at EDB, PostgreSQL Major Contributor and Committer

Some kind things our listeners have said