Tens of TB per hour
Michael: Hello and welcome to Postgres.FM, a weekly show about
all things PostgreSQL.
I am Michael, founder of pgMustard, and this is Nik, founder
of Postgres.AI.
Hey Nik, how's it going?
Nikolay: Good, good.
How are you?
Michael: Yeah, keeping well, thank you.
Nikolay: Yeah, so a long time, obviously.
We skipped last week because of me doing some stuff at work and
with family, so couldn't make it.
Thank you for understanding and I'm glad we are back.
Michael: Yeah, and no complaints from listeners, so thank you
everybody for your patience.
Nikolay: Nobody noticed.
They are still mostly catching up, we know.
Yeah.
Yeah.
Somebody shared on LinkedIn that like, oh, I discovered PostgresFM
and while I was walking the dog, listen to all the episodes.
It took me like almost 2 weeks.
Like, okay, once 1, 164 episodes during 10 days it's challenging
right it's like 16 episodes per day
Michael: yeah that's binge watching
Nikolay: Yeah It's you need maybe 2x or 3x speed and yeah, it's
insane.
So This topic we have today.
It's not my originally.
It's from Maxim Boguk.
I Probably mentioned him a couple of times.
He always has very interesting ideas and approaches to solving
hard problems.
And, I remember he said, he told me it was maybe half a year
ago, maybe more.
He said, I squeezed more than 10 terabytes per
hour when copying data directly from 1 machine to another.
And for me, the standard was 5 years ago, it was 1 terabyte per
hour.
I think I mentioned it a few times.
We had an episode about how to copy Postgres from 1 machine to
another.
And 1 terabyte per hour, it's like this.
Okay.
Modern hardware, we know there's a throughput for disks, throughput
for network, throughput to disks on the source and on the destination,
both matter here, parallelization, and SSD is good with parallelization.
So basically, golden standard for me was 1 terabyte per hour.
And if you need to create WAL-G or pgBackRest backup, it was 1
terabyte per hour at least.
And then we saw like 2, 3 terabytes if you raise parallelization
with modern hardware.
But it was still about not local
NVMe, but about EBS volumes
or PD SSD disks on Google Cloud,
persistent disks on Google Cloud.
And so like traditional disks.
By the way, those are also improved.
EBS volumes are pretty fast these
days and the Google Cloud also
has hyperdisks which are impressive,
but not as impressive as
local NVMe disks, right?
Maxim told me that, of course,
he was using local NVMes.
It was some self-managed Postgres
setup with special machines,
so local disks and no snapshots
like in cloud and so on, and
backups to where?
To different machine or to S3?
It doesn't matter here.
The idea was how fast we can provision
a replica, because sometimes
we need it, for example, 1 replica
is down or we reconfigure
something, we need to build a replica.
And normally if we have cloud disk
snapshots, these days we use
cloud disk snapshots.
By the way, there is great news.
Andrey Borodin implemented in WAL-G,
Andrey is a maintainer
of WAL-G, implemented native
support for CloudDisk snapshots
in WAL-G as an alternative to
backup push, backup fetch.
So instead of full backup or delta
backups, you can rely on snapshots,
AWS, GCP, or others provide.
Michael: Yeah, I saw this.
Is it a draft PR at the moment?
Or what's the status of this?
Nikolay: It was just created a
few days ago, early.
It was vibe-coded mostly.
Andrey is a great hacker, but I
think our hacking Postgres sessions
contributed to this idea.
Let's do more code writing with
LLMs.
But of course, the review will
be thorough.
I think it's already happening.
I saw many comments and the result
will be improved and so on.
So no hallucinations will be allowed.
Right.
So all covered with tests and so
on.
Anyway, there is this idea that
instead of full backups, we can
have snapshots, but at the same,
like I see like 2 big trends.
1 trend is, okay, if we have snapshots
for disks, let's use them
more and more and more and rely
on cloud capabilities, depending
on cloud.
Of course, such backups need testing
and so on.
In parallel, there is an old idea,
we should not be afraid of
using local NVMes.
We know they lose data if the machine
is restarted.
Ephemeral disks, right?
But it's okay.
These days, if a machine is having issues, we usually just re-provision
it anyway.
And switch over or failover doesn't require restart, so it's
very seamless.
It means that we can live with local NVMes, right?
And additional momentum to this idea recently was created by
PlanetScale, which came to Postgres ecosystem and started showing
very good looking pictures in terms of benchmarks and real latencies
from production systems.
Of course, you cannot beat local NVMe disks with their million
to 3,000,000 IOPS and amazing throughput, like more than 10 gigabytes
per second writes and reads.
It's insane.
Because for regular disks we have 123 gigabytes per second only
and that's it.
And IOPS, 100 to 100,000 IOPS maximum.
Michael: And I think for me, the critical part is if we have
an HA setup with failover in place, then we don't need the durability
of those disks, that we don't need there to be 6 cloud backups,
you know, so because we're gonna fail over anyway So yeah, it's
that that insight and I think they did a good job of making that
very clear If you've got an HA set up we can make use of the
NVME drives.
Nikolay: Yeah, by the way, I recognize I'm mixing throughput
and latency.
Anyway, local NVMes are good in both aspects of performance and
like almost or sometimes full order of magnitude better than
network attached disks and so on.
But you don't have snapshots.
So 2 big downsides, you don't have snapshots, and there is a
hard stop in terms of disk size.
If you plan to grow 200 terabytes, you definitely need something
like sharding, or anyway, some way to split.
Only if you have it, if you master it, then in this case, local
disks are great for, for like long-term planning, but for long-term
planning, Everyone should remember also that EBS volumes and
PD SSD or something, hyper disks are limited to 64 terabytes
and both CloudSQL and RDS, this is hard limit for them as well,
64 terabytes.
This is cliff or wall, very hard 1.
And Aurora has 128 terabytes, double of capacity.
But anyway, if you self-manage, you can use LVM, of course, to
combine multiple disks, and I think some companies already do
it.
But in the case of local disks, it's really hard stopping, nothing
to combine, unfortunately.
Anyway, you usually combine multiple disks when you have terabytes
of data.
In my benchmarks, in our benchmarks, we combine multiple disks
on the i4i instances.
So there is a hard stop in terms of capacity.
I think on i4i it's 40 terabytes or something like this.
It's impressive, right?
So if you think you won't grow to that size very soon, it's a
very great alternative to consider if you go self-managed or
some self-Kubernetes managed setup compared to RDS.
Because it gives you an order of magnitude better performance,
right?
Michael: Yeah.
By the way, when you say our benchmarks, do you mean in the blog
post you did with Maxim?
Nikolay: Yeah, yeah.
So the original idea was by Maxim.
He did the majority of initial work, created the recipe.
I will tell the recipe, like why it existed.
But the idea was we need to copy from 1 machine to another as
fast as possible.
We don't have snapshots because it's local disks.
Either fully like maybe your own data center or it's these instances
with local disks like i4i, i7i.
I remember I explored this idea very long ago, 10 years or so
maybe, yeah, 10 years maybe ago with i3 instances at that time.
I really liked the cost of the final solution because the disks
are included, instance costs are slightly higher, and performance
is great.
So we did a lot of benchmarks using i3 instances, even spot instances,
I remember it was great.
So you get very cheap, very powerful machine and do with Postgres
many things.
So anyway, the goal was how to clone to provision a replica,
for example, or to provision a clone for experiments.
Michael: Yeah, I wanted to ask about this because in the benchmark,
for example, you turned off checksums.
And it was interesting to me, Because if you're provisioning
a replica, presumably you wouldn't do that.
But if you were doing it for some other reason, maybe you would.
Yeah, so what are the use cases here where this would make sense?
Nikolay: Well, if it's a clone for experiments, we don't need
to be super strict.
And this option, I think, exists mostly for backups to make them
super reliable and so on.
Of course, and also, you know what, this is an extra feature
that BackRest has, which it's actually a luxury to have it.
So yeah, of course, it would be good to see experiment with checksums
enabled of course results should go down but for some cases it's
a it's appropriate to disable that that check
Michael: that's a good point actually so if we don't have it
it's not Is there a risk of introducing corruption, or is it
more that we persist...
Like, if our primary is corrupted,
Nikolay: is
Michael: it both?
Nikolay: Let me ask you this, if
you copy data using pg_basebackup,
Who will check this?
Who will do that?
Michael: Yeah, good point.
Nikolay: Or you use rsync.
Traditional ways are rsync and
pg_basebackup.
This is the most traditional way.
Both are single threaded.
That's the point.
They are single threaded, so, and
there is no checksum.
This is an extra feature.
It's great that DBLab implemented
it.
And it would be good to compare
how it affects this amazing performance.
But anyway, the original goal was
to be able to copy data with
traditional way, not involving
backups this time.
Because usually, we basically,
like normally, If not snapshots,
we just fetch from backups using
WAL-G or pgBackRest.
This is also a very fast way, and
there you can control also
parallelization, you can provision
quite fast.
And S3 and GCS on Google Cloud,
they are good with parallelization,
like 16 threads, 32 threads.
You can speed up significantly,
you can increase throughput of
backing up or restoring from backup.
But in this case, OK, we don't
have those backups.
We just want to clone from 1 machine
to another, that's it.
And the problem is, pg_basebackup
is still single-threaded.
There are discussions and even
patch proposed to make it multi-threaded,
but apparently it's not trivial.
And I don't see current work in
progress, So I think it will
stay single-threaded.
So Maxim came up with the idea
that we could use pgBackRest node
to create backups in S3 and then
restore, because it's basically
copying twice.
If you have them already, good,
just restore.
But if you don't have them somehow
or cannot involve them somehow,
you just have 2 machines you need
to clone.
In this case, he just came up with
idea, let's use pgBackRest to
copy from 1 machine to another,
that's it.
It's not its primary job, right?
But why not?
And he told me, like I said, months
ago that he achieved more
than 10 terabytes per hour.
Yeah.
Which is great, which is like absolutely
great and impressive.
Michael: And multiple times what
you thought was, you know, considered
good.
Nikolay: Yeah, so I already, from
this fact and others, I already
recognize that Like 1 hour, 1 terabyte
per hour is not a golden
standard anymore.
It's outdated.
We need to raise the bar.
Definitely.
So maybe 2, at least, I don't know,
maybe more 3, 4, how much
like should be okay for modern
hardware for large databases these
days, maybe 5 terabytes per hour
should be considered as like
good because you, you see, you
build some infrastructure and
you see some numbers and we have
clients, not 1 client, many
clients who come to us and complain,
okay, We will backups take
like this amount of time, like
10 hours.
What's your size of database?
200 gigabytes, 10 hours, 200 gigabytes,
something is absolutely
wrong, you know, maybe like, let's
see where situation is, where
is the bottleneck.
And it can be disks, in many cases,
but maybe network, maybe
something else, mostly disks.
But anyway, this is not normal.
Sometimes it's software and lack
of parallelization and so on
but it's not okay to live with
these bad numbers these days,
right?
It's already...
We have very powerful hardware
usually.
Michael: Just to ask the stupid
question, what's the negative
impact of it taking 10 hours?
Nikolay: Well, it affects many
things, for example, RPO, RTO,
right?
You create backups, you recover
a long time, it affects RTO.
How much time, recovery target
objective, how much time do you
spend to recover from disaster?
You lost all the nodes, how much
time to get at least something
working, at least 1 node working.
If it's 10 hours for 200 gigabytes,
it's a very bad number.
You need to change things to have
better infrastructure and software
and so on.
And This is 1 thing.
Another thing is when you upgrade,
for example, sometimes you
need to clone, our recipe involves
cloning, a recipe for 0 downtime
upgrades.
And if it takes so many hours,
it's hard to think what will happen
when you will have 10 terabytes?
How many days will you need, right?
So it's not okay.
This requires optimization earlier.
Michael: And also for provisioning
replicas, a certain amount
of time taken is always going to
be acceptable, I guess, because
it's not the primary, right?
Like we've maybe
Nikolay: lost...
It depends.
If it's only primary you are out
of replicas is dangerous even
with the most it's I would be running
with 2 nodes I would spend
as much as less time as possible.
Michael: So when you say 2 nodes,
do you mean with 2 replicas?
Oh no, okay, so 1.
Nikolay: Primary and 1 standby,
it's already degraded state.
We need a third node, right?
To be in, with 3 nodes, you have
how many nines, 12 nines or
so?
If you check the availability of
EC2 instances and just think
about 3 nodes and what's the chance
to be out of anything, this
will be I think 12 nines or something,
as I remember.
Michael: That's a lot of nines.
Nikolay: Yeah, yeah, and that's
great.
So it means like almost 0 chances
that you will be down unless
some stupid bug which propagates
to all the nodes and puts Postgres
on knees on all nodes simultaneously.
Anyway, if we integrate it, and
especially if it's just 1 primary,
it's a very dangerous state.
Like, I like, we have cases in
some clients who run only 1 node
and like, we discuss it with them,
it's not okay, but somehow
these days it's, they just survive
because the cloud became better,
you know, like it doesn't die so
often.
Nodes don't die so often as before.
Michael: I also think there's this
phenomenon where if you're
in US East 1 and you go down there's
so many other services that
are down at the same time that
you kind of get not a free pass
but like they get away with it
a little bit more.
Nikolay: You're in the club right?
Michael: No no no
Nikolay: you are down others are
down like we are we are in the
same club.
Michael: Yeah, I'm not in that
club, but yes, I think a little
bit.
Nikolay: No, I think, yeah, we
have also customers who got in
that trouble as well, and they
seriously think it's not okay,
and they need more region set up,
And I think this drives right
now improvements in many companies
and infrastructure and so
on.
Michael: I'm glad to hear that.
Nikolay: Yeah.
Anyway, and with our 1 node is
dangerous, like I would provision
more nodes sooner.
And in some cases you cannot leave
with 1 node.
If you already relied on distributing
read-only traffic between
among multiple nodes, 1 node won't
be capable of serving this
traffic.
Michael: Yeah, so we've established
speed matters.
Of course.
Sorry, time matters, therefore
speed matters.
How do we go faster?
Nikolay: Yeah, so pg_basebackup,
I expected like, okay, it's single
threaded, so I expect something
like 234 hundred megabytes per
second maximum.
And this is what also my expectations
fully matched with what
I saw Maxim shared with me.
But when I started testing I took
2 nodes i4i, it's not the latest
it's a third generation Intel scalable
Xeon which is quite outdated
they have already fifth generation
i7i so 2 nodes 32 cores it's
128 vcpus more than terabyte of
memory I think both, and 75 gigabit
per second network.
And the disks are like, I think
it's 3 million IOPS I remember,
I don't remember how many megs
per second, gigabytes per second.
It's definitely somewhere like
10 or more gigabytes per second
disk throughput maximum.
I think 8 disks already in RAID.
I took Ubuntu.
I installed Postgres 18 and this
was my mistake.
Because this gave me very good
speed for pg_basebackup.
Unexpectedly.
I saw gigabyte per second.
And I was like, oh, what am I doing
wrong here?
Because it's too fast.
And then I realized it's Postgres
18 improvements, right?
io_uring.
Michael: Well, yeah, I saw you
said that in the blog post.
But did you run it deliberately
with io_uring on?
Nikolay: It's by default.
Michael: Well, no, it's not io_uring
by default.
But it is using something similar
it's using something similar
it's like
Nikolay: that's some pre-fetching
or something but it's definitely
nothing
Michael: yeah well it's it does
have 3 work 2 or 3 work processes
by default, and it is an asynchronous
I/O thing.
Nikolay: No, I didn't change that
setting.
So this is probably an accuracy
in my blog post.
Michael: But maybe not bad, because
it might still be AIO.
It might still be asynchronous
I/O, just not io_uring.
Nikolay: So, right, there is setting
I/O method, right?
And default is worker, right?
Yeah.
And it's asynchronous I/O using
worker process.
Yeah, it's not io_uring, I need
to make correction.
Michael: But this is actually interesting,
because I think the
default is only 2 or 3 workers,
which means if you increase that
number, you might see triple, like,
Nikolay: 3 workers, you
Michael: got triple.
Yeah.
So you got roughly triple the
Nikolay: exact in my expectations
for 300 yeah
Michael: but if you increase that
number if you increase the
number of workers you might be
able to get even more
Nikolay: doesn't mean I need to
spend a few hundred dollars again
for these machines?
I guess so.
Michael: Or an exercise to the
reader.
Nikolay: How it worked.
I started them at 6pm or something.
I worked with them.
Cursor did a lot of work.
Like I just explained control that
I connected T-Max, Iostat,
IOTOP, everything.
I see how many threads, everything,
like htop, many things.
So I see that it's doing work as
I do.
Many iterations to Polish approach.
And then I realized, okay, it's
already 9pm, 10pm, but I cannot
drop it because it's already provisioned
and it took time to
create 1 terabyte database.
I had comments from Maxim as well
at 1 terabyte is not enough.
Like, okay, I agree.
I should do 10 terabytes.
So I guess I need to redo this.
I have homework to do.
Michael: You don't need to redo
it.
It's just an interesting-
Nikolay: It's interesting to me
as well.
Because I also want to see Postgres
17, you know?
Yes.
And different settings here.
I think it's interesting how pgBaseBackup
can behave with various
settings.
More workers, io_uring as well,
sync as well, right?
Synchronously.
So not asynchronously like here,
but synchronously.
It should go down.
The throughput should go down.
Should go
Michael: back down.
Nikolay: Yeah, yeah, yeah.
So now, obviously, you just make
me do another round of this
experiment right
Michael: well sorry but I also
think like this someone else could
do this right someone else could
do this
Nikolay: yeah I I will have it
in my to-do list, but if someone
who is listening to us is ready
to repeat, maybe they have already
some good machines.
Unfortunately, I don't have credits
anymore.
Like I did in my company, had a
lot of credits, but not now.
Maybe someone has credits and can
provision big machines and
just avoid extra spending Because
I think a few more times it
will be already $1,000 just to
check this setting.
And I'm not interested in checking
on smaller machines and then
like extrapolating.
No, no, no.
It's boring and not serious.
So it should be big machines.
I would also check i7i Because
they have 100 megabit per second.
It's versus 75.
So also extra I Think we can exceed
10 gigabytes per second And
so what I've got with pgBackRest
with pg_basebackup 1 gigabyte
I think you are right in terms
of 3 workers.
This is why, right?
Interesting to check more workers.
With pgBackRest, I increased
parallelization and my blog post
has this like the Graph showing
how throughput was growing with
more and more workers and then
situation happened on that network
So I achieved 36 terabytes per
hour And that was also exceeding
my expectations.
I was hoping like be close to 20
maybe, right?
But this machine's,
Michael: yeah.
It sounds fake to me.
It just sounds like not believable.
But it obviously is impressive.
Nikolay: Yeah, so i7i with 100
megabit per second should give
maybe more than 40, right?
Maybe approaching 50 TB per hour.
This raises the bars and expectations
what we should have in
our systems these days.
So it should be normal, 5 terabytes
per hour should be normal
now, I think.
Answering my own question some
minutes ago, right?
10 terabytes should be not surprising
already, if you have local
disks.
With BS volumes, different.
Michael: Yeah, for me, I guess
the surprising thing is that we
can saturate the network.
Like, it then becomes about your
network, right?
I think.
And the number of cores.
Nikolay: Yeah.
Yeah, if you saturate network,
the idea is of course, let's try
compression.
So I did.
And compression improves things,
but, so it shifts the saturation
point, but it's like quickly saturated
again, and it was not
helping to achieve more.
Maybe I saturated the disks there
actually.
So if you look at the picture...
Michael: I'll pull up your blog
post as well.
Yeah, because you hit a peak around
32 parallel processes.
Nikolay: Ah, also SSH overhead.
I've got good comments on Twitter
that it would be good to check
TLS.
It would require effort to configure,
but it should also reduce
some overhead and throughput should
be improved.
Wow.
So I guess 36 terabytes per hour
is not the current limit for
modern hardware in AWS.
We can squeeze more, right?
So we can probably squeeze up to
50 terabytes per hour.
There are several good ideas, and
also squeeze from pg_basebackup.
So this is a competition, right?
So my intention was to show like
pg_basebackup is bad, single-threaded,
forget about it, here's the recipe.
Now with Postgres 18, it's not
that bad.
It can be tuned additionally, as
you say, probably, right?
And we can have very good speed
with it as well.
So it's interesting.
And 1 gigabit per second is more
than 3 terabytes per hour.
Michael: Yeah, wow.
Nikolay: Yeah, 3, 600 gigabyte
per hour, right?
It's more than 3 terabytes.
So it's already like good enough.
And these are default settings.
So the answer, should you use pgBackRest
Recipe?
Well, it depends.
With Postgres 18, with older Postgres,
I definitely think you
will be limited by 3, 400 megs
per second.
That's it.
Because of single-threaded nature
of the base backup.
And I like this a lot compared
to the tricks we did in the past
with rsync.
rsync is also single threaded and
I really don't like that.
Why is it so?
It should be implemented.
People use GNU parallel and so
on, but it so feels cumbersome,
right?
So it's not a lightweight approach
to write some things and then
control the threads and so on.
With rsync, it's definitely single-threaded.
pg_basebackup in Postgres 18 basically
shines compared to rsync.
And also it orchestrates all the
stuff additionally like like
connecting to the primary doing
like telling it I'm copying you
right pg_start_backup, pg_stop_backup
basically right so you
don't need to remember all those
things and it's also standard
officially.
Michael: Although I would say I
was looking because of researching
for this I was looking at the
pgBackRest docs and GitHub repo
just earlier today and it's incredibly
well maintained like I
was surprised to see
Nikolay: yeah we're moving to David
Steele, yeah it's great
Michael: yeah kudos to David but
also just in terms of viewing
a project for the first time in
a while I hadn't seen I hadn't
like looked into it in detail for
a while.
And I think I was just a bit surprised
when I looked at an open
issue.
I was like, there's only 54 open
issues?
Like that's, for the size of the
project, for the age of the
project, That for me seems like,
and by the way, I'm not taking
that number on its own, looking
at what they were, like some
of them are like feature ideas
from a few years ago that are
just left open because they're
still good ideas.
Some of them are new things.
Nikolay: That's good because there's
approach when people just
close if it's like, okay, no comments
for a month, let's close.
And I absolutely hate it because
if you reported a bug which
is not fixed, CloudNativePG, hello,
like they just closed because,
right?
Because it's a little bit of a
discussion, but bug is still there
and everyone around agrees it's
still there.
Why do you close my issue?
It's a bug report.
So next time I won't write anything,
right?
And I told it everywhere.
And in CloudNativePG, I'm not
going to contribute anyhow else.
So this is like, let's just keep
issue set lean, right?
It's not okay.
If it's a serious problem of idea,
it should be there.
So the fact that there are all
the issues there is great.
Michael: Yeah, yes.
So all I meant is, you kind of
get these, Sometimes you get red
flags when you look at a new project
or like if you if you see
a project And it's got 20,000
open issues for me That is a red
flag like not that it's that there's
lots of issues But that
no one's maintained no one's closing
the ones that are duplicates
or things.
So it was more just like, I wasn't
getting red flags and then
there were loads of green flags
looking at how like recent PRs
and it just, it seemed like a route.
So I know it's not in core, but
it does still feel like a very
tried and tested product that's
really well maintained.
And I know a lot of people are
using it.
So it feels pretty much as close
to core as like a...
Nikolay: What if you see hundreds
or thousands of issues open,
but many of them are recent and
there is activity.
Not well maintained or a mess.
Michael: Spam issue?
Like how can you create hundreds of thousands of issues?
I just haven't seen a repo like that so maybe I have to like
reassess but yeah it's yeah.
Have you seen a repo like that?
Nikolay: I have such projects.
Oh, okay.
So it's just, yeah, there are some old issues that should be
closed, you know, yeah.
Yes.
Well, If you check, well, so I agree this is a great example
of well-maintained open source project.
Absolutely great.
I agree.
I think if you check some internal repositories in companies,
sometimes it's a mess as well.
Michael: And sometimes it's beautiful.
I think it depends a lot on the maintainer
Nikolay: what can you tell about Postgres itself yeah
Michael: well we don't have an issue track
Nikolay: do yeah no issues no no issues right well like
Michael: we should get that would be a good t-shirt.
No issues.
No issues All right, let's we get back to the topic
Nikolay: back to the topic so David Steele reached out to me commenting
and it's great So some inaccuracies in my blog post to fix.
And also, obviously, it inspired him to, as I understand this
idea was already for quite some time to have the feature to basically
issue renice, to reprioritize some processes, right?
So now there is a pull request to be able to change priority
of pgBackRest processes, which is great.
Michael: Yeah, so this is super cool.
So this is the idea that because you're running this on production,
you might not want to, you know, let's say you're quite high
CPU on your primary.
Nikolay: Compression.
Yeah.
You use compression, so you need CPU for pgBackRest, for example.
Michael: Yeah.
But you don't want to affect your current traffic, right?
It reminds
Nikolay: me of old days, like 2003 or 2004, when we also...
It's so strange, like, it was...
We couldn't think about terabytes those days, 20 plus years ago.
It was too much.
Terabyte is a mind-blowing number.
But we used, I remember, renice, it is renice, ionice, something,
I barely remember from those.
Michael: Just NICE, yeah?
Nikolay: Yeah, NICE and so on.
It's strange that this word is used to change priority, right?
I don't know.
Do you know details here?
Yeah, no, I don't.
Yeah.
Yeah.
So I remember from those days we used it, but also I remember
I was struggling to see good effects.
So now if this pops up again, I would check, test it properly
again, how exactly it helps, you know, but this is in to do.
Michael: And really cool that your, that these benchmarks helped
prioritize that work.
So cool.
Nikolay: Inspires to do more benchmarks as well.
Michael: Always.
I get stuck in that loop sometimes, not just with benchmarks,
but you know, you think you've used to implement a feature, then
you think of a better way to implement it, then you think of
a better way, and you think of a better way, and you get stuck
in that loop of constant, yeah, or blog posts.
Nikolay: Yeah.
This post is definitely not the final.
So somebody should achieve 50 terabytes per hour.
This is a challenge.
Yeah, I like it.
Yeah, Who can do it?
Who will be the first in the Postgres community reporting 50TB
per hour?
Michael: Yeah, let us know.
Although, I did see there was an issue that David linked us to,
where somebody maxed out a 100-core machine, somebody maxed out
a 100 core machine and it convinced him to increase the, he had
a process max at 99 before then.
So I thought that was quite funny that, you know, there was a,
the limiting factor was this hard-coded yeah limit
Nikolay: number of maximum number of vcpus on AWS already I think
has it exceeded already of 1000 or no I remember 700 vcpus with
also fifth generation into 0 scalable so it's already like just,
we have a lot.
Michael: Bear in mind your test maxed out at 32 processes, right?
Right, but this 1...
Nikolay: Bumped into network.
Michael: Yes, okay.
Nikolay: And then when I increased compression it shifted, right?
So it shifted to 64 cores.
Michael: So I'm wondering what do you think the person in the
open issue back in 2019?
Maxed out their hundred core like what do you reckon their?
Throughput was
Nikolay: I Lower it's interesting, but but maybe you know like
we need compression not only to battle network throughput limit,
but we also, if it's actual backup,
we want, sometimes we want
it to take as much, as less space
as possible in S3.
Right.
Just to pay less, maybe.
Right.
In this case, we can have aggressive
compression and CPU consumption
high.
In this case, it can be 128 cores,
for example, third generation
of Intel Scalable Xeon has 128
cores maximum, I think, at least
in clouds.
It's N2 on GCP and I4I, as I used
on AWS.
But I wonder a lot, how come AWS
created 7 or 800 VCP use machines?
It's like, wow.
So like 10 plus terabytes of memory.
Wow.
Yeah.
Michael: Yeah.
By the way, when you keep saying
there was the Google disks earlier,
I think you were saying that every
time you said that, all I
could hear was PTSD.
PTSD.
Nikolay: PD, SSD.
System disk, SSD.
Yeah, yeah.
I
Michael: heard PTSD.
And then when you're saying eye
for eye, all I can think of is
the saying, an eye for an eye in
the world is blind.
So now...
Nikolay: Eye for eye.
Michael: Yeah.
Nikolay: Oh, by the way, 1 interesting
thing, like EBS volumes
these days also are built on top
of, NVMes.
I don't know how's it called, Nitro
architectural.
It's always the new name.
I don't follow the, all the terms,
but I remember 2 gigabytes
per second and even more.
Right.
So I think we can squeeze a 5 to
6 7 terabytes per hour on modern
EBS volumes as well
Michael: even on EBS Wow
Nikolay: yes yes yes so so forget
about 1 terabyte per hour
Michael: it's old you can do better
Nikolay: yeah definitely better
if you have a large database,
do better.
Michael: Nice.
All right.
Good.
Thanks so much, Nikolay.
I enjoyed that.
Take care.
Nikolay: Bye bye.