Compression
Nikolay: Hello, hello.
This is Postgres FM, episode number
107.
My name is Nikolay, founder of
Postgres.AI, and as usual, my
co-host is Michael, pgMustard.
Hi, Michael.
Michael: Hello, Nikolay.
Nikolay: So, you chose the second
most boring topic, in my opinion,
after security.
Tell us what it is.
Michael: Yeah, I can blame our
listeners for this one.
We had a… In my opinion…
Nikolay: Blame someone else, right?
Michael: Yeah, exactly.
Always.
That's the first rule.
Nikolay: This is my favorite methodology
in troubleshooting,
of incidents.
Michael: No, blame this culture,
right?
So we had a great listener suggestion
to talk about compression.
And I guess it's kind of surprising
we haven't covered it yet.
We've covered various topics around
this, but they are specifically
in the context of a couple of extensions
that offer compression
options.
I thought it was a good topic to
cover generally.
We can talk about all the different
things where compression
is available in Postgres.
Kind of a broad but shallow topic
maybe this time.
Nikolay: Yeah, we will talk about
physical, like not lower level
of compression, not like modern
ways to compress data when you
tell some LLM, I have a lot of
data in this table, in this column,
it's a huge chunk of text, let's
just summarize them and drop
the detailed texts and so on, right?
It's actually my favorite way to
compress.
Michael: Well, that's a good point,
actually.
I guess I assumed automatically
we were talking about what would
be called lossless compression.
Yeah, so like with images.
Nikolay: This is definitely lossy.
This will lose some details for
sure.
And it will use some details that
don't exist.
Not only it's glossy, it's also
glossy, I guess.
So,
Michael: Does it stand for lossy
lying machines?
Is that what they stand for?
Lutsky
Nikolay: Yeah, yeah.
So we talk about lossless, lower
level, transparent compression
when we just enable something and
data suddenly takes less disk
space or space when we send a network,
like sending something
in transit, so to speak.
And we are able to uncompress it
without any losses.
But it's all fully automatic and
users just use it, enabling
some features probably.
This is our focus today, right?
Michael: Yeah, exactly.
And I think that's a good transition
into kind of the 2 main
benefits.
And I think they're obviously related,
but I think they are somewhat
different and probably in some
ways trade off against each other
a little bit.
1 is compression for the sake of
storage.
So if we have very repetitive data
or data that compresses really
well, we could maybe spend a lot
less money and less resources
by compressing it.
But so storage is obviously a big
1.
But there's also the performance
side of it.
If it can take up less space, it
might be, depending on where
our bottlenecks are, it might be
that overall the cost or speed
degradation of compressing and
uncompressing or decompressing
the other side is still faster
if we've had to only transport
a lot less data or do calculations
on a lot less data, that kind
of thing.
Nikolay: Let's maybe start with
things we have for many years
in Postgres.
And then discuss some new stuff.
And then discuss and compare what's
beneficial, what's less beneficial.
We have, first of all, compression
at WAL level for full page
writes, full page inserts, right?
FPIs.
And full page inserts need to fix
the problem of difference between
the size of buffers, the pages
in memory, 8 kilobytes in most
cases, and the size of block and
file system, very often 4 kilobytes,
x4 for example, right?
And to avoid partial writes in
the case of failures, Postgres,
after each checkpoint, if the buffer
was changed for the first
time until the next checkpoint,
Postgres doesn't write only the
change itself, it writes the whole
buffer.
And if by default WAL compression
is not enabled, it means whole
buffer 8 kilobytes is written as
is to WAL, consumes 8 kilobytes,
right?
But if we enable WAL compression,
this page is compressed.
And in my opinion, in most cases,
if we talk about significant
load, we should consider enabling
WAL compression.
It can be beneficial, especially
if you have short distance between
checkpoints, because in this case
you have more frequent full-page
writes happening.
Of course if you increase distance,
then maybe the same page
is becoming dirty.
It means writes happen inside it
many times, multiple times,
and only for the first change it
will be full page write.
Subsequent changes will be written
only like deltas, right?
Only tuple, which was changed.
But if you have quite frequent
checkpoints, Enabling WAL compression,
you can significantly reduce the
size of WAL written.
And this has very good positive
consequences, such as less data
to write to backups.
WAL archiving, archive command
will archive less bytes, fewer
maybe like gigabytes per hour,
for example.
And second, replication as well.
WAL is smaller, so replication
has less to transmit over the network.
Of course, compression needs CPU
cycles and decompression needs
CPU cycles.
And I saw on Twitter, some people
mentioned they had problems
when they enabled WAL compression.
But in my cases, I observed we
always decided to switch it on
and CPU overhead was worth it.
So there is a trade-off here, CPU
versus I-O, and we always chose
in favor of less I-O.
And with synthetic tests, I just
showed you before we started
this recording, I showed you we
had a simple PgBench experiment
with max WAL size 4GB, checkpoint
amount 15 minutes.
We saw WAL reduction, I think,
on regular PgBench workloads,
which include writes, of course,
inserts, updates, deletes.
We saw 3 times less WAL created,
generated, or written, it's
the same, right, when we enable
WAL compression, which is a
huge benefit, 3 times less WAL.
But if you had the checkpoint tuning,
as we discussed in some
of our last, like, previous episodes,
if you had it and distance
between checkpoints is quite large,
especially if you have Patroni
and modern Postgres, which in case
of failover or switchover
doesn't require restart of all
nodes, you can afford a bigger distance
between checkpoints and have a maximum
size set of, I don't know,
like 100 gigabytes.
In this case, it's unlikely the
benefit will be so huge, not
3x.
Of course, it also depends a lot
on the data nature, right?
If it's very repetitive, it's easier
to compress.
Michael: Well, yeah, I mean, that's
pretty true across the board
for compression isn't it?
On the WAL front, 1 thing I think
we probably didn't talk about
when we talked about WAL and
checkpoint tuning is on the compression
side as of Postgres 15 which I'm
guessing is pretty new so you
might not have this kind of in
the wild experience yet, but we
do have more options now.
So in the past, we could only turn
WAL compression on or off.
And now we have, instead of on,
we have 3 different options.
We have the previous option, which
is the PGLZ algorithm, but
we also have LZ4 and ZStandard
options now.
And for the people that had, that
kind of complained about the
CPU side of things, 1 option might
be, LZ4 I believe would be
less CPU intensive.
It might not be faster.
Huh?
Nikolay: Faster and more lightweight
in terms of CPU consumption.
But the compression ratio is worse.
Michael: Well, it's not that different
to PGLZ actually.
It's yeah, but compared to some
other modern compression algorithms,
it tends to lose.
But compared to PGLZ, you don't
lose much.
I think it is slightly worse on
average, obviously depends a
lot on the data.
But I think it's a pretty good
option if you have that issue.
And if you're testing, turning
this on, on a modern version of
Postgres, worth trying at least,
whether, depending on your constraints,
trying those different new options.
Nikolay: Yeah, that's a great point.
And 2 ideas here I have in mind.
First of all, I remember these
options, the standard and LZ4,
they require compile flags to be
turned on.
I guess, as you rightfully said,
I don't have a lot of...
It's fresh things, so I don't have
production, rich production
experience.
But these flags, I guess they are
turned on in the official apt
packages, right?
If you just install it on fresh
Ubuntu.
I hope it's so, I hope it's so,
but worth checking.
This option should be turned on.
So Postgres should be compiled
with support of these 2 algorithms.
And second thing, I think it's
worth new research.
Maybe with our bot it should be
quite easy, and we should just
research maybe with different workloads
and maybe with different
Maxwell size.
As I said, it's very important
in this case, the distance between
checkpoints.
And just with regular PagerBench
we could check all available
algorithms on fresh Postgres versions,
and maybe draw some charts,
right?
Plot some charts.
So yeah, we just like hands, but
bot is definitely ready for
that.
So I think it's a great direction.
I wish I had more hands, by the
way.
Like I wanted to say we are looking
for maybe part-time database
engineers, people who want some
fun with this kind of work, like
research.
We usually do it public, so it's
kind of interesting for community
as well.
So if some people listening to
this podcast want to participate
and work part-time with us with
Postgres.AI, definitely would
be interested in discussing.
So maybe we will have more hands
and do this benchmark, for example.
Michael: What's the best way for
them to get in touch with you?
Nikolay: Email or Twitter
or LinkedIn, but email
nik@postgres.ai is always a good way
to contact me.
Michael: On the packaging front,
I don't know, I haven't checked
whether they have compiled Postgres
with those flags, but I was
pleasantly surprised.
I'm on Google Cloud SQL for my
own, for the pgMustard database,
and recently upgraded to Postgres
16, finally.
And was pleasantly surprised that
I was able to turn on WAL
compression with LZ4 and TOAST,
I was able to switch our default
TOAST compression algorithm to
LZ4 as well, which is really cool.
Nikolay: On 16 you said?
Michael: I'm on 16, but I think
those are available as of version
15 for WAL and version 14 for
TOAST.
Nikolay: That's great.
So what benefits, like, do you
remember some numbers?
Michael: So in my, I didn't do
like extensive benchmarking, but
like the thing we use it for most
is we have saved plans.
So EXPLAIN plans compress really
well.
It's a lot of repeat.
Nikolay: Yeah, many.
Yeah, it makes sense.
So it's in JSON, right?
Michael: Well, both like text plans
compress well too, But JSON
plans compress extremely well.
But obviously JSON plans are bigger
in the first place.
So there's not...
Nikolay: Yeah, that's good.
Michael: So yeah, it compresses
really well.
But the main thinking was, we don't
mind spending a bit more
on storage for those if the speed
of retrieving them will be
quicker.
So people save a plan, they might
share it with somebody on their
team and that person needs to load
it.
Obviously, it's like we're talking
small amounts of time, But
if the storage wasn't that different
and the speed was faster,
I was happy to make the switch.
And yeah, it turned out the storage
was slightly worse on average
for the plans I tested with LZ4,
but the retrieval speed was
faster.
So I bit the bullet and did it.
The cool thing is, the thing I
didn't realize, I thought it would
be a complex migration.
But like, what do you do?
Like, I thought you might have
to change existing plans or existing
data.
But you don't.
If you change the setting, it applies
to new data.
Nikolay: Right.
Yeah, actually, this is a good
point we didn't mention.
So we're shifting to discussion
of storage, compression, and
host, right?
But it's a good point we forgot
to mention about wall compression.
It does require restart.
So you can switch it on, switch
it off, and all new writes will
happen according to the new setting.
Of course, you need to send the
SIGHUP signal or select PgReloadConf,
so configuration changes are applied
without any restart, which is
great.
And it also means it's easier to
try.
If you have monitoring, if you're
prepared to roll back, it's
easier to try.
And if things go wrong, you can
return.
So TOAST compression is interesting.
And so, again, like, sorry, I'm
not paying attention to details
today somehow.
You chose this standard, right?
Michael: I actually chose LZ4.
I was more interested in the speed
of compression and speed of
retrieval than I was for total
size on disk.
Nikolay: Meanwhile, I asked our
bot to double-check how apt packages,
official pgdg packages are created.
Of course, these 2 options are
there, so if you install Postgres
on Ubuntu, you can try various
compression algorithms.
Well, just to double-check.
Okay, so TOAST, what else?
We don't have a good ability to
compress table data, right?
By default in Postgres we don't
have it, And it's actually not
a simple topic in restore.
In restore, we have, yeah, so tuple
by tuple is stored and it
can go in different pages.
So like, If we compress it, we
can transparently compress it
switching to ZFS, for example.
We saw benefits in terms of disk
space, like 10 to 30 percent,
depending on the data nature.
But ZFS brings new challenges for
administration, for sure.
It's still good to shave off 30%.
And this is what you get by default
if you install DBLab because
it's on ZFS.
So if you have, for example, a terabyte
size database in DBLab,
it will look like 700 to 800 gigabytes
only.
It's much better.
But yeah, so But in Postgres itself,
there are no good compression
options for heap.
Michael: Except for single large,
like if you have 1 column that's
large, I think toast is very good,
but not for like, smaller
values.
Nikolay: Oh, actually, yeah, big
values are compressed in heap,
right?
Now I remember before going to
TOAST, like Postgres tries to,
to, to squeeze, like, to, to fit
them in size, like 2 kilobytes
and have 4, 4, roughly 4 tuples
per page.
Roughly, like it's like kind of
things I remember.
Michael: Once it's compressed,
it's under 1 kilobyte or
something like that, they put it
in line on the page.
Nikolay: Or in line.
Michael: Yeah.
It just means that if you have
values that are multiple kilobytes
and compress well, even the default
PGLZ will give you quite
a lot of compression out of the
box, transparently in Postgres.
It's just, if you have lots of
repeating small data, like time
series data, that could compress
as a whole very well, the row
store...
Nikolay: But this compression is
applied to a single tuple, single
row version.
So a single row version, it means
that we only have 1 timestamp,
for example, 1 temperature, and
so on.
Several different columns, maybe
a couple of timestamps, but
different nature of timestamps.
For example, created that and I
don't know, like registered that,
something different timestamp.
And Postgres tries to compress
the tuple and TOAST, like we didn't
cover and there's no goal to cover
it deeply.
We had another episode on it.
Right.
So Postgres tries to compress.
If it doesn't fit, it shrinks it
already and chunks are stored
in separate so-called TOAST table,
which is actually also a regular
table, which is kind of invisible
to user, but you can inspect
it if you want as well.
Then the compression and reconstruction
of tuple is occurring
when it's needed.
But what I'm trying to say is there is
no...
For the heap itself, there are no rich
capabilities to control compression.
And even if we had them, expected
benefits would be not high
compared to analytical column store
databases, where, for example,
we have all temperatures or all
timestamps stored in a separate
file.
Only this column for all rows is
stored here.
They all are temperatures and maybe
these temperatures are coming
from, for example, if they're coming
from a single source, they
don't jump.
I don't remember the word in English,
sorry, but the changes
are not acute.
If they change, for example, if
temperature is increasing, probably
it will be increasing for some
time.
It means that compression could
be done using various algorithms,
for example, applying deltas and
storing only deltas.
We will probably discuss TimescaleDB
soon because this is pointing
in the direction of TimescaleDB
and what it provides.
But so for ColumnStore you can
compress like 10x which is not
possible
Michael: for rows.
Yeah, or more even.
But I think you're missing the
multiples interesting, right?
But you've missed the fact that,
like, let's say we had a row
that is, let's say our tuples are
like a megabyte, but 99.9%
of that is a single JSON column,
and we have 10 other tiny little
integer columns.
We get way more than 10x compression
ratio just with TOAST, you
know, as if that block is as well.
Yeah.
So, I
Nikolay: don't...
As you know, probably, I, like,
when I was very young, I participated
a little bit in XML function and
data type development in Postgres,
and I remember that time XML compression
was the thing, big thing.
And I even remember hardware accelerators
companies were selling
to compress XML on the fly transparently.
It's crazy.
It's because a lot of structural
pieces of the values can be
compressed well in XML for sure,
but less than JSON, but still
also all those parentheses, braces,
quotes and so on.
Or maybe it's also some, if it's
JSON, not JSON B, a lot of white
spaces.
Right?
Yeah.
Yeah.
So, yeah.
If you have JSON and, yeah, so
in some cases, compression and
TOAST, like tuning it, it's interesting.
And maybe also we need benchmarks,
so maybe for JSON as well.
It's good.
Michael: Yeah, it
Nikolay: would be
Michael: interesting, but what
I mean is more like the compression
ratio is a lot, is varied, obviously
extremely dependent on the
data.
But I think
Nikolay: I'm very curious what
you have, for example, if you
know that some old values are not
compressed and some new values
are compressed, you check how much
storage is occupied by, I
don't know, by like 1000 rows or
so.
It's worth checking.
Michael: Well, you can't, it's
like impossible to store large
values in Postgres without, actually
no, you can turn compression
off, but TOAST, you can turn compression
off with TOAST, but
the default is on.
So I have never tried it without.
So I don't have that to compare,
but just give you an idea.
We didn't use to compress plans
in so we only store plans in
local storage by default, I can
browse local storage, and that's
an extremely limited resource.
Like in some I think in Firefox,
it's like 5 or 10 megabytes,
depending on the version.
So that's not very much when you're
talking large query plans.
Like, obviously, it's a lot for
most people's query plans, but
some people will come up with query
plans that are like, dozens
of megabytes.
Nikolay: Varlena data types accept
up to 1 gigabyte, right.
But as I remember, It's better
not to go beyond a couple of hundred
megabytes because the performance
will become terrible.
Michael: Well, just to give you
an idea though, quickly, these
plans were compressing from megabytes
down to kilobytes.
Like it was easily, yeah, It was
more than 95%.
I think in some cases, like, yeah.
Nikolay: That's cool.
But still, it's all compression
of only 1 tuple and actually
1 value, not tuple.
Michael: 1 value, yeah.
Nikolay: But It feels like it would
make sense to have some compression
at maybe at page level as well
and so on.
This is what ZFS provides transparently
if you put Postgres on
top of it, right?
So compressing pages, I think,
even if it's still a raw store,
maybe having these abilities...
I remember Peter Geoghegan talked a
lot about new features in MySQL
many years ago.
I think MySQL has more settings
in this area for compression
of storage.
Have you heard or remember something
from it?
We have mostly questions today,
right?
Michael: I didn't know that.
Well, they seem to be a bit further
ahead on the storage engine
side of things, don't they?
Like they've had multiple storage
engines for quite a lot longer
than...
Well, like we don't have...
But yeah, so it wouldn't surprise
me if they were ahead on that
front.
But yeah, you mentioned Timescale,
it feels like we've had a
history of a few extensions that
have offered the option for
columnar storage.
Nikolay: Before going there, let's
just touch a little bit on
options that Postgres could have,
and there are some discussions.
Like, what else could be compressed
inside?
For example, temporary files.
Yes, good idea.
Michael: But we should also probably
talk about things that we
actually haven't mentioned, like
backup files, pgdump files.
Nikolay: I know, this is already...
Yeah, I wanted to do it, but look,
before going to backups or
dumps, backups and dumps, it's
already going outside of Postgres.
But imagine we have Postgres, we
run some queries, we discuss
the WAL, which is the first thing
we should discuss because
data is written there first before
it's written in data files
or pages and so on.
So we discussed WAL, we discussed
the storage itself, TOAST,
this is what we have and that's
it.
Not at page level, but it can be
achieved by a file system, but
that's it.
Indexes, I have no idea.
Maybe deduplication, which was
done in Postgres 13, 14, but by
Peter Geoghegan, Anastasia Lubyannikova,
maybe this can be considered
as compression actually, right?
Michael: That's a good shout, yeah.
Nikolay: Native compression, you
know, like pieces of compression
because it occupies less space,
right?
It's kind of...
Why not?
So, But it's optimization, but
still.
And last time we discussed, what
was the topic last week?
Michael: Out of disk.
Nikolay: Out of disk, right.
And remember we mentioned, I named
it exotic and some listener
commented on YouTube, By the way,
thank you all those who write
something on YouTube or on Twitter
or LinkedIn.
It's very good to receive feedback.
So they mentioned that temporary
files is not that exotic.
And if you run huge, heavy queries
so they had it, you can be
out of disk space because temporary
files were huge and consumed
a lot of disk.
And the compression of them makes
total sense to me, right?
What do you think?
We don't have it, right?
Michael: I was watching a really
good video that you sent me
a link to by Andrei
Nikolay: Borodin.
Right, right.
Michael: Cybertech 2023.
Yes.
And I watched it last night and
he mentioned that in order to
be...
I think there was a problem with
durability.
If you want to be...
Nikolay: Forget
about problems.
This is like...
I'm just...
It doesn't make sense.
My answer is
Michael: yes, it does.
David Willis He started doing the
work and it's more complex
than it might sound.
But yeah, it makes sense, but it's
complicated.
Nikolay: Full disclaimer, I just
chatted with Andrei this morning
and this is what he told me and
I'm just shamelessly using his
ideas.
So it's worth compressing pages,
as I mentioned, and it's worth
compressing temporary files.
Andrei says it's just lack of time
and hands, but it's worth
implementing and proposing Pudge.
So, Yeah, it's a great idea.
I hope he will find time to work
on this and other people probably
will find time to move Postgres
forward in this area.
And in this case, imagine if temporary
files are compressed.
In this case, I would be more right
saying it's exotic to be
out of disk space when you have
temporary files occupying a lot
of disk space, right?
Michael: I think it makes extra
sense because the type of things
that are generating temporary files
like sorts and hashes, it's
probably a lot of similar values.
I'm guessing there would be a fair
amount of repetition which
would naturally suit compression
as well.
So I'm wondering if there's like
a...
Not only would it just take up
less space, and obviously there's
benefits there, but I'm wondering
if there might be performance
benefits too.
Nikolay: Yeah, that's a good point
actually.
When we talk about saving disk
space, sometimes it's also about
performance.
If disk is slow, It's better to
consume cycles of CPU for compressing,
decompressing, and things become
much faster.
I agree.
This is a good point.
But it's not always so.
It depends.
In general, I think I saw somewhere,
there's a general point
is that databases are I-O hungry.
And like, as you very well know,
we talk buffers, buffers all
the time.
It's just like confirming it's
all about I-O usually.
But I also saw CPU 100% and lag
CPU, especially for example,
we are on Graviton or like ARM
and we only have up to 64 vCPUs
and That's it.
These days it's not a lot, 64,
right?
It's like moderate number of…
Michael: It's a lot for some people,
yeah.
Nikolay: Right, right.
But it's not 360, it's not 800
like you showed me on AWS for
Intel scalable Xeon.
So if it's 64, a lot of clients,
in this case CPU becomes a very
valuable resource.
Right?
Michael: Of course.
Nikolay: Right.
So in this case, probably we might
be preferring to spend more
disk IO cycles.
Okay, what else can be compressed
in Postgres?
We talked about ideas which are
not currently implemented.
Temporary files, page level compression,
maybe protocol level,
which has pros and cons.
Maybe let's not go there, right?
What else?
Michael: Anything else in memory
that could be?
Nikolay: Oh, interesting.
Maybe.
I don't know, But I think if we,
for example, shifting maybe
to some additional projects, I
think we will talk about Hydra
and TimescaleDB.
If, for example, we have a storage
engine and we have some tables
defined as column store, this is
what Hydra provides.
In this case, it makes total sense
to compress those tables.
Or, for example, if we mentioned
last time, talking about disk
space, or we have raw store, but
we have partitioning and some
old partitions we consider as having
archived data, and this
data we want probably compress,
and maybe we want to store it
not on the regular disk, but on
object storage, and we discussed
PGT, Tempo developed.
So kind of bottomless Postgres,
but we want all data to be compressed
and rarely used.
Right?
Michael: Yeah, I saw 3 different
approaches from different extensions.
The Hydra 1 you mentioned, I think
came from Citus data originally.
C-Store FDW I think was the original.
Nikolay: It's AGPL from Citus inherited
by Hydra.
Michael: Yeah, so I think that
is very much like from the start
of a table's life, you choose whether
it should be row store
oriented or column store oriented.
That's 1 approach that makes a
lot of sense.
The timescale approach seems to
be...
Nikolay: And column store, of course,
we want to probably compress
quite a lot because the ratio usually
is good.
Michael: Well, yes.
But I got the impression that the
main aim originally with the
Citus ColumnStore approach and
therefore Hydra, was yes, you
get some compression for the storage
benefits But the main aim
seemed to be so that we can make
analytical queries faster So
again, so it's that performance
angle That was the seem to be
the driving force of why should
we want it to be column store
in the first place for performance
of analytical queries that
tend to be aggregates over a single
column.
And if we've got if we've got column
oriented data that's compressed.
Nikolay: It's massively a lot of
I.O.
Of course, we want to use this I.O.
And then already deal with it in
CPU, but also nature of data
is like a lot of similar looking
values and compression.
Michael: And I think once it's
organized in column, like by columns,
you can also start to store much
easier, like the metadata of
like min max values and do some
like shortcuts, I think on that
data as well.
So I think there's some like cool
tricks as well.
Now but But there's then I think
there's 2 other approaches that
I've seen.
1 is the Timescale approach, which
seems to be on older partitions,
like on older data.
Everything's raw store at the beginning.
And then after a certain point,
you set a policy that it gets
converted to column store later
once it's unlikely to change?
Nikolay: LBK I remember differently.
I remember it's always row store
but compression works in 2 dimensions.
Like for example, I'm not sure
if they converted to column store,
maybe I'm wrong, but what I remember
is still row store but with
understanding of vertical dimension,
so to speak.
For example, storing deltas instead
of row values and applying
compression.
Michael: Well, I mean, it's still
in Postgres.
It's still in Postgres, so it's
still row store under the hood,
but it's column-oriented, like
it's organized by column.
Like, if you look, the Postgres
docs are really good on this.
I'll share a link.
Sorry, not the Postgres, the Timescale
docs.
Nikolay: I like understanding here,
apparently.
Michael: But there's a third approach
as well.
I think that's over the Timescale
approach I think is more optimized
for space.
I think the compression is more
on the, let's make older partitions
take up less space because you
have so much data.
Like 20 times less.
Like 20, yeah, like some of the
numbers they mentioned, like
95% compression.
We have impressive
Nikolay: numbers observed, yeah.
It's really good.
Michael: But yeah, but that's the
idea is with time series data,
you could end up with hundreds
of terabytes like, and they've,
they have themselves.
So I think it's, it's the kind
of time where you could actually
save a lot.
Now, obviously they also benefit
from the performance on analytical
queries and they have some cool
features there, but it feels
like the way they've implemented
it was primarily for those storage
benefits.
And then the third one, I think,
has popped up relatively recently
in the grand scheme of things,
is this idea of, as you mentioned,
the tiering or like the moving
to a file like format like Parquet.
So exporting data out of Postgres
into a compressed format on
object storage that's normally
column oriented so that you can
get these fast analytical queries
and they take up a lot less
Nikolay: storage space.
There's some big limitations for
data types, I suspect.
Only a limited set of data types
supported for those kinds of
things.
Michael: And how do updates and
deletes work?
I don't actually know all of the
details.
Not
Nikolay: possible, I don't.
Michael: Yeah, so there's definitely
limitations and differences
between these.
A lot of, we're not gonna be able
to describe them all here,
obviously, time-wise.
But I found it really fascinating
that there's these 3 different,
quite different approaches.
What do you think?
Nikolay: Right.
Yeah, well, it's super interesting
to observe progress here.
I like, probably again, like, for
me, this episode raising more
questions than answers.
I think after this episode, I will
be planning more experiments
to study benefits.
And I guess we always need to take
into account several metrics,
compression ratio, speed of compression,
speed of decompression,
right?
Maybe the CPU overhead itself,
if it matters for us, how much
CPU we consumed.
So yeah, TimescaleDB and Hydra
are interesting in this area.
And I still, like, I'm still, I
remember my big impression reading,
from reading the TimescaleDB details
in their blog post, how
they implement compression.
I think we forgot, because of me,
we forgot to talk about compression
of dumps.
Oh yeah.
PgDump has compression options.
And also compression of wall files
as a whole, which I think
Postgres doesn't provide it, but
both PgBackRest and WAL-G, I
think the most popular backup tools,
they both do it, right?
Because if you have 16 megabytes
file, if you compress and you
have like 3 times less, for example,
as I remember, like 5 megabytes
maybe or so, maybe even less in
some cases.
Again, it depends on data.
In this case, it's much better
in terms of storage costs and
transfer speed and so on.
But this will consume some CPU
cycles and usually we archive
command is working on the primary.
This is the key here.
Usually, to avoid risks of having
longer delays, lag of archiving.
Because we need walls, it's part
of our DR strategy, the disaster
recovery strategy, so we want to
archive wall as fast as possible.
And this means we do it on the
primary.
And if we archive whole wall, we
need some CPU.
And if a lot of wall generated,
probably we will need multiple
workers.
And then you see already more than
100% of single CPUs, meaning
multiple cores are busy.
200 percent out of our 360 cores,
vCPUs, We allow 2 cores to
be used to compress and archive
walls.
It's just like some random numbers
in my head, but definitely
if we talk about compression of
whole walls by WAL-G or PgBackRest,
we need to keep in mind this, the
most valuable CPU resource,
which is on primary.
We need to think about capacity
here.
And decompressing usually is not
a big deal, especially if we
can fetch walls using multiple
workers from object storage, like
S3 and decompress them.
Usually it's not a big deal, but
still worth remembering.
So usually we have compression
there.
Unfortunately, Postgres right now
doesn't do it officially, so
this is only...
Here I talk about, again, third-party
tools which are very common,
popular, WAL-G, PgBackRest.
Others as well, I think they also
compress.
And PgDump.
PgDump is official, very official.
PgDump, PgRestore.
They support compression for custom
or directory or both formats,
right?
I always mix
Michael: them up.
Yeah, but, and this is 1 of the
oldest, like this is, this has
been in forever, right?
Nikolay: Right, but compression
was forever, but what about options
in terms of algorithms?
Oh, yeah.
ZStandard and LZ4, so I think it's
relatively new.
And it's worth again experimenting,
benchmarking, and studying.
I think there are articles from
cybertech, right?
Let's check.
Yeah, There are articles comparing
different types of compression.
and decompression.
So ratio, I think you can even
control a ratio there if you want
to spend more time and CPU capacity
to achieve a little bit more
better compression ratio, it's
possible.
Yeah, I remember some surprising
results from those articles.
I don't remember details, so let's
attach them.
But definitely we want to study
this as well.
Again, mostly questions today,
not answers, not exact recipes.
But it's good that the only thing
I don't like in PgDump, PgRestore
is that you cannot use parallelization
and compression on the
fly.
What if I don't want, this is 1
time operation, I want just logically
migrate from 1 database to another
database, and I don't want
to send a lot of bytes over the network.
I want to compress and decompress
and use multiple workers because,
for example, these servers are
not used right now.
So I just need to migrate out of
RDS to self-managed Postgres
because I feel enough level of
confidence and I've already worked
with great guys who know how to
do it and RDS is not needed for
me anymore and so on.
So, in this case, you need to,
unfortunately, you need to first
save it to disk and then restore
from there.
Yeah, so this is, I think, a big
missing feature of Postgres.
And Dimitri Fontaine created pgcopydb,
I think, but I quickly
checked before our recording, I
didn't see anything related to
compression.
It talks only about parallelization.
Let's have like 16 workers or,
I don't know, 4 workers to speed
up the process, which is in my
head quite relatively simple idea.
We just create repeatable re-transaction,
keep it, export snapshot,
create other transactions, repeatable
read transactions, and
use the same snapshot so all of
them are synchronized and you
can even read 1 huge unpartitioned
table in chunks.
It will be a fully consistent read
using multiple workers.
But no compression options.
I couldn't find it.
Maybe it's not an easy idea to
implement.
Maybe it's a good idea.
So what I would like to have is
just pgDump and pgRestore supporting
both parallelization and compression
without the need to store
intermediate file Or directory.
Would be great.
Makes sense.
Yeah, I needed it yesterday.
Copying huge, not huge, not huge
at all, like 10 or something
gigabytes tiny table with big vectors.
I mean, a lot of vectors.
By the way, let's make a full cycle
and in the end mention that
although we joked about lossy compression
using LLM, under Liontag,
when you have a lot of dimensions,
huge vector values, it means
TOAST is involved definitely.
It's interesting how it's compressed
there.
In this case, just simple idea.
Sometimes reducing the number of
dimensions means kind of lossy,
but it's also kind of compression,
right?
And OpenAI speaks about it.
It's also an interesting area,
how to like, not losing a lot
of quality.
Michael: On vectors, though, I
think Jonathan Katz made a really
good point on the episode we did
for Postgres FM, that because
it's mostly, because the vector
data, like at least the embeddings
that come back from models like
OpenAI, it's mostly random integers,
it doesn't compress well.
So there's actually, I think, I'm
not sure he's done it yet or
he's planning to do it.
There's work to look into whether
you'd actually be, it'd be
beneficial to turn compression
off for TOAST of vector data because
you're not getting much compression,
there's no point paying
the overhead of the compressing
and decompressing each point.
So I thought that was super interesting
as a use case for turning
it off.
Nikolay: Yeah, interesting.
I would like to explore it myself.
It's a very interesting area.
So I think we discussed maybe 5
different directions for benchmarking.
I would like to conduct these benchmarks.
Michael: Cool.
Let us know how you get on with
all 5 by next week, right?
Nikolay: Yeah.
Well, yeah.
So thank you.
It was interesting.
Many questions.
Michael: There you go.
Not boring after all.
Thanks so much, Nikolay.
Catch you next week.