Nikolay: Hello, hello, this is Postgres FM.

I'm Nik, PostgresAI, and as usual with me here is Michael,

pgMustard.

Hi, Michael.

Michael: Hello, Nik.

How's it going?

Nikolay: Very good.

How are you?

Michael: I am good, thank you.

Nikolay: We haven't recorded last week because I was on trip

in Oregon forest having some fun disconnected from internet mostly

Yeah, so now I return you said that we should discuss sequences

somehow, right?

Michael: Yeah, so I was looking back through our listener suggestions.

So we get, we've got a Google doc where we encourage people to

comment and add ideas for us to discuss topics.

And whenever I'm short of ideas, I love checking back through

that.

And 1 of them from quite a long time ago actually caught my eye

and it was the concept of gapless sequences and I guess this

might this might be a couple of different things but I found

it interesting both from like a theoretical point of view, but

also in terms of practical solutions, as well as being 1 of those

things that's kind of, so a sequence with gaps is 1 of those

things that catches most engineers eye.

Like if you start to run a production Postgres, you will see

occasionally an incrementing ID and then a gap in it.

And you think, what happened there?

So it's 1 of those things I think most of us have come across

at some point and been intrigued by.

So yeah, there's a few interesting causes of that.

Nikolay: Name sequence should mean it's sequential.

Why the gap?

It's unexpected.

And by the way, this episode, is it number 163 or because I missed

last week it will be number 164

Michael: Do you know what, it would be quite funny to, should

we increment the episode count?

Nikolay: Yeah, what's the number because I was telling, yeah

Michael: Either we should do yeah 164 missing 1 and then do 163

next week as like a joke, because it's like coming in too late,

or we just carry on increasing the number.

Nikolay: This is another anomaly you can observe sometimes, because

at commit time you can see like we have many users they can use

next numbers from sequence all the time and Then at commit time

the order is not, it can be different of course, right?

Michael: Yeah, I forget the name for it, but whatever it whatever

that phenomenon is but whatever is we should discuss that next

week and have the...

Nikolay: It's not serializable right so it's yeah it's not serialized

so if you have 2 transactions and
1 for example you have support

system, ticket tracking system,
and you generate ticket numbers,

you think sequentially.

1 user came, opened a ticket, but
haven't committed yet.

Another user came, opened ticket,
and that ticket has bigger

number, next 1, right?

And then committed already, and
then you committed this, you

see 1 ticket created before that
1 right but at the same time

if you generate timestamp automatically
with like created at

time created at Column with with
default clock timestamp or something

and it was it INSERT happened at
the same time when a Sequence

nextval call happened.

In that case, created at values
will have the same order as Sequence

values, like ID column values.

Right.

So there will, there will not be
a problem when you order by

those tickets but normally can
be understood oh there is a ticket

number like 10 and then number
9 visible layer because we don't

see uncommitted writes right So
it should be committed first

before it's visible to other transactions,
other sessions.

Yeah.

Yeah.

But this is different anomaly,
like gapless gaps.

It's this Anomaly is very well
known because for the sake of

performance, Sequence, this mechanism
in Postgres exists for

ages and it just gives you next
number all the time and of course

if you for example decided to Rollback
your Transaction you

lost that value.

So this is number 1 thing.

Michael: Yes, exactly.

So it's to allow for concurrent
rights, isn't it?

So if you've got, imagine like
within a microsecond, 2 of us

trying to INSERT to the same Table,
if my if I am just before

you and I get assigned the next
value in a Sequence and then

my Transaction fails and is rolled
back you've already been assigned

the next value after me so yeah
I think that's that's super interesting

So I think that's probably the
most common, in fact possibly

not, but that's the 1 I always
get see always see as the example

given as to why.

Nikolay: Yeah so for not to think
about going back to previous

values like this is your value
and it's like fire and forget

like this value is wasted and the
Sequence has shifted to new

value although you can reset it
using there's nextval and there's

setval there's currval and currval
requires a assignment first

before you can use it and then
setval.

So you can shift Sequence back
if you want, but it's global for

everyone.

And Also interesting, Sequence
is considered a Relation as well,

right?

Michael: Yeah, we discussed this
recently, didn't we?

Yeah, yeah,

Nikolay: in pg_class you see a
real kind equals, you said capital

S, right?

Michael: Capital S, and by the
way I was wrong, it's not the

only 1 with a capital there is
1 other do you know the other

1?

Nikolay: No.

Michael: Indexes on partitions.

Okay.

Capital I.

Nikolay: Okay.

Michael: So we've got 1 1 cause
already you've mentioned which

is transactions rolling back I
want to go through a bunch of

other causes, but before that should
we talk about, like why

would you even want a gapless sequence?

Like what, we've got sequences
and sequences with the odd gap

in are fine for almost all use
cases.

Should we talk a little bit about
why even bother?

Like why even discuss this?

Why is it a problem?

Nikolay: Well, expectations, I
guess, right?

You might have expectations.

Michael: So I think I've only got
a couple here.

I'm interested if other people
have seen others and but 1 I've

got is user visible IDs that you
that you want to mean something.

There was a really good blog post
on this topic by some folks

at incident.io and it was actually
old friends of mine from GoCardless

days and they wanted incident IDs
to increment by 1 for their

customers so they could refer to
an incident ID And if they've

had 3 that year, it's up to 3 and
then the fourth 1 gets assigned

incident 4 and It's not ideal if
they want to if want them to

mean something For them to miss
the odd 1 and much worse to miss

like 10 or 20 in a row

Nikolay: So they obviously have
many customers.

So it's multi-tenant system, right?

Yeah, And do they create a sequence
for each customer?

Michael: Well they did initially.

Nikolay: Okay yeah I'm asking like
because I saw this in other

systems and I remember the approach
when we have a sequence just

to support primary keys unless
we use beautiful UUID version 7.

Yeah.

Well, with some drawbacks, but
overall it's winning in my opinion

these days.

But for each customer, like the
namespace of each client ID or

organization ID, doesn't matter,
project ID, we might want to

have internal ID.

Internal ID which is local, right?

And then we shouldn't use sequences.

It's like overuse of them because
if each customer has like thousands

or millions of rows and like we
can handle it and collisions

would happen only locally for this
organizational project or

customer, right?

Which is great.

Yeah, right.

So, so yeah.

And for sequences, we just, the
only thing we care about is uniqueness,

in my opinion.

Michael: Yeah, yeah, you're right.

Uniqueness is, but that's the job
of the primary key, right?

It's also the fact they only go
up, I think.

Nikolay: Yeah, yeah, well, unless
somebody rewind, right?

Michael: Setval.

Nikolay: Setval, exactly.

So, and the capacity, like, just
forget about it because it's

an int8 always for any sequence.

I noticed some blog posts you share
with me, not this 1, different

ones, that you used int4 primary
keys.

I am very welcome this move because
these are our future clients.

Yeah.

So very good move.

Everyone please use int4 primary
key and later if you're

successful and have more money
you will pay us to fix that.

Yeah.

Michael: I like you flipping the
advice.

So wait, but you said something
interesting then.

So you said sequences are always
int8.

So even if I have an int4
primary key, the sequence behind

it is int8?

Nikolay: Sequence is an independent
object which, well, independent

relatively because there is dependency
which is also a weird

thing, like owned by, right?

It might be dependent, but it also
might belong to a column of

a table.

With alter table, alter sequence
owned by some column, right?

But overall it's just a special
mechanism, int8 always,

and just gives you a next number,
next number, That's it.

Simple.

Michael: Yeah.

So yeah, by the way, I wasn't talking
about...

So incident.io did use sequences
initially and it turned out to

be a bad idea, but all I meant
was that that's a use case for

not just monotonically increasing
IDs, but IDs that increase

by exactly 1 each time.

So that's 1 use case for like the
concept of gapless sequences.

And another 1 came up in the blog
post by Sequin that I shared

beforehand, and I'll link up in
the show notes again and that

was the concept of Cursor-based
pagination so the the idea that

you well I think it's I think it's
very similar to keyset pagination

but based on an integer only so
the idea that it would I guess

it's I guess for those it's most
important that it monotonically,

that it only increases but also
that concept of the committing

out of order becoming important.

So if we read rows that are being
inserted right now there might

be 1 that commits having started
later than a second 1 that sorry

having started earlier than a second
1 that hasn't yet committed

so we could see The example they
give is we could see IDs 1,

2, and 4, and later 3 commits But
we only saw 1, 2, and 4 at

the time of our read So if we were
paginating and got the first

set and it went up to 4 And then
we only looked for ones above

4, we've missed 3.

So that's an interesting definition
of a sequence where you don't

want there to be gaps maybe at
any point.

Nikolay: You know what I'm looking
at the documentation right

now and I think it would be great
if this thing called not sequence

but something like generator, a
number generator or something.

Because sequence it feels like
it should be sequential and gapless,

like it's just some feeling you
know.

This gives false expectations to
some people, not to everyone.

Of course the condition says getSequence
define a new sequence

generator.

So generator is a better word of
this.

And I think the condition could
be more explicit in terms of

gaps to expect.

So yeah, in my opinion, in my practice,
it happened not once

when people expected them to be
gapless somehow, I don't know.

A lot of new people are coming
to Postgres.

Michael: All of us were new once,
right?

I definitely experienced this.

I think for us, moving on to a
second cause of this, I think

the reason we were getting them
was using insert on conflict.

So it was something around having
new users that had been added

by somebody else in the team, for
example.

So the user had already been created
behind the scenes because

somebody invited them, and then
when they signed up, we were

doing an insert on conflict update
or something like that and

then so as part of that the next
file was called just in case

we needed to insert a new row but
we ended up not needing to

because it was an update instead.

So I think you can get these again
through insert on conflicts.

Nikolay: Yeah, and actually the
documentation mentions it.

Oh, cool.

It mentions it, like I think still
could be mentioned more explicitly

maybe in the beginning and so on.

And the thing is, someone might
consider sequences as not ACID,

right?

Because if rollback happens, they
don't rollback.

For the sake of performance obviously.

Michael: So it violates atomicity
does it?

Nikolay: Yes or no?

Yeah so if other things, other
writes are reverted, this change

that we advanced sequence by 1, we shifted its position, it's

not rolled back.

So our operation is only partially
rewarded if we strictly look

at it.

For the sake of performance it's
pretty clear but yeah so like

kind of not fully ACID and that's
okay it's just you need to

understand it and that's it yeah
and the most natural but I can

understand the feelings of people
who come to Postgres now and

just from the meeting they expected
it but then boom.

It's a simple thing to learn.

Michael: Another case where naming
things is hard.

Nikolay: So, yeah, for me it's
a generative number, huge capacity,

8 bytes, and it gives me a tool
to guarantee some uniqueness

when we generate numbers.

That's it.

Very performant.

Very, very.

I never think about performance
because rollback is not supported.

That's it.

Let's go and yeah, but let's talk
about again like If we really

need it, I would think do we really
need it or we can be okay

with it.

If we really need it, I think we
should go with like specific

allocation of numbers, maybe additional
ones, not primary keys,

right?

Michael: Yeah, well personally
I think this is a rare enough

need that, it's not needed by every
project I don't

Nikolay: think right

Michael: I've run plenty of projects
that have not needed this

feature so I personally think there's
not a necessity to build

it into Postgres core as a feature
as like you know a sequence

type or something but I do think
it's interesting enough like

it seems to come up from time to
time and I think there were

neat enough solutions at least
at lower scales I'm sure there

is a solution at high scale as
well but there are simple enough

solutions at lower volumes that
I think there's no necessity

I don't think for a pre-built solution
that everyone can use.

Nikolay: High-performance solution?

It's impossible because if there
is a transaction which wants

to write number 10, for example,
but it hasn't committed yet

and we want to write number next
number or also number 10, it

depends on the status of that first
transaction.

We need to wait for it, right?

It creates natural bottleneck.

Yeah.

And we, we like, I cannot see how
it can be undone, right?

Like can be done differently.

We need to wait until that transaction.

We need to serialize these rights.

And again, for me, the only trick
in terms of performance here

is to use the fact that if we have
a multi-tenant system, we

can make these collisions very
local to each project or organization

or tenant, right?

So they compute only within this
organization and other organizations

are not like, are separate in terms
of these collisions.

Michael: And ultimately, then it's
about parallelizing writes,

which I think is then sharding.

Yeah.

So if you've got the multi-tenant
system across multiple shards,

you can then scale your write throughput
So yeah, it feels to

me like another case of that probably
being the ultimate solution

Nikolay: well, if you have sharding
and distributed systems,

it's like

Michael: Across shards, I don't
mean

Nikolay: yeah locally locally

Michael: Yeah, exactly.

If you've got a tenant that's local
and you can...

Nikolay: Because if you want pure
sequential gapless number generator

for distributed systems, it's a
whole new problem to solve.

You basically need to build service
for it and so on.

But again, if you make...

So you should think about it.

Okay, we will have thousands of
writes of new rows inserted per

second, for example, soon.

What will happen?

If the collision will happen only
within boundaries of 1 tenant

or project organization, doesn't
matter.

It's not that bad, right?

They can afford inserting those
rows sequentially, 1 by 1, and

maybe within 1 transaction or some
transactions will wait, but

maybe just 1.

So maybe this will affect our parallelization
logic.

So saying, let's not deal with
multiple tenants and multiple

backends and transactions.

Let's do it in 1 transaction always.

But if we like write thousands
of rows per second and they belong

to different organizations, collisions
won't happen, right?

Because they don't compete.

So this dictates how we could build
this high performance, gapless

sequence solution.

We just should avoid collisions
between tenants for example.

That's it.

Michael: Yeah.

But we've jumped straight to the
hardest part.

Should we talk about a couple more
of the kind of times that

you might...

Nikolay: — Oh, surprises!

Yeah, so rollback is 1 thing which
can waste your precious numbers.

Another thing is, I learned about
it and I forgot, and relearned

when you sent me these blog posts,
there is a hardcoded constant

32 pre-allocate.

Actually I think there is constant and I think there is some

setting.

Maybe I'm wrong but there should be some setting.

Yeah, so which you can say I want to pre-allocate more.

Michael: Oh, I didn't come across that.

So we've got set log values, that's the hard-coded 1, right?

Nikolay: Yeah, maybe I'm wrong actually.

So there are pre-allocated values.

And can we control it?

No, we cannot control it, right?

32.

Ah, there is cache, right?

What is cache?

When you create a sequence, you can specify the cache parameter

as well.

Michael: Okay, so what does that control?

Nikolay: Yeah, so this controls exactly like this.

If you don't do it, it will be 32.

Michael: Oh, okay.

So it's defined on a per sequence.

Nikolay: Per sequence.

You can say I want 1000.

Pre-allocate.

Michael: What if we set it to 1?

Nikolay: Well, only 1 will be pre-allocated, right?

1 is minimum, actually.

Michael: 1 is minimum.

Nikolay: Yeah.

Actually, it's also interesting, maybe I'm wrong because there

is also...

Yeah, so I'm confused.

So the computation about this parameter says 1 is default, but

we know there is also 32 hardcoded constant.

In any way, I don't know this hardcoded constant can be associated

with 32 gap.

So when, for example, a failure happens or just you fail over,

switch over to new primary, which should be like normal thing,

right?

You change something on your replica, switch over to it.

This is when you can, you might have a gap which is described

in 1 of those articles 32.

So I'm not sure about this cache parameter, right?

So maybe if you change it, it's only cache of pre-allocated values

and that's it.

Maybe like specifying it won't lead to bigger or smaller gaps.

I'm not sure about that.

So maybe there are 2 layers of implementation here.

But based on articles, we know there are gaps of 32.

And this is just common, right?

And interestingly, this is connected to recent discussions we

had with 1 of the big customers who have a lot of databases.

And we discussed major upgrades.

And we have 0 downtime, 0 data loss, reversible upgrades solution

which multiple companies use.

And part of it is like 1 of the most fragile parts is when we

switch over.

During switchover into logical replica, we do it with basically

without downtime things to pause, resume and PgBouncer.

Also Patroni supports it.

So we pause and resume.

And between pause and resume, where small latency spikes in transaction

processing happens, we redirect PgBouncer to a new server.

And that server by default has Sequence values corresponding

to initialization, because the logical replication in Postgres

doesn't support still, there is work in progress.

I think it's close to.

It doesn't replicate values of Sequences.

So the question is how to deal with it.

There are 2 options.

First, you can synchronize Sequence values during this switcher,

but it will increase this spike.

We don't want it because we achieved a few seconds spike.

That's it.

It feels really pure 0 downtime.

And if we start synchronizing Sequences, it will be incremented.

Especially some customers had like 200,000 tables, it's insane.

But okay, if it's only 1000 tables, I thought, well, I don't

want it.

Actually, 1 of the engineers on the customer side said, you know

what, like, this set value is not too long.

If we quickly read it, quickly adjust it, maybe, okay, another

second.

And testing shows, yeah, exactly, like, changing position of

Sequences super fast, actually.

Yes, if you have hundreds of thousands of tables and Sequences,

it will be quite slow.

But it's only a few, you can do it quite quick, also can paralyze

it maybe, but it will make things more complicated.

But another solution which I like much more, we just advance

it beforehand, before switchover, with some significant gap,

Like I say, check how many you spend during a day or 2.

Millions, 10 millions, advance.

We have enough capacity for our life.

8 bytes, it's definitely enough.

So, yeah, just bump it to like 10 millions.

But then it works with, you know, your system, like 1000, 2000

tables, just 1 system, and you know, these big gaps are fine.

But when you think about very, very different projects, thousands

of clusters, you think oh maybe some of them won't be happy with

big gaps.

You know?

This is a hard problem to solve.

Michael: And if you go back in
the other direction, let's say

you want to be able to fail back
quickly, that's another gap.

So each time you bounce back and
forth.

Nikolay: Yeah.

Yeah.

Yeah.

Since our processes, our process
is fully reversible.

It's really blue green deployments.

Every time you switch, You need
to jump and we recommend jumping

big like we have big gaps and I
would say you should be fine

with it But I can imagine

Michael: why not smaller gaps.

Why not?

Like why not?

Let's say it's a two-second pause.

Nikolay: Yeah, if you know there
won't be spikes of writes right

before you switchover, well, we
can do that.

But it's like, it's just, there
are like risks increase of overlapping.

If you did it wrong, after switchover
some inserts won't work

because this sequence already used,
right?

Michael: With duplicate key, or
yeah.

Nikolay: —

Michael: Yeah, so...

— What would the actual errors
be?

Duplicate key violations?

Nikolay: — Yeah, so your sequence...

But yeah, it will heal itself,
right?

Thanks to the nature of sequences
which waste numbers.

Insert a unique key value, oh,
it works.

It's funny.

Yeah, anyway, I always preferred
to be on the safe side and do

big jumps But when you think about
many many clusters and things

of many people It's a different
kind of kind of problem to have

and and so I'm just highlighting
the gaps are fine.

But what about big gaps?

Yeah, you know some Sometimes it
can look not good.

In this case, We are still thinking
maybe we should just implement

2 paths, you know, and by default
we do big jump, but if somebody

is not okay with that, maybe they
would prefer a bigger spike

or bigger like maintenance window,
like, okay, well, up to 30

seconds or so yeah while we are
synchronizing those sequences

and don't allow any gaps or will
for me naturally knowing how

sequences work for years like gaps
should be normal right

Michael: yeah it's so interesting
isn't it like the trade-offs

that different people want to make.

Nikolay: You know solution to this?

Michael: Yeah.

Pardon me?

Nikolay: You know the good solution to this?

Finally start supporting sequences in logical replication, that's

it.

Michael: Yeah, that would be...

Well, yeah, and that might not be too far away, so yeah.

I think

Nikolay: so, I think so.

This work in progress lasts quite some years.

It's called logical replication of sequences or synchronization

of sequences to subscriber.

And it's already multiple iterations since 2014, I think.

And it has chances to be in Postgres 19, but it requires reviews.

It's a great, great point for you to take your code or Cursor

and ask it to compile and test and so on and then think about

edge cases, corner cases.

If you don't know C, this is a great point to provide some review.

You should be just an engineer, like writing some code, you will

understand discussion, comments, it's not that difficult.

So I encourage our listeners to participate in reviews, maybe

with a AI, but there will be still value if you consider yourself

an engineer.

You will like figure it out over time which value you can bring.

The biggest values in testing is to think about various edge

cases and corner cases as a user, as Postgres user, right?

And try to test them and AI will help you.

Michael: Yeah.

Well, I also think we do have several experienced Postgres, like

C developers listening and I think it's always a bit of a challenge

to know exactly which changes are going to be the most user beneficial

because you don't always get a representative sample on the mailing

lists.

I think sometimes like a lot of the people asking questions are

very like the beginning of their journey.

They haven't yet worked out how to look at the source code to

solve problems so you don't get some of the kind of slightly

more advanced problems always reported because people can work

around them and I think this is one of those ones that people have

just been working around for many years.

A lot of consultancies deal with this in different ways, but

it is affecting every major version.

There is friction on, so if any hackers, any experienced hackers

are also wondering like, which changes could I review that would

have the biggest user impact.

This feels like one.

Nikolay: This feels like so many wanted.

Logical replication is used more and more like bluegreen deployments

and so on.

And it's like for me in the past, if I looked at this, let's

include by the way, commitfest
entry so people could look at

it and think if they can review
and help testing.

So in the past, I would think,
okay, to test it, I need first

of all, what I need, this is about
logical replication and behavior.

I need logical replication, setting
up 2 clusters with logical

replication.

Oh, yeah, okay.

I have better things to do actually,
right?

Now you can just launch your cloud
code or Cursor and say I have

Docker installed locally on my
laptop or something.

Please launch 2 containers, different
versions maybe, create

logical replication and let's start
testing.

And then like not containers.

If containers work, now you can
say, okay, now I want some of

them are built locally from source
code and then same thing.

And you don't need to install logical
repli...

To set up logical replication yourself,
that's it.

So, yeah, so this like roadblocks
can be eliminated by AI and

then you focus only on use cases
where this thing can be broken

and this is where you can start
contributing.

You just need to be a good Postgres
user, that's it.

Michael: Yeah, nice.

Nikolay: Good.

Just to be able to distinguish
logical replica from physical

replica manually, that's the only
thing you need to know to start.

Yeah, good.

Okay.

So, are there any other cases where
we can experience gaps?

Michael: Well, I actually thought,
I only wanted to talk about

2 more things for sure.

1 is why 32?

Why do we pre-allocate these?

I think that's interesting.

And 2, what can you actually do
about, like if you, I thought

the incident, like especially at
lower volumes, there's like

some neat solutions.

That were the only last 2 things
I had on my list.

Nikolay: Well, for performance
we pre-log it, right?

Because technically it's a page,
it's like also a relation which

stores value and so on, right?

Michael: Well, I got the impression
from a comment in the source

code that it was...

So let me read it exactly.

We don't want to log each fetching
of value from a sequence so

we pre-log a few fetches in advance.

In the event of a crash we can
lose, in brackets skip over, as

many values as we pre-logged.

So I got the impression it was
to avoid spamming the WAL.

Nikolay: Yeah, it's optimization
technique, that's it.

Michael: So I could imagine a case
where you'd want to pay that

trade off the other way around,
like, and it's good to know,

as you mentioned, that you can
reduce it on a per-sequence basis.

Nikolay: I think it's different.

I think what you can reduce is
cache, but it's not the thing that

goes to WAL.

I'm not 100% sure here.

I just think you still lose 32.

But because these are 2 different
things.

1 is a hard-coded constant value,
another is dynamic control

by user.

But maybe I'm wrong again here.

It's a good question to check,
but it's nuance.

For me, a sequence is always a
gap, having gaps, that's it.

Michael: And it's okay.

So okay, the last thing was solutions.

And I thought the instant 1 was
really neat but also quite oh

it's very very simple I like simple
solutions that work for now

and we can solve later problems
later.

And it was just to do a subquery
and read the max value currently

and increment it by 1, so not using
sequences of course.

Nikolay: Yeah, no sequences, it's
just reading.

It reminds us of the episode we
had with Haki Benita, right?

And the problems.

Yes, yeah.

Get or Create or something like
this, right?

Yes.

So we need to basically, we need
to deal with other, like we

need to read the maximum value
and get plus 1, but maybe others

do the same thing in parallel.

And how to deal with performance
in this multi-concurrent environment.

Again, the clue for me is to narrow
down the scope of collisions.

That's it.

So contention would be local to...

Michael: So there's multiple options,
right?

You could just implement, I say
just, as if it's simple.

Retries are 1 option.

If you expect collisions to be
super super uncommon retries would

be a solution but I think there's
well the Sequin blog post actually

goes into a bit of depth into how
you could scale this if you

are doing tons, like a lot per
second.

So that's an interesting solution.

There's way too much code to go
into now, but I'll link that

up in the show notes.

But yeah, I did think there's like
a range of solutions like

from we have a multi-tenancy system,
like incident for example.

You're not going to be creating,
hopefully most organizations

are not going to be creating thousands
of incidents per, never

mind second, per day.

So the chance of collisions or
like issues there are so low that

it's almost a non-issue whereas
a different use case I actually

I can't think of a use case for
like needing a gapless sequence

that can insert thousands or like
thousands per second.

So I just don't see that being
a, well, I'd love to hear from

people that have seen that or have
had to deal with that and

what they did.

Nikolay: Thousands per second?

Michael: Yeah.

For an a gapless sequence like.

Yeah.

Where it's important not to have
gaps.

Nikolay: Yeah, yeah, yeah, yeah.

Yeah, because if you have a lot
of inserts, you have big numbers.

So the idea is a desire to have
gapless matters when we have

only small numbers.

Michael: I think it's more important,
right?

Nikolay: Yeah, maybe.

Also, the little 32

Michael: would disappear quickly.

Imagine a gap of 32, it would disappear
quite quickly.

If you're, if you're,

Nikolay: if it's a big number,
stop, stop paying attention to,

yeah, maybe, maybe.

Michael: And also, I don't think
computers care about gaps.

I think it's humans that care.

Yeah.

Personally, I don't know.

Nikolay: Yeah.

Well, with sequences, I remember
it was 2005-06 when we wanted

to hide actual numbers of users
and things created in our social

network.

So we used 2 prime numbers and
set default next val from sequence

multiplied by 1 big number and
then a module of the different

number.

So it was like fake random, you
know, to hide it.

I figure like you can still, if
you create some of things yourself,

you see numbers, you can quickly
understand the logic and like

it's still, you can hack it and
understand the actual growth

rates.

But it's hard to understand absolute
value for this.

You don't know how many things,
compared to like people who don't

care, they're just 1 global sequence
for all users and okay,

number of posts, this like 1 million
something.

Okay, this platform has 1 million
posts.

It gives some signals to your competitors
right

Michael: so Have you, I learned
today what that generally is

called it's called the German tank
problem have you heard this?

No It's like the maybe not the
first but like the first famous

case of this was, I think, in World
War II, the Allies were using

the numbers, like an incrementing
number found on German tanks

to find out how many they were
going through, like how many,

what their production capacity
was.

It was a useful thing in the war.

So yeah, this is older than computers.

Nikolay: Yeah, it reminds me how
the guys who from my former

country went to your country to
poison some guy and their passports

were sequential.

That's how they were tracked.

Yes.

So stupid, right?

I mean, sometimes gaps are good.

If you want to hide some things.

So if you build some system maybe
you want gaps actually.

Michael: Yeah, that's the next
episode, another different episode.

Nikolay: How to build gaps.

Michael: Gapful sequences, yeah.

Nikolay: Some random gaps so everyone
doesn't understand how

many.

Michael: Yeah, just UUIDV4, right?

Nikolay: Random jumps.

Yeah, so that's it.

I also wanted to mention sequences
have a...

Like, a sequence has a few more
parameters you can specify,

like, min value, max value, and
you can say it should be in loop.

I don't know why, I never used
it.

Cycle, It's called cycle.

So you can specify from 1 to thousand
and cycle.

Michael: So you don't, for example,
need to, it doesn't need

to be on a primary key.

So it couldn't be on a primary
key.

That 1.

Nikolay: Yeah.

I would use like just percent operator
model, just divide by

something and have the same effect.

But...

Michael: Yeah, I guess it's similar
to transaction IDs.

If you think about how transaction
IDs look.

Nikolay: Wrap around, yeah.

If you want to wrap around, go
for it.

Yeah, I'm very curious use cases
for this never never used it

yeah but increment also you can
specify jump and like only odd

numbers for example right

Michael: yeah or any positive might
be more

common.

Nikolay: We want to increment by random.

This will be our random gaps to
fool everyone.

Yeah.

Okay, good.

Enough about sequences.

Thank you for the topic.

Michael: Likewise.

Good to see you and catch you soon.

Some kind things our listeners have said