Postgres FM | Transcript: Rails + Postgres

March 1, 2024 • 45 Minutes

Rails + Postgres

Michael: Hello and welcome to PostgresFM, a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard.

As usual, I'm joined by my cohost,
Nikolay, founder of

Postgres.ai.

Hello Nikolay.

Nikolay: Hi Mike.

Michael: And today we are also
joined by a special guest, Andy

Atkinson, Andrew Atkinson, as we
were saying earlier, who has

been a software engineer for 15
years at the likes of Groupon,

Microsoft, and Fountain, a user
of PostgreSQL for the past 10,

speaker at several conferences,
and now a published author with

the brand new book, High Performance
PostgreSQL for Rails.

It's a pleasure to have you on,
Andy, welcome.

Andrew: It's a pleasure to be here,
thanks for having me, guys.

Really fun.

Michael: Nice, well, I'm gonna
let you in on a little secret

here that I didn't tell you beforehand.

You are actually our first guest
who has been requested by a

listener.

So somebody specifically asked
for you to come on, which is awesome.

Andrew: Wow.

Michael: And with the topic Postgres
plus Rails.

So that's what we're going to be
talking about.

Andrew: I guess asking my mom to
do that really worked out.

No, that's great to hear.

Michael: Awesome.

Well, yeah.

Who better at the moment, you know,
you've got this stuff fresh

in your mind.

So yeah, in terms of where to start,
I thought it'd be really

interesting to hear from your side,
how popular a choice is Postgres

for Rails?

Like what's it competing with and
yeah, how's that been changing

over time?

Andrew: Yeah, good questions.

Well, the ORM in Ruby on Rails
is called Active Record and it

supports Postgres and MySQL/MariaDB, and SQLite.

So whenever you're generating a
new Rails app, that's one of the

first decisions you make, I believe
by default it's SQLite.

But most folks that have apps in
production, I still think it's

usually, they're using Ruby on
Rails, they're usually working

with MySQL or Postgres.

In kind of prepping for this, there
are a couple of surveys where

folks have been responding to surveys
about how they deploy their

apps.

And there was a 2022 survey from
Planet Argon, which I had

pulled up here.

And there were about 2,600 responses.

And from 2014 onward, Postgres
has been the most popular relational

database with Rails apps.

And I guess what I've seen is kind
of this shift, starting in

maybe the early 2010s, where I
think in large part, thanks to

Heroku having Postgres support
and Heroku being a really popular

choice, easy place, as you guys
have talked about it before,

made deploying your Rails app onto
the internet much easier,

pretty much a git push command,
and took away a lot of operational

toil that a lot of folks might
otherwise take on themselves.

And I think a lot of folks may
have switched to Postgres just

because it was part of the package
there, and maybe they weren't

deeply using relational database
features.

And so I think there was a lot
that had to do with Heroku.

And then of course, as also, as
you guys have talked about, Postgres

gaining a lot of features, a lot
of performance over the last

10 years.

So I kind of noticed this general
setup where a lot of the big

Ruby on Rails, the companies that
famously use Ruby on Rails,

like GitHub, Shopify, Basecamp,
that kind of started in the 2000s,

they tended to choose MySQL at
the time and they've stuck with

it for the most part.

They maybe are using, you know,
clustering solutions and things,

but companies that started more
in the 2010s or mid-2010s, I

feel like a lot of times they're,
they're running Postgres.

And so I'm certainly, you know,
now I'm doing independent consulting

with Rails teams that use Rails
and Postgres.

And I'm certainly, you know, I'm,
clients are coming to me that

are using Postgres.

So for whatever that's worth, there's
definitely companies out

there using it.

Michael: Yeah, I was going to ask
Nikolay as well and then I

realized that all of us have a
super biased sample but that does

that definitely does make sense.

That data you mentioned that survey
is really cool as well.

I saw the question below was
about which one would you like

to be using in production?

And it changes just 2 years earlier.

So you can see that there's that
desire and people wanted to

switch or wanted to use it in production.

And then it did eventually take
over a couple of years later

and hasn't lost first spot since,
which is nice.

You mentioned the ORM already.

Should we dive into that?

How are folks generally sending
queries across to Postgres from

Rails?

Nikolay: And where is Postgres?

If Ruby on Rails, then Postgres
is under Rails or where?

Andrew: Yeah, so with Active Record.

Oh yeah, where is it?

Let's see.

Where is

Nikolay: the place for it?

Michael: On again, isn't it?

Ruby on Rails, on Postgres.

Andrew: Yep.

Nikolay: The basis for Rails, okay.

Modweb.

Andrew: Yeah.

As a Postgres enthusiast, I tend
to think a lot about the database

operations and the relational data
model and almost like Rails

is kind of wrapping, developers
are writing Ruby code that's

doing database interaction and
other, other things.

But I almost tend to think about
things now more from a database

first perspective.

So it's almost like Postgres with
Ruby on Rails or something.

But I think, you know, depending
on how you tend to, most developers,

I think though, it's the other
way around.

They're mostly, most Rails teams
I've worked on, they're writing

Active Record code.

In the early days, I've been around
in Ruby on Rails for a long

time, almost 15 years with some
divergences.

But in the early days, there actually
was a little more competition

around the ORM.

Some of these projects are still
around, but Sequel, SEQUEL, and

Datamapper, some other ORMs that
were written in Ruby and allowed

developers to create queries and
evolve their schema with an

alternative to Active Record.

Different pros and cons, performance
benefits, maybe, that kind

of thing.

But Active Record sort of steamrolled
everything or consolidated

everything over time, as can happen
with open source projects.

So most developers then, in my
experience, they're thinking,

they're working more with objects
and interactions and algorithms

and they're building background
jobs and working with message

queues and things like that.

So they're not necessarily writing
a lot of SQL.

And as I was saying, most Rails
teams, they're going to write

Active Record.

So Active Record then generates
SQL and it can do things for

us as queries are generated, like
annotate them and say where

they're coming from in the app,
which is nice.

If we look at queries within Postgres
and we wanna kind of go

backwards and say, well, where
did this come from in the app?

And then another big way Active
Record is used of course, is

it is the de facto schema evolution
tool as well, which is interesting

because I think Rails developers
just kind of take that for granted.

Like, of course I control the schema
And of course I could ruin

everything with a bad migration
or whatever, you know, like an

incremental schema change.

But that's not always the case
when I've worked on other teams.

Like I've worked at, when I was
at Groupon, actually, the main

applications I was working on were
Java and there were DBAs and

we used Postgres.

But sometimes due to the scale
there, I mean, usually developers,

unless you're working on a small
microservice maybe that you

might have ownership of, if you're
working on any of the core

services there because of the scale
of the operation or how the

company was set up, usually there
were DBAs that would do riskier

database changes and that sort
of thing.

So you might actually just kind
of provide them an example of

the change that you want, and then
it might just be done and

they let you know, maybe via a
ticketing system or something.

So I think, you know, it's interesting,
like if you come to,

you know, your background might
be where as a Rails developer,

you may at a large company, you
may not do as much of the direct

schema control, but certainly for
a lot of small to medium companies

that is what developers do as well.

So they do need to be informed
about good schema design, of course

creating indexes, constraints,
all those sorts of things.

And that's where it kind of, that's
where then the ORM can start

to be limited in its scope, you
know, and you got to kind of

go beyond and learn like, okay,
well, what are the capabilities

I have at my disposal within Postgres?

Nikolay: And at some point you
need to switch from schema rb

to structure SQL, right?

Andrew: Yeah, Michael asked about
that.

Nikolay: I saw several companies
switching.

Why do we have schema rb by default
at all?

Because everyone is switching at
some point while growing, right?

Andrew: Yeah, well, and I kind
of agree with you, but I didn't

always feel that way.

It's actually an interesting little
like microcosm of the whole

spectrum, I think, of being more
of an application developer,

working within your programming
language, which is Ruby in this

case, and kind of thinking of the
relational database as just

a thing that you don't really directly
work with, but it's just

there to like store your data and
access your data.

And then on the other extreme,
kind of like I was saying, like

you're thinking of completely in
the database operations, you're

thinking about the schema, the
queries, the indexes, you're running

query plans in your head.

You're thinking about, you know,
how do we, we have this high

growth table.

Should we use partitioning?

Should we split it out?

Like that's the opposite end of
the spectrum.

And yeah, I think so the, to briefly,
for anyone that's not familiar

with the schema rb and the structure
file by default in Ruby

on Rails.

As you make changes, let's say
you add a table or you add an

index, et cetera.

You generate what Rails calls a
migration, and that would be

that incremental change.

So here's the new table definition.

Here's the index definition.

And it's expressed in Ruby, but
of course, it generates SQL statements

to run and they run against your
local database.

And then what happens is your local
database is then dumped,

its schema definition or its
database definition is then dumped

into a file.

And by default, that's a Ruby file.

So it kind of like translates it
back to Ruby and represents

it as Ruby.

And it kind of insulates you from
the SQL.

But really in Postgres, what's
happening is it's just running,

well, I should say, if you move
to the SQL form, what it's doing

is it's running pg_dump.

And it's basically just taking
the raw pg_dump output and putting

that into a SQL form of the file
with a little bit of extra stuff

at the end, which are those migrations
that you're creating.

Each of them have a version.

So it dumps those versions as insert
statements into the end.

But otherwise, it's basically just
the pg_dump output.

So yeah, what happens a lot of
times for teams is they start

out with the Ruby schema file,
they start to use things that

are beyond what it covers.

And so they might use like a materialized
view or like triggers,

triggers,

Nikolay: maybe, right?

Andrew: Yeah.
So stuff like that.

And, and then what can happen too,
is the open source community

can, can spring in and can say,
"Oh, we can fix this, we can,

we can actually extend the Ruby
form of the file.

So you can keep using it, but you
gotta also run this Ruby gem

with your app.

And now we can express triggers
in the Ruby file.

But I kind of tend to just encourage,
okay, at that point just

switch away or maybe even just
start that way.

But just switch to the SQL file
because it's going to give you

the highest fidelity information.

It's essentially what pg_dump is.

Nikolay: And if some database guy
is performing code review,

it's much easier to see the changes
in Structured SQL than in

this language you don't fully understand.

Basically, if you do it in the project
with SchemaRB, you end

up asking, provide me a full log
or a full dump of everything.

Andrew: Right.

Not to get too philosophical, but
as we're on a team building

an application, the code is kind
of our method of communicating

with the computer and with the
team members as well, you know,

as we express the domain concept.

Or in this case, as we're expressing
the database design.

So if your team has people that
are more, you know, database

only people then giving them a
Ruby file is, I mean, they, they

would have to learn it and it doesn't
actually have all of the

information in it often.

So yeah, it's a little, it's a
little weird.

However, as I was writing the book,
I was thinking about a lot

of these kinds of things too because
I have been on a lot of

different teams.

And I was kind of trying to not
be too preachy about one thing

or the other, just kind of saying
like, well, if your team is

mostly Rails developers, you know,
just know that there are limitations

to the Ruby schema dump, that there
may be some information that

you're not getting in here.

However, you know, and then I sort
of made the pitch for the

structure file maybe.

But yeah, that's a good point.

Like if you're working with a DBA,
you know, you can hand them

a dump file and they'd have a clear
understanding of what's in

the database.

Nikolay: Yeah, I'm curious if things
would be changed with LLM

and so on, which can easily translate
or something.

But it's a different story.

Andrew: Yeah, that's an interesting
idea.

Nikolay: Do you have an understanding
why do we need to keep

two tracks of changes?

So first is each migration is kind
of different, and if we keep

all of them, we always can build
from zero, from ground, we can

build our final schema, right?

Just in steps.

But also we keep this structure
SQL, or schema RB, doesn't matter,

which is a snapshot of the latest
version.

Why do we need both?

I was very curious, I was trying
to understand this all the time.

Why do we need both?

Andrew: Yeah, well some people
even say, so as your incremental

changes contribute to the single
file representation, schema

RB, and like if we're talking about
the Ruby one, some people say

that like the schema RB, it is
basically the same as the incremental

changes.

It's just everything at once.

It's kind of this intermediary
Ruby representation of your database's

structure.

And you don't actually need both,
really.

The incremental changes do serve
as the log, which you'd otherwise

have to...

You could get through Git.

Like if you looked at, you know,
if you pulled changes as a developer

on a team and you notice that the
schema would be changed, but

there wasn't the incremental file
that represented the change,

you could look into the Git history
and try to discern what happened,

but I guess the file kind of makes
it easy.

And also when Ruby on Rails was
started, we didn't really have

Git, so we were stuck with like
subversion or something else.

So, you know, I think Git made
doing that kind of version to

version investigation easier in
my experience anyways, like being

a, I think a better version control
tool.

But yeah, I mean, technically you
don't need it.

And so what some teams do after
a while is they just throw away

the incremental ones.

And so Ruby on Rails also allows
you to just load the schema.

If you're, if you're setting up
a brand new machine, or a new

developer on the team, you can
just load directly from that single

file, the schema.rb.

So that would be like a structure
load command.

And that would work the same regardless
of which type.

So Michael, you were asking about
like a beef.

I think there's, it's like not
a real beef.

It's like a, it's like a faux beef
that programmers invented,

I think.

But it's kind of like, do you want
to preserve the beauty or

elegance of the Ruby code, or do
you want to just like, you know,

throw it away for the ugliness
of an SQL file, maybe what some

Ruby programmers might say, Or
what I might say is like, I actually

think the SQL file is pretty.

It's elegant.

You know, it's got, like I said,
it's like high fidelity.

It's all of the information, you
know, and you can even customize

it.

Active Record allows you to pass
flags through to pg_dump.

So if you need even more information
or you want to change the

output, actually I did think of,
there is one reason maybe the

beef emerged is if you have two developers
running slightly different

versions of Postgres, so maybe
the same minor major version,

but slightly different minor versions.

Postgres changes the pg_dump format
over time, different, like

it could be the ordering or I think
it's usually the ordering.

And what can happen is you get
these annoying diffs as a developer

where two developers are essentially
it's noise.

It's not a real meaningful change,
but it's like, Oh, the triggers

we added now they're on top of
the constraints.

And before they were after the
constraints or something like

that.

I'm just making that up, but I
have a solution for that though

too.

Lucas Fiddle created, this is several
years old, but it's a Ruby

gem that basically does some post
processing on that dump process

and does like an explicit ordering
of all the content.

So that if you have a team that
uses that tool, then you should

have consistent ordering amongst
each other.

Michael: And this was for the .sql?

Andrew: Yep, that's for the SQL
version.

Michael: Makes sense.

Yeah.

Cool.

I could see Nikolay exploding as
you were talking about how beautiful

the Ruby was.

And so for our podcast listeners,
I felt like that couldn't go

unsaid.

Nikolay: There are 2 of the ugliest
languages in the world, JavaScript

and SQL.

They also happen to be the most
popular ones.

Andrew: That is interesting, isn't
it?

Or even C, right?

Like C is still, you know.

Nikolay: Another thing is that
Ruby was created or else in 1995,

same as Java, JavaScript, and PostgreSQL
95 and what else, right?

So many things same year, just
a random fact.

Yeah.

True.

Michael: Yeah.

So you mentioned way back about
some limitations of the ORM.

I think it's worth talking about
like ORMs have a bad reputation

if you talk to database folk.

Is it worth talking about some
of the like more common issues

there or like ways around those
issues or limitations when you

need to break out that kind of
thing?

Andrew: Sure yeah.

Well that was actually a pretty
big premise I wanted to cover

in the book is to show people Show
readers like there are these

other things in PostgreSQL that you
may not be aware of if you've

limited your kind of research
area to just what's supported

in ActiveRecord.

Michael: Mm-hmm.

Andrew: mean one that comes to mind
right away is table partitioning.

I mean, there's not really any
support in ActiveRecord for table

partitioning.

It doesn't mean you can't do it
with a Rails app, but you might

run into a couple of small issues,
especially prior to recently,

composite primary keys became supported
in ActiveRecord.

But I was performing a table partitioning
project on a Rails, older

Rails project about a year ago,
and there were some issues with

assumptions code would make about
primary keys, for example,

just like there's only one column
that is the primary key definition.

But yeah, so it's worth noting
that, you know, if you have a

high growth table that you wanna
look at table partitioning for,

you would be likely doing that
a bit on your own with writing

SQL commands and or kind of maybe
researching like a Ruby gem

that you would add into your Rails
project that has done some

of that work for you around creating
the table structure or making

sure your queries have the partition
key column or things like

that.

Yeah, I mean, definitely ActiveRecord.

I think ORMs generally try to bring
some of what the database

does into the application.

I think that's fair to say, like
for example, triggers in Postgres

that have, you know, that can trigger
different trigger types that fire

at different times and that have
different scopes.

There's a whole set of things called
Active Record lifecycle

callbacks that are similar in their
purpose where, you know,

you might want to persist an object,
which would be taking an

in-memory Ruby object, you know,
turning it into an insert or

update statement, basically.

But you could intercept that kind
of before that event happens

and do something within the application
code with an Active Record

callback.

So you could have like a before
save or a before commit.

And some developers might not then,
they might try to design

things within that scope where
maybe a trigger could be a solution

where like if two applications were
sharing the same database,

which is not a great idea sometimes,
but maybe you'd want to

put the trigger in instead of having
it at the application level

so you don't have to duplicate
code or that kind of thing.

Nikolay: What about transaction
control in this case?

Is it inside one transaction and
Rails controls this or can you

do like trigger outside of transaction
so it's not guaranteed

that it will be consistent?

Andrew: Yeah, yeah.

Active Record supports a transaction
concept that maps pretty

much straight up to Postgres, you
know, begin, commit or rollback

transaction.

And actually recently learned you
can pass in, you can do transaction

control.

Like if you want to change the
transaction.

Yeah, the isolation level.

Yes, thanks.

If you want to change the isolation
level, I actually just recently

realized ActiveRecord lets you,
supports that, you can pass it

in as an option.

And then I kind of verified it
myself and made sure that the

SQL statements were generating
those things.

But yeah, I guess if you do want
to have some kind of nested

transactions or if you want a little
more control, then I think

part of what, you know, you certainly
might want to take that

on yourself.

And, But I wanted to actually then
tie that into, I think, part

of why ActiveRecord has been successful
is it doesn't, at a certain

point, you might just say, well,
I just want to write SQL, but

I still want maybe some ActiveRecord
objects to work with.

And ActiveRecord doesn't prevent
you from just writing SQL within

your ActiveRecord code.

And so that can be beneficial if
you want to just say, well,

I just want to write my own SQL
statement here for a query, or

I want to even just use it as kind
of an interface to run some

commands, like maybe, you know,
opening a transaction, although

that is supported directly.

But you can write SQL commands
within ActiveRecord as a string

that then get invoked or get sent
through the ActiveRecord connection

pool, etc.

And then what you can do is you
can leverage then taking a result

set and taking advantage of mapping
all of those database types

into your Ruby object types and
have an object to work with.

Or you can even use primitive types
like having a simple lists

and strings and that sort of thing.

So it's kind of like, it allows
you to, has these nice helpers

to do things like, you know, perform
joins and limit fields and

things like that.

But it also doesn't prevent you
from just saying like, I wanna,

I'm gonna take over here.

I'm just gonna kind of write my

Nikolay: own SQL within this.

Huge recursive CTE I want to write.

Andrew: Yeah, you could do that.

Yeah, there's.

Michael: What proportion of the
time do you tend to see, for

yourself and for others, do you
find that you're, Yeah, you're

using one versus the other.

Andrew: Yeah, it's interesting
because like 10 years ago, I did

see more of, I'd say, writing plain
SQL within ActiveRecord.

And I think it was because a lot
of folks that were using Ruby

on Rails then, they had more that
were more senior engineers,

they had experienced, or they had
experience with working with

SQL and other databases, and it's
just how they worked, you know?

And ActiveRecord was also more
limited in its capabilities.

And then I think there's been this
blend maybe over time, or

this interesting parallel tracks
of things happening.

One could be ActiveRecord has gained
more, it's continually adding

more and more helpers.

The documentation refers to it
as helpers, but like, you know,

as I was mentioning before, recently
common table expressions

or CTEs gained support, which I would
have thought maybe they would

have been around for a lot earlier,
but ActiveRecord has a first-class

method, a helper method for CTEs
now, and then composite primary

keys I mentioned.

So ActiveRecord has gained more
support even when you move beyond

one database.

If you want to work with multiple
databases, you can configure

that in your application.

There's even the ability to take
advantage of automatically sending

read-only queries identified by
the HTTP verb for your web app

to a read replica.

If you have that configured, Active
Record lets you set up a,

what they call a writer and a reader
role, and then they even

added that to the sharding capabilities
too, where if you have,

if you take advantage of horizontal
sharding in Active Record,

you can do some automatic shard
distribution.

And so I think because there are
more capabilities that ActiveRecord

supports, there also, I tend to
see less writing of SQL, but

again, I think it kind of comes
back to the team's composition

too.

If the team's very familiar with
SQL and also myself as I've

gotten more experienced with SQL.

I think my patience for like, okay,
is this supported in ActiveRecord?

record?

Like, okay, fine, I'll do it in
Active Record because that will

work well for the rest of the team
or whatever.

But if I can't find it pretty quickly,
I'll just write it as

an SQL statement.

It's no big deal to me.

Michael: Makes sense.

Nikolay: I worked with Rails projects
a lot, and one of them is

GitLab.

Shameless plug-in of ads for their
migration helpers.

.rb is great, and documentation
is open and great.

But someone in the Rails community
should finally rename disable_ddl_transaction!, disabled DDL transaction, because

people, just because of this weird,
I don't want to say word

or literal, a lot of people tend
to say we are going to deploy

this without transactions, in non-transactional
mode.

Postgres cannot work without transactions.

So it's just becoming like a single
transaction, each statement

becomes a single transaction.

Michael: It's not transactional, right?

And I don't like this part of Rails
at all, for many, many years,

and it should be renamed.

But at the same point, you know,
levels of understanding.

I also understand at some point,
like, I understood, creating this

concurrently, and you need disable_ddl_transaction to run create index

concurrently.

It's not transactional, right?

Because it can leave leftovers.

So things are difficult.

But I see people like Ruby developers,
they say we will execute

this without transactions.

It just breaks my ear, Postgres
ear.

It's not right.

Is it possible to change at all,
to create pull requests or something?

Andrew: Oh, you definitely could.

I've actually seen...

I mean, yeah, we could after this
call, we could create a pool

request to Rails and try to get
that approved, but it'd be an

uphill battle, I think.

Nikolay: It's probably a huge battle.

But I already have 1 battle won,
and I remember the 37signals

team, or Basecamp, or who is that?

Yeah.

They added support of the logic
when we have replicas, asynchronous

replicas, and a write happened
in a session.

So you need to stick to the primary
for some period of time,

because otherwise if a replica
is lagging, you won't see your

own write immediately after.

So they implemented this.

Before that, many companies, and
I also implemented this logic

in several languages, in PHP and
Java, in my past.

But I see this path finally Rails
is going through, and it's

obvious to me that they implemented
this lag as a constant written

in code.

It should be configurable.

So there was a big battle, and
DHH supported my proposal.

I was happy to see, like, let's
make this configurable.

So maybe we can win 1 more battle,
right?

Maybe.

It's not a question, sorry, it's
just my story with Rails.

Andrew: Yeah, well, it is actually
a good, I like that the design

of, so this replication lag, there's
like a resolver class concept

and the documentation talks about,
actually I wanna get back

to your question too, Michael,
about are there any other benefits

we didn't really cover?

I like how the documentation says,
hey, we purposefully have

this simple resolver class.

And if you have greater needs,
then they've built in an extension

point.

They've said like, okay, just create
your own resolver class,

implement this method, and then
you can do whatever you want.

And if I remember correctly, that's
how the automatic replica

switching and the shard switching
works, is they both have kind

of an official extension point so
that you do kind of get some

default behavior.

They took a guess at what a reasonable
replica lag would be,

I guess.

And then, but they said like, if
you want to do something else

or omit some certain, you don't
want to make this type automatic

or you do want to make this type
automatic, etc., you could

do that in your own implementation
class.

Just so we could cover this quickly
too, Michael, you were asking

before about other benefits, like
if you do stay kind of within

the Rails world.

I think like, because I wanted
to mention that a lot of these,

what I would call enhancements
for like higher scale operations

are coming from developers at GitHub
and Shopify and companies

that do have, you know, internet
scale operations, and they're

still using Ruby on Rails with
their relational database.

And they're building a lot of things
into ActiveRecord.

And then I think those are, you
know, great gifts that we receive

as users then of the framework,
even if we don't have that scale,

we have some pathways that we could
grow into that if we need.

And we also get them the benefit
of the ORM there where it's

like, well, multiple database adapters
are supported by the framework.

So you need to make sure that this
works in both MySQL, even

though they don’t use Postgres
at GitHub, from my understanding.

The Active Record capabilities
support multiple relational databases.

So we kind of get that, even though
they're not using Postgres

within the framework.

Michael: Yeah, that's really cool.

And it used to be like, I'm old
enough to remember when people

used to ask does Rails scale and
think, you know, like that,

those old questions.

And it just feels like we have
so many good examples now that

doesn't seem to come up as often
anymore, which is nice and refreshing.

I did actually almost want to ask
from the other perspective

quickly, though, what, like, as
a Postgres community, what can

we keep doing or what can we do
better to make sure Postgres

stays as the number 1 choice
for Rails or how can we make

things easier for developers at
both large and small organizations?

Andrew: Yeah, that's a good question.

I do think that there's a little
bit of a recent popularity with

SQLite within Rails.

It's gained some features that
I think make it more scalable.

And then also, depending on your
deployment setup, it may offer

you a simpler deployment configuration.

And then on the MySQL side, I think
that there's, you know, there's

certainly I hear a lot about PlanetScale.

They're, they've been doing great
with, they're the only, I actually

tweeted this a while ago, but like
they somehow made foreign

key constraints look cool.

I don't know if you saw that video
or not.

They have this well-produced video
where they're like, boom,

we added foreign key constraints.

And I was like, it was actually
kind of cool.

And I mean, I think that databases
are, it's hard to make, to

generate a lot of enthusiasm maybe
around it.

But I would say that like, you
know, some of the companies offering

Postgres as a service these days,
there’s a number of new startup

companies.

They're kind of either advertising
more like we can help you

with multi-tenancy or we can help
you with full-text search or

we can help you with these other
sorts of like capabilities or,

You know, usages that Postgres
supports.

It just might be a lot that you
need to kind of build yourself.

And so I do think Postgres has
like the raw ingredients for a

lot of stuff, but I do think that
if you're not going to use

a managed service and you want
to take this on, you got to do

a lot of work to build skills and,
you know, like if you want

to build in your own full-text
search at a good scale with good

performance, Postgres has a lot
of built-in capabilities and

is extensible.

You can add extensions.

But I think like, you know, just
continuing to maybe think about

the developer experience, I guess,
like at least for Postgres

users that are web applications,
anywhere the Postgres community

can contribute, you know, guides
and tutorials or if companies

that use it successfully with some
of these use cases, if they

can publish on their engineering
blogs, like I always love to

read those.

Like DoorDash is a company I know
that is a Rails and Postgres

company and has a lot of great
engineering blog posts.

And I think those are ways that
show pathways to leverage some

of these capabilities.

Yeah, and then I guess I've been
thinking about, I think the

open source solution still around
end to end query observability

is still like, it can sometimes
be limited in my experience where

I'll like, you know, we, we know
of tools like pg_stat_statements,

but then we want to collect samples
and it can be a little bit

difficult.

You kind of go back and forth between
relying on logs using system

catalogs, but Postgres keeps investing
in that there's new catalogs

and there's new things.

And so I think like anytime, any
of those sorts of things that

make it easier to see what my query
workload is like, where are

the costs coming from, especially
if they're connected to things

I can take action on.

Like, Nikolay, you mentioned recently
how pg_stat_statements shows

the number of rows returned.

You could go through and look at
queries where you could say,

well, we should really probably
add limits to these queries.

Things like that can make really
meaningful...

You can draw a real clear line
between, like, here's information

that's being provided to you within
Postgres that helps you out

on as you're building and scaling
and operating your system.

Nikolay: But this approach is reactive.

If we see pg_stat_statements on production,
it's already happening.

It would be great to try to guess
based on Rails code.

It's not about migrations already,
just regular some code, Rb

file, which serves some page or
API endpoint.

And we can just look at it, we
can say, oh, the limit is missing

here, right?

Yeah.

People at GitHub, for example,
they build a lot of additional

things in CI which help them.

Andrew: Yeah, that's right.

Nikolay: I think something can
be found in public.

So basically, for example, a select
plus 1 thing, which is very

common for ORM, it can be automatically
detected before you deploy.

And even if you test it on a very
small database, not on a full-size

database, or probably workload
generated is not yet in production,

but we already can see something
is dangerous here.

I'm trying to say this is probably
not Postgres's job.

Andrew: Yeah, it's probably not
Postgres's job.

I agree with you.

I was kind of thinking there's
the Ruby open source community,

there are some great command-line
tools.

I'll shout out rails-pg-extras
is one.

And a common interception point
where you might want to take

a more of a proactive approach
would be when you're creating

a migration.

Like, oh, do you realize now you've
created an inconsistency

between your schema and your application?

Like maybe you were checking over
here it was a string type and

over here you've created an integer
type, that kind of thing.

Or that's not the most exciting
example, but that command-line

time, like when the developer is
working, is kind of one touch

point.

Or like you said on CI, you shipped
off some code to a system

running tests and it starts to
detect things and gives the developer

feedback before it goes to production,
that's always a good thing.

Nikolay: Yeah.

Well, I'm trying to reflect on my
experience with Rails.

I can show you another one from 2017
I found.

So Shopify is on MySQL, right?

Yep.

You mentioned this.

So let me describe it.

There's a gem or library, I don't
know, called Delayed Job.

Delayed Job, just a single one.

And in 2017, I saw the issue.

I had a client in 2017 which was
on Rails and on RDS already.

It was maybe around the first time
I had a good production experience

with RDS, which I actually found
liking because it provided some

good automation for cloning, right?

For experiments.

Experiments matter a lot.

So I found this problem.

It was not performing well, delayed
jobs, under scale.

So you have a lot of jobs, a lot
of entries, also like high insert

rate. You need to have a high processing
rate.

It was super easy for me to find
what was the problem and I also

found this issue. It was discussed
on GitHub since 2013, saying

with half a million jobs it gets
really slow.

Half a million is not a huge number
at all for Postgres, it's

a small number.

So I popped up saying it's super
easy, like you just create an additional

index, this 1, and you have already
selected for an update, just

add 2 words, keep locked.

And index plus this, you have a massive
boost for your library.

You don't need to migrate to Sidekiq
or Kafka or something.

That's it.

You don't need to learn PGQ from
Skype.

And they didn't do it.

Now I understand, because it's
from Shopify, and they're from

MySQL world.

They started discussing some Postgres
versions, and I see that

I got a bunch of likes, and I keep
having email notifications,

people thanking me because this
saved their life, because this

improved.

But it looks like not only we talk
about the connection of Ruby and

Postgres worlds and the problem
of how to connect better, but

also this idea of ORM, let's be
abstract, let's keep agnostic

from the database.

This makes life worse actually.

We cannot use what Postgres can
offer.

What do you think about this?

So again, not a question, a reflection
of my experience, but maybe

it's a good discussion I think.

Michael: Well, I don't think you'd
be able to use Postgres if

there was only, if Rails only supported
one relational database

it would not be Postgres I don't
think, based on who runs the

Rails project.

Andrew: Yeah it certainly wasn't
the most popular in 2005 when

Ruby on Rails was kind of really
getting out there for the first

time, which now is a long time ago.

And yeah, I mean, so it does provide
you that indirection point

where you can say, well, we have
a generic adapter and then we

have the MySQL adapter.

And yeah, and maybe it doesn't
support skip locked.

So maybe we can't do skip locked
at all because it's not supported

across all three of our databases,
which is a trade-off.

Nikolay: You can write an if.

If it's Postgres, add two words.

Andrew: Yeah, and actually that's,
there's been, I should have

had a couple of examples ready
to go, but there's actually a

couple of things that are actually
only supported in Postgres

that ActiveRecord supports, which
is pretty exciting because

it's like a little bit of a philosophical
switch to me where

you say like, okay, well we have
these really useful capabilities,

but we're just going to support
them in Postgres only.

You know, sorry, MySQL users.

And that way then you get the best
of both of those categories.

But of course, like, then we aren't
really getting into this,

but there's a whole discussion
about SQL standards and stuff

like that.

So I know like, for example, the
returning clause is not supported

in, my understanding is not supported
in MySQL.

Nikolay: I just checked the docs
for MySQL, like latest version,

and they already started to support
skip locked, so.

Okay, yeah.

Andrew: Well, and it could be that
a framework that chooses to

offer conditional support, as long
as it's, of course, like,

you know, well tested and supported,
then that could actually

apply pressure to another open
source database to add support

for a SQL standard command, right?

It's like if this gains support,
probably not one framework like

Rails, but if everything Rails
and Laravel and Prisma and all

the language communities are all
clamoring for additional support.

Nikolay: You mentioned SQL standard,
but the problem here is

that SQL standard doesn't care
about performance at all.

Nothing about indexes and SKIP LOCKED is not there.

They don't care about performance.

And here we talk about performance,
right?

So it's not easy sometimes.

There should be some addition to
standard talking about performance.

But I don't see how it can happen.

Andrew: Yeah.

I think it's great that you've
provided that feedback, Nikolay,

into the community.

I think there is, to the earlier
question too, I think like Postgres

is going to, you know, has its core
set of objectives with each

release, you know, where the development
effort investment is

going and that sort of thing, which
is going to be a completely

different track than Ruby on Rails.

But like if we have some overlap,
you know, the folks that tend

to maintain the Postgres adapter
code are Postgres fans in the

Ruby on Rails community.

They happen to also write Ruby.

And it's great to have those kind
of overlapping folks.

There's kind of this discussion
around, I know we don't have,

we're out of time to get into this,
but there's an example that

comes to mind too of an in-clause
SQL query that has a large

list of values.

And then some folks notice that
from Ruby on Rails and on Postgres.

And that was one where I felt like,
I think Nikolay, we might have

both tweeted about this at some
point, but like there was kind

of this crossing the streams for
me where I saw people, Postgres

people I follow and Ruby on Rails
people, were like, you know,

we need to do something about this
and improve query performance

for these types of queries.

Nikolay: Right.

Also, .plug, or how is it called?

Plug, right?

This is like when you retrieve
everything, then you do something

like Postgres can do it much more
efficiently.

Don't do work instead of database
on application side, right?

But I guess it's a small thing,
any application language can

be used for this not efficient
approach.

What do you think about the future
of Ruby in general?

Andrew: I think it's achieved that
kind of cockroach status where

it just never goes away now.

I don't know.

There just seems to be enough,
not cockroach DB, but the actual

insect.

There's such a large amount of
companies that there's just a

lot of work to maintain the applications
if they're successful

that use Ruby on Rails.

Of course, the growth is not what
it was in the early days.

But there's a huge community of
Rails developers and there are

new people entering it.

So if someone enters it because
maybe it's like their bootcamp

program or they joined a company
that's using Rails, despite

being 20 years old as a framework
for Rails anyways.

First of all, I think Ruby is a
great language outside of Rails,

but despite being a 20 year old
framework, like there's a very

steady development clip.

There's new features being added
that are exciting.

I just think they tend to be more
front end oriented.

Like Ruby on Rails these days is
trying to capture more attention

from what might have been JavaScript
only apps before and offer

a lot of capabilities on as you're
building your web application

screens, like giving you a lot
of interactivity and low latency,

but doing it within Rails and kind
of getting back to some more

of the full stack kind of application
building capabilities.

But there are things happening
too on every release related to

Postgres and I certainly track
those and it still feels active,

just not where everyone's flocking
to.

Michael: I thought the active might
be a Rails joke.

Andrew: Oh yeah, it could be, yeah.

Michael: Yeah.

On a serious note though, and thanks
so much for all your time,

where can folks go to learn more?

Andrew: Yeah, If anyone's interested
in this topic area and they'd

like to explore the book that I
wrote, it's at pragprog.com.

That's the publishers Pragmatic
Programmers.

And I do a fair amount of blogging
on Postgres and Rails topics.

My blog is andyatkinson.com.

And I'm also on most of the social
media apps.

This year I'm going to be at Sin
City Ruby and PG Day Chicago,

as well, so probably more for Postgres
people.

If anyone's at PG Day Chicago,
I hope to make it to some more

Postgres events, but I love the
in-person events too and being

Able to meet more community members
and learn from other people.

Nikolay: And the book is called
High Performance PostgreSQL for

Rails, right?

Andrew: Yep, that's right.

Michael: Awesome work, awesome
publisher as well. And thanks

again.

Andrew: Thanks for having me.

A lot of fun, guys.

Thank you.

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

Guest

Andrew Atkinson

Software engineer, speaker, consultant, and author of High Performance PostgreSQL for Rails

Rails + Postgres

Creators and Guests

Some kind things our listeners have said