Postgres FM | Transcript: Corruption

May 19, 2023 • 42 Minutes

Corruption

[00:00:00] Nikolay: Hello, hello, this is Postgres FM, I'm Nikolay, and my co-host is Michael. Hi Michael.

[00:00:05] Michael: Hello, Nikolay.

[00:00:07] Nikolay: And today we are talking about, corruption, right?

[00:00:10] Michael: Yeah, absolutely.

Database corruption.

[00:00:14] Nikolay: technical corruption.

[00:00:15] Michael: Yeah.

[00:00:16] Nikolay: involved unless you need to pay some consultants, a lot of money, to fix it or at least to understand what's happening and how to escape from it. Because sometimes you cannot fix, you need just to escape and lose some data and. Say what you want and so on. Right.

[00:00:32] Michael: Yeah, these can be some pretty horrible stories, can't they? every now and again, I see somebody mailing one of the mailing lists or popping up on one of the, discord on Slack or something like that, and they, they say something along the lines of, I'm getting this error message help. And you can just tell you, your heart sinks and you think, oh, they're in trouble here.

this could be bad.

[00:00:52] Nikolay: All right. let's start maybe from what corruption is. what definition do you have in your mind, because I've heard various kinds of definition.

[00:01:03] Michael: Yeah. Good, point. I guess it's a, a few categories of issue. there's some things on the, the subtler side, which are like incorrect information coming back let's say, for a query, all the way through to the database not starting. you've probably got, I mean, written in front of you, but I guess it's data not being on disk in the way it was supposed to be. Something along those lines.

[00:01:29] Nikolay: Yeah, well we, we can distinguish particular type of corruption called data corruption. And in this case, that is quite broad, but I think it's quite good definition. Uh, you wrote something, you ask data boys to write something, you try to read it and you cannot read it or you read something else and so on. for example, you wrote zero, but, uh, when you read it, it's, it's one, this is corruption. But if you cannot read it and you have some Error instead on just no response at all? Well, sometimes it can be called corruption, but sometimes if server is down, it's not corruption, it's just, it's down.

Right? Or it's not working well at all and have some bug or something. Some people can name this corruption as well, but it probably doesn't. Involve any, wrong data rights or data reads, maybe just, uh, like server itself, itself is not working properly and data is fine on disk, for example.

It's not, there's no real corruption there. But if we apply this quite broad definition, You wrote something, but you cannot read it or read something else. A lot of cases can be called corruption, but which are not really corruption, right? So, so I, I'm struggling to find very good definition for corruption.

[00:02:45] Michael: Well, I think probably, if we got given a hundred cases and we had to categorize them into corruption, not corruption, we'd probably agree on 98 or 99 of them. It feels like there's quite generally accepted, cases. Should, should we go through some of like the more common ones?

How do you want to do this?

[00:03:02] Nikolay: Yeah, definitely. first of all, data corruption, uh, can be in hip in. Tables for example, and It's not uncommon because still a lot of databases have data check sums, disabled. Right, because you need specifically enable them. They have, this has overhead.

Some managed, pocus providers, uh, enable them by default. some don't. But if you install pocus, if you install it many years ago, it's hard to switch to check sums on because, Basically you need the NTB and Dump Restore, or you need to do some, play some tricks and we can discuss it with some extension, but it's not easy if you have big database under load and the maintenance window is not welcome, it's hard.

You postpone. I see many cases when people. Keep postponing this, living with without check data, check sounds enable. So they probably have data corruption in table files easily. It can be also in toast files, which also part of data files. Then, it can be in various other places, for example, in indexes and let's discuss this in.

Depth, like in detail a little bit later, but now just let's walk around and see how what else can be corrupted. Visibility map can be corrupted,

[00:04:24] Michael: Yeah, a bit more niche. That one I guess. But the, the other major category I've seen a lot, like I say, seem a lot, I've heard a lot of people that have had tables, data corrupted indexes, corrupted is, it seems to be the most common one to me, especially recently with the drama around version 14. But the, other one that I see quite often is backup file corruption

[00:04:46] Nikolay: Oh

[00:04:46] Michael: that catches people out.

[00:04:48] Nikolay: Yes, and let's talk about this in detail as well. Now let's just collect ideas, for our episode. And, uh, I also can be what can be corrupted, uh, woo fast, potentially can be corrupted, but fortunately for ages we have data checks, uh, wool record check sums in enabled everyone has it enabled. It's well already.

That's why we don't hear a lot about, wall. Corruption. If you think also about levels, for example, data, you write something to table, corruption might happen in at software, it might also happen at lower software file system. For example, it might happen in hardware. So that we have also layers.

We, we have verticals and we also have horizontal classification. It's like Metrics. we can, when we say, uh, data files corrupted well, but what if. at protocol level, something happened. You, you are writing zero, but protocol substitute 2, 2, 1. You have data check something enabled. I, I like, it's crazy.

It's, it's probably not going to happen, but I'm just trying to explain about LE levels. So protocol replaced your value. Right. And then, uh, POS wrote with data check, Sams enabled, everything is fine. Check Sam is fine. File system is fine, disks are fine. No corruption. Right. But it already happened at upper level.

[00:06:19] Michael: Yeah. And you haven't even mentioned operating systems. I think, and I That's like one of the major, yeah, so it's, there are so many places,

[00:06:28] Nikolay: your application, If, if application took some value from user input and when replaced it, POCUS cannot notice it because it's already very upper level. But we won't be talking about, uh, application level, uh,

[00:06:43] Michael: Yeah, let, let's, as far as I'm concerned, I don't think if Postgres has been told to store a certain value and then it returns that same value, uh, we are not talking about corruption. So if the application, uh, is to blame, I think that's out of the scope for me of the definition here.

Cool.

[00:07:01] Nikolay: Right. also, uh, replication, both logical and physical might have issues and, it might spread some corruption, for example, it, well, well, of course if some data is corrupted on the primary, it, it will be easily spread if you use physical. But interesting point here, if you use logical, probably you'll be fine.

It's, it depends of course,

[00:07:22] Michael: Well, I was gonna say, you could have different, corruption on each. Like they could be exactly the same. They could be different. all combinations seem to be possible. But yeah, really good point. That logical can help avoid some of the types of corruption spreading.

makes sense. But equally, I've heard problems, like I was reading up upon a thread by a trip advisor, uh, thread a while back, where I think it was actually replication that was part of the problem. It was replicating between two systems as part of an upgrade. But the replica was on a different operating system and that ended up causing, corruption.

So, It almost certainly was gey related. I can't remember the details though.

[00:08:02] Nikolay: Yeah, well maybe to finish with upper level. when application corruption happens. I remember I had a very weird case when, uh, my scripts, based on P sql, uh, returned one value while application had different, so different value from database and uh, application was Java. You know, this meme, uh, ground is Java. And all junior developers looking as goats, they are all on three. so it was Java, it was a quite good customer. And, uh, they asked me like, I trust P SQL much more, of course, what happens in Java with all those like springing everything, hybrid, like a lot of stuff there what's happening here?

And, but then I noticed that in the Beaver value is the same as za. And I think maybe we, we have issue with pc sql, maybe some back, but then I realized, uh, the, also Java,

[00:09:01] Michael: good point.

[00:09:01] Nikolay: so if you use the d

bbc, it has, it has some like extra flow digits or something. So numbers, uh, look differently sometimes, and you just need to change this.

And you'll be fine. So this also kind of corruption to me because like, I expect one value, but it's, it's, it's really at upper level. the stored value was fine. PC SQL show that well, it's called Extra Flow Dig. Default in Postgres is not the same as default in G D B C. And this is how you can easily get corruption if you don't notice it, okay? It's good that I noticed it, but if you write some application and your application does notice switching from, uh, G DBC and to other things like, lip odbc, for example, switching back and forth can cause corruption If you read something and a a right to different place, to different

table. You will have different results, corruption, just, uh, causing by lack of knowledge about this setting.

[00:10:02] Michael: Mm-hmm. Just as we forgot to say that this was a, this was a listener request and, um,

they did, they, so thank you for that. And they, they mentioned they were interested in like, What kind of types of corruption there are, how to detect it, how to fix it, how to safeguard from it, , and even are there any performance penalties to some of those safeguards?

So that's an interesting question. I feel like it's hard to carry on with out talking about, well, they asked types of it. I think we've covered that a little bit, but it would be easier to keep going by talking about how can we detect it or like how do we generally

[00:10:38] Nikolay: Yeah, yeah. Let's, let's dive into specific types of corruption, but, but, uh, to, to finalize this, uh, like, overview of types of corruption. I, I must admit that, it's one of the topics when, you know, like this saying, the more I learn, the more I understand that I don't know anything. it's the, topic where you. Can easily realize that, oh my, this also can happen. This can happen. And then you already have feeling that, uh, there should be many more types of, and. Kinds of corruption, which we probably keep being unnoticed by everyone, right? But at some point we start talking about them.

So we had a few sessions, on positive CV also including with Andre Borodin. Where we discussed in detail various things. I will try to mention few approaches and also I have a couple of documents. Where we try to summarize a lot of links, tools, and uh, also mitigation issues, monitoring approaches and so on.

So where do you want to start? heap index. Anything else?

[00:11:42] Michael: Well, or even as you said, some, some of those links could be useful. I, I was looking at the Postgres Wiki. I thought it had a really good page on corruption.

[00:11:51] Nikolay: It's good, but it's very, very limited.

[00:11:53] Michael: I agree, but they make roughly the same point as you, which is by definition, a lot of cases of corruption are quite unique. if they happened all the time, hopefully there'd be things in place to mitigate them.

So, the people designing software systems really, don't want this to happen. This is like, we spend quite a lot of episodes talking about things like performance, but performance doesn't matter at all if you're, you know, it's like quite, this is quite a low level need. This is something people come to databases and really want, if you, if you say, oh, our databases is a thousand times faster, has loads of other benefits, but there's like a 5% higher chance of corruption, or, I don't even know how those numbers would work, but this is a really like basic need.

This is something that Postgres, I think has quite a good reputation on generally, but we still have some quite famous cases of, of issues. So, yeah, I thought it was a good starting point that, that Wiki article, I'll link it up as well, but you're right, it's, it's not

[00:12:48] Nikolay: Definitely. And, when we talk about corruption, I always start also from that article. It's like it has some very core, knowledge and in the first lines, you will see first rule. If you try to fix something, make a backup of your data directory. Or backups or what you have like, because, with trying to investigate and fix it, you are going to use, uh, some, interesting, risky tools and approaches.

Some of them probably are not working well and you may make, harm additionally or just lose everything. So you need to start with back additional backup of everything.

[00:13:25] Michael: yeah, if you can copy everything, and make notes of what you're changing. If you can't, there's two things though, right? It firstly, you can make things worse. Like you can end up in an unrecoverable state when you were in a recoverable state previously. And the second thing is, even if you do recover, If you don't have a copy of what it was like in a corrupt state, you might struggle to understand what the root cause was, or there might be some useful information in there for working out what caused this.

And it also, I think it says quite early on, you might want to contact professionals for this. So most of us in our day-to-day work who aren't consultants, I count myself in this, don't see this often, if ever in our careers. Hopefully you'll never see this in your career, but consultants get called about this.

I dunno, you could tell me the exact number, but probably at least monthly, like seeing different cases of, this kind of thing. So these, are people that have seen this dozens of times, if not hundreds of times in their career. So you might want to lean on their services and it's probably worth it financially to pay somebody to help you with this

[00:14:25] Nikolay: It's, it's one of the topics which for consultants, uh, like, my team and I, uh, like to do because this clients have fears. If you explain properly what kind of fears they should have, they, they easily have them. And of course, uh, they, they usually, they, uh, ready to, to invest into improving things.

For example, improving monitoring and mitigation and so on. Let's start from backups probably. I find that as probably. One of the easiest place, backups without proper testing are like shanger backups, right? Because nobody knows if they are working. And of course, all backups should be tested.

Of course, it, requires some resources. You need to provision virtual machines, for example, and test them and so on. You need to test recovery from a backup. And, uh, usually it means like all usually physical backups. They have. Full backup plus maybe delta backups and also wall stream stream of 60 16 or 64, uh, megabyte, walls.

So when we test, we need to test, uh, full backups, delta backups, and also we need to test, all walls and, point in time recovery until the very last point before next back backup started. So it's like, it's a lot actually to, to test because otherwise probably full backups is working. But later in the day, for example, you have daily backups and you test, full backup is working, but then some wall is, uh, missing, for example, or corrupted or something.

Well, wall corruption might be something which won't happen as we discussed, but, uh, the backup compressing walls, With some in inno innovative, , compression might, , cause corruption of like you cannot restore it, for example, and, uh, full test, , passed, but some wall later, if you don't test whole stream for until next full backup, you probably also won't be able to restore until some point, until the latest point.

Usually we need to, yeah,

[00:16:25] Michael: In my experience, larger companies, tend to have this down. They tend to be, have a process for doing this on a semi-regular basis, if not very regular basis. It's somebody's job to do the full run of this every couple of weeks or every couple of months, depending on how, like, depending on how

[00:16:42] Nikolay: every day it should be the same frequency as, uh, backup creation,

[00:16:47] Michael: Oh, you think,

[00:16:48] Nikolay: I'm sure.

[00:16:49] Michael: if you automate it, if you, so you're talking

[00:16:51] Nikolay: You, everyone must automate it. Otherwise, backup, backup system is super weak.

that's what, what, when people say untested backups are not backups, this is it. You, you need to test all backups, and of course you need to pay for it, unfortunately.

And all walls, including world walls

[00:17:08] Michael: but a lot of people, I, I think there's a lot of, especially smaller companies, startups that either have never tested a backup or did it once as like a, a one-off.

[00:17:19] Nikolay: and they just trust RDS and that's it. or,

[00:17:22] Michael: Yeah, exactly. That's another good point about only having backups in one place. Right?

[00:17:27] Nikolay: but how, how can we be sure that RDS tests all backups properly?

[00:17:33] Michael: or that they're not corrupt. Right. how do you know they're not,

being sorted in a way that's accidentally

[00:17:37] Nikolay: need to test, restore. Oh, by the way, when you restore, like there are also some levels. First you restore and, can reach a recovery point. And POCUS is working like, uh, accepting connections, queries, right? But then you usually, we think, okay, what about data in this restore pocus?

Maybe something wrong, right? Maybe corruption from production, uh, propagated to backup. And here we usually like very simple trick. You need to dump it fully to dev now, causing risk of all data, data files, but not indexes by the way. Right. And dumping. It's very, it's like, it sounds silly probably, but this is a very good approach just to, cause reading all, all files it might take a long time, while this, you need to pay cloud for these resources. resource consumptions is significant here, but this is the only option to test up and see. It definitely works, right? Soup recovery works pointed up. Recovery to any point in the day works and we can read the data, but reading data, it's already a verification of, , data files, not recovery itself, not restoration process, right?

It's already beyond backups, but this is, uh, good place if you have backup, verification automation, at least some like, not, maybe not every time, but sometimes this is good point where you can start testing various things including data, uh, corruption risks and index corruption risks as well. And, also, or mentioning some like larger companies you mentioned they have. some additional replica. For example, it might be delayed, replica used sometimes to quickly recover, , accidentally, mistakenly deleted files, deleted records in database. So they have, for example, a replica, which lags some hours, like six hours or four hours, always behind.

And this replica usually, of course it's based on wall shipping. And this replica can be considered as, , a test tool for wall stream. You still need to test, recovery of full backups, but, walls are tested by this replica. Or you can install database, lap engine and, , it'll be reading for walls and also can, if, if it's down there is a problem with false So it's a complex system to test backups.

[00:20:01] Michael: yeah. that sounds good. And it's a really good point that these can be corrupted. And the best way to check is see if you can restore it.

[00:20:07] Nikolay: S3 can have problems as well,

to store backups in object storage is very good because, S3 or or GCs, they have very reliable storage, so they don't lose data. But uptime might be worse than, e b s volumes, for example. So when you test it, you need to understand that sometimes it's slightly down, but they won't lose your data.

This is a very good thing. so somebody tries, logic needs to be implemented. It's, it's like, it's a whole project, to test backups properly. But okay, let's shift to data check sum, probably, and then then talk about indexes, because I think these topics are narrower. So data check sum should be enabled.

Of course there is some penalty overhead you need to pay for it, but in my opinion, base should be enabled everywhere.

there is also a PPG check sums, uh, which, uh, is a, remember. All new versions. It, they have it in it, it's country module, right? and perhaps you enable check sums. And there is a recipe, so you can do it on replica first and perform switch over, and then, cover all existing replicas.

So like with very small downtime, cost to zero, it's possible. But for older versions, this, , tool p check sums didn't exist. But there is a third party, I this tool is available for all the post versions, And it's also possible to use it, uh, with replication fail switchover achieving downtime close to zero.

[00:21:34] Michael: Do you know

what that one's called?

[00:21:36] Nikolay: uh, also p check sums,

[00:21:38] Michael: Okay, cool. Makes sense.

I'll find it and link it up. I, I wasn't aware of that.

[00:21:43] Nikolay: Yeah, so they both called similarly. Maybe some different, I don't remember in detail, but, uh, this is it. Like you need to enable data check sounds and this it, like, it still doesn't mean, for example, some constraints can be violated, silently.

Data check sum want to save you from this. It's like upper level, corruption data. Check sums will save you from, uh, having wrong rights. So you, you wrote something dis disk, for example, corrupted valve system has bug and then you read something else or cannot read it at all. This is where data check sounds are needed.

Also, you need to choose either to like, There is a setting saying that, uh, you need to stop if, if corruption, uh, was detected or you can continue, but just have some errors in locks. It's interesting, I had very interesting discussions with very experienced people who have, good experience with Oracle and, uh, SQL Server, and we discussed what, what is better for current business.

For example, you have eCommerce, which, generates thousand dollars per minute, for example, or. Per second, and you have corruption and you have very good, strong tool and you enabled it. And, uh, it, it puts your server down. If corruption is detected saying, oh, corruption fire, right? Of course you, you'd prefer to continue.

And if corruption is, uh, like limited,

you, you don't want to be down at all. Right? So, so it's better to know about the mitigate somehow, but you, you, you want to continue operating. This is interesting. Right. And, and uh, they also mentioned that some like SQL server, they have interesting approach. When corruption is found on one node, it can auto heal.

getting not corrupted, files from replicas, for example, from standby nodes, bringing them

back. Let's, let's, auto, auto Hill Post does have it, but it's an interesting idea. but like rule of farm here, enable data check sums if you don't have them enabled to use this extension to enable. So indexes, let's talk about indexes.

[00:23:57] Michael: Yeah.

[00:23:59] Nikolay: ZE is, yeah, it's a good tool, good tool

[00:24:02] Michael: Is it worth quickly covering? Like what So bere index specifically, why? Why do we tend to see more index corruption than

heap corruption?

[00:24:12] Nikolay: because of, collation and gypsy version changes. For example, if you upgrade from one, Linux, for example, Buntu to another, there was na, which said like, gypsy version two point 27 maybe. has, uh, Dangerous changes. So you definitely need to be careful. And it said other versions are fine, but in my practice I have zero trust. If we plan os upgrade, if we plan any upgrade of underlying software, gipsy, and you can check version l d d dash version, you can check version. if we expect, uh, this version will change or we'll, if we also play with containers, you, you might have some binaries in container, which, and, if you have container with pauses and pitch data is outside of container, probably pitch data was created using different gypsy version.

And in container you have different, like it's if, if it is not, if it is mismatch here, what can happen? B3 is built using one order of characters collation, right? And uh, now we have different order. And, uh, you can just, for example, not being able to find some, like you select something and previously you had some roles returned, but now you don't have them at all. And, uh, we can distinguish two types of problems here. One is read problems and another one is, propagation of real problems to rights which hurts more, which cause it's permanent.

so customers cannot find, they search something and cannot find something

[00:25:47] Michael: or

they get back incorrect data, right? Like they could get back data they're not expecting. So that's, that's like often how these present themselves, or as you said, like the right problem. If we, if we can end up with duplicates, like we can end up looking for an identifier, it doesn't exist. So we in like, we insert another one and it, we don't think it exists because we're looking up like via the

[00:26:08] Nikolay: Duplicates a slightly different problem. Not like, not related to the order of or gipsy version, but it also can happen. Duplicates is different problem

because,

[00:26:20] Michael: think it could be related to, cuz it's like, isn't it using like the primary key look like, cuz it's, isn't it using the B tree for the unique constraint checks?

[00:26:29] Nikolay: Right. But if order change, the uniqueness is not violated. yeah. We just changed order of course. In order by This is a key problem.

[00:26:37] Michael: Cool. Uh, well, my, my understanding is you can also end up with like data being in the wrong partition, for example. Like, so

ordering affects quite a few things.

[00:26:45] Nikolay: corruption.

[00:26:46] Michael: So

like I was thinking the index is used for those things, but no.

[00:26:50] Nikolay: so, so unique installation might happen and uh, use erections might cause it sometimes, but, I don't see how the order. Like character order, uh, how order, uh, characters can lead to it. Maybe I'm wrong, maybe I just don't see it, but what can happen in my practice, I saw it like what can happen?

Wrong order. So user expect one order, having different order, and, uh, missing, uh, results completely because of some ordering and limits. Results are out of scope and, uh, this like unexpected and so on. And problem, big, bigger problem is indeed rights. And if it's some, bi billing subsystem, you can propagate wrong rights.

Uh, some people can, uh, not get money, for example, right?

[00:27:36] Michael: Yeah, I was just thinking when you gave your e-commerce example, I was wondering what the trade off would be for like, let's say a bank if you've Yeah, but it's, so if you're worried about people's like balances being wrong, for example, is that a different matter? But I still think they'd make the trade off of keeping it up and dealing with the consequences in a lot of cases.

So it's an interesting question, like which applications would choose which?

[00:28:01] Nikolay: Yeah, yeah. But, anyway, uh, o upgrades are dangerous, right? And containers are slightly dangerous if you bring p data. We had it also, we, you bring picture data from different operational system and you run it, you run your posts container, well, even without containers. If you just copy p data between two node and one node is running kubu to 1604, another is 2204.

You are going to get a trouble because you copy at physical level. In this case you need to copy it. Logical Dump, restore. This is where dump restore, uh, plays good because, uh, it eliminates the problem completely.

[00:28:38] Michael: and I'm not sure we've said it explicitly since we switched to talking about indexes, but the, the tool for checking E three index corruption is still

[00:28:47] Nikolay: Ze ze. Right. And It has, uh, several modes. And the, unfortunately, the proper way to check it is heavy, and you shouldn't do it on production because it requires exclusive lock. This is where, you probably need to put it to backups, uh, verification system. And after you restore, you run it on all linuxes. It'll take a lot of time, but you need it to run into using special snippet, which like parent blah, blah, I don't remember. But it also checks, uh, a relationship between lymph nodes. And this is where, uh, order can be easily broken. So you need this and it'll take time. And, uh, funny thing, from practice that if you do it at sequel level, it's single threaded.

But, but it's, it's not fun. If you have a lot of indexes and you have a lot, of course, you usually this trade off, it's with backup verification, same trade off. If it's in cloud, you usually think, okay, I will hire, much more powerful machine, but for shorter time. And I will do drop much faster and destroy it.

So it's, it's, I mean, budgeting here, it's interesting for such kind of verification projects, but if you'll do arm check, don't do it in single thread. Do it utiliz all core and disks like, situating them because it's a verification machine. We, we should go full speed. Nobody's there. Right? If it's so, of course.

But then, uh, I, I, twice I wrote, uh, pluralization scripts for it and only then realized that since PO 12 or 13, maybe 13, there is, uh, ze, which has dash j, dash J is for pluralization. So, so, and you can use it for all s as well,

[00:30:28] Michael: yes, that's a really good point. I think you brought this up. We would, I think we had an episode on index maintenance and you made you, so we've actually covered Amek and this stuff in quite a lot of detail for people that want to go back to that one. I'll, I'll link that one up in the

[00:30:41] Nikolay: But it's more with our sessions with Andre Boin and in, in documents, uh, I mentioned we will def definitely link to this episode of these documents. And we have table for monitoring and prevention, mitigation, measures. So, yeah, so, uh, ze, we see patches to check nucleus violation.

There is such patch and, but it's not committed yet. It's ready for commuter and uh, fortunately it won't be, in post August 16 because it's already almost better next week. We have better, right, better one

[00:31:17] Michael: so so we've gone past the feature freeze for 16, right?

[00:31:21] Nikolay: Right, right. But I hope it'll make it into post 17. And, uh, I think it's, uh, one of the good cases when you can, can start, uh, using it before official release and also help testing it because, uh, on these, uh, verification machines, temporary machines where you verify your backups, probably it's, you can run, like not polished software and so on,

[00:31:46] Michael: It's a good point. And what's the worst that like, I guess if it's got bugs, like that's the normal reason to not use it, right? Is it

[00:31:53] Nikolay: there are false positives and false false negatives. False negatives. like you miss some problem. Well, without using it, you miss it for for sure if it happens like so still un polished software, it's still beneficial. Force positives are annoying. it says, oh, unique. Okay. It's easier to check and verify. All right?

For

[00:32:17] Michael: a nice failure

[00:32:18] Nikolay: for corruption, slightly more complex, of course. For example, it says, oh, the corruption index. Well, in this case, you can just rebuild it on this machine, this very machine. You rebuild it and check again. If it still says corruption in the same place, probably it's false positive because you

fix it already, rebuild it, fixes it,

[00:32:39] Michael: Yes. So that's, so that's, I guess where we should be going is how do you fix the, some of these issues and with indexing it's relat or with Bree indexes, it's because it's an ordering issue. Yeah. Normally right. Re, re-indexing should fix the ordering cause you are re-indexing with the nucleation.

So That's the fix. And as you say, if, if it shows again, it should, it'll be a false positive. Although I, I guess that would then mean it's a bug in amj. Right? Yeah, of course.

[00:33:05] Nikolay: Yeah. And, uh, worth mentioning again, uh, post versions 14.0123 are super bad because if you use, uh, uh, read, create, index, index concurrently, you might have a corruption. And in this case, if you use the same version on this certification machine and EK says corruption, you rebuild it with concurrently.

Again, probably it says corruption again, not good. concurrently option in the, uh, create index. And Rein index is super cool. But, uh, it has long trail of bug, unfortunately. And if, uh, if we run and check on this clone non-production, I would verify all findings with regular, uh, index,

[00:33:48] Michael: That's a good shot

[00:33:49] Nikolay: which is more REI reliable,

[00:33:51] Michael: because you

don't have read people reading. Right. Like you're not worried

about

the looks.

[00:33:55] Nikolay: user machine, right?

in, this case, if, if even if, uh, I was running, uh, 14.0. And corruption, uh, came from production because of the use of concurrently option here, I would, uh, I'm check find that I, I run index and, um, check. Shouldn't find it anymore. It means it's not false positive, right? I we go. But, but, uh, um, check for battery.

It's quite reliable. It shouldn't produce false positives, like it's quite broadly used and, uh, well tested and like, production tested. I'm checked with this patch for uniqueness violation, of course, different story. And also I'm checked with patch, which also the same, like, it's, it's ready for commuter but not, uh, want happen in post 16.

Unfortunately. I'm . Indexes, long awaited thing. It's very also needed, and bar also participated in this and this patch. I tested myself, I applied the time check and started to test. I saw false positives and uh, I was told that there, there were many false positives in the beginning, but a lot of them were fixed already.

So it was, it has version, it has manipulations already. So I would consider this quite good and I encourage, People to try it in non-production. You, again, you still need non-production because of this exclusive log

there. Andre mentioned some improvements for future posts that will probably make possible to run check in production in less aggressive, way.

But, uh, currently, like I prefer, I'm checking on the clone where I can do more things, and and so on. Utilize the whole course, dis disks and so on. It's, it's, it is better. So that's probably it with iCheck.

Uh, again, I, I'm looking forward to having, uh, g and gene support and also uniqueness installation support, uh, in future versions of, um, check and, uh, but we can already start using them, taking just patches, right?

what else? The visibility map there is, uh, also some, uh, tool to check visibility, map corruption, because it's not good to have it as well, right? If it happens.

PG visibility

[00:36:07] Michael: I think this episode could easily be like three hours long. I suspect we could go on for a long

time, but there is a, there's a few good posts that I wanted to mention.

[00:36:15] Nikolay: I mention, mention my document. it has lots of information. If you find something, it's missing, let me know. I will put it, I'm trying to collect everything about corruption and it's very broad topic. It has like many directions

[00:36:29] Michael: nice. I look forward to reading that. there's one by cyber tech that's quite a fun post of, um, Lawrence, I believe going through about giving quite a few examples of ways to corrupt this database, which is quite a nice way of learning the kinds

[00:36:44] Nikolay: Uh,

[00:36:44] Michael: you can do to end up

[00:36:46] Nikolay: How to corrupt. How to corrupt your how, how to corrupt. It's, it's an interesting, uh, area of gba, practice, uh, let's corrupt, uh, to learn.

And, uh, I mentioned, I, I, I, I remember something like corruption related to Accid Brown. Is it, is it this topic? Is this post or no?

[00:37:04] Michael: I can't remember that one. Sorry.

[00:37:06] Nikolay: It's okay. how to corrupt. Yeah, yeah, yeah. I, I, I remember it. I have it in my list. It's, it's a good, good. There are two post on cyber tech about corruption, one about C pr, and another is, is just about corruption. There's also interesting, uh, tool, uh, noise from OVS as like how to damage your poss a tool just, just to learn and practice, uh, how to escape from the situations, right?

We, we didn't cover cases when you have corruption, how to do it. Like of course a wall is

tool. It'll take time and, uh, you definitely need to back up before I use it several times in my practice. One case was interesting when POGS was running on, on top of, NFS in aws, and, got corrupted, but fortunately, only one table was corrupted.

It was so, it

was super

easy to fix. Just analyze whole database. It's, it's rebuilt. Yeah. But to find the proper, like what's corrupted it take, it took time, like half an hour at least. And it was nervous because, uh, it was under pressure from clients. So, yeah. And I remember like some cases takes hours or day to, to fix and to escape with some data, data losses.

So it's, it's huge topic.

[00:38:27] Michael: There's one more final thing I wanted to mention. There's a good. presentation. I think it was quite an old one from Christoph Pettus that the slides are available online. I'll check and see if I can find the talk as well. But I thought the slides alone were very good. And that was detailing a previous case.

So you talked about the 14.0, and reindex concurrently corruption issue. There was also apparently one that I wasn't aware of back in 9.3 0.1. So e even in a

minor release, there was

a case.

[00:38:57] Nikolay: yeah.

let me mention a few names as well. I mentioned Andrea Bread in a lot of work related to arm check indexes and so on. Of course, Peter Gagan and, tools work on Peter indexes and so on. And also tool called pg. I used it a few times. You need Linux for it, but it's still like very good thing.

And on top of, uh, page spec, visualize, batteries and uh, and, and, uh, other types of indexes as well. also, Thomas one, ze, pg. Check from EnterpriseDB, by the way. Interesting thing.

[00:39:26] Michael: What was it called?

[00:39:28] Nikolay: Pg q Check, Catalog check. I think it's about, yeah, I, I'm not sure it could be applied to regular post, but the, like a catalog checker to find corruption in system catalogs.

It's also interesting topic,

, wolf exploring as well. by Drew Vo, sorry, like pronunciation wrong, but also there is a pigeon toolkit or also works by Jeremy Schneider. Good posts and, uh, talks as well and so on. interesting materials by Jeremy Schneider from AWS team. Well, yeah, a lot of stuff.

[00:40:05] Michael: Wonderful. And if people can only take one thing away from this, what should they, maybe they should restore, like check their backups. What would you, what would

your number one advice be?

[00:40:14] Nikolay: check, check backups with, uh, point with wall stream. dump it to da death node causing res of data add, um, check for battery three. and, or of course enable check sums, right? Enable data, check sums. If you want to go further, use, visibility, map checks on right on production. Also use patched and check to the five just engine if you have any. In non-production on after cup restoration and also uniqueness violation and many more tools, like I mentioned, this PG cut check and so on

and so on.

[00:40:50] Michael: Oh.

[00:40:50] Nikolay: like it's a rabbit hole.

[00:40:52] Michael: One more thing. Read release notes, even for minor version releases, like cuz if, if there are, if there are any Postgres box causing issues like this, they will be mentioned in there with mitigation steps. Sorry, I completely forgot that one.

[00:41:07] Nikolay: didn't mention many good names as well. Sorry. Like it's, uh, it's a big list, uh, p checks from credit and it, it went

to POS core in PO 12.

[00:41:18] Michael: Very cool.

[00:41:19] Nikolay: Yeah.

[00:41:20] Michael: Well, uh, if Nikola forgot to mention anybody, it's a definite case of corruption. Uh, he's just mentioned all his friends. That's it.

[00:41:28] Nikolay: No, if I, if I, if I got to mention something, I mentioned that it just was not recorded.

[00:41:34] Michael: Oh yeah, good one. Blame the internet connection.

[00:41:36] Nikolay: recording corruption. Right, right.

[00:41:38] Michael: Yeah.

[00:41:38] Nikolay: So if you, if you have more ideas, uh, please tell us in on Twitter, LinkedIn, and, uh, comments on YouTube anywhere. And, uh, help me collect, this is community effort. like 99% of what is there. It's not from me. I am like a hub. And these documents are quite good. Like if you take any of this document, you can, find work for any GBA for a year at least in serious projects.

Definitely for a year. Like a lot of stuff.

[00:42:05] Michael: last, bad joke from me for today. Uh, would you consider your work a collation?

[00:42:12] Nikolay: What, okay. Okay.

[00:42:15] Michael: It's time.

Thanks

everybody.

[00:42:17] Nikolay: good. ending point, right?

[00:42:19] Michael: Catch you next week.

[00:42:20] Nikolay: Have a good, yeah. Bye.