Subscribe
& more

Transforming Your Database

Episode 3

Transforming Your Database

With

// Craig Kerstiens

Chief Product Officer, Crunchy Data

About the episode

Databases are rarely the stars of digital transformation. That’s not necessarily a bad thing. But that doesn’t mean they should be an afterthought either.

Craig Kerstiens of Crunchy Data covers the state of the database—and how the database tools you choose have implications beyond the databases themselves.

About the guests

Craig Kerstiens

Chief Product Officer, Crunchy Data

Transcript

00:02 — Jamie Parker
It wasn't long ago that data was proclaimed king, but when it comes to digital transformation, data doesn't often get the royal treatment. The focus tends to go to modernizing other elements of the infrastructure. You shouldn't let your king become an afterthought, though. Modernizing your infrastructure will likely mean performing some data migration at some point.

00:23 — Jamie Parker
Data migration has a reputation for difficulty and tedium with dire consequences for getting it wrong, and that can be the result of a culture clash instead of a technical decision. In this episode of Code Comments, Chief Product Officer Craig Kerstiens of Crunchy Data shares why your database deserves careful consideration, but that doesn't mean automatically opting for the newest, shiniest database. Craig's goal is to make the database boring. Craig works at Crunchy Data. Postgres, an open source relational database, is their bread and butter. They know their data.

01:05 — Craig Kerstiens
Data is sticky. It has gravity. Data is not easy to move around from point A to point B.

01:11 — Jamie Parker
Organizations handle data. There's no way around that in the modern economy, but there's something about data that makes it hard to handle and move around. It's sticky. In some ways, that's a good thing. You want your data to be accurate and reliable, but in some cases, you want it to be a bit more flexible too. Craig splits data into two systems.

01:32 — Craig Kerstiens
When we talked about systems of record versus systems of engagement, right, systems of record, what is the source of truth? There's my bank, which has my balance. Like, okay, I swipe my credit card, and then I transfer money from this account to that account. And that's the system of record, right? There is a source of truth that exists there.

01:50 — Jamie Parker
Systems of record hold the really sticky data, and you want that to stick. Nothing too fancy. You want to know the money going in and the money going out, and that's it. You also want that system to be available and responsive. People want to be able to check their balance quickly and have any transaction show up. Systems of engagement, on the other hand, are a different story.

02:16 — Craig Kerstiens
Now we think differently of a system of engagement of, "All right, how about my credit card points?" I don't need those to reflect in kind of real time.

02:26 — Jamie Parker
Systems of engagement still need to be accurate, but they don't have quite the same requirements as systems of record. In this banking example, Craig suggests that systems of engagement use more features like AI, machine learning, and other tools to provide recommendations. That requires reading and manipulating data differently than for a system of record. Now, the inclination might be to build separate systems with databases that specialize in handling each case, but they probably still need to talk to each other.

03:00 — Craig Kerstiens
How do we merge these systems of record and systems of engagement? And as you go through kind of a digital transformation journey, the way we used to do that is, say, "We've got these separate systems that exist. We've got one over here and another over there, and Friday at midnight, they'll sync data between them, and you've got a delay here." And the reality is when we start to look at our database and be a little less afraid of it, we say, "Well, why can't certain databases, mainly Postgres, do both of these things?"

03:30 — Jamie Parker
So there it is, your first database choice. Do you build separate specialized systems that then need to talk to each other, or do you build a single database to handle all aspects of your data management that may need more customization? But before we get to that choice, let's look at another scenario.

03:51 — Craig Kerstiens
One of the most common questions I get all the time is, "How do I replace this legacy database with extra Y?" Whether it's a crazy old mainframe or Oracle or Db2, take your pick A database that's been around, and it's been there for a long time, and it's working, and it powers your bank.

04:10 — Jamie Parker
You have a database that's been chugging along for a while. It's working, but it's getting old, and the rest of your stack is moving on. Moving that sticky data from an older database to a new one isn't going to be an easy project, but it has to be done, right?

04:28 — Craig Kerstiens
It's funny because I think that it's not an inherently bad question. It's the first question that usually comes up, and a lot of the time, it's not the right thing to start with. If a system's working and a database is working, I often say, "Well, let it keep working. Why are you changing this?"

04:48 — Jamie Parker
Let's take a breath here. Yes, that database may be old. Yes, you might need to change it at some point. Yes, this episode is about making database choices in the midst of modernizing your IT infrastructure.

05:02 — Jamie Parker
Craig is saying you don't need to update your legacy database right away, but there are other places where you'll need to make choices about data management. Craig advises focusing on greenfield projects first so you can figure out what you're going to need long-term. As it is, you might have a lot of untangling to do, and if you want to get moving with some of those new features, you don't want to spend all your time on that yarn.

05:28 — Craig Kerstiens
We went through this phase for a little while where, "Oh, migrations are hard," like database migrations, or changing your schema is hard, and I don't want to manage that, so I'm going to use a document database, and I'm going to have no schema.

05:43 — Jamie Parker
Relational databases have a set structure, and changing that schema takes work. Many turn to nonrelational databases to increase their database's flexibility. Craig points out that you never truly get rid of schemas, though. They just take a different form.

05:59 — Craig Kerstiens
The reality is if you're not managing your schema in your database, then you're managing it in code, and you haven't gotten rid of the schema. Your application has to know about these 5 different versions of schema from game version 1, 2, 3, 4, or 5.

06:15 — Jamie Parker
Your data has to be organized somehow, somewhere. There are levels of flexibility in how you set up your schema, but some sort of schema is still necessary. Craig argues that moving it to your codebase makes it more complicated than it could otherwise be.

06:31 — Craig Kerstiens
On day 1, that seems fine. There's only one version of it. But on day 100, when you've gone through 20 changes, now you've got other code complexity here. And so we went through this process where developers jumped on that bandwagon, and I think we've learned over time that you're always managing some data schema somewhere, let the database do what it's good at, and we've kind of come back full circle to the database isn't a dumb store. It's a useful thing in the toolbox.

07:02 — Jamie Parker
There's complexity in managing a database, whether it's in the database itself or handled in your code base. With digital transformation, your application's database requirements are going to change. And while change is difficult, a full migration can wait until you figure out what kind of database you need, but someone has to handle that sticky data.

07:25 — Jamie Parker
And in today's economy, the once clear lines of responsibility and authority over data have been blurred. I mentioned that data was crowned king not too long ago. There's been a huge increase in demand for data analysts to extract all sorts of insights from the rows and columns of information, and many have flocked to the field, but there's one kind of data technologist that hasn't seen an equivalent surge of supply.

07:54 — Craig Kerstiens
Data is everywhere. Now, there's a fascinating kind of aspect to this or counter aspect where there's not necessarily more DBAs in the world. There's not more people necessarily operating databases for you. So, with this, you got—you go through this digital transformation journey.

08:13 — Craig Kerstiens
You're able to ship apps faster, meet customer's needs faster. Now you've got 10 more apps than you had yesterday, and how do you maintain these without more DBAs, right? What you find is it used to be in a large enterprise that the DBA held the keys to the castle. You wouldn't go and make a change to your database without your DBA approving it.

08:35 — Jamie Parker
Database administrators had authority over the database. You needed them to make changes, which often happens when adding new features. With digital transformation, that's supposed to be a frequent occurrence, so you have more work for the same number of administrators. How much more work? Craig shared his estimate of the new balance in the workforce.

08:58 — Craig Kerstiens
The DBAs and the developers, to bridge this shift in ratio, if you had a ratio before of 10 developers to 1 DBA, it's probably 100 developers to 1 now.

09:08 — Jamie Parker
That's a 10 times difference in DBAs to developers, but it's not 10 times more work. There's also shorter development cycles to contend with. These are all estimates. How close they are to the mark will depend on the organization, but they do paint a picture of the strain DBAs are under.

09:27 — Craig Kerstiens
Today, you go talk to a group of DBAs, and they're so beaten down and worn out because, on Wednesday, a developer showed up at their desks and said, "Hey, we've got this application launching on Friday, and here's 3 new data sources you've got to support." And they're just dying.

09:46 — Jamie Parker
That's a short turnaround time. A lot of database administrators are getting overworked. You can't stick to the old way of doing things.

09:56 — Craig Kerstiens
You can't just drop a new technology and say, "We're going to go on this transformation journey and replace systems and tools and frameworks, and we're going to introduce CI/CD," and then still keep the old way of doing things where, "Hey, this all flows through this DBA approval process on how we deploy and roll out software and timelines."

10:19 — Jamie Parker
We've learned why funneling all these new features through an old approval process won't work. There aren't enough database administrators to go around anymore, but someone with knowledge of database management has to make sure that part of your tech stack is ready for the change. So, you share the load beyond the traditional data team.

10:38 — Craig Kerstiens
You've got to give some control and ownership over to the team that's moving and pushing things in that. If you try to continue to do that the old way, especially on the database side of things, it'll break. For everything you gain with modernizing some of your processes, you're going to still be held up in so many ways by not giving up some of that ownership of the database and how people work with it.

11:04 — Jamie Parker
With fewer DBAs and more work to be done, developers have had to learn database tools to help manage databases. The spirit of digital transformation and DevOps is to break down barriers, have teams communicate, and work together on projects so everyone can move faster. That's the spirit anyways.

11:28 — Jamie Parker
The ideal doesn't always come to fruition, especially with communication. So developers are also working with databases. They're often going to be familiar with different tools than the data teams. When Craig talks to DBAs about the tools their developers prefer, he gets the question turned back on him.

11:48 — Craig Kerstiens
App developers don't talk to us. We don't have the keys to the castle like we once did. And so this is where it's fascinating that your app developers now need to learn how to work with the database a little bit more than they used to.

12:04 — Jamie Parker
On one hand, having everyone understand the tools they're working with should be a good thing. On the other hand, it creates a new dynamic where DBAs aren't consulted on how to solve database issues, and they're no longer the only ones with access to the database. Collaboration and communication issues have to be sorted out. Craig shared a call he got from a customer who hadn't sorted it out yet. They were experiencing an extended database-related outage.

12:34 — Craig Kerstiens
I got on the call talking with them, and the entire team was absolutely fried because they'd been dealing with an incident for the last 12 hours overnight, and the data team knew what to do, but they weren't allowed to touch it.

12:48 — Jamie Parker
You'd think this was a case from a customer who hadn't started their digital transformation, but they had. Somehow, they were still separated into camps that couldn't coordinate on a solution.

12:59 — Craig Kerstiens
The app developers were not equipped or trained and weren't able to use the tools that the data team was. So they had kept the silo of, "Well, no developers. Here's how you work. You don't really know the database so you don't touch it. Hey, database people, you're over here, but you don't really get to touch the app code and control the deployments." It was really the worst of both worlds.

13:22 — Jamie Parker
Talk about going from bad to worse. The application developers had changed their processes so they could move fast and make changes, but they didn't have the right tools to properly manage the database themselves. Nevertheless, they had modernized their processes to have continuous deployment, and they were making changes to the database to implement their new features. They were supposed to have figured out a way to work that brought them together. Instead, they found themselves entrenched in their old divisions and facing new problems apart.

13:55 — Craig Kerstiens
It was a multi-hour outage that, because of the processes they had defined, they just couldn't work around those processes to fix the issue. The fix was a less than 5-minute fix. It was that fast. So you've got to evolve who can work with the database and how, otherwise, you're not necessarily better off.

14:15 — Jamie Parker
Ah, if only people would just talk to each other. We've heard how there aren't enough database administrators to keep up with the workload. You'd think things would be difficult enough with the shortage of database expertise, but not even making use of the data team you do have, that seems like a tragic waste, especially when they have the expertise to quickly fix an extended outage. This was an issue with the DBAs and the rest of the organization not communicating and working together. But this isn't just about data. It's easy to imagine how similar situations can happen with other teams who end up isolated. It could be something with a network or security or any number of other specialized roles.

15:02 — Jamie Parker
If you're going to undertake new projects, your teams need to be in sync and able to work together. Easier said than done, right? Culture and the processes that form it, that can be the hardest part of digital transformation. After the break, Craig walks us through some database-related factors to consider when planning your digital transformation. Earlier in the episode, Craig shared that one of the most common questions he gets is about what to do with legacy databases. His answer wasn't to move everything to Postgres. It was to keep using what's working while you can, but you are going to need a newer database. Focus on figuring out your new data needs by setting up a database for your new greenfield projects.

15:57 — Craig Kerstiens
Modernizing legacy systems is a different focus versus, "Hey, how do we get our organization to move faster and think differently?" And there's new greenfield projects that happen every month, every quarter, every year, focusing on those. How do you start with those on a fresh data stack, a fresh lens, and then pull forward those legacy systems bit by bit is the thing that I see. Most organizations, they start on the one side, and they end up, "All right, let's leave the legacy as the legacy."

16:28 — Jamie Parker
Instead of tackling a complex data migration project, you should focus on figuring out your data needs for the new project your organization is tackling. At this point, the tech world is your oyster. You can architect your application based on what it needs. With microservices, a lot of organizations choose to use specialized databases for each component. This can yield performance systems. Craig argues doing so also increases the complexity of your data management strategy and of your overall application.

17:02 — Craig Kerstiens
You start as an organization and build a basic app, and you say, "Now we need to add some geospatial." So you add another data system, and then you need to add search. So you add Elastic. And then, you need to add AI because that's— clearly, every company is running after AI right now, so you need an AI database, and the business needs a data warehouse to get insights.

17:23 — Craig Kerstiens
Now, how are you getting that data in and out of all these systems? Well, now you've got a change data capture pipeline and maybe some modifications on the data as it's flowing around. You add in some extra ETL tools. I wasn't keeping count. Are we at almost 10 data systems now?

17:41 — Jamie Parker
Microservice-based systems have a lot of parts. You'll likely need a large team to maintain all of those data systems. Craig advocates for using a single relational database to support your applications, even if it's not completely optimized for each component, and for him, there's another advantage to consider.

18:00 — Craig Kerstiens
It's really fascinating how the database fits into the digital transformation story, because you can go and pick a shiny new database. The reality is you don't know what's going to break. You don't know what's going to go wrong. You don't know how it's going to scale, and you're going to find as many problems with a lot of the shiny new as anything else.

18:21 — Jamie Parker
With newness comes uncertainty. Things are going to go right. They might go wrong. It's hard to know with a new database since their exposure to production environments is still being tested and issues are still being found. That's a trade-off many have been willing to make. Craig argues there's a lot of value in a well-vetted database.

18:41 — Craig Kerstiens
You don't have to look far to say, "Okay, if it takes 10 years for a database to mature, all those shiny new databases aren't mature, right?" And so while you think you maybe have modernized to a new shiny database that solves all these problems, it's in its infancy.

18:56 — Craig Kerstiens
And so, now, instead of focusing on building features and functionality for your users, you're back to maintaining infrastructure, which is not where any of us want to be. And I want my database boring. It's kind of like security. When my bank—

19:12 — Craig Kerstiens
I get a special urgent letter from my bank. It's not good news.

19:20 — Jamie Parker
So, sometimes, the new database provides features you can't find anywhere else, but sometimes, if something goes wrong, you're left holding the bag. Instead of just taking advantage of the efficiencies of a specialized database, you end up counteracting them with all the maintenance and bug fixing you're now responsible for.

19:41 — Jamie Parker
Maybe he'll be able to handle any problems that come your way. Maybe the better choice is to use a more established database that may not have all the features but gets pretty close. Only you and your team can make that decision, but it's one you should make together.

19:57 — Craig Kerstiens
You don't need to force, I think, these things on your developers. For digital transformation journeys, a lot of your developers are driving this. If a developer doesn't want to work with Postgres, I don't know that you should force them to or any other database. Give your developers the tools they want to work with.

20:15 — Craig Kerstiens
Include them in that process. If you mandate something, you're not going to get the buy-in from developers. The developers are opinionated, fickle things. And I say that having been a developer of many friends that are developers, and so give them the tools that they want to work with. Don't force something in there.

20:36 — Jamie Parker
Listen, it may take a while to get to a decision. Let the arguments run their course so everyone is heard. Once you come to a consensus, support that decision to its fullest. If there's anything more tragic than further siloing teams, it's making a deliberate decision and then holding back on using the features you signed up for in the first place. Craig has seen this happen when organizations fear investing too much into developing their database, just in case they move away from it later.

21:07 — Craig Kerstiens
You don't say, "We're going to use the least boring features of this language just in case we want to migrate this whole stack." And we had this attitude about that with databases for a while, and to me, it's just crazy because then we talked about it with legacy stuff. You're not going to up and migrate this tomorrow, and if it's working, let it keep working, and let's go faster on this other stuff.

21:30 — Jamie Parker
There's no point in making an informed decision based on a database's features only to then not actually take advantage of those features. Craig says to embrace the tools you're picking and use them to their full extent. Data has earned a reputation for being difficult to migrate. If you're undergoing digital transformation, it's likely going to need to happen at some point, but that doesn't mean you need to move off of your legacy database right away.

22:00 — Jamie Parker
With the shortage of talent and the sharing of responsibilities across teams, it's important to focus on inclusion, communication, and figuring out what your data needs are as a unified team. Craig presented his case for choosing a relational database like Postgres. While it's not the newest tool in the shed, he argues it's a trusted and reliable option that can avoid complexity. What then remains is for your team to figure out if they agree on whether or not that's what you need. You can learn more at redhat.com/codecommentspodcast or visit redhat.com to find our guides to digital transformation.

22:41 — Jamie Parker
Many thanks to Craig Kerstiens for being our guest. Thank you for joining us. This episode was produced by Johan Philippine, Kim Huang, Caroline Creaghead, and Brent Simoneaux. Our audio engineer is Christian Prohom. The audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Nick Burns, Aaron Williamson, Karen King, Jared Oates, Rachel Ertel, Carrie da Silva, Mira Cyril, Ocean Matthews, Paige Stroud, Alex Traboulsi, Boo Boo Howse, and Victoria Lawton. I'm Jamie Parker, and this has been Code Comments, an original podcast from Red Hat.

Chart your journey

Digital transformation is a big undertaking. Everyone’s path is different—but a lot of the obstacles are the same. Find out how to avoid the pitfalls and overcome the barriers that may otherwise slow you down.

quotation mark

If you try to continue to do that the old way, especially on the database side of things, it'll break like for everything you gain with modernizing some of your processes. You're going to still be held up in so many ways by not giving up some of that ownership of the database and how people work with it.

Craig Kerstiens

More like this

Code Comments

You Can’t Automate The Difficult Decisions

The tensions between security and operations and developer teams are legendary. DevSecOps is trying to change that, and automation is a big part of making it work.

Code Comments

Scaling For Complexity With Container Adoption

Spinning up a Kubernetes cluster is just the beginning. How do companies get value from container adoption?

Code Comments

Challenges In Solutions Engineering

Tech changes constantly. What does that mean for companies adopting new technology?