When Should Data Die?
We have a finite time on earth. But the data we generate? It can last much, much longer. We have limited control over what happens to our data during our lives. And while you might not care about anything when you’re gone, you and your loved ones might have an interest in how your information is used after you pass. So we wondered: When should data die?
It’s a tricky question. In the digital age, individuals generate mountains of data over their lifetimes. But who has the right to decide whether that data remains, or when it is deleted? How should IT organizations handle their datasets given the complexities of privacy, legacy, and ownership that they need to consider?
00:02 - Caroline Creaghead
Hello, Brent. Hello, Angela.
00:04 - Brent Simoneaux
00:07 - Angela Andrews
00:08 - Caroline Creaghead
You know me, but our listeners don't know me because I'm usually not on the microphone. So; hello, listeners.
00:14 - Brent Simoneaux
Caroline, you are usually directing us, producing us, making us sound way smarter than we actually are.
00:23 - Angela Andrews
She's the brains of this outfit.
00:25 - Caroline Creaghead
Hey, it takes a team.
00:27 - Angela Andrews
No, give yourself much more.
00:29 - Caroline Creaghead
Oh my gosh.
00:30 - Angela Andrews
We're happy to have you on.
00:32 - Caroline Creaghead
I'm happy to be here. I have a question that came up because, and I think maybe both of you know this, that I had someone close to me pass away late last year. And so in the wake of that, something came up that I thought might be an interesting conversation for us to have. I want to respect this person's privacy and I won't say everything that happened that sort of led me to this question, but basically it happened very quickly that they passed away, and not a lot of plans were made about what happens to their digital life. Do you guys know what happens after you die, to your data?
01:09 - Angela Andrews
No, not really. I don't know when our data actually expires. No pun intended.
01:17 - Caroline Creaghead
Yeah, totally. And how should it? When should it expire?
01:26 - Brent Simoneaux
This is Compiler, an original podcast from Red Hat.
01:30 - Angela Andrews
We're your hosts.
01:31 - Brent Simoneaux
I'm Brent Simoneaux.
01:33 - Angela Andrews
And I'm Angela Andrews.
01:35 - Brent Simoneaux
We're here to break down questions from the tech industry; big, small, and sometimes strange.
01:42 - Angela Andrews
Each episode, we go out in search of answers from Red Hatters and people they're connected to.
01:48 - Brent Simoneaux
Today's question: when should data die?
01:51 - Angela Andrews
Producer Caroline Creaghead is here with our story.
01:58 - Caroline Creaghead
So when we say data, that actually can mean all kinds of different things.
02:03 - Brent Simoneaux
I mean, the first thing that I think about is social media accounts, like my Instagram account. But it's not just social media, right?
02:13 - Caroline Creaghead
No. I mean, there's like also your email account.
02:17 - Angela Andrews
And your bank records, all of your financial transactions out there.
02:21 - Brent Simoneaux
02:22 - Caroline Creaghead
Totally. Yeah, there's all kinds of GPS records. Anything that we're leaving behind that we create in our digital lives. Some things, super-sensitive that we think about a lot, and some things that don't matter to us and we never think about. So there's the idea of what should happen to our data when we die and that we can all think about on a personal level, but also as people who work in IT it's interesting to think about it from that perspective, as people who are setting this kind of policy or who should be dealing with what does my data management practice look like with a view to people not being around anymore? Or even short of that, just once I'm finished with it? What's the appropriate time and way to get rid of data?
03:10 - Caroline Creaghead
So to get into this, I wanted to sort of go outside of tech first and I talked to Patrick Stokes.
03:18 - Patrick Stokes
I'm Associate Professor of Philosophy at Deacon University in Melbourne, Australia. And I'm a philosopher of death and personal identity among other things.
03:27 - Caroline Creaghead
Are you guys as excited as I was to hear from a philosopher of death?
03:31 - Angela Andrews
03:33 - Brent Simoneaux
You know I'm in.
03:34 - Caroline Creaghead
So I asked him, what does he think should happen to a person's data when they die.
03:39 - Patrick Stokes
We're defaulting to treating the digital assets of the dead or treating the digital traces of the dead as being essentially just a slightly exotic form of property. And therefore who gets to say what you do with that property or what can be done with that property would be the same as... Ultimately there’s an answer that can be found through just looking at property law and inheritance law and copyright law. I don't think that's going to be adequate though. I think that we need to start taking the idea of digital remains seriously.
04:10 - Brent Simoneaux
Help me out here Caroline, because he's saying that traditionally, we think of people's data as property. It's something that you own, or that someone else can own. But he's saying we need to think of them as remains. I think the difference there is like remains are like a part of you. So is he saying that your data is a part of you?
04:38 - Caroline Creaghead
04:42 - Caroline Creaghead
So we think about people as just who we see, like a person, like whatever is contained in your body is you, that's your identity. But he explores this idea of soft selves, meaning outside of the body, we are also our relationships. We are our creations. We are how we interact with the world as much as we are the physical body that exists in the world.
05:05 - Brent Simoneaux
So we are not just sacks of meat, we're not just skin bags.
05:15 - Caroline Creaghead
I love this conversation. Yes.
05:17 - Brent Simoneaux
We are our connections to other things. We're sort of our totality of this network that we've built. And that it also includes things like things that we've created and things like our data.
05:31 - Caroline Creaghead
You're absolutely right Brent, like this is something that we have to think about in terms of it being a part of you as a person. Here, he explains it a little better.
05:38 - Patrick Stokes
When your great aunt dies, you might inherit her house and you might inherit her car and you might inherit her pet tortoise, but you don't inherit her corpse.
05:49 - Brent Simoneaux
So many people are online now. Like so many people have digital lives in a way that hundreds of years ago, if you passed away, like nobody may know that you existed ever.
06:05 - Caroline Creaghead
Do we prefer that? Like should people just pass away completely? Or should we preserve as much as we can in order to analyze or just be able to pass on to generations into the future, this is what life was like back in 2022? If that's your prerogative, the more the merrier.
06:25 - Angela Andrews
So what you're saying is our footprint should remain so future generations can marvel at the amazing food pictures that I took on Instagram when I'm six feet under and I've been gone for 20 years. But why?
06:44 - Caroline Creaghead
Yeah. Great question. I think there's a lot of gray area and saying either like, yep, when someone dies, all of their data goes with them is probably not the model that we want. But then also on the other side, if we keep everything, there's not any reasonable expectation that we'd be able to deal with that much data in the future. There becomes a diminishing returns equation with the usefulness of literally everything sticking around forever. I want to bring in Patrick, as he sort of parses this a little bit.
07:19 - Patrick Stokes
I think there is actually a moral imperative to preserve the dead. So there needs to be some sort of serious discussion about how we do that, how we do it in a sustainable way and how we ensure access to it. But also who gets that kind of access, how we preserve the privacy rights of the dead and who foots the bill.
07:37 - Caroline Creaghead
That's a big question, right?
07:39 - Brent Simoneaux
07:40 - Angela Andrews
Who foots the bill? Like who's going to keep paying? Maybe I do want to keep my website up. Who wants to keep paying my hosting services or something?
07:49 - Caroline Creaghead
That's right. That's something that certainly gets flipped from when we're alive and can make those decisions. And when we're gone and can't. And it gets flipped when the data is owned by not us. So with social media companies, they're the ones who are paying for the server space, et cetera, to keep data around or they're paying for the cost to delete it. Here's Patrick again.
08:11 - Patrick Stokes
So in 2009, in his really fantastic book on forgetting and deleting, Viktor Mayer-Schönberger noted that the point had already been passed where the cost of digital storage media was less than the average wage, which meant that basically it was cheaper to buy more storage space than it is to hire someone to sit there, working out what to keep and what to delete. So the default then shifts from forgetting to remembering.
08:38 - Caroline Creaghead
A person's publicly available data persisting after it's relevant or after the person has any ability to control it... It gets dicey. And it can't just be up to the companies who store and own that data what the best time is for it to go based on their bottom line. Patrick says that we really need to be engaging with this notion of digital personhood and to adopt the stewardship approach rather than ownership when it comes to making these decisions about personal data.
09:08 - Patrick Stokes
And it's only once we've really engaged with that notion that I think we're going to be able to see what kinds of safeguards and duties and protections the dead are going to need if we're going to steward them through this environment in which they persist electrically potentially forever. But probably more like for five years. Because as I say, data is really... I mean, it's something you can talk to Carl about too. I mean, Carl's done fantastic work on what happens if these tech giants go under?
09:38 - Brent Simoneaux
Wait, who's Carl?
09:40 - Caroline Creaghead
Great question. I'll let Carl introduce himself.
09:44 - Dr. Carl Öhman
My name is Carl Öhman. I'm a researcher at the Department of Governments in Uppsala University, Sweden, where I'm mainly researching AI and political communication. But I also have a long-standing interest in digital preservation and digital human remains.
10:00 - Caroline Creaghead
Digital remains again. Can you see why he was brought into this conversation?
10:05 - Angela Andrews
Yes. It's a theme.
10:05 - Caroline Creaghead
Patrick brought him up in referring to this question of what happens when tech giants, we're talking about social media companies and anyone else who's a big repository of data. What happens when they go under? So I asked him what he had to say about that.
10:20 - Dr. Carl Öhman
Generally it's not the case in every country, but generally dead people have zero data protection rights. You can do whatever you want. You can sell them, you can disclose them and so on. So the big disaster that I'm sort of waiting for, the ticking bomb is that one of these tech giants will go bust. They will go bankrupt and there will be an insolvency administrator that comes in, starts selling off all the valuable assets. And what are the valuable assets in this case? Well, it's people's data. And I could think of several nefarious actors who would be interested in purchasing such data.
11:01 - Angela Andrews
11:02 - Brent Simoneaux
Well it's an interesting thought experiment. Like what does happen if one of these tech giants goes under?
11:09 - Caroline Creaghead
Part of why that's such a scary idea is because the way we understand it right now, like they are the keepers of that data. They are the owners of that data. And if they own it, rather than are stewards of it, then they can just sell it or that it's just an asset that gets liquidated in the event of a company going under.
11:29 - Brent Simoneaux
11:31 - Angela Andrews
Unfortunately data is very lucrative. The data of living people, dead people, all one and the same, but it does have some moral implications that someone is making money off of someone who has passed away. That to me, feels like it should be addressed. So maybe this is something that we should start talking about because you want to start thinking about, "what am I going to do with all of these sites and all of this information? Who am I going to put in charge of it? And what rights or responsibilities do the companies have? How easy are they going to make it for the people that I've left in charge of my digital remains?" I can't stop laughing when I say that, but how easy or difficult are they going to make it? So maybe we need to start… again, this conversation has to be had. Thank you, Caroline.
12:26 - Caroline Creaghead
Yeah. Not just by the companies. Exactly. Dr. Öhman agrees with you.
12:32 - Dr. Carl Öhman
When we decide "who do we delete and who do we preserve for posterity," that shouldn't solely be a question of whose data can we make money on. Because if we view it solely as an economic matter, we're going to end up with historical records that are extremely skewed in terms of whose data actually makes it to posterity and to future generations.
13:01 - Angela Andrews
Ooh. More ethical implications.
13:03 - Brent Simoneaux
Yeah. This is what I was just thinking about. I don't know how to say this delicately, but stewardship isn’t lucrative. Like if you were just thinking about money and if you were just thinking about profit as a lot of corporations tend to do, that's going to lead you to certain decisions about how data is remembered or forgotten.
13:33 - Caroline Creaghead
Absolutely. And that's going to lead to a skewed historical record.
13:38 - Angela Andrews
13:40 - Caroline Creaghead
Yeah. And that's something that actually Patrick, who we heard from first, he put it well.
13:45 - Patrick Stokes
The rich and the famous and the powerful were the ones who left behind the portraits and the writing and the physical memorials. If you go back and try and find your ancestors' graves, it helps if they had money, if you're going to actually find it.
13:57 - Caroline Creaghead
So if we're going to be thinking ethically about this, about what should remain and who should make those decisions, we want to try to think about influences outside of money. But just as you said, Brent, that's pushing a boulder uphill. It's not lucrative. So I wanted to get back to Carl and thinking about this problem of corporate interest, being the deciding factor in preserving or deleting data after the person that the data is tied to has died.
14:30 - Dr. Carl Öhman
So if you think about what the monetization of dead people's data really are about, it's a kind of monetization of their digital corpses, if you will. And if you run by that analogy, we really already have an industry that does that, which is archeological and medical museums. And so I've suggested in my research that perhaps an interesting ethical model to look at would be how do those museums manage their collections? How do they decide, as they include new artifacts and objects into their collections, what stays and what goes?
15:13 - Brent Simoneaux
I think sometimes when we are making decisions about what to do or how to proceed in the tech industry, these are technical discussions. It's about what is technically possible. And they're also business discussions as well. And so the disciplines that we draw on are sort of economic, business and technical. But what I hear him saying is that we should have more disciplines at the table when having these discussions. So people like religious leaders, people like philosophers.
15:52 - Angela Andrews
15:52 - Caroline Creaghead
15:53 - Brent Simoneaux
Ethicists too. Archivists.
15:55 - Caroline Creaghead
Yeah. Archivists, people who are thinking about history. Yeah, exactly. And I think we started by thinking about it as like, oh, what am I doing with my data? And what happens to my data when I die? But as Dr. Öhman says, we need to be thinking about it, not just in a personal way, but through this sort of multifaceted lens.
16:16 - Dr. Carl Öhman
Trying to think about it also as a collective matter, what will happen to our data when we die? What kind of society do you want to leave behind? What kind of society do you want to make possible? Because whomever inherits your data or whomever is going to be the custodian of your data will also wield the power of that archive. We should try to think about these matters, not only as personal matters, but as collective and political questions that we will need to solve together.
16:48 - Brent Simoneaux
It is interesting that we're calling it an archive.
16:51 - Caroline Creaghead
Yeah. We've sort of defaulted to that now. What would you suggest instead?
16:53 - Angela Andrews
16:54 - Brent Simoneaux
Well, I think generally, like we tend to think of them as data sets. Or we tend to think of them as like databases.
17:01 - Caroline Creaghead
Yeah. I guess for me, I'm thinking about it as an archive because of sort of where Dr. Öhman is suggesting that we look to for similar models.
17:10 - Brent Simoneaux
17:11 - Angela Andrews
17:12 - Caroline Creaghead
17:12 - Dr. Carl Öhman
It would be really interesting to also see collaborations, for instance, with national archives. And NGOs, such as the Internet Archive. In 2010, Twitter actually decided to donate their entire archive of tweets to the Library of Congress, which I think is a really interesting initiative. And I would love to see more such initiatives of tech companies realizing what massive resources they actually have, and trying to actively share that with experts who may have different kinds of knowhow, but also different perspectives on what kind of data would be valuable to preserve for posterity.
18:00 - Angela Andrews
But again, that doesn't take into account the needs or the wants of that digital person, should they pass, and the people are responsible for them. It doesn't take that into account. It takes the more historical approach where Twitter has actually donated my dog's tweets. Like if you think about it that way, to the Library of Congress, like who's to say that someone wants their dog's tweets to be sent to the Library of Congress. I'm making jokes, but you see what I'm saying? Like the way he's speaking about it, it takes it away from the individual.
18:38 - Caroline Creaghead
Yeah. But that doesn't mean that that individual perspective doesn't exist. And in fact, yeah, it is just like an illustration of how this is all so very complicated in terms of weighing the interests, right?
18:50 - Brent Simoneaux
18:52 - Caroline Creaghead
So when we're talking about how companies think about data and bringing that perspective into the conversation, I went ahead and talked to somebody who has a lot of experience with data management.
19:05 - Jamie Steele
My name is Jamie Steele. I'm a data consultant in the advisory services team here at Version 1. And I've got a long history of data management. I've worked in the healthcare industry, logistics, local and central government exhibitions and financial services. I've hoarded data, I've purged and deleted data, I've protected data, and I've monetized data.
19:30 - Caroline Creaghead
Kind of covers all our bases here.
19:31 - Angela Andrews
Right. He's done all the things with the data.
19:35 - Brent Simoneaux
To be clear, like, Version 1 is the name of the company he works for.
19:39 - Caroline Creaghead
Yeah. That's right. Version 1 works with a lot of public sector, private sector clients. But yeah, his particular job is to consult on the data policies that their customers have. And it's, as you might imagine from our discussion so far, not so straightforward. I wanted to get his take on this concept of stewardship that we've been talking about.
20:03 - Jamie Steele
We, at Version 1, we talk of data maturity. That is the level of sophistication and understanding of the data management responsibilities that organization has. And in our business, you'll hear about the Data Management Body of Knowledge framework from the DAMA, and this covers 11 subject matter domains. And the ones I've called out in particular relating to our conversation are data governance, master data management, data security, and data modeling.
20:32 - Brent Simoneaux
Okay. Break that down for us, Caroline.
20:34 - Angela Andrews
Yeah, that was a lot right there.
20:38 - Caroline Creaghead
He's saying that it's not a one size fits all kind of approach to data management. And in fact, you can have very complicated trees of decision making in terms of any particular data point. How does it get handled? Maybe some need to expire sooner than others. Maybe some are a security risk where others aren't. And so how a company is set up in terms of handling that data ranges in sophistication. So what Version 1 does is Jamie and his colleagues will work with an organization to try to get them from where they might be failing there or where they're not integrating all of these different inputs in how they're managing data, and help them to get to a more sophisticated place.
21:19 - Caroline Creaghead
The Data Management Body of Knowledge framework is a bunch of experts from these different disciplines have come together and determined here are the ways that... These are like best practices in terms of data management. And it's broken down by, as Jamie said, these 11 different knowledge areas. So when this comes into play or why this comes into play is because there's this phenomenon of where companies will push forward without thinking about the data life cycle, in terms of all of these different factors they need to consider, up until it comes up. So it kind of goes like this:
21:54 - Jamie Steele
The minimum viable product is released. The application is operational and it's earning revenue. And the regulations give us a two year timeframe, say. So we'll revisit that problem later. And of course those discussions get kicked further and further down the road.
22:12 - Brent Simoneaux
We are really good at avoiding these problems.
22:15 - Jamie Steele
At some point down the line, you are going to have to come back and look at the data that you have in your organization, the asset that you are maintaining, and you're going to have to assess whether you should still have that data and that's going to be regulatory related, it's going to be enforcement related, maybe moral. But you're going to have to come to a decision about what you do to purge, remove, delete, or depending on the circumstance, retain and archive that information. And what companies don't want to get themselves in the position of is having to do that in an ad hoc way.
23:00 - Caroline Creaghead
Having someone like Version 1 come in and do an audit based on this Data Management Body of Knowledge, that helps to see where the blind spots may be. Because as we know, most organizations will just keep moving forward at the pace that best serves the organization. And data management is part of those sort of unwieldy systems that need to be baked into the operations of a company. And it's maybe not the most cost effective thing to have to slow down and integrate fully. But it's very important to do that.
23:34 - Brent Simoneaux
So Caroline, let me throw this to you. What is the most important thing that you learned from Jamie?
23:42 - Caroline Creaghead
I mean, I was delighted to hear that there is a model that exists for assessing whether you're thinking about handling data in the appropriate way as an organization and that there's resources and best practices that you can look to. But it's really complicated. There could be contradictions, exactly, and interdependencies and so a particular data point or data set needs to sort of be filtered through a lot of these different perspectives in order to assess the way to handle it. And so if we think about that from an operation standpoint, that's a lot of work. And so what Jamie said is that this is definitely something where automation helps. However, it's not so straightforward.
24:32 - Jamie Steele
The process to remove, retain, and archive that information in an automated way, without human assessment and confirmation is not really appropriate for a third party tool. It's hard to define 100% consistent and appropriate rules for every scenario to drive such automation.
24:55 - Caroline Creaghead
Automation makes it possible, makes it efficient, but you still need to have experts and real people looking over these automated systems and handling it. It's hard to automate stewardship, I guess I should say.
25:09 - Angela Andrews
It is. But with any type of automation tool, you have to tune it so it actually recognizes what you're trying to catch. So it might take a lot of fine tuning, but someone (there's a human being involved) that's going to have to make those judgment calls.
25:26 - Caroline Creaghead
But again, as I said, it is kind of a relief to hear that we're not as rudderless as we may have thought from hearing from our previous two guests about industry barreling forward and leaving these considerations in the dust. There is this framework that we can reference.
25:44 - Jamie Steele
And what it allows people like ourselves to do is focus in on the areas that require improvement and leaping from a score of one to a score of five, well, that's not a trivial thing to do. That's probably six months minimum, but more likely, that's years of effort required to change the situation within that organization. Because it's not just improving the data itself, it's improving the processes and the handling and the governance. It's a big transformation in some cases.
26:21 - Brent Simoneaux
So let's return to our original question. When should data die? I think the answer is it depends.
26:34 - Angela Andrews
That's the answer.
26:36 - Brent Simoneaux
But is that correct?
26:39 - Caroline Creaghead
I mean, that's what my big takeaway is here. There are a lot of different perspectives to take into consideration. And that there's a lot of different factors that are considered, but the practice of integrating and assessing all of these different perspectives or even enough different perspectives is something that is difficult to do. And as you said Brent, that doesn't really help the bottom line necessarily in a short term. If we're looking at it from a short term perspective, it's tough to do, it's tough to make that a priority. But I think if we look at it in a more humanistic way of that data means something about us, that it is part of us and that we have a custodial responsibility and that we should think of what we do with data as stewards of that data, that really takes it out of this framework of thinking about it as just stuff that we need until we don't. And it is something that we have to really take into consideration and make sure that we are treating with a degree of maturity when we are running an organization that deals with people's data.
27:49 - Brent Simoneaux
Angela, I'm curious what you are thinking from an IT practitioner's perspective. How are you reacting to all this? And does this affect, sort of, your day to day life in any way?
28:03 - Angela Andrews
My previous life, yes. I was responsible for data, file servers, mail servers. That was a huge part of my job, how long we kept things. And after listening to our three guests, I've come to realize that the nature of the data really matters. How sensitive is it? And of course, when we're dealing with personally identifiable information, what laws govern that data? I think all of those things really need to be considered when we're talking about how this digital person's remains are handled. I think all of that needs to be taken into consideration. I mean, these companies are responsible for protecting and holding onto this data. They're even footing the bill for maintaining it. But honestly, they're also making money off of that data.
29:00 - Caroline Creaghead
Yeah, that's right, Angela. You're talking about the business side of it. And obviously we talked a lot in this episode about the ethical side of it. And I think here is where the technology itself actually enables us through automation to take this multifaceted look at each point of data and to decide what should happen with it. So the technology, though it feels more distant from the human part of this, actually is what allows us to take a more humane approach.
29:32 - Brent Simoneaux
Yeah. It's like automation can help us be better stewards of data at scale.
29:40 - Caroline Creaghead
That's right. Stewardship at scale, you heard it here first.
29:43 - Brent Simoneaux
There you go.
29:49 - Angela Andrews
And that does it for this episode of Compiler.
29:53 - Brent Simoneaux
Today's episode was produced by the one and only Caroline Creaghead. Victoria Lawton is our trusty steward.
30:01 - Angela Andrews
Our audio engineer is Elisabeth Hart. Special thanks to Shawn Cole. Our theme song was composed by Mary Ancheta.
30:10 - Brent Simoneaux
Big thank you to our guests: Patrick Stokes, Dr. Carl Öhman and Jamie Steele. And we also want to extend a big thanks to friend of the show, Varvara Tishkova.
30:21 - Angela Andrews
Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison, Nick Burns, Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, and Laura Walters.
30:38 - Brent Simoneaux
If you liked today's episode, please follow us. Rate the show and even leave us a review. It really does help the show.
30:47 - Angela Andrews
Thanks for listening.
30:48 - Brent Simoneaux
All right, we'll see you next time.
Dr. Carl Öhman