[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [linux-lvm] Backup costs (was: LVM reimplementationre)

Petro wrote:

On Thu, Feb 07, 2002 at 11:34:08AM +0100, Jesus Manuel NAVARRO LOPEZ wrote:

Hi, Petro:
Petro wrote:

On Thu, Feb 07, 2002 at 10:00:45AM +0100, Jesus Manuel NAVARRO LOPEZ wrote:


Well... let's consider all aspects. I'm a sysadmin the kind of BOFH, so late in the evening I usually find myself a bit overloaded on beer. Specially on friday, if I have to stay at work past 5PM I have the irresistible temptation to go to the closet and piss*1 on the diskcabbinet.

Good way to guarentee you won't have kids.

Fair enough. I *don't* have childrens... but I tend to consider my PFY like a bastard of my bastardness, does it counts?

That was intended to be humorous. Urinating on live electrical
components tends to be a shocking experience.

Yep. Me too. I neither have childs nor PFY, so it really doesn't matter.

For a backup policy you *must* take appart the media from the on-line data to be protected. Having all your backup media in a single place is *BAAAAAAAD* idea (TM).

Depends on your needs.

Yes: depends on my needs: If I need to recover any data, having *all*
your backup media in a single place is *BAAAAAAAD* idea (TM). If this

No, when you need to recover stuff, it's a *great* idea to have it in one place.

Well, but this is not what I said. I purpousely bolded *all*, for that's the key. It's a great idea have *some* backups at hand and grouped (if only for the chance of your boss asking you for recovering that porn... err... that paper he accidentaly deleted). But not *all* your media. Obviously is not too operative to have all my backup media in Dallas (I'm in Spain so it's not cost-effective).

When you suffer an environmental calamity, it's a bad idea.

place is the same (or near to) the place where the production data is stored is simply unadjectivable.

Ever tried to push tens of gigabytes over a WAN?

Ever heared about "Never understimate the bandwith of a wagon full of tapes"?

Depends on your needs.
I have one database that changes fast enough that if it's 36 hours
old, we're basically just recovering it for the table structures.

In case of disaster, if your backup media is in the proximities of your production database (define "proximities" as needed) you won't be able to recover table structures. One thing is what I say, and a *completely different* issue is to decide *what* to backup, and how, not where.

That wasn't my point.

My point was that for some stuff, 36 hour old data is useless, and

Then its *value* past 36 hours is... nihil. You told (implictly) that your database schema was of some value even if they were more than 36 hours old.

even a normal tape rotation schedule can put data out of reach for
10 or 12 hours minimum.

Yes, that's *potentially* true.

  I've got another one that changes fast enough that it's not worth
  backing up. If it's more than 2 hours old, starting from scratch is

So, the amount of change by unit time is your key to decide what you *need* to backup and what you don't?
Sound *extremly* odd to me. I would say it should be *the value* of the material (this include the cost to recreate it anew too, obviously), not its change rate.

Not the necessity of backing up, but the cost. If the data is
changing that fast, it could easily be that by the time it's on
tape, it's out of date and effectively useless.

Again: it's not its change rate but its *value*. The more it values, the more you can expense to "insure" it (part of your insurance policy talks about *within a time frame*).

It the first case, off site backups don't make sense, so we have 2
backup hosts (seperated by about 10 feet currently, less in a day or
two) that get backups on an alternating (daily) basis.

They won't make sense deppending on its *cost*, not its change rate.

No, it would be a lot cheaper to dump the dbs to tape, and carry the
tapes offsite, but (1) recovery time is almost tripled, and (2)

...and its *value* once recovered will be lower than having no data at all. Again, *if* you manage to find a method so the value of the recovered data is higher than the costs of having that method in place, your job (if that's your job, of course) is point it out and implement it.

And yes, we know the problems with this. It's a calculated risk. We
can't afford geographically seperated facilities right now.

*Value* again. And about it, I recently knew about a multinational company (so it were not only a one-site company) which main office was at the twin towers. It would be able to restore from the people death (though *many* of upper management died) but it didn't from the data/facilities loss.

  Again, your backup strategy depends on your needs, your budget, and
  your risk tolerance.

It only deppends on your needs. Your needs can include not surpassing certain budget amount, but definetly it hasn't nothing to do with "risk tolerance". "Risk tolerance" is either a winner bet or a misinformation issue.

It doesn't make sense to spend $10k for a backup solution for $20k of
data. It does make sense to spend $10k to backup $100k.

Plainly true... except for the last value: it makes sense to spend $X at most to backup $Y*p, where p is the probability of loosing that data (ie. if the probability is 1, so you're certain to loss the data, you can expend up to $100k to insure it -within the time frame that data produces $100k revenues).

Of course. Your backup strategy surely deppends... *on the data value* and only on this. From the very beginning I stated that for "home data"

No, it doesn't. It depend on several things:

(1) Value of data.

Plain data value

(2) Cost of downtime.

Data value too (in terms of lost revenue for the time the data is not accesible).

(3) Rate of change. (If your data set is completely worthless after
24 hours, but worth several million for the first hour, offsite
backups don't necessarily make sense etc.)

Data value too (in terms on how the value of data evolutions with time).

And probably some more I haven't thought of.

Probably: and they all will be expresable in terms of data value or will have no significance at all.
jesus_navarro promofinarsa es

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]