Andy MacLellan, Head of Cloud, Onyx Group
How long does your data need to be backed up for? It’s a question most businesses take for granted, working on the assumption their IT department have taken care of things. In the real world, backups are the last job to sort out in a system deployment
project and will typically involve using default settings in the backup software itself. Technical staff are rarely told what’s required for compliance and regulatory purposes and, to be fair, it’s not their job to know. So when was the last time you checked
with them to make sure that your understanding of what important data is matches theirs?
YOU have a problem!
If you’re not even thinking about this, you’re building yourself up for a fall. A ‘head in the sand’ approach won’t help when you actually need to restore data from backup but can’t. And here’s the news – backups have moved on since tape and, if you’re still
relying on old fashioned rotation schemes, you’re already well behind your competitors who are using online backups with automated off-siting of data and advanced retention policies. “That’s up to them - I’m backing up to tape, I’m covered, it’s not a problem.”
WRONG! To understand exactly why you do have a problem let’s get back to basics.
Back in the day, before everything was electronic, you would follow best practice without even thinking about it. You’d write a letter, you’d take a copy before sending it and there’s your backup. You would then change the letter, take another copy of it,
there’s your backup – pretty simple. Then everything became electronic, so you would backup to tape because that’s all you had. Backups were run once every night and a well managed IT department would ensure some tapes were taken off site, month end tapes
were retained somewhere secure and so on.
So why are we backing up stuff in the first place? Well there are many reasons and the reasons vary from business to business. Here are some examples:
1.Accidental deletion of data
You’re bidding on an important contract and some buffoon deleted your final version of the bid document thinking it was something else.
2. Virus outbreak
It is not unheard of for a virus outbreak to be so bad that the only option is to restore your systems to a time before the virus existed.
3. Disaster recovery
What if your main file server has a disk failure and all data is lost? Or what if you have a flood or fire that destroys your data?
4. Theft or malicious deletion
In most companies it’s relatively easy for a disgruntled employee to cause catastrophic damage with their ‘delete’ key. This is a big problem if it occurred yesterday. It’s an even bigger problem if it happened a month ago and nobody noticed.
This is probably the most important point and one rarely considered. Like it or not, we’re in a litigious society and if you don’t prepare for having appropriate evidence available should the situation arise, you’re probably going to lose.
Compliance and regulation is a bit of a red herring since generally it’s advising to protect against all of the above but puts the ball firmly in your court to decide exactly how you’re going to do it.
So, coming back to the fundamental point of why do we backup data, well ultimately it’s to ensure the commercial or functional success of your organisation. Whether this is to help you make money by restoring that contract after someone deleted it, or to
stop you from losing money through that law suit you had no evidence for.
I backup to tape – where’s the problem?
Well, it’s a start I suppose…but tape is very old technology. It’s cumbersome, expensive, a management nightmare, prone to failure and, above all, very limiting in terms of data retention. Let’s take an example:
You have a fairly well managed IT environment. Data on the systems is backed up every night to tape. At the end of each week, a tape is held for the period of one month, so there are five end of week tapes to cover the last Friday in every month (assuming
the month has five Fridays). To give some added longevity there’s also a month end backup – so at the end of each month a different tape is used – this month’s would be labelled ‘October month end’. This tape is put to one side for one year (and hopefully
taken off site). Finally, at the end of each year, a year end backup is taken and kept for seven years.
So here’s a scenario. Let’s say on 2nd August a person called John created an important document and e-mailed it to a client. 2nd August this year was a Tuesday so it would have been backed up on the daily ‘Tuesday night’ tape. Towards the end of the month
John decides to have a bit of an e-mail clear-out and accidentally deletes this particular e-mail from his sent items. Come the end of the month the ‘August month end’ backup faithfully runs – but as this e-mail was already deleted it won’t be on the backup.
The only tape it was on was the daily tapes and the week end tapes. The daily tapes have already been recycled but there are still four possible tapes you can get the data back from – the 5th, 12th, 19th or 26th. HOWEVER, John didn’t notice he’d deleted this
e-mail until 3rd October when the client claimed something had happened that John had clearly warned them about in this e-mail. By then all of the daily and weekly backups have been recycled – i.e. the week-end tapes now contain backups from September week-ends.
John searches and searches but it’s no use – this e-mail is gone. A law suit ensues and John can’t provide the evidence needed to win their case.
The above is quite a comprehensive tape rotation scheme. Many companies don’t use anything as sophisticated as this. I’ve seen one company, who shall remain nameless, who just left a backup tape in their server for well over a year – backing up every night
to the same tape without it ever being taken out the server or checked for consistency. Their server had a catastrophic disk failure and when they tried to restore data from backup it was no great surprise that the data on the tape was corrupt simply due to
wear and tear. The only option was to send their disks off to a specialist data recovery company – they got the data back, it cost upwards of £20,000 but the alternative was to shut up shop.
Here’s another scenario – you delete a file but this time it was on the end of month tape - the tape was faithfully taken off site and stored in a secure location. Unfortunately, you need that file back NOW. You have a customer who needs that quote
and if they don’t get it today they’re going elsewhere. By the time IT have got the tape back, re-indexed it and restored the file you needed several days have passed – too late.
So is it the end of the line for tape? Yes. Or at least I seriously hope so. Tape has one thing going for it and that is that it can hold a relatively large volume of data. But even that’s not a good thing! Take, for example, these fundamental principles
of the Data Protection Act 1998:
- Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed;
- Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or those purposes.
There’s a famous case relating to a large airline who stored all of their backup tapes in a warehouse – tens of thousands of tapes. The company was presented with a class action suit alleging securities fraud. When the plaintiff’s attorney learned of the
e-mail backup tapes they naturally demanded the tapes. The company were unable to tell who’s e-mails were on which tapes without restoring the data first. They had no option but to restore the data from every single tape. This was further complicated by the
fact that they used several e-mail and tape backup systems throughout the world. It was a mammoth and costly task. They retained far more information than was needed and retention was disorganised. They settled for $92.5m in the end.
Tape really is old technology and whatever online or disk-to-disk platform you use it has to be an improvement. The important thing to remember is that, generally speaking, tape just gives you a snapshot of what your data looked like at the point that backup
ran. So even with a pretty comprehensive 21 tape rotation scheme, best case you’re only going to be able to roll back to 21 out of 365 days in a year. I’ll not get in to benefits of running differentials and incrementals to tape as the management overhead
of this is prohibitive for most companies. So over a year that’s about a six per cent chance of being able to roll back to any one particular day and over seven years (assuming year end tapes are retained) just one per cent.
The Pros and Cons of Online Backup
The big advantage of disk-based or online backups is that you can backup at any time. You’re no longer reliant on someone putting the right tape in for any given day. You can backup as often or infrequently as you like. Backups are still normally run at
night, to minimise network load during production hours, but you could backup every minute if you wanted. Or even better, the second a file changes, ensure it’s backed up.
I’m really trying to avoid the whole sales pitch for online backup here, but I’ve worked with the Asigra product for a long time through various employers and it’s genuinely a fantastic product. I first started using it in a pilot study for a large insurance
company back in 2001. We put the product through its paces over a six month period throwing all types of data at it. We tested for everything from speed to data integrity and robustness - it passed with flying colours. For the purposes of this blog I’m going
to try to remain relatively unbiased.
Another big advantage of online backup is the ability to introduce retention policies to suit the type of data being backed up. The big problem is that the market doesn’t understand what they need to backup and for how long, never mind how retention policies
work. So this key point is often overlooked, making do with the default settings of the program.
But here’s the point. You need to forget everything you know about backups. Forget about tape rotations, forget about incrementals and differentials. Forget about finding the right tape to carry out a particular restore. Online backup works completely differently
and it’s important you understand the implications of this for your business.
For a start, after your first full backup everything is incremental, generally at disk block level. So if your 10MB Word document changes it’s only the additional 5KB or so of changes that gets sent up the line. Secondly, data retention is on a time AND/OR
generational basis. So you can say “I want every generation of this data kept for five years, and this less critical data I just want the last three generations kept for one year – after that delete it from backup”. Finally, as touched on before, you can configure
your backups to run whenever you want. If a critical set of data has been produced you can call your IT department and say “See the folder on the S: drive called ‘Critical’ – can you back it up now please?” Couple of clicks and it’s done. Oh and of course
all encrypted to certified standards way beyond that of most tape systems.
So from a compliance standpoint all of the limitations of tape have been removed. You can be as compliant or non-compliant as you like. Backup every second, hour, day, week. Create different backup sets that backup at different times. You don’t even need
to worry about backing up the same data twice as the system looks after deduplication automatically (although that’s a topic in itself for another time!).
Sounds complicated – why bother?
Well that’s the million dollar question. Just remind yourself of point five in the reasons to backup mentioned earlier. Whoever is most prepared wins and the good news is that you can be prepared as you like. The bad news is that you need to take data backups
more seriously than ever and that means attributing a much more sensible cost to protection of your data. Of course I’m going to say that, we sell online backup. But this isn’t a sales pitch. If you can’t restore critical data and your competitor can you’ve
potentially got a big problem on your hands. Up until quite recently companies spent the bare minimum on putting backup systems in place. As an extreme example, we’ve seen organisations spending £10,000 on a new IT system and then backing it up to a USB memory
stick - crazy. But businesses are catching on quick. In the age of Big Data and reliance on electronic systems for every aspect of your business, having sensible, well thought out data backup and retention policies is essential.
What can I do?
First of all set aside a reasonable budget for data backup – at least 25-50 per cent of the overall production system value. This is a reasonable assumption – remember you’re probably going to need to store much more information on backup than on live systems.
Secondly, devise a sensible data retention policy. Only you can decide on an appropriate retention policy for your business. If you’re not sure escalate the question to someone who will make this decision. Remember to think outside the box – don’t just mimic
what you did on tape. Here’s an example data backup retention scheme:
- Operating system / application files and non-critical user data: Keep the last three generations;
- Critical user data: Keep the last five generations and also one generation every three months for the last year;
- Move all files older than three months to archive storage.;
- Keep all deleted data for one year;
- Backup critical data every three hours; and
- Backup non-critical data every week.
Remember this is just an example to show the flexibility now available – you need to come up with an appropriate plan that meets any regulatory or compliance needs for your organisation.
Finally, let appropriate people know the retention plan you’ve elected to use. If possible ensure this information is passed to board level or at least senior management.
With some initial thought and investigation you can put together a self-managing data backup policy that keeps you covered for the vast majority of situations. Here’s some final points to bear in mind:
- Don’t treat all data the same;
- Come up with folder structures that segregate critical data from non-critical, making it easy to apply an appropriate backup policy to each;
- Assess the value of your data and what would happen if you lost it;
- Plan for the worst case scenario; and
- Publish your plans internally.