Anatomy of a systems failure: Danske Bank turns spotlight on DB2
03 April 2003 | 5855 views | 0
Danske Bank has issued a statement outlining how a series of never-before-identified errors in IBM's DB2 software caused a week-long disruption to payments systems and the trading and settlement of currencies and securities for customers.
The problems stemmed from a routine replacement by IBM engineers at a Danske Bank operations centre of a defective electrical unit in an RVA disk system used for storage of data in the DB2 database. During the repair process, there was an electrical outage in the disk system which sparked a critical operational situation with wide-ranging effects, says the Danish bank.
The RVA disk and the business data stored on it were not accessible to the bank's core processing systems after the breakdown. This meant that the part of the systems running at the bank's operating centre in Ejby (currency and securities trading, as well as foreign systems and payments) was not able to function. Operations continued for the systems running at the operating centre in Brabrand which runs the bank's cash dispensers and self-service systems.
When the problematic RVA disk unit was given the all-clear later that day, the bank's systems in Ejby were restarted as usual. However, by early Tuesday morning it had become apparent that the first batch runs out of Ejby contained data inconsistencies.
"This first software error in DB2 database software had existed in all similar installations since 1997, without IBM's knowledge," says the bank.
In the following process of data recovery, which lasted from the morning of Tuesday, 11 March, until Friday, 14 March, three more errors were discovered in the software, affecting DB2 tables and recovery jobs and resulting in new episodes of inconsistent data.
In order to avoid longer delays, Danske decided not to wait for an IBM patch, but instead rebuilt the database using back-up data stored at the operating centre in Drarbrand, which itself was beginning to suffer from data deficiencies at Ejby.
A week after the inital fault was discovered, all data from the problematic disk was transferred to other systems, and the disk was taken out of operation and the bank settled all accumulated transactions with external counterparites.
"None of the errors in DB2 were previously known to IBM, and therefore no patches could immediately be provided," says the bank.
Patches are now available for all users of DB2 database software and have since been installed by Danske, which is additionally seeking compensation from the supplier for the problems caused.
The bank says it is also discussing with IBM measures to improve DB2 as well as more extensive and efficient emergency procedures in case of future critical events.
Internally, the bank is considering considering future investment in its two-centre operations to protect some of the group's vital IT systems, but it refutes press reports that the implementation of mirrored disks at the centres would have prevented the failures.
Responding to customer feedback for more open lines of communication in future emergencies, Danske says it has rewritten its contingency plans to address these concerns.