Community
July 17 2024 marks six months to January 17 2025’s introduction of the Europe’s Digital Operational Response Act (DORA), which aims to improve digital resilience in 21 financial entity types and includes more stringent rules around ICT security and risks. With the scale of security risks and dependency on software systems only increasing, DORA’s introduction is timely.
Achieving compliance with DORA will require addressing multiple aspects across financial services organisations. According to the annual outage analysis in 2023 by the Uptime Institute examining the causes of IT and data centre outages from organisations around the globe, “IT systems (hardware, software)” was the third-leading classification of outages. When diving deeper into this classification, the report shows that the two most common causes were “configuration/change management issue” (64%) and “firmware/software fault” (40%). Furthermore, 65% of respondents said that “software or configuration error” was a top-three cause of third-party outages.
The combination of configuration and change management issues with firmware and software faults highlights the need for better software development processes, tools and skills. Software development is not just about releasing software faster and more efficiently; when done well, it helps an organisation prevent, respond to and resolve system outages. More modern enterprises use techniques such as DevOps, Continuous Integration/Continuous Delivery (CI/CD), and Site Reliability Engineering (SRE) to not only deliver great software features and applications, but also get out of trouble faster.
Best practice techniques
Some of the areas in which best practices in software development lead to greater resilience are:
Expanding test coverage and reducing testing times - including the use of higher-quality and higher-volume test data sets and the use of automated parallel testing. This leads to fewer defects making their way into production environments.
Executing static code security analysis as an integral part of development pipelines - this reduces the exposure to security vulnerabilities of working software.
Accelerating releases by automating and integrating laborious or time-consuming development tasks - this reduces the amount of time that defects and vulnerabilities remain in place and helps accelerate the resolution of outages and the response to day-zero vulnerabilities.
Controlling and automating the provisioning, maintenance and monitoring (“drift detection”) of infrastructure for development, testing and production environment - ensuring the configurations used during development and testing remain the same as those in production, and that all are maintained in compliance with policies, reduces the likelihood of configuration-related outages.
Automating the delivery of production data to developers, testers, and data scientists - both during development pipelines and incident resolution processes. As long as enterprises address data privacy and compliance risks, basing test data on production data means that developers and support staff can more easily replicate and resolve difficult issues.
There are also cultural elements and principles that underpin these modern practices. First, there is the culture of “continuous”: not just with CI/CD but also continuous compliance, continuous security, continuous quality, and other techniques. This is the idea that these outcomes are not the result of discrete or point-in-time processes but that they work as integral parts of the pipelines for coding, building and delivering software. By taking a continuous approach, the steps for producing resilient systems become routine processes with more staying power.
Shift left
There is also the principle of “shifting left.” For quality, means software development teams strive to discover and remediate defects earlier in the pipeline. This, too, goes for security and compliance, whereby risk mitigation is built into software from the planning stage onward and vulnerabilities are detected and resolved early. Shifting left help produce more resilient software systems, while reducing the amount of time developers spend on fixing problems that have become baked into the way those systems function. It is a lot easier to resolve problems before they are deployed into production.
Finally, teams should be granted as much autonomy as possible. This is achieved via developer platforms, self-service and integrated tooling, guardrails and enablement. More and more teams are supported by a platform engineering team that provides a well-designed set of systems for delivering software. Platform engineers are increasingly focused on security and compliance,embedding risk mitigation into their platforms. This way, development teams find it easier to implement secure software and have a better experience overall.
The bottom line is that resilience depends on both how a financial services organisation operates and also on how it develops and maintains its IT systems. Software development is rarely given enough prioritisation when tackling resiliency, but the investments being made to comply with DORA afford an opportunity to improve in this critical aspect of IT.
Third party reference:
https://uptimeinstitute.com/uptime_assets/5f40588be8d57272f91e4526dc8f821521950b7bec7148f815b6612651d5a9b3-annual-outages-analysis-2023.pdf
This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.
Scott Dawson CEO at DECTA
10 December
Roman Eloshvili Founder and CEO at XData Group
06 December
Daniel Meyer CTO at Camunda
Robert Kraal Co-founder and CBDO at Silverflow
Welcome to Finextra. We use cookies to help us to deliver our services. You may change your preferences at our Cookie Centre.
Please read our Privacy Policy.