Long reads

Agile Series: Including risk, compliance and internal audit

Andrew Smith

Andrew Smith

Founding CTO, RTGS & ClearBank

This maybe the most 'controversial' article within this series, but it's potentially the one that could solve the most headaches, and for a COO/CIO/CTO maybe the one that removes the most stress, if risk, compliance, and internal audit are onboard.

The truth across many functions within a financial institution is that they have not been exposed to agile concepts, and as such, they will place dependency on typical known structured approaches, probably along the lines of project plans and Gannt charts. Risk and compliance functions do not historically require agile types of approaches, since much of what they focus on can be highly structured, highly process driven. The same is true of internal audit. Immediately you can see that from a 'culture' perspective, these are functions that are not connected into the agile engineering culture that you are cultivating, because of this, it can lead to a fair amount of friction.

So, what do I mean by friction? Well, typically risk and compliance want to ensure that everything is done to mitigate risk and show that those accountable are the ones that make the decisions. Unfortunately, this can mean they want to see (and even introduce) a highly structured approach to how software is delivered, typically in the form of what they are most familiar with, project plans, Gannt charts, committee sign off, etc. These things very much break a continuous deployment pipeline.

Breaking the pipeline

Continuous delivery/deployment can potentially see software making it's way constantly into production, hundreds if not thousands of components on a highly frequent basis, maybe even daily. Immediately, this causes a challenge for departments such as risk and compliance. Most risk functions will want to see a consolidated point where testing was signed off by 'the business.' By this it means someone who has the authority and accountability to say “yes, this software works as expected, it may move to the next stage”. They want to see that conscious decision point, ensure it happened and be able to capture it as an auditable event, typically in a fashion that they are familiar with. But this breaks continuous deployment, it adds friction into the process and depending on how you wish to execute this approach, may add a great deal of infrastructural and resource overheads.

The pipeline needs to remain continuous, but it cannot simply just happen, rather these functions require conscious decision points. Don’t fall into the trap of reverting back to 'manual process' here, or a favoured delivery/approval meeting where people sign off on the process. No, this type of accountability can be easily captured and included into the delivery pipeline, you just have to start capturing more into a continuous deployment pipeline.

Team construct

Firstly, if your team construct is not correct then you will struggle to implement these types of controls and auditable points that I personally favour. Teams must be autonomous, and therefore, if someone such as a subject matter expert or domain owner must provide approval for software to progress into production, then that person must be a part of your team. You cannot afford to have individuals/groups of people outside of your team part of a separate deployment process.

With modern DevOps tools, you can introduce 'gates' and 'sign off' points in your delivery pipelines, this means that your team member (the empowered team member) is able to be part of that pipeline and provide sign off that the software may continue onto its next stage. You can even include domain owners into the final sign off process. These steps can be automated somewhat, with the necessary people receiving an alert when their signature (so to speak) is required. Only once they provide that consent will the automated pipeline continue.  

As a subject matter expert being part of the team, means they will already have the information about the suitability of the software on making it to the next stage. They will be able to review the automated tests, regression tests, even UAT results if you carry this out. They are the only people in many ways who can say if the software meets the requirement. As a team member they are also able to be there and review the deployment process in full, providing that sign off at the identified stages. Each one of these stages fully auditable and within the DevOps environment itself.

Segregation of duties

This is something that risk functions often raise as an issue when they understand that the engineering team itself is making the deployment into production. They question 'access' to the production system and see that as a risk. However, being able to deploy your software is not the same as having access to production, rather the delivery pipeline has that access not the individual. If individuals require access, it is granted temporarily and is always supervised, monitored, and maybe even recorded. Noting that those that grant access are not related to an engineering team.

From a risk perspective, it is far riskier to have individuals have access to production who do not deploy software components daily. In such a model there is a great deal of human risk and a great deal of lack of knowledge of the software and its behaviours. This model introduces a great deal of risk in terms of not only the software deployment, but in terms of ongoing access management to production.

Active involvement

In some cases, risk, compliance and even internal audit, want to be part of the delivery pipeline, part of the sign off process. This causes a great deal of friction in the process and often, frustration. The process does not become any less risky, rather highly inefficient and depending on the number of deployments, could cause backlogs in software deployments which in themselves could cause risk.

A possible solution

Oversight of a highly transparent process with strong access management controls is the solution. We do not want risk, compliance, and IA to be part of the process, however, they need to have that visibility and re-assurance that only quality approved software makes it's way into production. This is all achievable while maintaining continuous deployment. Here is a bit of a framework that can be followed.

Delivery trains

A very simple concept. Software components must be on a particular delivery train to make it into pre-production and production environments. Delivery trains can run as frequently as you wish, maybe one every few minutes. The point here is that the components get associated with a specific delivery train, and therefore the software journey becomes visible to more parties. As teams execute their delivery pipelines, these pipelines can only be executed at specific times in line with the delivery train they are on.

By making delivery trains highly transparent, risk, compliance, internal audit any other interested parties can see exactly what software is coming into the pre-production and then production environments. This is actually a great feature for relationship managers, marketing and other departments that will have an interest in any new software features.

The delivery train links seamlessly back to the software components release/delivery/deployment pipelines. Because of this, interested parties can quickly view any 'gates' where sign off has been requested. Those that sign-off onto the next step are clearly shown as an auditable event within the pipeline, providing comfort to risk, compliance, and audit that the right people are approving the release of the software.

Automated delivery and access management

Delivery into production maybe started by an individual confirming the process may being, but the process of deployment itself should always be automated. The delivery pipeline has the correct and required access to the production environment to deploy the software packages, the individual user(s) therefore do not require any access. Team members can review the progress of the deployment, but they do not have access into the underlying production environment.

Access management and control around the credentials used by delivery pipelines will all form part of that wider delivery experience, which should be documented within a good software development lifecycle paper.

Mitigating risk with rolling updates and or blue green environments

In the previous article we looked briefly at a 'blue green' deployment, where both environments are production environments with one active and the other not. To mitigate risk further, deployments can be made to the non-active production environment, some light testing take place and once all is happy, the production environment switched.

Rolling upgrades also allow software to be upgraded without causing any downtime. Effectively rolling upgrades see services upgraded and then assessed for successful deployment. At that stage, new software is running in parallel with old software, however if the new software is working as expected, the deployment continues gradually replacing all the older software components with the latest version.

Both these approaches provide a level of comfort to risk and compliance that software is being checked as it is deployed into production, and that services will not become unavailable to customers.

Transparent oversight

Risk, compliance, and internal audit effectively need oversight of delivery. Delivery trains provide that first step, along with the pipelines that move software through the various stages into production. Obviously, there will be training required to ensure these functions understand how to use the tools they are being presented with, some of which can look quite intimidating for someone who is not that technically confident. But its not just about transparency of the delivery pipeline, nor the technology.

Transparent oversight also needs to include oversight of the engineering processes, how software is built, how it is tested, the testing outcomes and the controls that are in place, automated or manual that show that only quality software makes it's way into production. Again, these functions will require some educating, but that is a good thing. By educating these functions you are empowering them to be able to execute their role in a far more effective and efficient fashion, removing false areas of concern and removing the risk of these functions insisting on very non agile processes being introduced. As the executive responsible for your agile efforts, this educational part is key.

Summary

In many ways, I see risk, compliance, and internal audit as part of the overall delivery experience. No, they are not part of the pipeline, no they do not have the ability to say what goes into production nor when. No, their role is not to dictate what a 'pipeline' should contain, they simply do not have the technical know-how to do that, rather their role is that of a watching brief, in many ways like that of a regulator.

A regulator wants to understand the real mechanics of what you are doing, they want you to explain the risks, they want to be educated and empowered enough so that they can try to identify risks and areas of concern. They want and need to be educated enough so that they may effectively provide some form of 'challenge,' if they believe there is something to challenge. They observe, and when a regulator believes you are 'off-course,' they provide some guidance on the where/how to get back on-course. This is exactly the way a modern risk and compliance function should work, playing its part within an agile engineering culture.

The next article in this series will look at how to effectively construct a product roadmap, how to ensure you continue to value delivery over predictability.

Comments: (1)

Steven Rackham
Steven Rackham - NetApp - London 07 May, 2021, 12:281 like 1 like

Great article Andrew, thank you! 

I have seen organisations that want to move forward but been thwarted by internal risk, compliance and IA more than external regulators. 

Change isn't just about the technology but also the teams, policies and procedures that surround it.  Looking at how to involve them in this is key to advancing that change – as you have pointed out so well!