Why it is important and cannot be ignored
Cloud adoption in BFSI has increased rapidly in recent years and the movement to cloud is forecasted to continue and even accelerate to achieve rapid agility and access to scale and elasticity with minimal capital expenditure. Cost benefits due to pay-as-you-use
pricing is also one of the key motivating factors driving cloud adoption. However, as companies move more workloads (business applications and data) onto cloud, their cost benefits quickly evaporate in absence of a well-defined cloud cost management strategy.
To quickly migrate to cloud, companies often adopt a lift-n-shift strategy to cloud bringing in the same inefficiencies that were on-premises. These inefficiencies could be in form of excess capacity of memory, compute, or storage. If applications are not
architected and optimized for cloud either during migration or after it, the underlying cost elements remain the same, providing no cost benefits and, in many cases, increase in cost. Also, many a times old budgeting and forecasting practices are used to manage
cost on cloud. Cloud financial management must be treated differently, and it warrants real time tracking, accurate forecasting, and immediate action in case of any event. This cannot be left for monthly or quarterly reviews as earlier.
Therefore, effective cloud cost management is an essential element in any cloud transformation journey.
Taking controls of your cloud cost – Framework and Strategies
Cloud adoption is thought as pay-as-you-use model. However, it is pay-as-you-order. All services that are provisioned are chargeable regardless of whether they have been fully utilized or not. Undisciplined, ungoverned, and unmonitored usage of cloud services
lead to ballooning of cloud budget and organizations struggle to control their growing cloud spend. There are numerous examples of companies grossly exceeding their on-premises infrastructure cost when moved to cloud.
Cloud cost management requires stringent monitoring and control of cloud cost from a holistic perspective. This can be done by defining and implementing a clear cloud cost management framework to manage cloud economics. The framework should allow organizations
to understand and baseline cloud needs, provide visibility into cloud-services related spends, ability and tools to optimize usage, implement recommendations, cost transparency with stakeholders and mechanism to charge-back costs to the lines of business (LoBs).
Cloud Cost Management Framework
Following are the core components of cloud cost management framework which should be considered.
1. Budgeting and Control
- Define, allocate and manage budgets assigned to LoBs, departments, or projects. This capability is to plan and control resource utilization, tracking budgeted vs actual run cost and acting on any variations. This will provide a predictable cost consumption
budgets for the enterprise.
2. Baselining and Optimization
- Right-size initially for cloud to baseline the cost and repeat it on a regular basis to further optimize.
3. Monitoring and Analytics
- Get visibility into cloud resources used so that they can be efficiently managed. Details such as current and past consumption, non-standard, un-used or sub-optimally used cloud services should be reviewed and acted on. Analytics of cloud usage patterns
and cost trends for granular budgeting and forecasting for LoBs and portfolio should be in place. Another aspect which helps cloud cost management is event-based intervention definition and automation.
4. Governance and Standardization
- Enterprise-wide policy-based access and right permissions for each role. Standardizing cloud infrastructure provisioning such as creating pre-defined templates of approved virtual machine configurations, baked-in security, network settings, etc that developers
can provision for increased productivity and automation.
- Set up automated alert mechanism using metadata, tags for notifying administrators when cloud service usage is greater than a predefined level or notifying for unused resources (e.g. volumes not attached to any virtual machine), underutilized resources
and automating the response.
- Establish governance related to operational hours of different environments (e.g. dev/test environments that can be a candidate for shut down when not in use), periodic review of billing agreement with the CSP and re-negotiate based on workload changes
5. Cost Transparency
- Bring visibility and transparency in cloud usage cost for different LoBs and departments. Use of metadata and resource tags for tracking and metering-based show-back and chargeback to LoBs and departments for their cloud usage.
Strategies for Cloud Cost Optimization:
Following are some of the strategies and best practices that can be adopted for optimizing cloud expenses:
1. Right sizing memory, compute, storage, and other resources
Many a times, especially if an organization has adopted a lift and shift approach to move to cloud, infrastructure resources are over provisioned. As pay-as-you-use is actually pay-as-you-order, right sizing initially and doing regular review and re-rightsizing
on a periodic basis is essential. This removes any chance of over provisioning or sub-optimized usage.
2. Eliminating unused resources and services
Identifying un-used resources in cloud setup and deleting them is a key strategy. This typically happens when a server is created for a temporary purpose and then forgotten. Similarly, failure to remove attached services to a virtual machine instance – services
such as block-level storage volumes (e.g. AWS EBS or Azure Managed Disks) or Static Public IP addresses (e.g. AWS Elastic IP), obsolete snapshots still cost even though the instance has been stopped or terminated. Identifying and eliminating unused services
will reduce cost.
3. Using right service and life cycle policy for storage
Pricing of cloud storage services vary a lot based on the usage pattern. Select right cloud services based on the business need. Use storage life cycle policy to move content to the right bucket based on expected usage pattern and latency requirements.
4. Schedule availability timings
Setting operational hours of different environments especially non-production instances. An example could be identifying VMs that do not need to run 24x7x365 and setting a dynamic stop and start schedule that is most cost-efficient.
5. Use Reserve Instances, Spot pricing
Estimate planned usage of cloud services and purchase reserved instances. Reserve instance have discounted pricing and can give a significant reduction in cost. This approach is more suited for enterprises that have a long-term commitment and applications
that have relatively low variability. However, if not estimated accurately, this can lead to increase in overall cost (due to under usage). For this, analytics on historical usage patterns and extrapolation of the same is crucial. Similarly use spot pricing
for virtual machines to take advantage of significant cost benefits. This is most suited for fault-tolerant and stateless application such as big data and analytics, high performance and high throughput computing, machine learning and AI applications.
6. Architecture Optimization
Regardless of cloud service provider, architecting a solution for cloud and selecting the right services is essential for getting the maximum value from cloud. Utilizing right combination of IaaS and cloud native PaaS services can reduce cost due to their
usage-based charging model. e.g. moving from databases on virtual machine to fully managed elastic database as a service.
Additionally, setting up enterprise standards and making sure that all applications getting onboarded on cloud follow the optimized architecture. Also, periodically evaluate and avoid all cost inefficient architecture elements in a solution, for example,
minimizing data egress from cloud.
7. Using Containers
Containers provide a lightweight and portable ways of running multiple applications in isolated fashion on the same virtual machine. They enable running applications at a higher density per unit of hardware as compared to traditional virtual machine hosting.
If done correctly, this can lower overall compute cost. Additionally, containers provide advantages of agility, simplified deployment across environments (due to elimination of issues caused by missing dependencies) and portability.
8. Training environment for AI/ML
AI/ML training when done on a large dataset require a lot of compute and is performed repeatedly to fine tune and increase the accuracy of the models. Doing this repeatedly on public clouds can become costly especially with large dataset. One of the approaches
is to have AI/ML training infrastructure setup in-premise and run trained models on public cloud. This way cost related to training can be better controlled. There are vendors that provide bundled solutions of specialized hardware (GPUs) and software to setup
AI/ML training infrastructure on-premises. Decision on infrastructure location of AI/ML training setup should be done holistically with business needs and cost in mind.
9. Use Tools to monitor usage, re-baseline and adjust frequently
Use tools to get a view of cloud expense. All cloud vendors provide cloud native services, and many third-party vendors provide solutions to organize, budget, track, monitor, report and optimize cloud costs. Use resource tagging and metadata for visibility
and tracking of resource for getting real time picture of usage, cost tracking and mitigation. This needs a proactive management rather than reactive as every delay costs money.
Cloud cost management is an emerging area which embeds cloud cost intelligence in the working of the enterprise. With this, cloud cost management becomes an integral part of cloud transformation and the framework to achieve optimized cost should be embedded
across the cloud adoption lifecycle. With proper guardrails in place, it will drive a cost-conscious culture and will become an essential part of the target operating model.