Cloud success from a new way of working

Picture of Matt Lovell
Posted by Matt Lovell

Moving away from Technical Silos to get the best from cloud deployments

I visited the UKISUG 2019 event recently and had the privilege of chatting to several leaders of IT within large SAP customers who are migrating to Microsoft Azure. When I asked for their thoughts on the challenges they are experiencing, they were familiar to the experience I have with many of my current customers.

It’s the People and Process, not the Technology that presents the biggest challenge.
Why is it so hard to get it right? Where are we going wrong?

Here are my observations from working in IT with large organisations for close to 20 years. IT is still a young profession when you compare it to others like accountancy or law that have been with us for hundreds  of years. IT has only had a few decades to evolve and has the fastest evolving problem set of them all to solve. During the early days of IT, departmental needs drove the purchase of software and hardware and organisation ended up with a series of purchases that were not standardised and created pockets of technical infrastructure of varying architectures. 

As the IT functional needs grew across the organisation it was recognised that this department-led IT purchasing was not getting the best value and presented an increasing challenge managing people process and technology. So gradually IT became more and more strategic and centralised. Standardisation of platforms and support became key to success. I was a big fan of the strategic view of IT in its time, and would be frustrated when companies could only have budgets that were associated with distinct projects, preventing them from buying infrastructure like networks or storage against a bigger more strategic plan.  Anyone suffered having to foot the bill for a new switch on a single project as you are out of ports on the existing switch infrastructure? So as a result we evolved specialist teams for Virtualisation, OS support, Database support, Storage, Networks, Backup and Security. This definitely reduced the headache of 50 OS variants, 40 independant storage area networks across 4 different vendor technologies and 4 different backup tools! But as we move to the Hyperscalers the challenges have changed again and this model is proving to be a barrier to change.

This problem is amplified by the new expectations that come with the move to a cloud. We can “spin-up” a VM in minutes with seemingly limitless storage, networking and compute options. So we should be able to build and destroy on demand, right? I should be able to request a DEV system in the morning, use it in the afternoon and destroy it in the evening, right? Run that new requirement through the storage team, network team, backup team, database team  and clunk, we grind to a halt with a series of 10 way conference calls that slowly nurse us to an outcome of “spinning-up” a system. Agile platform? Not anymore.

So what needs to stay and what needs to go?

The siloed approach of creating specialist departments full of subject matter experts SME’s needs to evolve. Let’s take a look at one example that is commonly centralised in the old model – the network team. There are still some Foundations that need to be governed by central teams. Networking between the Corporate networks and Azure for example – but what should their involvement be in the creation of VNET’s, subnets, NSG’s and other very specific aspects of the service to run on Microsoft Azure? If we continue to hand off responsibilities in technical layers, we continue to rely on multiple departments to invoke any changes. We need some subtle changes to the way we think about the role of these teams, and slice the responsibilities differently. To get the agility we desire from cloud demands the need for building through Infrastructure as Code (IaC). We need to adapt our people and process to this radically new way of working. We need to centre our build teams around the service they are building, rather than a technology silo

As our methods to build and maintain systems through IaC improve, so too can we improve the governance of setup. Do we need SMEs to dictate how a system should be built – yes, the first time. If that design decision is codified into a test, we only have to bother the SME when the design changes. Every build will be built to that specification through code. 

So central governance can be achieved with tools like Azure Policy , and tests built into configuration management tools like Ansible. And each time we build or make a change, the team closest to the service, that understand the service, are responsible for making changes, rather than a remote team that struggle to understand the context.

This means that the role of the individual teams needs to change. The cloud technology like Azure and AWS do not simplify all of the problems, so there will always need to be SMEs but how we engage them in teams and the method of central governance will need to change

So in summary in my opinion, we need to:

  1. Build via IaC, where governance is built into tests, not human process
  2. Focus your internal / external teams around the service, not the technology
  3. Do not separate build teams from run teams. Build is now a continuous requirement and part of the support model