Open secrets of enterprise software delivery failures

Kalpa Senanayake
14 min readMar 6, 2022
Photo by Hal Gatewood on Unsplash

Motivation for the article

Many software deliveries ended up with one or more forms of failure. Some projects or products could not be delivered on time. Others do not meet the client, consumer expectations. Some deliveries achieve expectations but fail to retain key people who contributed due to the burnout caused by the project.

Over the past ten years, I have been involved in various software deliveries. Some are project-oriented others are product-oriented. I have been learning from these engagements, observing what worked well and what failed.

These reasons are from my own first-hand experience of where things went wrong. These are neither the only reasons for failures nor the suggestions to address the underlying problems.

I am keen to listen to examples you may have experienced and how you have solved those. Hence, this is an effort to share the knowledge and learn from each other.

1. The never-ending hunt to estimate accurate time of delivery

1.1 The ambiguity and the business problem

Software represents human needs. Software is the fastest medium to fulfil ever-increasing human needs in modern society. Hence, all software products directly or indirectly involve businesses, which creates the pathway to fulfil those needs.

Human needs are vague and unique, and hard to put into a one box and convert to logics which, creates the first problem of building software. Some details get lost in translating these real-world human needs to profit-oriented business concepts.

And then these business concepts get translated to technical details some other sets of facts get lost. In the end, software engineers get a problem that needs to be solved but missing some key details.

The business problem is open for more than one interpretation at this point.

In other words, the ambiguity of the business problem is high. This makes the estimates of the delivery time of the software hard to solve. It is almost impossible to estimate the delivery of software accurately due to this reason.

The industry has found a solution for this in the early 2000s. We call it agile these days. It suggests 12 concepts on dealing with this ambiguity and a remarkable way of thinking to embrace that nothing is permanent. We must adapt to this uncertainty and move in small but decisive steps toward the end goal.

Image source https://dilbert.com/strip/2007-11-26

The first reason for delivery failures that I experienced in many organisations is that even though they have a whole team dedicated to introducing and coaching agile methodologies, many organisations do not practice it.

They do implement agile but in a waterfall manner. Instead of the small steps, they try to make significant changes in one go. Do not accept that the road to the final goal may take detours. Get uncomfortable with complexity-based analysis of the problem instead; try to use time, an open invitation for failure.

In many ways, most organisations internally operate in waterfall ways of software delivery or a broken form of agile, which everyone calls fragile.

And teams tend to think agile consists of sticky notes, scrum masters, retrospectives, and agile ceremonies.

In reality, agile is all about embracing the fact that we can not predict the future and counting small steps towards a common goal through empathy, common sense, team spirit, and compassion.

Agile is not the only tool to deal with ambiguity, and it does not fit everyone. Finding the right tool for your organisation is a journey full of experiments, and collecting data points along the way will help to make the final choice.

Failing to utilise whatever tool the organisation chooses to deal with ambiguity is the first reason for software delivery failures.

1.2 Unknown complexity / blind spots

Using software to solve problems is not a new thing. It has been there for a long time. However, the technical landscape in the software industry changes rapidly. Most of the time, even the engineers in the field can not keep up with the neck-breaking speed of change.

This speed of change is also a key factor contributing to the software delivery failure. Even the most senior engineers cannot claim that they know everything.

There are always hidden complexities waiting to be found. These are blind spots created by the knowledge that the team does not know exists. These blind spots create accidental complexity patches in delivery plans. Hence these affect the accuracy of the estimated delivery time.

Image Source : http://carlcheo.com/wp-content/uploads/2015/03/what-is-programming-what-do-programmers-do-pdf.pdf

Many factors, including those mentioned above, make it hard to predict the estimated delivery time for a software product accurately. But few organisations get it.

Instead, they try to put timelines for a business problem with high ambiguity then put a team together to solve it while hanging a ticking time bomb on their neck.

So people either work until they burn-out or leave the place and never look back. No one accepts the fault and learns from it, so they repeat the same process and expect different results.

Not accepting that no one can figure out all the hidden complexities, not leaving room for it, and not embracing it, leads to burn-out of teams and delivery failures.

2. Funding and agile: Friends but misunderstood

2.1 Funding for waterfalls while working on agile ponds

When organisations embrace the agile operating model (As I explained earlier, there are many ways to deal with the uncertain complexity of software projects. However, I am focusing on agile as it is ubiquitous) it does not happen overnight. It is a gradual cultural shift that occurs over long periods.

However, even after it spreads to most parts of the organisation, there is a department that it does not reach easily. The finance department, not many take the effort to take them on board and explain why agile is selected as the weapon to deal with this ambiguity monster.

The finance department used to work in the waterfall model, where the expected business outcomes and delivery date are set in stone. That is how other parts of the business work, so they get it naturally.

The projects run according to agile methodologies, and the accounting needs to adapt to this agile-friendly budgeting and funding model. Otherwise, there will be a gap between the expectation and the realities.

Project is the basic unit of work in the waterfall way of software delivery, and there is a large amount of budget allocated for the upfront planning and workshops. And before actual implementation starts, some artefacts act as a blueprint for the rest of the project. And the business people who fund the projects get lightened up when they see those.

However, there are designs and plans in the agile way of working but smaller in scope. These are not welcomed by the teams who fund the projects.

One other key difference is the inverted iron triangle of project management. The waterfall project management model is a continuous battle between the scope, cost, and time while maintaining the quality.

In the agile funding model, Instead of selecting a scope and deriving the cost and time, it suggests setting the time and cost and deriving the scope.

The friction created by the mismatch of expectation vs. reality from the fund team creates conflict between them and the delivery team. This conflict is another front of the software delivery failure, which leads to losses in future projects, and IT teams get the label “always expensive with least results.”

There are well-known agile funding models beyond the scope of this article.

Not taking effort to formulate funding model which matches with the organisation choice of tool to deal with ambiguity is a reason for delivery failures.

2.2 More people faster results

Another misconception that leads to inaccurate delivery timelines is the quest to apply 6th-grade mathematics to project delivery.

How much time will six men take to complete a task if two men complete the same task in 36 hours?

2 men * 36 hours = 72 man-hour
72 man-hours / 6 man = 12 hours

The hidden details that no one bothers to mention to the 6th-grade student is that there is a massive assumption that these men have the same physical and mental capacity, and everything else in the ecosystem (including the weather) does not change for the particular time frame.

Software product delivery estimations and planning deserve a much deeper set of factors to be considered.

  1. Experience and knowledge level of the team
  2. Possible learning curves
  3. The communication structure of the organisation
  4. Definition of done
  5. Consideration of the relative complexity of the tasks
  6. Lessons learned from the previous estimations

Most people who are involved in these deliveries know all of these factors. For an unknown reason, they tend to ignore these and rely on the 6th-grade mathematics and convince themselves that adding more members to the team can yield faster delivery.

It is destined to be failed due to the following reasons.

  1. Adding more members to the team requires the senior team members to allocate time to help the new team members to find their way in the product and ways of working.
  2. It takes time to establish communication between new and old team members.
  • They must agree on why the team used library A over Library B.
  • The rationale behind how the CI/CD pipeline was designed.

3. There is always ramp-up time for a new team member to get their head around complexities in the ecosystem.

Now put a hard deadline (time) and large backlog (scope) into this picture. It paints a clear picture of why the “more people faster results” formula will not work for a short-term attempt to speed-up things.

But do not get this wrong; it can work as a long-term strategic attempt to arm the team with experts or junior members who are hungry to learn and grow into crucial people in the team.

Addressing shortage of team members by adding new people for a short term and expecting fast results [without letting them to learn the product], and not re-calibrating delivery predictions are reasons for enterprise software delivery failures.

3. Taking quality as good to have an item in building software

Considering quality as a luxury is an invitation for failure. You may think this is too evident.

I have seen many examples of this over the years. To understand why this happens, I made a habit of asking deep-level questions until the person or team making the quality compromise explains to me why quality is the only thing they can compromise.

The following is what I understood from those conversations. To lay the background to understand my findings, we need to go back to the traditional project management triangle.

The traditional project management triangle

Cost: Most of the time, this is fixed, and the business that funds thinks IT is too expensive hence not ready to allocate time to improve the quality, but to add more features on the go. Therefore, this is a complicated conversation. If it goes south, the person who drives the conversation can lose the job.

Time: Organisations follow poor agile (or any mechanism to deal with the ambiguity), which leads to miscalculation of delivery time. (Refer to reason #1 “The never-ending hunt to estimate accurate time of delivery” ). There was a marketing campaign to the general public about the features with a date attached to it. Or someone in the management hierarchy has a bonus attached to this delivery date.

Scope: There are certain expectations about the upcoming features to the key stakeholders, and now they want everything. Once again, a complicated conversation to have. If it goes wrong, someone can lose their job.

Quality: Compromising quality for all above is the only option left. As long as the features work, it seems to be ok to make the trade-off here.

Quality of the code, readability, maintainability, ability to detect errors, and graceful handling come in to picture only when the product goes to production.

It is a much easier conversation to have, and one can sugar-coat it, saying, “We will immediately address these issues once we are in production and when we have some breathing space.”

And it can save money in the short term since the delivery will be much faster without code reviews and unit/integration test cases. After all, the most expensive item in software delivery is the engineering time.

So how does compromising the quality contribute to the delivery failure ? Every piece of software has its lifecycle. However, it is not going to be a short period. I have seen systems built 30 years ago still running some core functionalities in organisations.

An inevitable factor about software systems is functional changes will be required with the business changes. Systems built with compromising quality will be hard to change and will not support the business in response to market change.

And business always wonders why it takes so much time to change the system. Over time it will take more and more time to deliver a feature, so the quality compromise leads to more delivery failures.

Eventually, developers are afraid to touch specific components of the system. Assigning a ticket in these areas is like asking to take a one-way trip to Mars. I have seen teammates wish good luck to them!

Finally, systems quality problems drive most engineers crazy, and they get burned out quickly. They do not have time to deliver the features or improve the system. Instead, they have to do fire-fighting every day.

And it leads to exhausting on-call rosters where the team gets a call every weekend. So people start to leave the organisation. Losing team members with domain knowledge is a significant loss for an organisation. One may think this knowledge can be retained by knowledge transfer sessions or documentation, but one could not be more wrong.

Making quality compromises is the same as setting up a ticking time bomb. It is going to destroy any future delivery timelines and expectations.

4. Ignoring the fundamentals in the engineering team while hiring

Hiring skilled engineers and building a quality team with experts in all aspects is one of the hardest things in the software industry. Organisations use various techniques to tackle this problem.

FAANG and similar technology-oriented companies use data structures and algorithms skills to evaluate problem-solving skills coupled with system design and behavioural interviews.

Another side of the spectrum uses just a couple of chats with the candidate before offering the job. The bottom line is that the whole industry is divided, debating the correct process.

However, what matters is to find out if the candidate knows the fundamentals with practical knowledge. For example, this is an essential list of items a candidate should demonstrate knowledge and practical skills.

Common data structures: Array, linked list, stack, queue, symbol table, tree(s), heap.

Operating Systems: Process management, memory management, IPC, multi-threading, I/O and hardware, OS file system

Networking: IP, TCP, HTTP, DNS, OSI model

Basic software security

Testing methodologies for software

Relational and non-relational database fundamentals

Basics of distributed systems (Data replication, load balancing, CAP theorem)

Best practices of software engineering (SOLID, DRY, KISS)

The list can go on, and I deliberately put the list to highlight a key concept here. There is nothing related to any particular programming language or any framework.

But if we look at job advertisements that advertise to find software engineers, they looks like this.

React Web Developer — React, TypeScript, Angular Front-end

Backend Engineer — Java 8/9, Spring, Test Driven Development, Agile Methodologies

This process results in developers focusing more on learning particular technology, language, or framework rather than the fundamentals. And on the other hand organisations keep hiring developers who have the surface knowledge.

You may ask, what is wrong with this, and how it contributes to the failure ?

When the developers do not fully understand the impacts of the code and the solution they provide for the business, it takes more time to reiterate and validate.

  • Selecting Hash Map for storing items with identifiers for its superior “lookup” capability compared to a linked list could differ between high-performing API and unusable API.
  • The inability to understand the concurrency controls in the multi-threaded application could take the organisation to court by showing persons A’s personal data person B.

The fewer developers know about how really things work, the more bugs you find in the code. Hence, the team will have to spend more time on fixing those bugs, which cost time and money.

As we discussed earlier, dealing with ambiguity is the name of the game; hence asking questions and clarifying things before you get to implementation is the best way to reduce the time on delivery. But if the team does not know what to ask because of their lack of knowledge.

Ignoring the fundamentals in computer science and software engineering while hiring is a reason for software delivery failures.

5. Let the business decides how to build the engineering solution.

All organisations do not have to operate in technology first mode. Each has its history and way of working.

Most non-technology firms have a business team that runs the show, and they deserve to do so as they are the domain experts on the particular area.

The failure in software delivery came into the picture when the business stakeholders tried to formulate or influence the technical solution.

Their contribution is to present the business problem and articulate it to the technical team. The technical team should take ownership of creating the solution.

I have observed that when business teams want to have at it, they tend to make decisions based on their prior knowledge or limited skills in the technical field. Unfortunately often, the results are catastrophic.

Once, I have observed a senior business person decide to leverage an existing solution for a new business problem, on the surface, it looks excellent and feasible.

However, the underlying database for this complete solution was a non-relational document database.

The technical team wants to evaluate options and go back to the drawing board. But due to pressure, they decided to go with the suggestion. Later, the technical team realised that they could not execute some of the queries for the new solution since the data column they wanted for some queries was not the partition key.

It would have been better if the team had given time to analyse the query patterns and evaluate if the existing tables could support those queries. Eventually, they had to replace that half-baked solution with a new one with more time and effort.

Business deciding how the technical solution should build is a reason for software delivery failures.

6. Vendor-driven outsourced development.

Using outsourced development units as a cost-cutting option is not a new phenomenon, and organisations are quite used to doing it. It can work for small deliveries if closely monitored and quality controlled with a team’s senior engineer(s).

However, most organisations start with that small amount of work, and they get the taste of it and tend to give more and more to these service provider vendors. These vendors could be offshore service companies of onshore consultancy.

Eventually, these vendors gain influence in the organisation’s internal structure. They proportionally start to lose the initial interest in these old clients’ work as they grow business into new clients.

This new landscape leads to less quality work, assignment of low-skilled engineers, missing accountability for the deliveries.

And the worst scenario is that the people who are coming from vendors do not have best interest for the organisation. The work they do is just another client engagement, there is no passion, accountability or the best interest towards the organisation.

It is not going to be a healthy relationship once this happens. After couple of delivery delays and failures the blame game starts. The management tries to save neck, the vendor is trying to keep the next project deal, but no one focuses on the critical delivery items on the current pipeline.

Vendor driven , outsourced , zero ownership software delivery leads to blame games and failures.

--

--