co0p

co0p

tech, culture & random thoughts

21 Nov 2021

Talking Cost In (Agile) Software Development

agile vs programming motherfucker Agile manifesto vs the programming motherfucker manifesto … all devs want is to program.

In the companies I worked for as a Software Engineer there always was a disconnect between the economic side of product development and the implementing part. I suspect that we developers are not good talking in money terms and fail to make a good case for the technical side by highlighting the related financial benefits. In my career I wittnessed discussions with management regarding writing tests (it’s a waste of time), constant feature developement (maintenance-work does not give me new customers) and working overtime (Our Cxx promised that we deliver on X.X.XX) to meet a theoretical deadline.

One time one of the COOs stormed into the it-department, pressured us to deploy the software that was hastily prototyped, without basic tests in place, on top of a codebase that was left unmaintained for a long time … and we did. We also did spend the follwing 2 weeks fixing the production datadase and tried to get the crashed system back online. You can imaging management not being happy and the other departments not trusting in our abilities. We delivered late, were always too expensive and quality was subpar. Basically the pheonix project in real life. In comparison, the marketing team was always on good terms with Management.

So why do we engineers end up in this corner?

While doing research on Kanban and trying to find out a better way to prioritize tasks, I stumbled upon an interview with Don Reinertsen, where he explains Cost of Delay. In this interview Don mentions two things that stuck with me and made me wonder, if his observations may (at least partially) answer the previous question.

But first let me explain cost of delay.

Cost of Delay

Let’s say we are running our software in the cloud, paying 200€ per month. There is another provider, that offers the same service to half of the price (100€). How much money will you loose every month, when you postpone the migration? That is the Cost of Delay. Or put slighlty different, “Cost of delay is the money you loose when you deliver late”.

You could also post the question in a more positive tone, “what amount of money would I gain when the cycle time is reduced to time t.” This way of asking the question is a perfect segway into the whole kanban and lean manufacturing universe obsessed with optimizing the flow, eliminating waste and improving cycle time. No wonder D. Reinertsen writes about Product Development Flow.

Talking cost to me

Back to the quote. In the Interview D. Reinertsen argues that most agile methods concentrate on value. But value alone is not enough. You need to have time as well. And arguing about value is subjective; it would be much better in asking for support in a language management understands.

[…] it’s about economics what tends to be very important about that is that a lot of methodologies like scrum have this view of I don’t want senior managers bothering me with economic issues, just go away and let me take care of writing the code

and the second quote

[…] if you cannot engage with management speaking the language they understand, which is ‘how do I make money by doing this’ then you will not get as much support from management.

If you are are familiar with DDD and the concept of “ubiquitous language” this should be a no-brainer. The development team and the customers/users should speak the same language, use the same terms and have a common understanding of those terms in a given context. Let’s replace customer with “your management”. Does the development team and customer speak the same language? What do you think does “maintenance work”, “refactoring” or “bug fixing” yield with the customer. Certainly not a warm fuzzy feeling of increased revenues and new opportunities.

Different types of CoD

As in most agile methodologies future work is prioritized in some form of backlog. If you can estimate the Cost of Delay of future tasks you can use this information to prioritize your list. By applying the CD3 method, you will get a list optimized to reduce Cost of Delay. But not all CoDs are equal. In Kanban there are 4 classes of service, which govern and decide how the team should handles the work - aka prioritizes the tasks at hand:

Now imaging being responsible for scheduling work for the development department and you have a discussion what work to schedule for next week’s iteration: Marketing wants this shiny new frontend done for the next product launch, operations asks for development support to integrate structural logging and then there is this edge case bug when a big query is being executed. Thank god, there is no critical incident right now.

fixed date service type

fixed date profile

Marketing wishing for an update of the website promoting the product launch is clearly a fixed date. The development team validated all the mockup screens and estimated one week of work to implement the frontend. Marketing estimates a loss of 10k per week, starting from the product launch date which is in 3 weeks.

So what is the cost of delay of postponing the task for one week? Zero €/week! Sorry marketing, we will get to the frontend next week when the Cost of Delay is 10k€/week.

standard service type

standard profile

The operations team wants to have more insights into the running application. They already provided an elk-stack and are now waiting for your team to send the logging data. The cost of running the elk stack is negligible. They tell you that every incident costs them 8 hours ~1000 € of work. They think that the improved logging will reduce that time to 4 hours. They report an average of two incidents per week. They also mention to you that based on the last product lauch, they expect an increase in the incident rate most once the new product is launched.

So what is the cost of delay postponing the task for one week? 8 hours per week ~ 1000 €/week.

intangible service type

intangible profile

Occasionally one of your customers executes a long lasting query which blocks the processing of other queries. Your development team wants to investigate and prototype different solutions for vertical scaling. They estimated an effort of one week.

So what is the cost of delay postponing the task one week? Zero. Sorry dev team, you did not give me enough information to satisfy an implementation.

What your dev team could have done, is gather some data, align is with the business model and prepare for the upcoming product launch: Over the course of the last year we observed ~10 blocking queries per month with an average duration of 10 minutes. We had 1000 customers executing queries. Having one customer blocking the queue makes other customers wait. Sales tells us that they expect doubling the numbers of customers with the upcoming product launch. We bill our customers by for each executed query. We propose to implement vertical scaling to increase throughput and handle the upcoming load of requests.

What is the cost of delay now? is it even intangible? I would argue that this request is now competing with the operations team.

expedite service type

expedite profile

While you are doing the planning you get a call from operations and they tell you one of the query service is down. I guess you delayed the vertical scaling problem for too long :-)

verdict

To use Cost of Delay as a proxy variable to prioritize your backlog seems to be a good idea. This forces you and your team to get some numbers and convert them into cost. And cost is measured in $ what ultimately your management does have an interest in - therefore increasing the chance of getting their support.

There is still the problem of guessing mostly every number involved. For example, marketing is estimating an revenue drop and the development team has an estimate of the implementation duration. You will never nail those estimations with a 100% accuracy. Getting better at estimating is a differnt topic though.

And then there is the problem that you most likely will end up postponing the intangible types of requests building up tons of technical debt. And expediting every request by upper management. To remedy this, Anderson promotes the idea to put a capacity limit on the expedite and the intangible class of service types.