The ART of SAFe: September 2016

In recent months, there has been considerable public critique of the SAFe interpretation of Cost of Delay. This culminated recently in Josh Arnold’s June blog post suggesting improvements. As the author of the simulation most broadly used to teach the SAFe interpretation (SAFe City) and the veteran of having introduced it to perhaps more implementations than most in the SAFe world, I thought I’d weigh in.

What’s the debate?

At heart, the debate is hinges on whether you drive straight to “fully quantified Cost of Delay” or adopt SAFe’s approach of using a proxy as a stepping stone. The debate then digs deeper by questioning the validity of the SAFe proxy.

What is full quantification as opposed to proxy COD?

By “fully quantified”, we refer to a situation where the Cost of Delay (COD) has been defined as “$x/period” (eg $80K/week). A proxied approach establishes no formal “value/period”, but tends to produce an outcome whereby we can establish the approximate ratio of Cost of Delay between two options (eg Option A has twice the cost of delay of Option B) without actually quantifying it for either option.

Where there’s smoke there’s fire

When the topic of Cost of Delay arises, it’s easy to get lost in intellectual debate. The reality is that the primary use is to apply it to prioritization to enable maximisation of economic outcomes from a fixed capacity – and a well-implemented proxy will get pretty close to the same results on this front as an attempt at full quantification. Both approaches seek to expose underlying assumptions and provide a model to assist in applying objective rationale to select between competing priorities.

After a workshop I held on it a few months ago one audience member came up to me to passionately argue about the theoretical flaws in the SAFe interpretation. My response was to ask how many organisations he had implemented it in – to which the answer was zero. At heart, full quantification is hard and many organisations don’t begin because the theory sounds good but the practical application seems too daunting. A proxy is far easier to get moving with, and likely to gradually lead towards full quantification anyway.

“The other thing to realize about our economic decisions is that we are trying to improve them, not to make them perfect. We want to make better economic choices than we make today, and today, this bar is set very low. We simply do not need perfect analysis” - Reinertsen

However, there’s no escaping the fact that there are applications of COD that cannot be achieved by using a proxy. Further, the SAFe proxy approach is not without its flaws.

What do we lose by not achieving full quantification?

Cost of Delay can be used for a lot more than prioritization. Key applications negated by a proxy include the following:

Economic formulae for optimum utilisation rates

Quantified COD allows us to calculate the economic impact of queues at key bottlenecks and establish an economic basis for the amount of slack to build in. In short, the hard maths behind why we might staff a particular specialist function to operate at 60-70% utilization in pursuit of flow efficiency.

SAFe’s approach to this problem is basically a blunt hammer. Build in overall slack using “IP sprints”, and build in further slack by only committing plans at 80% load. It then relies on effective application of dependency visualization and inspect and adapt cycles to optimize for slack. The destination is similar in a good implementation, but the people footing the bill would certainly buy in much faster with the hard numbers quantified cost of delay can provide.

Economic Decision Making for Trade-offs

Reinertsen makes much of the fact that all decisions are economic decisions and better in the presence of an economic framework. Quantified cost of delay allows us to apply economic criteria to such decisions as “do we ship now without Feature x or wait for Feature x to be ready”, or “If reducing the cost of operations by 10% would cause a delay in time to market of 4 weeks, would this be a good tradeoff?”.

SAFe currently has no answer to this lack other than to stress the fact that you must develop an economic framework.

Application of global economics to local prioritisation for specialist teams

Quantified Cost of Delay is a global property where delay time is local. If, for example, I have a highly capacity constrained penetration testing team attempting to balance demands from an entire organisation they can apply their local “delay estimate” in conjunction with the supplied cost of delay to easily achieve global optimised priorities for their work. A cost of delay of $60K/Week is the same regardless of where in the organisation it is identified. Relative cost of delay is local to the domain of estimation, and a 13 from one domain will never be the same as a 13 from another domain.

SAFe’s approach to this problem is to largely negate it. Complex Value Streams and Release Trains sweep up specialist skills and create dedicated local capacity whose priority is set “globally for the system” using the PI planning construct.

What do we lose by achieving full quantification?

Perhaps it’s heresy to suggest a proxy can be better, but if one takes the perspective that COD is “value foregone over time” then a quantified COD inherently only caters to value which can be quantified. Let me take a couple of examples:

Improved NPS

Every customer I work with (even the Australian Tax Office) considers “customer value” to be a key economic consideration when prioritizing. Most have moved from measuring customer satisfaction to customer loyalty using Net Promoter Score (NPS).

The creators (Bain & Company) argue that any mature NPS implementation will eventually be able to quantify the P&L impact of an NPS movement in a particular sector/demographic/service. However, I have yet to work with a company that has reached this maturity in their NPS implementation. Losing NPS movement from our value considerations in COD would not be a good thing.

Learning Value

What of the feature whose core value proposition is to understand how customer’s respond to an idea? To validate an assumption the company has about them? You could argue that then the COD would centre on the value that would be lost if the company got its guess wrong in a quantified world, but I suspect the resulting numbers would be highly suspect.

Pirate Metrics

More and more I see customers implementing leading measures such as the Pirate Metrics (Acquisition, Activation, Retention, Referrals and Revenue). With enough time (and lagging feedback) you can quantify these into hard dollars, but there’s a reality that for a significant period when introducing new products they don’t quantify well.

With enough work, I’m sure there’s a way to solve for these problems in a fully quantified world but none of the examples I have researched have done so. The reality is the vast majority of COD science is based on Reinertsen’s work, and his focus is “introduction of products” where in the software world we are not simply introducing new products but selecting how to evolve them iteratively and incrementally – it’s a different paradigm. Achieving an objective balancing of qualitative and quantitative inputs is one of the things I have found the proxy model does well with.

Is there a killer argument one way or the other?

Personally, I don’t really feel like it’s an open and shut case. The reason I like the proxy is simple. It’s easy for customers to get started with. Full quantification (particularly at feature level) sounds scary and all too easily raises the barrier to entry out of reach. The longer the proxy is employed, the more hard data is brought to the table – full quantification is almost inevitable. Having said that, Josh and others have successfully kick-started people straight into quantified – having read all of their material and attended one of their workshops the starting journey and discussions sound remarkably similar (flush assumptions!).

What (if anything) is wrong with the proxy?

I agree with Josh - the current proxy is flawed when it comes to time sensitivity. The simplest proof is to use a legislative requirement. Imagine that I have a new piece of legislation coming into place on 1 July 2017. If I am not compliant, I will be charged $100k/week that I am in breach. It will take me 4 weeks to implement the feature. Today, in September 2016, there is no value whatsoever to me implementing the feature (COD=0). As April/May 2017 approach, my COD suddenly escalates. There is no way to reflect this using the current SAFe approach.

How did Josh fix it?

Josh suggested using time as a multiplier component, which is most definitely an improvement but not the approach I would take. For one thing, I tend to find people over-fixate on timing in their early days with COD. A multiplier is a fairly blunt instrument (think double and triple), and you would have to be careful to talk percentages.

However, the real challenge is that the question we are really trying to ask ourselves is “how does value change over time”, or “how is value sensitive to time”. Let’s take two examples based on a federal government client of mine:

They have many processes they are seeking to digitally enable/enhance which occur yearly. A feature which is delivered just before peak lodgement activity is far more valuable than a feature delivered just after peak. In fact, if you miss the peak you might choose to defer the feature for 9-10 months.
They have numerous features which must be delivered in time for a legislative change. There is very little advantage other than risk mitigation to delivering these features 6 months early. Using time as a multiplier would allow us to arrive at "zero COD" for early delivery of a legislative change, but this is entirely dependent on the date at which I assess it.

How would I fix it?

My belief is that we need to focus on the timing of the COD assessment, rather than the timing component of the COD. At any given point that we assess it, we are effectively assessing it “based on today’s urgency”.

At this point, we can leverage Josh’s great work with urgency profiles. Each urgency profile takes the form of a graph, and the graphs tend to have inflection points representing moments (timing) when the value goes through significant change.

This is what I would do:

When first assessing COD for an item (be it an Epic or a Feature), first assign an "Urgency Profile" to it and make explicit the dates believed to be associated with the inflection points.
Eliminate the time criticality component of the COD formula.
Separate the User and Business Value components. Most implementations I work with tend to do this, defining "Customer value" and "Business Value" to separate considerations of customer loyalty from hard-nosed underlying business. This would also open the door to more easily introducing quantification (with relative values based on dollar value brackets perhaps).
Make explicit in the guidance the fact that when applying the Proxy to your organisation you need to consider the relative weighting of the proxy components.
When assessing the Customer, Business and Risk Reduction/Opportunity Enablement values, do so in light of "today's urgency"
Based on the identified inflection points on the urgency profile, flag the date when the COD of the item needs to be re-assessed.

This solves two fundamental problems:

It ensures we have a systematic (and meaningful) approach to considering time sensitivity and its impact on COD without suffering from the flaws of the current world
It establishes a rhythm for re-assessment, sadly lacking for most current implementations of the SAFe COD model which tend to be "set and forget".

The ART of SAFe

Saturday, September 10, 2016

Improving SAFe Cost of Delay - A response to Josh Arnold