Saturday, January 14, 2017

Revamping SAFe's Program Level PI Metrics Part 1/6 - Overview

"Performance of management should be measured by potential to stay in business, to protect investment, to ensure future dividends and jobs through improvement of product and service for the future, not by the quarterly dividend" - Deming

Whilst the Scaled Agile Framework (SAFe) has evolved significantly over the years since inception, one area that has lagged is that of metrics. Since the Agile Release Train (ART) is the key value-producing vehicle in SAFe, I have a particular interest in Program Metrics - especially those produced on the PI boundaries.

In tackling this topic, I have numerous motivations. Firstly, the desire to acknowledge that it is easier to critique than create. I have often harassed +Dean Leffingwell  over the need to revamp the PI metrics, but not until recently have I developed a set of thoughts which I believe meaningfully contribute to progress. Further, I wish to help organisations avoid falling into the all-too-common traps of mistaking velocity for productivity or simply adopting the default “on time, on budget, on scope” and phase gate inheritance. It is one thing to tout Principle 5 – Base milestones on objective evaluation of working systems, and quite another to provide a sample set of measures which provide a convincing alternative to traditional milestones and measures.

Scorecard Design

It is not enough to look at value alone. One must take a balanced view not just of the results being achieved but of the sustainability of those results. In defining the PI scorecard represented here, I was in pursuit of a set of metrics which answered the following question:

"Is the ART sustainably improving in its ability to generate value through the creation of a passionate, results-oriented culture relentlessly improving both its engineering and product management capabilities?"

After significant debate, I settled on 4 quadrants, each focused on a specific aspect of the question above:

For each quadrant, I have defined both a basic and advanced set of metrics.  The basics represent “the essentials”, the bare minimum that should be measured for a train.  However, if one desires to truly use metrics to both measure and identify opportunities for improvement some additional granularity is vital – and this is the focus of the additional advanced metrics.

Business Impact

Whilst at first glance this quadrant might look sparse, the trick is in the “Fitness Function”. Wikipedia defines it as “a particular type of objective function that is used to summarise, as a single figure of merit, how close a given design solution is to achieving the set aims”. Jeff Bezos at Amazon quite famously applied it, insisting that every team in the organization developed a fitness function to measure how effectively they were impacting the customer. It will be different for every ART, but should at minimum identify the key business performance measures that will be impacted as the ART fulfils its mission.


The focus in culture is advocacy. Do our people advocate working here? Do our stakeholders advocate our services? Are we managing to maintain a stable ART?


For quality, our primary question is “are we building quality in?” Unit Test coverage demonstrate progress with unit test automation, while “Mean time between Green Builds” and “MTTR from Red Build” provide good clues as to the establishment of an effective Continuous Integration mindset. From there we look at late phase defect counts and validation capacity to understand the extent to which our quality practices are “backloaded” – in short, how much is deferred to “end-to-end” feature validation and pre-release validation activities. And finally, we are looking to see incidents associated with deployments dropping.


This quadrant is focused on responsiveness - how rapidly can our ART respond to a newly identified opportunity or threat?  Thus, we start with Feature Lead Time - "how fast can we realise value after identifying a priority feature?". Additionally, we are looking for downward trends in time spent “on the path to production”, mean time to recover from incidents and frequency of deployments as our Devops work pays dividends.


In parts 2 through 5 of this series, I will delve into each quadrant in turn, exploring the definitions of and rationale for each measure and in part 6 wrap it all up with a look at usage of the complete dashboard.

Series Context

Part 1 – Introduction and Overview (You are here)
Part 2 – Business Impact Metrics
Part 3 – Culture Metrics
Part 4 – Quality Metrics
Part 5 – Speed Metrics 
Part 6 – Conclusion and Implementation 

"Short term profits are not a reliable indicator of performance of management. Anybody can pay dividends by deferring maintenance, cutting out research, or acquiring another company" – Deming


  1. Their software development kit is used by multiple vendors and ui consultant has considerably reduced implementation time.

  2. Hi I was searching for the blogs for many times, now I have reached at the right place.
    multiple url opener for firefox

  3. This quadrant is focused on responsiveness - how rapidly can our ART respond to a newly identified opportunity or threat? Thus, we start with Feature Lead Time - "how fast can we realise value after identifying a priority feature? Smart Game Booster Torrent

  4. thanks for this great and very informative post share with us. I really appreciated you for this great blog.

  5. this web!
    Key Difference: The word 'kid' was originally not even used to refer to human children, but meant a baby goat, but in the 16th century the word was adapted to even include human babies and children. The word 'child' comes with three different contexts where legally it can refer to any minor that is under the age of 18.