The ART of SAFe: Revamping SAFE's Program Level PI Metrics

“Base controls on relative indicators and trends, not on variances against plan” – Bjarte Bogsnes, Implementing Beyond Budgeting

Introduction

The series began with an overview of a metric model defined to address the following question:

"Is the ART sustainably improving in its ability to generate value through the creation of a passionate, results-oriented culture relentlessly improving both its engineering and product management capabilities?"

The ensuing posts delved into the definitions and rationale for the Business Impact, Culture, Quality and Speed quadrants. In this final article, I will address dashboard representation, implementation and application.

Dashboard Representation

The model is designed such that the selected set of metrics will be relatively stable unless the core mission of the ART changes. The only expected change would result from either refinement of the fitness function or incorporation of the advanced measures as the ART becomes capable of measuring them.

Given that our focus is on trend analysis rather than absolutes, my recommendation is that for each measure the dashboard reflects the vale for the PI just completed, the previous PI and the average of the last 3 PI’s. Given the assumption that most will initially implement the dashboard in Excel (sample available here), I would further suggest the use of conditional formatting to color-code movement (dark green for strongly positive through dark red for strongly negative).

Implementation

In The Art of Business Value, Mark Schwartz proposes the idea of “BI-Driven Development (BIDD?)”. His rationale? “In the same sense that we do Test-Driven-Development, we can set up dashboards in the BI or reporting system that will measure business results even before we start writing our code”.

I have long believed that if we are serious about steering product strategy through feedback, every ART should either have embedded analytics capability or a strong reach into the organisation’s analytics capability. While the applicability extends far beyond the strategic dashboard (ie per Feature), I would suggest the more rapidly one can move from a manually collated and completed spreadsheet to an automated analytics solution the more effective the implementation will be.

Virtually every metric on the dashboard can be automatically captured, whether it be from the existing enterprise data-warehouse for Business Metrics, the Feature Kanban in the agile lifecycle management tool, Sonarqube, the logs of the Continuous Integration and Version Control tools or the Defect Management System. Speed and Quality will require deliberate effort to configure tooling such that the metrics can be captured, and hints as to approach were provided in the rationales of the relevant deep-dive articles. NPS metrics will require survey execution, but are relatively trivial to capture using such tools as Survey Monkey.

Timing

I cannot recommend base-lining your metrics prior to ART launch strongly enough. If you do not know where you are beginning, how will you understand the effectiveness of your early days? Additionally, the insights derived from the period from launch to end of first PI can be applied in improving the effectiveness of subsequent ART launches across the enterprise.

With sufficient automation, the majority of the dashboard can be in a live state throughout the PI, but during the period of manual collation the results should be captured in the days leading up to the Inspect & Adapt workshop.

Application

The correct mindset is essential to effective use of the dashboard. It should be useful for multiple purposes:

Enabling the Portfolio to steer the ART and the accompanying investment strategy
Enabling enterprise-level trend analysis and correlation across multiple ARTs
Improving the effectiveness of the ART’s Inspect and Adapt cycle
Informing the strategy and focus areas for the Lean Agile Centre of Excellence (LACE)

Regardless of application specifics, our focus is on trends and global optimization. Are the efforts of the ART yielding the desired harvest, and are we ensuring that our endeavors to accelerate positive movement in a particular area are not causing sub-optimizations elsewhere in the system?

It is vital to consider the dashboard as a source not of answers, but of questions. People are often puzzled by the Taiichi Ohno quote “Data is of course important … but I place the greatest emphasis on facts”. Clarity lies in appreciating his emphasis on not relying on reports, but rather going to the “gemba”. For me, the success of the model implementation lies in the number and quality of questions it poses. The only decisions made in response to the dashboard should be what areas of opportunity to explore – and of course every good question begins with why. For example:

Why is our feature execution time going down but our feature lead time unaffected?
Why has our deployment cycle time not reduced in response to our DevOps investment?
Why is Business Owner NPS going up while Team NPS is going down?
Why is our Program Predictability high but our Fitness Function yield low?
Why is our Feature Lead Time decreasing but our number of production incidents rising?

Conclusion

It’s been quite a journey working through this model, and I’m grateful for all the positive feedback I have received along the way. The process has inspired me to write a number of supplementary articles.

The first of these is a detailed coverage of the Feature Kanban (also known as the Program Kanban). Numerous people have queried me as to the most effective way of collecting the Speed Metrics, and this becomes trivial with the development an effective Feature Kanban (to say nothing of the other benefits).

I’ve also wound up doing a lot of digging into “Objectives and Key Results” (OKR’s). Somehow the growing traction of this concept had passed me by, and when my attention was first drawn to it I panicked at the thought it might invalidate my model before I had even finished publishing it. However, my research concluded that the concepts were complementary rather than conflicting. You can expect an article exploring this to follow closely on the heels of my Feature Kanban coverage.

There is no better way to close this series than with a thought from Deming reminding us of the vital importance of mindset when utilising any form of metric.

“People with targets and jobs dependent upon meeting them will probably meet the targets – even if they have to destroy the enterprise to do it.” – W. Edwards Deming

2 comments:

Tim WalkerMarch 2, 2017 at 6:19 AM
Love this! Curious why unit test coverage is included. I only ask this because that's a metric that can be completely misused. We can have 100% test coverage but *not assert anything*. I'd like to see "Running Tested Features" as described in: http://ronjeffries.com/xprog/articles/jatrtsmetric/ Thank you for the great post.

The ART of SAFe

Saturday, February 18, 2017

Revamping SAFE's Program Level PI Metrics - Conclusion