The ART of SAFe: February 2017

Sunday, February 26, 2017

Getting from Idea to Value with the ART Program Kanban

Introduction

Readers of this blog will be no strangers to the fact that I'm a strong believer in the value of kanban visualization at all levels of the SAFe implementation. Further, if you've been reading my Metrics Series you may have been wondering how to collect all those Speed Metrics.

Given that the Feature is the primary unit of "small batch value flow" in SAFe, effective application of kanban tools to the feature life-cycle is critical in supporting study and improvement of the flow of value for an Agile Release Train (ART).

The first application for most ARTs is enabling flow in the identification and preparation of Features for PI planning. Many ARTs emerge from their first PI planning promising themselves to do a better job of starting early on identifying and preparing features for their next PI only to suddenly hit panic-stations 2 weeks out from the start of PI 2. The introduction of a kanban to visualize this process is extremely valuable in starting to create visibility and momentum to solving the problem.

However, I vividly remember the debate my colleague +Em Campbell-Pretty and I had with +Dean Leffingwell and +Alex Yakyma regarding their proposed implementation of the SAFe Program Kanban over drinks at the 2015 SAFe Leadership Retreat in Scotland. Their initial cut positioned the job of the program kanban as "delivering features to PI planning", whilst we both felt the life-cycle needed to extend all the way to value realization. This was in part driven by our shared belief that a feature kanban made a great visualization to support Scrum-of-Scrums during PI execution but primarily by our drive to enable optimization of the full “Idea to Value” life-cycle. Dean bought in and adjusted the representation in the framework (the graphic depicting the backlog as an interim rather than an end-state was in fact doodled by Em on her iPad during the conversation).

A good Kanban requires a level of granularity appropriate to exposing the bottlenecks, queues and patterns throughout the life-cycle. Whilst the model presented in SAFe acts much as the Portfolio Kanban in identifying the overarching life-cycle states, it leaves a more granular interpretation as an exercise for the implementer.

Having now built (and rebuilt) many Program Kanban walls over the years while coaching, I've come to a fairly standard starting blueprint (depicted below). This article will cover the purpose and typical usage of each column in the blueprint.

Note: My previous article on Kanban tips-and-tricks is worthwhile pre-reading in order to best understand and leverage the presented model. Avatars should indicate both Product Management team members and Development teams associated with the Feature as it moves through its life-cycle.

Background thinking

The fundamental premise of lean is global optimization of the flow from idea to value. Any truly valuable Feature Kanban will cover this entire life-cycle. The reality is that many ARTs do not start life in control of the full cycle, but I believe this should not preclude visualization and monitoring of the full flow. In short, “don’t let go just because you’re not in control”. If you’re not following the feature all the way into production and market realization you’re missing out on vital feedback.

The Kanban States

Funnel

This is the entry point for all new feature ideas. They might arrive here as features decomposed out of the Epic Kanban, features decomposed from Value Stream Capabilities, or as independently identified features. In the words of SAFe, "All new features are welcome in the Feature Funnel".
No action occurs in this state, it is simply a queue with (typically) no exit policies.

Feature Summary

In this state, we prepare the feature for prioritization. My standard recommendation is that ARTs adopt a "half to one page summary" Feature Template (sample coming soon in a future article).

Exit Policies would typically dictate that the following be understood about the feature in order to support an effective Cost of Delay estimation and WSJF calculation:

Motivation (core problem or opportunity)
Desired outcomes
Key stakeholders and impacted users
Proposed benefits (aligned to Cost of Delay drivers)
Key dependencies (architectural or otherwise)
Very rough size.

Prioritization

Features are taken to Cost of Delay estimation workshop, WSJF calculated, and either rejected or approved to proceed to the backlog.

Exit Policies would typically indicate:

Initial Cost of Delay agreed
WSJF calculated
Feature has requisite support to proceed.

Backlog

This is simply a holding queue. We have a feature summary and a calculated WSJF. Features are stored here in WSJF order, but held to avoid investing more work in analysis until the feature is close to play. If applying a WIP limit to this state, it would likely be based on ART capacity and limited to 2-3 PI's capacity.

Exit Policies would typically surround confirmation that the Feature has been selected as a candidate for the next PI and any key dependencies have been validated sufficiently to support the selection. I find most Product Management teams will make a deliberate decision at this point rather than just operating on “pull from backlog when ready”.

Next PI Candidate

Again, this state is simply a holding queue. Movement from the “Backlog” to this state indicates that the Feature can be pulled for “Preparation” when ready.

Generally, there are no exit policies, but I like to place a spanning WIP limit over this and the following state (Preparing). The logical WIP limit is based on capacity rather than number of features, and should roughly match the single-PI capacity of the ART.

Preparing

Here, we add sufficient detail to the Feature to enable it to be successfully planned. The Exit Policy is equivalent to a “Feature Definition of Ready”. Typically, this would specify the following:
• Acceptance Criteria Complete
• Participating Dev Teams identified and briefed
• Dependencies validated and necessary external alignment reached
• High level Architectural Analysis complete
• Journey-level UX complete
• Required Technical Spikes complete

This is the one state in the Feature Kanban that is almost guaranteed to be decomposed to something more granular when applied. The reality is that feature preparation involves a number of activities, and the approach taken will vary significantly based on context. A decomposition I have often found useful is as follows:

Product Owner onboarding (affected Product Owners are briefed on the Feature by Product Management and perform some initial research, particularly with respect to expected benefits)
Discovery Workshops (led by Product Owner(s) and including affected team(s), architecture, UX and relevant subject matter experts to explore the feature and establish draft acceptance criteria and high level solution options)
Finalization (execution of required technical spikes, validation of architectural decisions, finalization of acceptance criteria, updates to size and benefit estimates).

Planned

The planning event itself is not represented on the Kanban, but following the conclusion of PI planning all features which were included in the plan are pulled from "Preparing " to "Planned".

This is a queue state. Just because a feature was included in the PI plan does not mean teams are working on it from Day 1 of the PI. We include it deliberately to provide more accuracy (particularly with respect to cycle time) to the following states. There are generally no exit policies.

Executing

A feature is pulled into this state the instant the first team pulls the first story for it into a Sprint, and the actual work done here is the build/test of the feature.

Exit policies are based on the completion of all story level build/test activities and readiness for feature level validation. Determination of appropriate WIP limit strategies for this state will emerge with time and study. In the beginning, the level of WIP observed here provides excellent insight into the alignment strategy of the teams and the effectiveness of their observation of Feature WIP concepts during PI planning.

Feature Validation

A mature ART will eliminate this state (given that maturity includes effective DevOps). However, until such time as the ART reaches maturity, the type of activities we expect to occur here are:

Feature-level end-to-end testing
Feature UAT
Feature-level NFR validation

Exit Policies for this state are equivalent to a “Feature Definition of Done”. They are typically based around readiness of the feature for Release level hardening and packaging. The size of the queues building in this Done state will provide excellent insights into the batch-size approach being taken to deployments (and "time-in-state" metrics will reveal hard data about the cost of delay of said batching).

Release Validation

Once again, a mature ART will eliminate this state. Until this maturity is achieved we will see a range of activities occurring here around pre-deployment finalization.

Exit Policies will be the equivalent of a "Release Definition of Done", and might include:

Regression Testing complete
Release-level NFR Validation (eg Penetration, Stress and Volume Testing) complete
Enterprise-level integration testing complete
Deployment documentation finalized
Deployment approvals sought and granted and deployment window approved

The set of features planned for release packaging will be pulled as a batch from "Feature Validation" into this state, and the set of features to be deployed (hopefully the same) will progress together to “Done” once the exit policies are fulfilled.

Deployment

Yet another "to-be-eliminated" state. When the ARTs DevOps strategy matures, this state will last seconds - but in the meantime it will often last days. The batch of features sitting in "Release Hardening" will be simultaneously pulled into this state at the commencement of Production Deployment activities, and moved together to Done at the conclusion of post-deployment verification activities.

Exit Policies will be based on enterprise deployment governance policy. For many of my clients, they are based on the successful completion of a Business Verification Testing (BVT) activity where a number of key business SME’s manually verify a set of mission-critical scenarios prior to signalling successful deployment.

Operational Readiness

This state covers the finalization of operational readiness activities. An ART that has matured well along Lean lines will already have performed much of the operational readiness work prior to deployment, but we are interested in the gap between "Feature is available" and "Feature is realizing value". Typical activities we might see here depend on whether the solution context is internal or external, but might include:

Preparation and introduction of Work Instructions
Preparation and Delivery of end-user training
Preparation and execution of marketing activities
Education of sales channel

Exit Policies should be based around “first use in anger” by a production user in a real (non-simulated) context.

Impact Validation

A (hopefully not too) long time ago, our feature had some proposed benefits. It's time to see whether the hypothesis was correct (the Measure and Learn cycles in Lean Startup). I typically recommend this state be time-boxed to 3 months. During this time, we are monitoring the operational metrics which will inform the correlation between expected and actual benefits.

Whilst learning should be harvested and applied regularly throughout this phase, it should conclude with some form of postmortem, with participants at minimum including Product Management and Product Owners but preferably also relevant subject matter experts, the RTE and representative team members. Insights should be documented, and fed back into the Program Roadmap.
Exit Policies would be based upon the completion of the “learning validation workshop” and the incorporation of the generated insights into the Program Roadmap.

Done

Everybody needs a "brag board"!

Conclusion

Once established, this Kanban will provide a great deal of value. Among other things, it can support:

Visualization and maintenance of the Program Roadmap
Management of the flow of features into the PI
Visualization of the current state of the PI to support Scrum of Scrums, PO Sync, executive gemba walks, and other execution steering activities.
Visualization and management (or at least monitoring) of the deployment, operational rollout and outcome validation phases of the feature life-cycle.
Collection of cumulative flow and an abundance of Lead and Cycle time metrics at the Feature level.

Saturday, February 18, 2017

Revamping SAFE's Program Level PI Metrics - Conclusion

“Base controls on relative indicators and trends, not on variances against plan” – Bjarte Bogsnes, Implementing Beyond Budgeting

Introduction

The series began with an overview of a metric model defined to address the following question:

"Is the ART sustainably improving in its ability to generate value through the creation of a passionate, results-oriented culture relentlessly improving both its engineering and product management capabilities?"

The ensuing posts delved into the definitions and rationale for the Business Impact, Culture, Quality and Speed quadrants. In this final article, I will address dashboard representation, implementation and application.

Dashboard Representation

The model is designed such that the selected set of metrics will be relatively stable unless the core mission of the ART changes. The only expected change would result from either refinement of the fitness function or incorporation of the advanced measures as the ART becomes capable of measuring them.

Given that our focus is on trend analysis rather than absolutes, my recommendation is that for each measure the dashboard reflects the vale for the PI just completed, the previous PI and the average of the last 3 PI’s. Given the assumption that most will initially implement the dashboard in Excel (sample available here), I would further suggest the use of conditional formatting to color-code movement (dark green for strongly positive through dark red for strongly negative).

Implementation

In The Art of Business Value, Mark Schwartz proposes the idea of “BI-Driven Development (BIDD?)”. His rationale? “In the same sense that we do Test-Driven-Development, we can set up dashboards in the BI or reporting system that will measure business results even before we start writing our code”.

I have long believed that if we are serious about steering product strategy through feedback, every ART should either have embedded analytics capability or a strong reach into the organisation’s analytics capability. While the applicability extends far beyond the strategic dashboard (ie per Feature), I would suggest the more rapidly one can move from a manually collated and completed spreadsheet to an automated analytics solution the more effective the implementation will be.

Virtually every metric on the dashboard can be automatically captured, whether it be from the existing enterprise data-warehouse for Business Metrics, the Feature Kanban in the agile lifecycle management tool, Sonarqube, the logs of the Continuous Integration and Version Control tools or the Defect Management System. Speed and Quality will require deliberate effort to configure tooling such that the metrics can be captured, and hints as to approach were provided in the rationales of the relevant deep-dive articles. NPS metrics will require survey execution, but are relatively trivial to capture using such tools as Survey Monkey.

Timing

I cannot recommend base-lining your metrics prior to ART launch strongly enough. If you do not know where you are beginning, how will you understand the effectiveness of your early days? Additionally, the insights derived from the period from launch to end of first PI can be applied in improving the effectiveness of subsequent ART launches across the enterprise.

With sufficient automation, the majority of the dashboard can be in a live state throughout the PI, but during the period of manual collation the results should be captured in the days leading up to the Inspect & Adapt workshop.

Application

The correct mindset is essential to effective use of the dashboard. It should be useful for multiple purposes:

Enabling the Portfolio to steer the ART and the accompanying investment strategy
Enabling enterprise-level trend analysis and correlation across multiple ARTs
Improving the effectiveness of the ART’s Inspect and Adapt cycle
Informing the strategy and focus areas for the Lean Agile Centre of Excellence (LACE)

Regardless of application specifics, our focus is on trends and global optimization. Are the efforts of the ART yielding the desired harvest, and are we ensuring that our endeavors to accelerate positive movement in a particular area are not causing sub-optimizations elsewhere in the system?

It is vital to consider the dashboard as a source not of answers, but of questions. People are often puzzled by the Taiichi Ohno quote “Data is of course important … but I place the greatest emphasis on facts”. Clarity lies in appreciating his emphasis on not relying on reports, but rather going to the “gemba”. For me, the success of the model implementation lies in the number and quality of questions it poses. The only decisions made in response to the dashboard should be what areas of opportunity to explore – and of course every good question begins with why. For example:

Why is our feature execution time going down but our feature lead time unaffected?
Why has our deployment cycle time not reduced in response to our DevOps investment?
Why is Business Owner NPS going up while Team NPS is going down?
Why is our Program Predictability high but our Fitness Function yield low?
Why is our Feature Lead Time decreasing but our number of production incidents rising?

Conclusion

It’s been quite a journey working through this model, and I’m grateful for all the positive feedback I have received along the way. The process has inspired me to write a number of supplementary articles.

The first of these is a detailed coverage of the Feature Kanban (also known as the Program Kanban). Numerous people have queried me as to the most effective way of collecting the Speed Metrics, and this becomes trivial with the development an effective Feature Kanban (to say nothing of the other benefits).

I’ve also wound up doing a lot of digging into “Objectives and Key Results” (OKR’s). Somehow the growing traction of this concept had passed me by, and when my attention was first drawn to it I panicked at the thought it might invalidate my model before I had even finished publishing it. However, my research concluded that the concepts were complementary rather than conflicting. You can expect an article exploring this to follow closely on the heels of my Feature Kanban coverage.

There is no better way to close this series than with a thought from Deming reminding us of the vital importance of mindset when utilising any form of metric.

“People with targets and jobs dependent upon meeting them will probably meet the targets – even if they have to destroy the enterprise to do it.” – W. Edwards Deming

Sunday, February 5, 2017

Revamping SAFe's Program Level PI Metrics Part 5/6 - Speed

“Changing the system starts with changing your vantage point so you can ‘see’ the system differently. Development speed is often attributed to quick decisions. Early definition of the requirements and freezing specification quickly are often highlighted as keys to shortening the product development cycle. Yet the key steps required to bring a new product to market remain the creation and application of knowledge, regardless of how quickly the requirements are set. The challenge in creating an effective and efficient development system lies in shortening the entire process.” – Dantar Oosterwal, The Lean Machine.

Series Context

• Part 1 – Introduction and Overview
• Part 2 – Business Impact Metrics
• Part 3 – Culture Metrics
• Part 4 – Quality Metrics
• Part 5 – Speed Metrics (You are here)
• Part 6 – Conclusion and Implementation

Introduction

As mentioned in my last post, the categorization of metrics went through some significant reshaping in the review process. The “Speed” (or “Flow”) quadrant didn’t exist, with its all-important metrics divided between “Business Impact” and “Deployment Health”.

Lead Time is arguably the most important metric in Lean, as evidenced by Taiicho Ohno’s famous statement that “All we are doing is looking at the customer time line, from the moment the customer gives us the order to the point when we collect the cash”. Not only does it measure our (hopefully) increasing ability to respond rapidly to opportunity, but is a critical ingredient in enabling a focus on global rather than local optimization.

In this quadrant, the core focus is to employ two perspective views on Lead Time. The first (Feature Lead Time) relates to the delivery of feature outcomes, and the second (MTTR from Incident) our ability to rapidly recover from production incidents.

The other proposed metrics highlight the cycle time of key phases in the idea-to-value life-cycle as an aid to understanding “where we are slow, and where we are making progress”. In particular, they will highlight failure to gain traction in XP and DevOps practices.

There is, however, a caveat. Many (if not most) Agile Release Trains do not begin life in control of the entire idea-to-value life-cycle. On the one hand, its very common for features to be handed off to an enterprise release management organisation for production release. On the other, whilst Lean Principles are at the heart of SAFe the framework centers on hardware/software development. The (traditionally business) skill-sets in areas such as operational readiness, marketing and sales required to move from “Deployed product” to “Value generating product” are nowhere on the big picture.

ARTs focused on bringing to life the SAFe principles will address these gaps as they inspect and adapt, but in the meantime there is a temptation to “not measure what we are not in control of”. As a coach, I argue that ARTs should “never let go until you’ve validated the outcome”. You may not be in control, but you should be involved – if for nothing else than in pursuit of global optimization.

Basic Definitions

Basic Metrics Rationale

Average Feature Lead Time (days)

This is the flagship metric. However, the trick is determining "when the timer starts ticking". For an ART maintaining the recommended 3-PI roadmap, feature lead time would rarely be shorter than a depressing 9 months.

To measure it, one needs 2 things: A solid Feature Kanban, and agreement on which stage triggers the timer. A good feature kanban will of necessity be more granular the sample illustrated in the framework (fuel for a future post), but the trigger point I most commonly look for is "selection for next PI". In classic kanban parlance, this is the moment when a ticket moves from "backlog" to "To Do", and in most ARTs triggers the deeper preparation activities necessary to prepare a feature for PI planning. The end-point for the measure is the moment at which the feature starts realizing value and is dependent on solution context, often triggered by deployment for digital solutions but after business change management activities for internal solutions.

Average Deployment Cycle Time (days)

This metric was inspired by the recently released Devops Handbook by Gene Kim and friends. In essence, we want to measure “time spent in the tail”. I have known ART after ART that accelerated their development cycle whilst never making inroads on their path to production. If everything you build has to be injected in a 3-month enterprise release cycle, its almost pointless accelerating your ability to build!

Whilst our goal is to measure this in minutes, I have selected days as the initial measure as for most large enterprises the starting point will be weeks if not months.

Average Mean Time to Restore (MTTR) from Incident (mins)

When a high severity incident occurs in production, how long does it take us to recover? In severe cases, these incidents can cause losses of millions of dollars per hour. Gaining trust in our ability to safely deploy regularly can only occur with demonstrated ability to recover fast from issues. Further, since these incidents are typically easy to quantify in bottom-line impact, we gain the ability to start to measure the ROI of investment in DevOps enablers.

Prod Deploys Per PI (#)

Probably the simplest measure of all listed on the dashboard - how frequently are we deploying and realizing value?

Advanced Definitions

Advanced Metrics Rationale

Average Feature Execution Cycle Time (days)

This is one of the sub-phases of the lead time which are worth measuring in isolation, and is once again dependent on the presence of an appropriately granular feature kanban.

The commencement trigger is "first story played", and the finalization trigger is "feature ready for deployment packaging" (satisfies Feature Definition of Done). The resultant measure will be an excellent indicator of train behaviors when it comes to Feature WIP during the PI. Are they working on all features simultaneously throughout the PI or effectively collaborating across teams to shorten the execution cycles at the feature level?

One (obvious) use of the metric is determination of PI length. Long PI’s place an obvious overhead on Feature Lead Time, but if average Feature Execution time is 10 weeks its pointless considering an 8 week PI.

Average Deploy to Value Cycle Time (days)

This sub-phase of feature lead time measures "how long a deployed feature sits on the shelf before realizing value".

The commencement trigger is "feature deployed", and the finalization trigger is "feature used in anger". It will signal the extent to which true system level optimization is being achieved, as opposed to local optimization for software build. In a digital solution context it is often irrelevant (unless features are being shipped toggled-off in anticipation of marketing activities), but for internal solution contexts it can be invaluable in signalling missed opportunities when it comes to organizational change management and business readiness activities.

Average Deployment Outage (mins)

How long an outage will our users and customers experience in relation to a production deployment? Lengthy outages will severely limit our aspirations to deliver value frequently.

Conclusion

We’ve now covered all 4 quadrants and their accompanying metrics. The next post will conclude the series with a look at dashboard representation, implementation and utilisation.

“High performers [in Devops practices] were twice as likely to exceed profitability, market share, and productivity goals. And, for those organizations that provided a stock ticker symbol, we found that high performers had 50% higher market capitalization growth over three years.” – Gene Kim, Jez Humble, Patrick Debois, John Willis .. The Devops Handbook