Monday, March 4, 2019

Effective Change Control "by Release" in SAFe

Control by Release … Control by turning loose within well-understood parameters.  Control by trusting the process.  This doesn’t mean anything goes” – Artful Making, Rob Austin and Lee Devin

Under a “traditional” delivery model, organizations employ a combination of gating and detailed scope, cost and timeline commitments to facilitate governance, change control and executive oversight of strategic product spend. 

With SAFe, the move to Lean Budgeting and outcome-based metrics enables the elimination of much of the overhead and inefficiency inherent in the traditional approach.  SAFe 4.6 crystallized this with the introduction of the four Portfolio Guardrails:

  1. Guiding investments by horizon
  2. Optimizing value and solution integrity with capacity allocation
  3. Approving Significant initiatives
  4. Continuous Business Owner engagement

The opening quote is from one of the books that profoundly influenced me early in my Agile career – Artful Making.  The notion of “Control by Release” has been a core component of my coaching toolkit for many years, particularly in the context of decentralization.  In this post, I will explore the “Control by Release” aspect of each guardrail as we determine “what to do” then move beyond into the tools that help us during PI execution while we “are doing”.

Guiding Investments by Horizon

This guardrail controls the percentage of investment capacity that can be consumed by a particular horizon of investment idea.  “Release” is achieved by not constraining the ideas under consideration, but "control" ensures innovative horizon 3 and “coming attraction” horizon 2 ideas are not drowned out by the all-consuming “demand of today” in horizon 1.
 
Horizon allocation authority often exceeds the remit of the Portfolio itself, requiring validation at the enterprise level – particularly given the high-risk nature of horizon 3 investments.  However, once established it releases the Portfolio team to more futuristic investments within the agreed constraint.


Optimizing value and solution integrity with capacity allocation

Another percentage-based control tool, this is typically applied more at the Solution and Program levels.  A classic scenario is B2B digital products – often stuck in the following dilemma:

  • “Without the B2B providers we can’t attract B2C consumers” 
  • “Without B2C consumers we can’t sign up the  B2B providers”.  

This is a level well above “what feature should we build” – it’s a strategic tool.  “For the next 2 PI’s, we want to invest heavily in the B2B space with 70% of our capacity”.  We thus achieve "control" by specifying the percentage of ART capacity to be consumed by B2B features and "release" by the fact we have not actually talked about what B2B features should be considered. 

This decision is often not within the remit of Product Management.  It will need validation by their Business Owners, Portfolio Management or some combination of the two – but once set they are released to find the best features to use that capacity for.


Approving Significant Initiatives

The Product Manager for an ART has significant remit when it comes to feature selection.  In mature SAFe most features built are independent features identified in service of product strategy rather than derived from Epics.  The Product Manager is empowered to identify Features and facilitate their prioritization (through WSJF), but accountable for the economic outcomes achieved by the ART justifying the spend.  There is a level of safety and control in this as we know that the Feature should have a quantifiable, rapidly measurable benefit and has to be small – small enough to fit in a single PI for a single ART. 

However, what happens when the Product Manager encounters a big idea – something too big to be a Feature - an Epic.  This attracts more due diligence, approval and monitoring at the Portfolio level.  “Before you start down this path, we need to validate it.” 

The Product Manager is “released” to pursue a range of small valuable investments, but works within the “control” that if the investment passes a certain threshold higher authorities must become involved. 


Continuous Business Owner Engagement

Late last year I dedicated a whole post to this.  How do we “release” the Product Manager to do the job they’ve been appointed to do?  By providing the “control” of engagement with executive business owners for involvement in moments where key constraints are established.

There are 4 clearly defined “control” events for this in SAFe:

  • Business Owners collaborate to define the Cost of Delay for candidate Features and thus “control” the priorities whilst having “released” the Product Manager to find features to present for prioritization.
  • Business Owners collaborate to “control” tradeoff decisions during Management Review at PI planning whilst having “released” the teams to identify the tradeoffs required, then accept the committed objectives to establish a “control” around commitment expectations for the PI.
  • The System Demo (attended by the Business Owners) demonstrates progress in the previous iteration, and should illustrate alignment to the agreement reached at PI planning.
  • The PI Demo where achieved outcomes are assessed against committed objectives by the Business Owners

Beyond Portfolio Guardrails

Each of the guardrails provides controls on what we choose to do, at varying levels of seniority, regularity and granularity.  However, what happens once we are “doing?”  Or, in more traditional language “how do we apply change control during execution?”.
 
The “hippy agilist” inside me gets nervous about tackling the second half of this post.  Surely I’m not going to start talking about the dreaded “change request”.    What about Agile principle 2 – “Welcome changing requirements, even late in development”?  Unfortunately, I have probably seen this principle abused more than any other (albeit, the sustainable pace of principle 8 is definitely the most commonly ignored – typically in the context of abuse of principle 2).  All too often, it is interpreted as “I thought this was agile and I could change my mind anytime I liked” without accepting that the change might impact cost, timeframe or other commitments.

So, how do we enable the “release” to embrace changing requirements whilst maintaining the “control” to ensure this occurs in a responsible fashion?

Establishing Guardrails for ART execution

We have the hints for these controls from the Portfolio.  A decision-making framework must address the following:

  • At what point does a Team need to escalate a decision or notification to the Program level?
  • At what point does an ART need to escalate a decision or notification to their Business Owners?

The answer in both cases lies in the controls we have established:

  • Committed PI Objectives
  • Capacity Allocation

Both have been validated and accepted by the Business Owners.  The ART has been “released” to make any change it likes within these controls.  For example, it has pre-agreed with the Business Owners that if the PI runs into challenges the stretch objectives can be sacrificed without consultation.  However, to examine some example “control triggers”:

Capacity Allocation for BAU

The application of capacity allocation to BAU/Unplanned work was explored in detail in my recent post on BAU and Unplanned work, but to illustrate a change control example:

  • A team was given a capacity allocation of 10% for BAU/Unplanned work
  • The Product Owner is now “released” to prioritize any stories they feel are important (be it minor enhancements, tech debt remediation or minor production defect fixes) so long as they fall within 10% of the team’s capacity each iteration.
  • In the event that BAU and Unplanned work threatens to exceed the capacity constraint (eg a string of urgent minor enhancement requests from a noisy stakeholder), we trigger the “control”.  It’s no longer a team decision, and must be escalated to Program Level.
  • At the Program Level, there may be a “System-level” solution to it that can still be constrained to the ART.  For instance, redirecting some of the work to another team or sacrificing a stretch objective due to the importance of the unplanned work.
  • If there is no ART-level solution that doesn’t compromise committed objectives and the Program Team cannot politically refuse the BAU work, it must be escalated to the Business Owners to make the decision: “Are we willing to sacrifice a strategic committed objective for this unplanned work or will we exercise our executive authority to defer the unplanned work?”

Significant change in Committed Objective

We know that teams don’t commit to delivering either their plans or their stories at PI Planning – they commit to achieving their committed objectives.  The stories and plans constitute a current belief as to the best way to achieve the objectives.  There’s more to this story than meets the eye though:

  • Whilst objectives are meant to be “SMART”, they rarely are.  This leaves interpretation very much open, and its easy for a Product Owner to wind up in a world of pain with subject matter experts, stakeholders and product managers arguing that there is implicit scope in the objective that the team never planned for.
  • The objectives are “in a system context”.  Business Owners accept “the overall objectives for the ART”, and make tradeoff decisions to optimize the whole rather than the parts.  Implicit in the process of arriving at this is that the team’s plan reflects the rough percentage of the team or ART capacity that will be consumed achieving the objective.  (ie In the plan, there were 78 points of estimated stories associated with it).  We know estimates are guesses, but if we are suddenly dealing with 150 points of stories there is no doubt we are in a world of pain.  This could occur due to ugly surprises, missed stories, or interpretation arguments but however it happens there’s no doubt we’re in pain.  Whether it’s caused by dysfunction between the Product Owner and Team, Product Owner and Product Manager, or Product Owner and Stakeholder this objective may compromise a number of others.
A common technique is to establish a “size threshold” type control.  For example, in the event that the total estimated size for the Feature varies by more than 20% from that established at PI planning, it should be escalated to Program Level for review.  Often, it can be solved at the Program Level through effective tradeoff decisions (by resetting expectations, swinging other teams to the rescue or sacrificing stretch objectives).  However, if it cannot be resolved we’re now in a position where following through on one committed objective compromises one or more others – and that’s a Business Owner decision.  Once again, it should be escalated.  They may lend their executive support to downscoping the work, or elect to sacrifice another objective to support it continuing in its inflated state.

An objective cannot be met.  


This seems to happen most often due to an issue with an external dependency (eg delayed infrastructure, rejection of design by ARB).  As soon as the ART is aware that an objective cannot be met, this must be escalated to the Business Owners.  Notwithstanding the importance of transparency, its quite possible that they can swing their executive weight behind moving the blockage.

Conclusion

If I had read this post 10 years ago, my response would have been “that’s far too structured to be agile”.   My views have become a little less simplistic over the years as I’ve worked with large organizations and government agencies who need evidence of documented controls in place and are looking for “simpler, less wasteful, but still responsible”.

But, to be honest, what motivates me most to address it is helping “teams in pain”.  I’ve worked with a number of ARTs in recent years who have struggled massively with achieving 50% “Release Predictability” let alone 80% even with obscene levels of overtime.   There are numerous contributing issues, but uncontrolled scope creep is a recurring culprit.  Teams trying to collaborate with their Product Owners accept too much change with the Product Owners bright new ideas.  Product Owners struggle to push back when their Product Manager keeps reinterpreting the goalposts of their Features thanks to fuzzy PI objectives and absent/incomplete Feature Acceptance Criteria.   Product Owners struggle to say no when senior stakeholders come up with great ideas.  It’s not fun, and teams enjoy neither the overtime pressure nor the feeling of failure when they miss their objectives time after time.
 
The truth is, every change involves a decision and every decision is a tradeoff decision.   A team will make a trade-off that leads to team-level optimization, a Program Team will make a trade-off that leads to ART-level optimization, and Business Owners will optimize still more broadly.  Good backlog discipline and effective leveraging of the controls built into SAFe enable the right trade-offs to be made at the right level at the right time, and satisfy the auditors that effective controls are in place!

Monday, February 25, 2019

Practical Finance and Cost Allocation in SAFe



SAFe provides some wonderful yet daunting guidance when it comes to funding and the application of Lean Budgets.   As of SAFe 4.6, we have 3 key tenets of Lean Budgeting:

  1. Fund Value Streams, not projects
  2. Guide Investments by horizon
  3. Participatory Budgeting

My purpose in this article is not to restate the SAFe recommendations.  They’re well documented at in the Lean Budget article.  As always, however, I’d rather talk practice than theory – in this case in the area of “Fund Value Streams, not projects”.

In summary, the theory is that we provide guaranteed funding to establish a standing capacity (in the form of an Agile Release Train), then use a combination of “guardrails”, Epic level governance and Product Management accountability to ensure that this funded capacity is well used (building valuable features) and tune it over time based on results achieved.

When I take a room of leadership through this approach its usually pretty confronting, and often provokes strong reactions.  Then I share a story from an early SAFe implementation.  It was with an organization that didn’t have Lean or Agile friendly funding, we just had a stable ART with stable teams that were fed by projects.  And we built a record for “100% on time/on budget”.  We did it in a very simple way.  When something was delivered under budget, we still charged the quoted price, and as such we were able to build up a buffer fund.  When something ran over budget, we drew down on the buffer fund.  It was closely managed, and it all came out in the wash!  By the conclusion of the story, that same room who was reacting strongly a few minutes earlier is suddenly grinning (in some cases rather nervously).  I’m pretty sure every organization I’ve worked with in the past 10 years has survived historically using some variation of this technique.

Some SAFe implementations are lucky enough to start with capacity funded ARTs fully following the framework guidance, but in most cases they’re still realistically project funded.  It’s either a really big amorphous project being used as a masking umbrella to the standing funding model, or quite literally the single ART is being funded by numerous projects and needing to charge back.

And to take it a bit further, even if we are capacity funded we usually need more granular data on how the money is spent.  At minimum, we generally have some work which is Opex funded and some Capex funded.  We also want some data on how much our features are costing.   Hopefully we’re not falling into the trap of funding feature by feature, but when your ART is costing $3-5M per PI the organization will require us to be able to break down where those millions are going.

About now, the timesheet police come out!  Somebody somewhere figures out a set of WBS codes the ART should be charging against (I recently heard an example where a theoretically capacity funded ART had up to 15 WBS codes per Feature), and everyone on the ART spends some time every week trying to break up the 40 hours they worked across a bewildering array of WBS codes that will then be massaged, hounded and reconciled by a small army in a PMO.

There’s a much simpler, less wasteful way.  We’ve used it time and again, blessed by Finance department after Finance department – once they understand the SAFe framework.  I’m going to start at the team level, then build it up to the ART.

Team Level Actuals

We usually work this on a “per Sprint” basis, and need to know two things:

  • What did the team cost?
  • What did they work on?

What did the team cost?

Figuring out what the team cost should be straightforward.  Given that we know if you’re part of an Agile team you’re 100% dedicated, all we need is your daily rate and the days you worked.  Specifics of how this daily rate is calculated for permanent employees versus contract/outsource vary too much to provide specifics, but for each of your agile teams you should know your “burn rate per sprint”.  You then need a way of dealing with leave and (hopefully not) overtime.  We’re not trying to totally kill timesheeting here, but we can be much more simplistic about it.  Each team member should only have one WBS code to allocate their time against – the one that says “I worked a day as a member of this team”.  Thus, using your knowledge of burn rates and actual days worked you have total cost for the team for the sprint.

What did they work on?

If you’re part of an Agile team, the only thing you could possibly work on is your backlog!  So, we know that the entire team cost should be allocated against their backlog.  From  a funding perspective, aggregating this is based on effective categorization of backlog items: by parent Feature, item type or both.  Consider the following example:


The Sprint in Aggregate

Velocity: 28


Based on a team burn rate of $100K/sprint, we then have:

  • Feature A: $36,000
  • Feature B: $46,000
  • Production Defects: $11,000
  • BAU: $7,000

If your world can’t cope with ragged hierarchies, you can create “Fake Features” for the purpose of aggregation.  I quite commonly see features like:

  • Feature C: “PI 3.1 Production Defect Fixes”
  • Feature D: “PI 3.1 BAU Work”

The richness and detail of your aggregation basically depends on your “Story Type” classifications.  For example, typically work done on exploration/discovery can't be capitalized, which would lead you to introduce a story type of “Exploration”.

Applying the results

At this point, we have all the costs against the temporary WBS associated with their team.  All that remains is to journal the costs across from the temporary WBS to the real WBS based on your aggregates.

If you’re clever, you’ll automate the entire process.  Extract the timesheet data, extract the backlog data, calculate the journals and submit!

Dealing with the naysayers

At this point, some people get worried.  The size of the stories was estimated, not actual.  What happens if Story 1 was estimated as 3 points and was actually 5?  What about stories that didn’t get accepted?  Can Fibonacci sizing really be accurate enough?  Our old timesheets recorded their time spent against different activities in minutes!

It’s time for a reality check.  I’m going to illustrate with a story.  I was recently facilitating some improvement OKR setting with the folks responsible for timesheets in a large division (still mostly waterfall).  One of the objectives they wanted to set involved reducing the number of timesheet resets.  I asked what a timesheet reset was and why it was important to reduce them.  Turned out it was when a timesheet had been submitted and was some way through the approval process when they realized there was an error and it needed to be reset to be fixed and resubmitted.  Obviously, this was a pain.  I asked them how often it happened and why?  The response: “Usually when there’s a public holiday we get a lot of them.  Everyone enters 8, 8, 8, 8, 8 (hours worked) and the approver approves on autopilot then half way through their approval run remembers the public holiday!”

Everyone (and most especially finance) knows timesheet data is approximate at best.  The person filling it out knows they worked 8 hours, guesses their way through how much they spent on what, and finds whatever hours are left unaccounted for and chooses something to attach them to!  And the person approving it does little review, knowing they have no meaningful way to validate the accuracy.

While the backlog aggregation will always have a level of inaccuracy, few will argue that it is any less accurate than the practices employed by their timesheet submitters (unless you’re a law firm steeped in the 6-minute charging increment).  And at this level, your CFO should realize that any variation is not material to the overall investment and is accepted under GAAP (General Accepted Accounting Principles).

Moving from Team to Train

On the surface, life gets a little more complex moving from team to train.  You have lots of people in supporting roles not necessarily working on specific features.  Again, however, it’s not all doom and gloom when you look at prevailing practices.

In many an IT organization, pre-Agile daily rates attract a “tax” used to cover the costs of those less directly attributable such as management, governance, QA teams and the like.  Every project they’ve ever estimated has attracted a percentage surcharge (or a series thereof).

In the same vein, we can calculate a “burdened run rate” for the teams.  We do this by taking the burn rate for every member of the ART not associated with a delivery team, summing them, then distributing them across the team burn rates.  In theory, they exist because the teams couldn’t be delivering value without them – so they must be contributing in some fashion.  Consider the following example:


Support costs can either be distributed proportionally to burn rate or on a flat per team rate (usually based on discussion with Finance).  Assuming a flat per team rate, we can restate the example above as:



This becomes more nuanced in the case of a support team who does directly cost attributable work.  The classic example is the System Team.  They should be spending a certain percentage of their capacity in general support and the rest building enabler features (hopefully DevOps enabler features).  In this case, we can use the team-level backlog aggregation approach illustrated above provided we can see their support work clearly categorized so we know which percentage of their cost to distribute to burdened burn rates and which percentage to attribute directly to features.

All that remains is for our supporting staff to timesheet against a temporary “I worked on this ART WBS”, and we have the means to attribute our costs at the ART level just as we did for the team.

We get one other thing for free.  I like the word "Burdened" when it comes to burn rates.  There's usually a world of potential waste to be eradicated in burden costs.  Calculating some heuristics allows your CFO to start asking some rather pointed questions about whether ARTs are really "running lean".

Conclusion

I work with one small client who has none of the “enterprise fiscal responsibilities” of most SAFe implementations.  In theory, they have no need for any of the above discipline and in fact ran their early implementation without it.  But then they wanted to start analyzing cost/benefit on the features they were building.  In fact, it was the delivery folks who wanted to know the answers so they could change up the cost/benefit conversations when feature requests came in.

I don’t think it matters how “ideally Lean or Agile” you are, if you’re in the type of enterprise using SAFe you will need to be able to allocate your costs across Opex and Capex, and need to analyze your Feature costs to tune your investment strategy and provide data to your improvement initiatives.  The techniques illustrated in this article require good backlog discipline and some walking through to get blessing from your Finance departments, but they’re far from rocket science.  And best of all, they work regardless of whether you’re project funded, capacity funded, or some combination of the two!

Because, in the end, I have one experience with Leaning up governance.  Until you have a viable alternative that enables the organization to fulfil its fiscal and governance responsibilities you’ll never dislodge the onerous, wasteful practices of the past.

Friday, February 15, 2019

Dealing with Unplanned and/or BAU work in SAFe

In the Agile world in general, we have long preached the move from “Project mindset” to “Product mindset”.   “Wouldn’t the world be simpler if we just talked about work instead of projects and BAU?” is a mantra on many an agilist’s lips. 

Whilst the notion of forming teams and trains that “just do the most important work regardless of its nature” is a great aspiration, it comes with a number of caveats:

  • Funding and capitalization are generally significantly different for the two
  • Planning and commitment are difficult when some (or much) of the team’s work is unplanned

Enterprises have typically solved for the problem through structural separation.  The first step into Agile is often to move from separate “Plan”, “Build” and “Run” structures to separate “Plan and Build” and “Run” structures.  Projects are fed through “Plan and Build”, then after some warranty period transitioned to “Run”.  Funding is separate, and “Run” is driven more by SLA’s than plans.

A truly product-oriented mindset requires the establishment of teams and ARTS that can “Plan,Build and Run”, and this post will tackle in-depth the issue of planning and commitment and introduce some tools for tackling the funding side of the equation.

Funding

I’ll tackle the topic of funding in greater detail in a future post, but the short version follows.  If a backlog item is categorized, the categories can be mapped to funding constructs.  We can then take the burn-rate for a team, the percentage of its capacity dedicated to each funding construct, and allocate funding accordingly.

Planning and Commitment

Both the PI cadence of SAFe and the Sprint cadence of Scrum seem to invalidate the incorporation of BAU.  After all, if we fix our Feature priorities for 8-12 weeks in SAFe and our Story priorities for 2 weeks in Scrum how do we deal with the unplanned?

Known BAU work can be represented by planned backlog items, but the answer to unplanned work lies in the effective utilization of Capacity Allocation.  We can reserve a given percentage of the team (or train’s) capacity for unplanned work, and plan and commit based on the remaining capacity. 

Team-level Illustration: Production Defects

One of the first benefits we find with persistent teams is that we can feed production defects back to the team responsible for introducing them.  This provides them with valuable feedback, typically dramatically improving quality. 

We might reserve 10% of team capacity to cater for this.  Thus, if the team’s velocity is 40 they would only plan to a velocity of 36 and reserve 4 points for production defects. 

Mechanically, the following occurs:

  • If less than 4 points of production defects arrive, the team pulls forward work from the following sprint.
  • If more than 4 points of defects arrive, the Product Owner makes an informed decision: defer new feature work or defer low-priority defects.

My preferred implementation of this technique is slightly different.  A number of times, we have reserved the 10% for a combination of Production Defects and Innovation.  If the team has shipped clean code, they get to work on their innovation ideas rather than pulling forward work!

ART-level illustration: BAU work

When staffing ARTs, we often find that some (or many) of the key staff are only available “if they bring their BAU work with them”.  In these cases, we plan the known BAU work and apply PI-level capacity allocation based on the percentage of their capacity we feel is needed to cater to expected “unplanned BAU” loads and withhold this when planning out the PI.

Dealing with fluctuations in unplanned work levels at the ART/PI level is a little more consequential.  Whilst the sprint-to-sprint mechanism of the production defect illustration still applies, we need to be monitoring for potential impact on PI objectives.

  • If less than the expected amount of unplanned work arrives for the team, we have the option to either use the spare capacity to absorb work from other teams struggling with their PI objectives or pull forward Features from future PI’s.
  • If more than the expected amount arrives, we are monitoring impact on committed objectives.  We can cater to a certain amount by sacrificing capacity allocated to stretch objectives, but if we are at risk of compromising committed objectives this should trigger a management decision to determine whether to defer or deflect the unplanned work or compromise a committed objective due to the significance of the unplanned work.

Discipline is a must

Applying these techniques will quickly run into a challenge.  Teams are often sloppy with BAU/unplanned work.  They “just do it”, viewing the effort of creating, sizing and running backlog items for it as unnecessary overhead.  This leaves us without the visibility required for the deliberate, proactive decision making illustrated above and often somewhat embarrassingly at the end of the Sprint or PI apologizing for missing a commitment “because BAU was more than expected” without any hard data to back it up and even more importantly without having given the Product Owner/Product Manager/Business Owners the opportunity to intervene and deflect the unplanned work to enable us to maintain the commitment.

Further, I find most teams dramatically underestimate the capacity consumed by BAU work.  We’ve routinely worked with teams who set a capacity of 30% aside for BAU, then when they’ve finally missed enough objectives to buy into actually tracking their BAU work find it to be 50-60%. 

However, the true benefit of discipline goes further – the data generated is a goldmine.


Reaping the Benefit of Discipline

Whilst the first benefit of discipline is obviously that of gaining an accurate understanding of your capacity and being able to more confidently make and keep commitments, exponential gains can be realized once you start to analyze the data generated.  A key first step is developing an awareness of failure demand and value demand.

Failure Demand vs Value Demand

Failure demand is demand caused by a failure to do something or do something right for the customer” – John Seddon
The first illustration that was given to me for failure demand many years ago was in the context of call centers.  It’s the 2nd and 3rd phone call you have to make because your issue wasn’t fully resolved on the first call.   If we take a typical agile team or ART, we can find many examples:

  • A late-phase defect is caused by failure to “build quality in”.
  • A production defect is caused by failure to deploy a quality product
  • A request for information is caused by failure to have provided that information previously or failure to have made the requester aware of where the information is published
  • An issue is often caused by failure to effectively mitigate a risk
  • Time spent issuing reminders or nagging is failure demand, as more effectively establishing the awareness of the “why” and clearly setting the expectation would have avoided it.
  • Managing the politics of a missed commitment results from both failure to meet the commitment and failure to effectively manage the possibility that the commitment would be compromised.

Value demands are demands from customers that we ‘want’, the reason we are in business” – John Seddon
Value demand for teams and ARTs should be obvious – the features and stories the teams are working on!  However, this can become a little more nuanced very quickly:

  • Is work done on an improvement initiative value demand?  Our customer probably didn’t directly ask for it.  In fact, many improvement initiatives are effectively failure demand as they are driven by addressing previous failures.
  • A great deal of BAU/Unplanned work is falsely perceived as value demand.  “I run this script or extract every morning”, “We produce and consolidate this report every month” are all great examples.  In theory someone values the result of the script or extract, and values the report – but the need to dedicate capacity to it results from a failure to automate it, or failure to fix a broken process.

Applying the Insights from Demand Patterns

Assuming we’ve had the discipline to channel all demand on a team through their backlog, and the further discipline to categorize it appropriately as failure or value demand, we can now start to drive significant improvement on the following basis:

  • If I reduce failure demand, I have more capacity to devote to value demand
  • If I find a more effective way to respond to value demand, I have more capacity to devote to value demand

In “Four Types of Problems: from reactive troubleshooting to creative innovation”, Lean expert Art Smalley defines a hierarchy of problem types and accompanying resolution strategies.  Three of these are pertinent to this situation:

  • Type 1: Troubleshooting – “Reactive problem solving based upon quick responses to immediate symptoms”.
  • Type 2: Gap from Standard – “Structured problem solving focused on problem definition, goal setting, root causes analysis, counter-measures, checks, standards and follow-up activities
  • Type 3: Target Condition – “Continuous improvement that goes beyond existing performance of a stable process or value stream.  It seeks to eliminate waste, overburden, unevenness, and other concerns systemically,  rather than responding to one specific problem”.

When you form a good Agile team, their ability to jump to each other’s aid, rally around problems and move from individual work to teamwork tends to exhibit a lot of troubleshooting – particularly in the case of unplanned work.  Good troubleshooting skills are fundamental to any team.  As Smalley comments, “to address each [issue] with a deeper root cause problem-solving approach would require tracking and managing a problem list that runs, literally, hundreds of miles long.  No organization can hold that many problem-solving meetings … in an efficient manner”.

Our response to most failure demand is to apply troubleshooting techniques.  However, while these will help us survive the prevailing conditions they won’t help us change them.  Change requires the use of Type 2 problem solving techniques.  We need to leverage our data to identify recurring trends, and act to remove the root cause of the failure demand.  Smalley devotes great attention to problem definition, and opens with two pieces of critical advice when framing the problem for attention:

  • “The first step is to clarify the initial problem background using facts and data to depict the gap between how things should be (current standard) versus how they actually are (current state).
  • “Why does this problem deserve time and resources?  How does it relate to organizational priorities?  Strive to show why the problem matters or else people might not pay attention or might question the problem-solving effort.”

As we are successful with the reduction of failure demand with our Type 2 activities, we can move on to Type 3 problem solving, driving activity to establish new target conditions.  If we accurately understand the capacity being devoted to various types of value demand we can more accurately assess whether the value being generated justifies the capacity being consumed – triggering informed continuous improvement.   An enterprise PMO we have been working with provided a wonderful example recently:

They had historically applied a QA process to every project the organization ran.  This, of course, was characterized as “BAU” work.  It had to be done every time a project passed through a particular phase in lifecycle.  As they gathered data on how much of their capacity it actually consumed, they started to question the value proposition.  How regularly did the QA check actually expose an issue?  What were the typical consequences of the issues exposed?  What other high-value discretionary activities were unable to proceed due to capacity constraints?  Eventually, they were able to make an informed decision to move to a sampling approach, freeing up more capacity to devote to high-value initiatives they had been frustrated by an inability to proceed with.

Conclusion

Capacity allocation allows us to deal with BAU/Unplanned work, but my experience has been that it never works well without the accompanying discipline of actually channeling that work formally through your backlog.  It might require some creativity to make it meaningful (eg a single backlog item for the sprint representing the capacity devoted to a daily BAU activity).  Beginning with the reduction of failure demand in BAU/Unplanned work will both improve performance and free capacity which can then be devoted to true continuous improvement initiatives.

However, the usefulness of the Sprint or PI cadence-driven cycle seems to fall apart at the point where more than 30-40% of capacity is being reserved for unplanned work.  Some form of cadence-driven alignment cycle will always be valuable, but adaptation from the standard events and agendas will be necessary to make them meaningful and Kanban is far more likely to provide a useful lifecycle model.  The ARTs I have worked with in this situation have tended to wind up with shortened planning events far more focused on “priority alignment” than detailed planning.

Above all, the benefit comes from the mindfulness generated in the presence of data reflecting “where you really spend your time” as opposed to “what your value priorities are”, and the accompanying discipline of acting on that data to achieve better alignment.

Saturday, October 27, 2018

Unleash your ART with the right Business Owners

For years, I wondered why Dean Leffingwell had to use another name for stakeholders with the "Business Owner" construct in SAFe.   My description in training used to run "an ART will have many stakeholders, but somewhere will be the 4-6 whose support is critical to your success - they're your Business Owners".

At some point it dawned on me.  If you consider the size of investment involved in running an Agile Release Train, under any legacy delivery model there would have been some form of executive steering committee.  Through this lens, you are offering the ART Business Owner construct as a Lean|Agile alternative to the traditional steering committee.

I have been rewarded by deliberate focus on formalizing the Business Owner construct.  It has unlocked one of the key decentralization puzzles I have encountered in my years with SAFe.  At some point in the journey of envisioning an ART, you are looking for a Product Manager.  A single person with a huge amount of delegated responsibility as the strategic voice of the customer.  Finding that one person the enterprise agrees can effectively represent all the competing interests for the ART's strategy and capacity can be almost impossible.  The problem also rears its head for the Release Train Engineer (RTE).  Most ARTs will be working on multiple strategic assets with a diverse ownership throughout the IT organisation - owners who often have significant concerns about others working on their platforms.

I have come to think of the RTE and Product Manager as empowered delegates of the Business Owners for the ART.  The Business Owners are then collectively accountable for rapid response to any decision escalated to them.  When these 2 layers of leadership operate effectively together, magic happens.

When and How to select Business Owners

The journey of an ART begins with leadership training and a vision workshop.  We are looking for a group of leaders and stakeholders collectively aligned in their understanding of the framework and with a shared vision for what the ART will be.

It is ideal for Business Owners to be a part of this early journey, so the time to map candidate business owners is when you're determining the attendees for the Leading SAFe and the vision workshop.

When selecting business owner candidates, an overall stakeholder map can be a useful input.  You need 2 types of representative:

  • Business Stakeholders
    • The owners of the operational value stream(s) the ART will service
    • Stakeholders with policy, regulatory or other assurance concerns in relation to the ART's mission
    • Owners of funding (if not covered above)
  • Technology Stakeholders
    • Designated owners of Strategic Technology Assets the ART will impact
    • Line Managers of staff the ART is likely to recruit

What level executive?

If you look too junior for your Business Owners, they won't have the teeth to fulfil their mission.  Conversely, looking too senior will give you authority but is unlikely to yield the time commitments you will be pursuing.

Whilst most organisations have their own leadership hierarchy, I find the sweet spot in large organisations is "the most junior level still considered to be an executive" - just above the line where senior middle management begins.  Considering this to be a "level one exec", I take one extra step - find the level 2 exec in that area and seek a nominated delegate.

How many do I want?

There is no set answer to how many Business Owners an ART should have, but I have some rules of thumb based on experience.  A number of times two were nominated - the Business Sponsor and the Technology Owner.  These struggled, partially due to a lack of diversity of voice and nearly always because the Business Sponsor had numerous peers who had interests they did not feel the sponsor adequately represented.

So, I worry if I see less than 4 Business Owners.  On the other hand, too many is also a huge problem.  It's incredibly difficult to reach and maintain alignment and move on critical decisions with a large and unfocused group.  I once had the misfortune of trying to facilitate an ART vision workshop with 30 candidate Business Owners and convergence was near impossible.

You want enough to be appropriately representative (usually at least 4) and not so many you'll struggle to make decisions (usually painful to move beyond 8-10).
 

The importance of setting expectations

Your prospective executive business owners need to understand what comes with the job.  As a collective the organisation is making them accountable for fiscal oversight and executive championing of the ART's mission.

They will attend and actively participate in a number of SAFe events, with a not inconsiderable accompanying time investment.  They will be needed for at least the following:

  • Participation in the ART vision workshop
  • Participation in the Program Backlog prioritisation workshop(s) each PI
  • Attendance at PI planning, validation and acceptance of objectives and participation in the management review
  • Attendance at System Demos (as part of their oversight obligations)
  • Attendance at and participation in Inspect and Adapt
  • Periodic participation in governance and escalation workshops with respect to ART performance.
  • Being the 'voice of the ART' in their area of the business.


They will need to understand the purpose of each of these, their role, and the ensuing time commitments.

I believe the position of Business Owner on an ART should be formally nominated and accepted, and would go so far as to suggest formalising a Terms of Reference for the Business Owner committee.

Leveraging your Business Owners

The primary times your train and Business Owners will intersect will be through the afore-mentioned SAFe events.   Following are some of the patterns I have found beneficial.

The ART Vision Workshop

Every ART should start with a clear vision, and this is the workshop that first crystalizes it (documented here).  Agreement is reached on Customers, Success Measures, Vision Statements and the nature of the solution to be built and the business owner roles are confirmed.  As the executive owners of the ART, it is critical not only that the business owners have a voice in shaping the vision but that clarity and alignment is reached on how the ART's success will be measured.

Program Backlog Prioritisation Workshops 

In preparation for this workshop, the Product Manager has been decomposing Epics, talking to customers, and reflecting on learning from shipped features to identify and shape a set of candidate features to add to the Program Backlog.  Each feature has a value proposition, and in SAFe we use  a contextualised Cost of Delay (COD) estimation and the Weighted Shortest Job First algorhythm to arrive at economic priorities.

In this workshop, new features are evaluated to determine their COD and existing backlog features where conditions might have changed are re-assessed.  Based on known information about feature value proposition, Business Owners use agile estimation to come into alignment on the cost of delay.  This enables the Product Manager to fulfil their responsibility to maintain an economically prioritised backlog whilst leveraging the diversity and organisational authority of the business owners to make the assessments.

PI Planning

Little needs to be added to existing material in this space.  Business Owners have a critical role to play both interacting with teams on an adhoc basis and in a planned fashion through plan reviews and the management review.   The PI Planning event provides amazing "Gemba time" for executives, with much to be learned by observing and interacting as the teams build their plans.

The event commences with a set of priorities that have been validated and endorsed by the business owners.  By the end of day 1, there will be various challenges to that vision.  Regularly, hard decisions will need to be made, and made on the night.   Management Review provides the forum to revisit their earlier "value proposition discussion" in the light of emerging information and provide rapid response with the same set of authority as drove the initial priorities.

On the second day, the all-important "Business Owner walk-around" occurs.  The Business Owners visit each team and conduct a review and assessment of the PI objectives.  It both establishes direct executive/team engagement and feedback, and permits clarification of objective intent - maximising the chance both the team and their stakeholders leave the room aligned in their commitment expectations.

System Demo

The primary "Execution reporting" vehicle for the ART culturally is the System Demo.  Every 2 weeks, it provides a transparent checkpoint based on demonstrated functionality and linkage back to PI objectives.  It also acts as a platform for feedback.   As implementations are demonstrated, where they differ from the vision of the business owner or expose new challenges or insights this is the forum to both trigger and receive that feedback.

Inspect and Adapt

Business Owners have a role to play in all 3 phases of the Inspect and Adapt workshop.  During the PI Demo, they are gaining an understanding of "what has been achieved" in the PI with reference back to the objectives committed at PI Planning.  During the quantitative measurement phase, they are both reviewing ART PI Metrics and evaluating performance of teams against their committed objectives based on the Demo.

Finally, during the problem solving workshop they provide the all-important executive lens to root cause analysis and brainstorming.  It provides them with invaluable time "at the Gemba", and insights into enabling initiatives that should be prioritised.

Governance

Each of the SAFe events described above qualifies as a governance event.  Collectively, they enable the Business Owners to steer ART strategy, provide rapid decision-making response at key moments in the lifecycle and maintain visibility through System Demos and Inspect and Adapt.

However, it is also typical to establish some form of cadence-based "ART Steering" forum.  In previous versions of SAFe this was known as "Release Management", but I suspect it has receded from visibility in the framework due to confusion over the title.  A monitoring and escalation forum needs to exist for the ART during PI execution.  It is informed by the standard SAFe events, but facilitates monitoring and follow-through of the various risks and issues that emerge.

Enabling

To succeed in mission, the ART will need to collaborate across the organisation - particularly for exploration activities.  Business Owners provide a built-in channel of access to the areas that will typically be critical.  Through their understanding of the focus areas for the ART, they can proactively plan who from their areas should be involved and how.

Conclusion

I have become so convinced of the power of a well-identified, engaged group of Business Owners that it is one of my top priorities when shaping a potential ART these days.  I have found it to be a hugely critical key in unlocking empowerment and ability to move for Product Managers and Release Train Engineers, whilst providing the Business Owners a great sense of comfort in the level of insight and input they have.

Monday, January 8, 2018

Effective Feature Templates for SAFe

Introduction

Features are the key vehicle for value flow in SAFe, yet they are also the source of much confusion amongst those implementing it.  The framework specifies that “Each feature includes a Benefit Hypothesis and acceptance criteria, and is sized or split as necessary to be delivered by a single Agile Release Train (ART) in a Program Increment (PI)”.

It is obvious that you need a little more information about the feature, and on what feels like countless occasions I have facilitated the definition of a Feature Template with a Product Management group.  People in classes ask me for a sample, and of course the only samples I have belong to my clients and aren’t mine to share.

The new emphasis on Lean UX in SAFe 4.5 finally inspired me to put some time into crafting a Feature Template of my own that I could share.  The result is a synthesis of recurring patterns I have observed in my coaching, and focuses an “essential components” with guidance on additional information that might be required.

How much detail is needed, and by when?

I divide the refinement lifecycle of the Feature into two phases.  
  • Prior to WSJF assessment
  • Prior to PI Planning
There is a level of detail required for a feature to enable Cost of Delay and sizing assessment.  Information is inventory, and we want a lightweight holding pattern until the Feature’s priorities indicate it as needing to be prepared for PI planning.

To this end, my template focuses on taking a canvas approach to supporting WSJF assessment, and I provide some guidance on likely extensions of detail in readiness for PI Planning.

Feature Canvas


Problem Statement
Leveraging the work of Jeff Gothelf in Lean UX, we base the feature on a definition of the problem it is designed to address.  Gothelf provides two excellent template statements here:

New Product:
“The current state of the [domain] has focussed primarily on [customer segments, pain points, etc].
What existing products/services fail to address is [this gap]
Our product/service will address this gap by [vision/strategy]
Our initial focus will be [this segment]”
Existing Product:
“Our [service/product] is intended to achieve [these goals].
We have observed that the [product/service] isn’t meeting [these goals] which is causing [this adverse effect] to our business.
How might we improve [service/product] so that our customers are more successful based on [these measurable criteria]?”
Feature Hypothesis
Again, leveraging Gothelf’s work we form a hypothesis as to the impact our Feature might achieve.  The template takes the form:
“We believe this [business outcome] will be achieved if [these users] successfully achieve [this user outcome] with [this feature]”.
Objectives and Key Results (OKRs)
Features are intended to provide verifiable impact, this detail is critical to enabling effective Cost of Delay estimation and the post-deployment verification of impact.  We want to ensure that quantifiable movement of identified Leading Indicators supports the ongoing evolution of our Product Management strategy.

As previously documented in this post, I have become quite a fan of the OKR model initiated at Intel and popularized by Google and find this a useful discipline in defining feature impacts.  

If OKRs seem a little daunting, you would instead list Leading Indicators and expected movements in this section.

Cost of Delay Components
The effectiveness/objectivity of Cost of Delay estimation workshops is heavily driven by the data on the table.  The 3 sections for User/Business Value, Timing Criticality and Risk Reduction/Opportunity Enablement provide opportunity to highlight supporting data for the assessment of the three cost of delay components.

Key Subject Matter experts
I rarely if ever work with an ART where Product Management is self-sufficient with their domain expertise.  Deliberate identification of and engagement with subject matter experts early in the lifecycle of a feature is critical.

External Dependencies
Nothing brings a feature unstuck faster than unidentified external dependencies.  These should be flushed early, and inform prioritization and roadmapping discussions.

Non Functional Requirements
We know that an ART will have a standing set of non functional requirements that applies to all features, but occasionally features come with specific NFR’s.  

Sample Completed Canvas

It was incredibly difficult to invent a sample feature because my head kept running to real features, but eventually I settled on a fictitious feature for restaurant reservations.


A glimpse at how you might visualise your next WSJF estimation workshop


Detail beyond the Canvas

As a feature is selected as a candidate for an upcoming PI, it triggers the collection of additional framing detail.  How much or how little detail is appropriate tends to vary ART by ART and at different stages in their evolution.  

At a minimum, it will require acceptance criteria.  However, some other things to consider would include:
  • User Journeys:  Some framing UX exploration is often very useful in preparing a Feature, and makes a great support to teams during PI planning.  
  • Architectural Impact Assessment: Some form of deliberate architectural consideration of the potential impact of the feature is critical in most complex environments.  It should rarely be more than a page – I find a common approach is one to two paragraphs of text accompanied by a high level sequence diagram identifying expected interactions between architectural layers.
  • Change Management Impacts: How do we get from deployed software to realised value?  Who will need training?  Are Work Instructions required?  

Tuning your Template

Virtually every ART I have worked with has oscillated between “too much up-front information” and “not enough up-front information”.  You want to know enough to enable teams and product owners to effectively plan and execute iteratively, yet not so much that you constrain the opportunity for teams and product owners to innovate and take ownership of their interpretation.

Every PI is a learning opportunity.  Take stock a week after PI planning and look at both the information you wished you’d had, and your observations as to the value proposition of the information you had provided.  

Then, take stock again late in the PI.  Look at how the features played out over the PI, and the moments you wish you could have avoided 😊  

Who completes the Canvas/Template?

The Product Manager is the content authority at the Program Backlog level, hence they are the ultimate owners.  However, one of the nice influences Lean UX has brought to SAFe 4.5 is a real emphasis of “collaborative design”.   In avoiding the waste of knowledge handoff, the best people to work through the majority of the detail (including preparing the canvas itself) are the product owners and teams likely to be implementing the feature.

 
 

Friday, July 7, 2017

Facilitating Release Train Design with the ART Canvas Part 3 - Launch Planning

Following a hectic couple of  months launching four new ARTS, its time to conclude my three-part series on facilitating Release Train design with the ART Canvas.  Covering the facilitation of a 1-day workshop, the previous posts dealt with creation of a shared ART Vision accompanied by a supporting design.  In this conclusion, I tackle the closing session of the day: the preparation of a launch plan.

Before commencing the launch planning, we generally shrink the audience.  We release the stakeholders who were critical to establishing a shared vision and design, and bring our focus in to the leadership team who will be executing the launch preparation activities.

Agile is a team sport, and the ART is intended to be a self-organizing team of agile teams.  So, your leadership team needs to work as an agile team too!   The vision workshop is essentially a team formation activity for the leaders, who then have the chance to operate as an agile team as they prepare for the launch.

Since our final objective for the workshop is to generate a launch plan, the metaphor that makes most sense is that of a short “enablement PI” which is executed by the leadership team with the support of the product owners.  Thus, the final segment of the workshop is a mini PI Planning.  I open with the following challenge objective for their enablement PI:

“Your product is an Agile Release Train that can fulfil the vision we’ve defined here today.  8 weeks from now (or whatever launch date we’ve set), 150 people are going to walk into a room to spend 2 days in their first PI Planning.  What will make this a success?”
To assist them in their planning, I then distribute the following Features before sending them into breakouts to create a set of stories and objectives that will realize the objective:







The depth to which the plan is developed depends on our time-boxing.  I’ve generally run this workshop over the course of a single day, which means their set of stories and backlog will be very rudimentary given the constrained time-box they will have for breakouts.  I have increasingly become tempted to extend it to a second day, allowing time to create a more effective plan, establish PI objectives, identify dependencies and create a Program Board.  This would also allow some time on the 2nd day to work through some team formation activities with the Leadership Team such as:

  • Development of a social contract
  • Design of the team Kanban
  • Planning of team demo approach (the team should be demonstrating back to stakeholders on launch preparation progress)
  • Agreement on standup timing
  • Iteration planning for the first iteration

Closing the session

In closing, we should have the following outcomes:

  • A name and launch date for the ART
  • A completed ART canvas, ready to be hung on a wall (and eventually digitised)
  • An agreement for the newly formed ART leadership team to function as an agile team during the launch preparation
  • An appropriately scary launch preparation backlog that motivates movement by the leadership team and is accompanied by an agreed “Agile monitoring” process using Team Demos to enable progress transparency and feedback. 

I like to close out with a final check-in.  I love the definition of consensus Jean Tabaka taught me: “I can live with that, and support it”.  It’s a great way to bring a day of very hard work to a close, perhaps using a roman vote.  “We’ve formed a vision, an ART design and a plan today.  There’s a lot of hard work ahead of us but I’d like to check that we have achieved consensus before we leave this room.”



Sunday, June 11, 2017

Reflecting on the 2017 SAFe Leadership Retreat

Earlier this week, I spent a few days at the very picturesque Dalmahoy Hotel in Edinburgh attending the 2017 SAFe Leadership Retreat.  This was our 3rd gathering, having travelled to Banff (Canada) in 2016 and Crieff (Scotland) in 2015.

If I had to describe the retreat in one word, it would be “community”.  Given the general application of SAFe in large enterprise and the prevailing “partner consultancy” based model, creation of an environment where both consultants and practitioners come together to transcend boundaries and share is no mean feat.  Full credit must go to +Martin Burns and his better half Lucy for their tireless efforts in this regard.

As always, the retreat was attended by +Dean Leffingwell  along with a number of other representatives from SAI.  Also in attendance were a mix of SAFe Fellows, SPCTs, consultants and practitioners along with a “chaos monkey” in the form of +Simon Wardley.

Having now had a few days to reflect on the experience, I’d like to share some of the key themes that emerged.

SAFe “next”

In the opening session, Dean spent a couple of hours running us through the soon-to-be-released next version of SAFe.  Whilst we’re under NDA when it comes to specifics,  I can say that the changes were enthusiastically received – with much improved guidance in some critical areas and best of all for the first time ever a simpler big picture!

SAFe beyond technology

Whilst the framework is squarely focused on the HW/FW/SW side of “Lean Product Development”, those with a truly lean mindset know that there’s a lot more than technology involved in the creation of a Lean Enterprise and optimization of the flow from Idea to Value.  I’ve long held the belief that great ARTs extend beyond technology into Sales, Marketing, Business Enablement and Business Change Management and it was great to see many others not just talking about this but doing it.

HR as an enabler rather than an impediment in a Lean-Agile transformation

We were lucky to have a number of attendees who were either practicing HR professionals or came from an HR background, and numerous open-space sessions devoted considerable attention to Lean|Agile and the world of HR.  Whilst Fabiola Eyholzer’s guidance article last year made a great start on this front, many are grappling with the practical realities of such questions as how to address career progression in the ScrumMaster/RTE and Product Manager/Owner spaces.  Hopefully in the coming months we’ll see some of the outcomes of those discussions synthesized and in the public domain.

SAFe for Systems of Record

When it comes to true Business Agility, there are always Systems of Record involved.  A number of sessions and discussions focused on this, with a particularly robust session on SAP.   The general conclusion was that whilst it’s a lot more convoluted than digital, many are doing it successfully and common patterns are emerging.

Active Learning in SAFe courses

Many of the attendees were passionate believers in experiential and/or active learning who have struggled with the orientation of the SAFe courseware towards “lecture-style” training.  The great news for all of us is that this is becoming a significant focus area for SAI.  The newly introduced RTE course is a radical departure from the past, and the preview we were given of the new Leading SAFe course shows marked improvement in this direction.

SAFe is taking off in Europe

I’ve been coming to Europe every few months in a SAFe context for a couple of years now (starting with Crieff in 2015), and it has clearly been lagging the US and Australian markets in enterprise adoption.   But it appears the time has come.  Of the roughly 50 attendees at this year’s summit, perhaps 40 were European based and the vast majority were actively involved in significant SAFe implementations – a radical departure from 2015.

Closing Thoughts

Great agile is all about collaboration, relationship and learning.  The manifesto expressed it beautifully with the words “We are uncovering better ways of developing software by doing it and helping others do it”.   This year’s retreat lived the philosophy, and I enjoyed deepening existing relationships, forming new ones, sharing my experiences and learning from others’.  Bring on 2018!