Initial adoption contextAt the time of adoption, the group was modelled roughly as follows:
- 5 project-based Scrum teams all with the same team-shape, following a reasonably similar model and in the final stages of delivery for their projects
- 1 project-based ‘pseudo-Kanban’ team with 3-4 months remaining on delivery of their project
- 2 stakeholder-aligned ‘pseudo-Kanban’ teams with very different team shapes and 4-5 months of work remaining in their pipeline
- A newly formed ‘System Team’ working on implementing a continuous integration capability
- 10-15 projects running under an outsourced/offshored waterfall model
- A group of system analysts fulfilling requirements and design elaboration for the waterfall projects
- 18 project managers
- Establish a shared understanding of SAFe fundamentals
- Determine the initial implementation model and approach to transitioning/maturing beyond initial implementation
- Determine the most effective organisational structure for the group
- Immediately establish an Agile Release Train involving the 5 Scrum teams, converting them from project based to stable feature teams. All new work entering the group would be fed to the train.
- Manage the outsourced waterfall projects through to conclusion.
- Allow the ‘pseudo-Kanban’ teams to conclude their current commitments, whilst gradually reshaping them to be ready to join the train upon completion.
- Agile Pipeline Management Services (APMS) – composed of the group of project managers from the Scrum teams and the system analysts. Primary responsibility: Manage preparation of the pipeline of work for PSI’s and make sure the teams had the support they needed in delivery.
- Development Services – run by the equivalent of the “Release Train Engineer” and composed of the feature teams. Primary responsibility: Deliver PSI outcomes
- Deployment Services – run by the deployment manager and incorporating the System Team. Primary responsibility: Liaise with enterprise release management and operations and ensure smooth deployment and ongoing continuous integration capability uplift.
- Transition Services – composed of the ‘pseudo-Kanban’ teams and the outsourced waterfall projects/PM’s. Primary responsibility: Ensure graceful conclusion of inflight work whilst preparing the teams for transition onto the train.
For the first month or so, it was ‘business as usual’ for the feature teams. They were already operating on the same iteration cadence, and all had a number of sprints committed in order to deliver on existing commitments. The primary focus area was the formation of the APMS group, but before delving into that I’ll briefly cover the feature teams.
Early days for the feature teamsThe feature teams basically had two things to do:
- Become ‘stable feature teams’ instead of project teams
- Normalise sizing.
The second was painful. The concept of story point normalisation is perhaps one of the most controversial in SAFe. Having been reluctantly convinced of the need by Dean, I still found that no matter how carefully you approach it you will still be impacted by most of the evils used to argue against the concept. We began by using the past 8 iterations’ metrics for every team to construct a normalisation formula. Simplistically, some teams downshifted their sizing one notch on the Fibonacci scale and other upshifted by one. It was deceptively simple, and one of the key lessons learnt in the time since is that we should have far more actively coached and retrospected on sizing practices.
Early days for APMSThe APMS team had 4 initial objectives:
- Establish the Epic Kanban system reflecting both the active work in the feature teams and work in the pipeline
- Prepare the ‘work in pipe’ and logistics for the first PSI
- Work with the rest of the COE (in particular the PMO) to agree governance, financial and other details of how the release train would mesh with the broader enterprise.
- Continue to provide project management support to the feature teams
APMS Challenge 2 – Feature Grooming: How much is enough?
The second challenge lay in the approach to grooming features for the PSI. The reality of the funding model and the dependency complexity of the work required a fair amount of analysis and estimation work to create ‘ready to play’ features. The best way to summarise this issue lies in the fundamental paradox of SAFe. Part of the reason the model is so powerful is because when you present it to program/project managers and architects they can readily map it to their existing mental models. On the other hand, the most common criticism I hear from agilists is that it ‘looks too formal and RUP-like’. What we experienced was that the intent of the model is too easily interpreted in practice into waterfall-like behaviours.
The original feature grooming model involved the APMS group identifying the business intent and features for the Epic, resolving the high level architecture and feeding it through a feature team for ‘enough discovery work to size it to +-30% confidence estimates’. The teams would thus hold a portion of their velocity available to be dedicated to these activities, which would be done in parallel with fulfilling their delivery commitments. What eventuated was the analysts/architects in the APMS group identifying and specifying the features and handing fully articulated designs to the teams who then simply estimated the design they’d been handed and handed their estimates back to the PM. This caused no end of problems, and was eventually heavily course corrected as you’ll see once we reach the current-day model.
APMS Challenge 3 – The Product Owner construct
Prior to the adoption of the release train, finding the right product owners for agile initiatives had been a serious challenge for the group. Given the nature of a data warehouse, there are extremely diverse groups of stakeholders for any given initiative and finding someone with sufficient diversity of domain knowledge, sufficient availability and sufficient empowerment had been entirely unsuccessful. Dean’s “Product Manager/Product Owner” separation was the key to unlocking the puzzle. Whilst we couldn’t employ it exactly as specified in the framework, the underlying principal of separation of concerns was a huge enabler. In standard SAFe, the Product Manager is basically the overall decision maker on prioritisation and scope for the release train whilst the product owner travels with the team. In our context, there could be 15 different funding sources active at once and prioritisation and scope had to belong to the sponsors providing the money.
The solution adopted was to employ “Epic Owners” and “Feature Owners”. The Epic Owner was the person providing the funding for an Epic, and was looked to for engagement at key points and strategic prioritisation and scope calls for their Epic. The Feature Owner, on the other hand, was to have deep domain expertise on a particular feature, have delegated authority within the scope of the feature and be far more actively engaged with the team whilst their feature was in play. The construct resolved a vast number of issues, particularly in addressing the contrast between availability and authority.
APMS Challenge 4: Can we really do PSI’s?
I’ve left the biggest challenge for last. SAFe deliberately separates the deployment cycle from the PSI cycle. Whilst the PSI produces a ‘potentially shippable increment’, the framework allows for either deployment on the PSI boundary, multiple PSI’s before a deployment or multiple deployments within a PSI. Our intent at launch was to run 8-week PSI’s and use the PSI as a planning boundary with multiple deployments happening in any given PSI.
The first difficulty we encountered was feature shaping. Given that the standard guidance for a feature is ‘small enough to be delivered in a single PSI but large enough to represent a significant outcome for the business’, there was a clash between good feature shaping and deployment window requirements. Available deployment windows were specified by enterprise release management, with 1 per month for minor deployments without significant integration requirements and a varying cadence for ‘enterprise deployments’ with major integration involved. Further, the deployment window for a feature was generally dictated by external dependencies. It was impossible to find a PSI cadence which aligned well with deployment boundaries, and shaping features to PSI’s was also largely impossible as we would regularly encounter features which would need to be deployed 1 iteration into a PSI but needed 2-3 iterations to implement.
Feature shaping was difficult but not insurmountable – in the end, we could have resolved it by allowing ‘cross-PSI’ features whilst still maintaining strong size limitations. However, the ‘bridge-too-far’ lay in establishment of sufficient funded pipeline. Once again driven by enterprise level gating cycles and funding processes, an 8 week PSI that fully funded and committed the entire release train was just not emerging.
By this point, however, we had established significant momentum with every aspect of the release train other than the PSI launch. The pipe was operating, the work was flowing into the teams, deployments were going out the door and momentum was tangible. One large leadership workshop later, we made the call: “know that PSIs will start eventually, but in the meantime develop compensating mechanisms and keep maturing the release train”.
Our focus then turned to compensating mechanisms – what were the key things we lost by not having the PSI, and how could we utilise our knowledge of the principals to achieve the purpose. To my mind, we identified the first 3 of the 5 key ingredients initially:
- The power of having the whole release train in a planning workshop together from the perspective of developing a sense of ‘release train as one large team with shared outcomes’ as well as ‘release train as team of teams’
- Employing a deliberate set of disciplines to groom out 4 iterations worth of backlogs at the release train level and ensure that visibility, dependency, risks and issues were fully articulated at the story level for all active features.
- Release train level ‘Inspect and Adapt’ cycles
- Maintenance of a strategic view of capability buildout on the platform that effectively capitalises on synergies between features.
- Strategic showcasing and communication of PSI level outcomes to draw together the ‘whole of product’ view of progress for broad stakeholder groups and avoid fragmented and isolated demonstration and communication feature by feature/epic by epic to narrow stakeholder groups.
The first strategy was the concept of ‘PSI-lite’, better known as ‘Unity Day’. A 1-hour ‘whole of train’ session at the start of each sprint aimed at building a sense of ‘train as team’ and ensuring everyone started the sprint on the same page. These have been incredibly powerful from a cultural transformation perspective. They are always attended by numerous visitors, and repeat attendees regularly comment on the tangible change in atmosphere over time. (for a deeper treatment, see this post)
The second strategy involved the establishment of ‘Team Loco 131’, composed of the extended leadership group and functioning as a ‘program level continuous improvement team’. Their stories are generated through learning generated by various forms of retrospective, and the leaders are fully bought into the lean management model of taking responsibility for improving the system of work.
Whilst the story of addressing backlog grooming challenges is too long and torturous to describe here, I will offer the following. Many organisations find the investment of putting a whole program and set of stakeholders in a room for 2 days planning every 8-10 weeks unpalatable. For Strategic Delivery, this commitment was one of the major sticking points in management workshops – although the general manager bought in and was willing to support it, most of her leadership group felt it was unjustifiable. I’ve heard similar stories from many other SAFe Program Consultants. Our journey, painful lessons and ongoing investment required to compensate in the area of release train level backlog grooming/planning have utterly convinced me of the vital importance of this activity. To put my systems thinking hat on, failure to implement effective PSI planning creates an insane amount of failure demand across the entire release train. Weighing the cost of the failure demand against the investment required for the PSI planning days provides a scale very heavily tipped towards the investment.
The Program Kanban
The second point is that whilst this blog series focuses on wall visualisations, every wall depicted is replicated in Rally with far more underlying detail. The Rally Portfolio Management capability is a key enabler in effective management and measurement of the system, whilst the walls are vital for tactile insight and facilitation of communication.
The Epic Kanban is, obviously, dedicated to the flow of an Epic through the release train value chain. Every card on the wall is an Epic. The value chain is divided into 5 phases, each with a number of states:
- Initiate Phase: Transition from demand management and connection with stakeholders.
- Discovery Phase: Feature elaboration and estimation refinement
- Launch Phase: Approvals, funding and preparation to drop into teams for implementation
- Evolve Phase: Implementation
- Operate Phase: Deployment and follow-through to ensure happy users and successful transition to operations.
On the day PSI’s become viable for this train, the phases and states will change very little. Initiate & Discover effectively involve the identification, sizing and prioritisation of features in preparation for PSI injection and maintenance of roadmap, Launch equates to locking in PSI content and executing PSI planning, and Evolve is delivery of the PSI.
The Initiate Phase
- Execute handover with the demand management group (‘PMO’ and ‘Validate Entry’ states)
- Connect with the Epic Owner, ensure their value drivers for the epic are clearly articulated and introduce them to the operating model (‘Envision’ state)
- Identify candidate feature owners ( ‘Envision’ state)
- Determine the feature team which will execute discovery, the sprint when it will occur, and ensure logistics are organised for discovery to proceed smoothly. (‘Prepare discovery’ state)
The Discovery Phase
Discovery is led by the feature teams, and involves working with the Epic and Feature owners to ‘discover the features’ and explore the feature level requirements sufficiently to articulate and size the outcomes. A feature team will typically have roughly 10% of their sprint capacity dedicated to discovery work, with discovery for most epics concluding within a single sprint. The flow of states is as follows:
- Kickoff Discovery: Workshop with Epic Owner and all Feature Owners to gain understanding of the vision for the epic, determine the exploration workshops required, identify key risks/issues and ensure the stakeholders understand the discovery process.
- Explore Features: Detailed workshops with feature owners to validate the feature composition, high level stories and feature level acceptance criteria.
- Consolidation: Consolidation of the outputs from feature exploration workshops, estimation and preparation of a high level release plan.
- Discovery Outcome: Presentation back to Epic and Feature Owners of consolidated discovery outputs for validation and agreement of outcomes
Looking at the wall, you'll see Astrotrain finalising Discovery for one Epic, Kaizen (3rd row) finalising one and kicking off another and Maglev consolidating one set of outputs whilst kicking off another. It's unusual for a team to have more than one Epic in Discovery at a time, in this case it's likely that they are small Epics which minimal feature elaboration required.
Launch concludes with evolve preparation. The feature team or teams to execute evolve are identified, logistics are organised for the release planning session at the start of evolve, the commencement sprint is locked in and the feature teams re-familiarise themselves with the discovery outcomes.
The key aspect to evolve preparation is feature team selection. Obviously, the ideal scenario is for the entire epic to land with a single feature team and for the evolve feature team to be the same one which executed discovery. However, reality intervenes. Whilst evolve capacity availability is utilised in selecting the discovery team, when the launch phase is protracted that team is often not available for evolve. In these scenarios, some capacity is reserved in the backlog of the discovery team to ensure they are available to support the kickoff and rapid knowledge uptake of the evolve team(s). Likewise, where the Epic is too large to be executed by a single feature team, work will be spread across feature teams on a feature by feature basis.
Looking at the photo, you'll notice a few things. Firstly, queues are very visible :) Secondly, although the APMS team runs a kanban system for their work (as hinted at by the epic level cumulative flow diagram and control chart on the wall), they also track the sprint (or iteration) cadence. Their priorities on any given day are driven by where in the sprint cadence the feature teams are. Finally you'll see the cards with post-its on them ('1', '2' and '4'). These would have been identified at the standup that day as needing urgent attention.
- Plan Release: Execute the release planning which would normally occur in a PSI planning session. The stories identified in discovery are broken down into ‘sprint-size’ and scheduled across sprints based on available capacity to create an initial release plan. One of the key outcomes of this workshop is establishment of working agreements with the feature owners. They know which sprints their feature will be active in, when acceptance testing for their feature will occur and can plan their availability accordingly.
- Implement: Build and acceptance testing of features.
- Integrate and Prep Deployment: This phase has a split personality depending on deployment type. For an ‘independent deployment’, it takes roughly a week and involves standard integration, deployment hardening and operational handover preparation activities. For deployments associated with an enterprise release (75-80%), it lasts 4-5 sprints. The work cannot be accelerated, as the timing and nature of involvement is dependent on enterprise level co-ordination. Teams thus wind up with a trickle-feed capacity reservation during this time.
The daily rhythm of all the groups will be covered in the next post, but it’s worth making a few concluding comments here. The first is to play back to Dean’s concept of the release train as a ‘self organising program’. Effective visualisation of the epic lifecycle is a vital ingredient, as it generates so much ‘knowledge at a glance’ of the state of play of the entire value chain. Within seconds, the group can identify what epics are where in the value chain, who has carriage, where the queues are and what needs focus. And of course, metric analysis through cumulative flow and cycle time charting at the Epic level. All of this across a system of work that at any given time generally has 30-40 epics somewhere in the value chain.
I'll close with a hint of things to come when we talk about results in the final post for the series. You might remember a mention of 18 project managers in the initial context at the beginning of the post - there are now 3.
PS Part 4 in the series is now posted and deals with the program level feature wall and in-play work.