Weighted Shortest Job First is a process utilised in SAFe® to bring Economic Logic, facts and figures, to prioritisation. It is a useful, and very powerful, mechanic but a lot of the subtleties around how it works and how it should be facilitated are absent from the training and explanatory material which concentrate on “how to do the process” rather than “why the process works”. Whilst I can, and do, discuss these topics in depth in any Implementing SAFe® training that I run, once I have stopped speaking those words are lost to the ether. These blog posts are intended to capture the explanations that explore the subtleties of WSJF and act as a more permanent record.
This blog assumes a familiarity with WSJF; it is described in the Scaled Agile Framework at: https://www.scaledagileframework.com/wsjf/
What is the Determining Factor for Getting High Scores?
Job Size is the determining factor.
The Job Size, the effort estimate, is below the dividing line; if that number gets very big then the resultant WSJF score gets very small.
If you want to game the system and get your work prioritised, then you need to submit smaller pieces of work. This is the first subtle, but hugely important point, that needs to be properly communicated to everyone involved with WSJF. Historically the Stakeholders are used to a completely different mechanic, project approvals. The game with projects approvals was to gather as many requirements together as you possibly could, maybe add a few extra sacrificial requirements that you expected to lose later but wanted as a buffer to protect your core requirements, and then you would try to squeeze this “Maximum Approvable Set” through the approvals process. If you brought the “Maximum Approvable Set” to a WSJF where would it end up? At the bottom, somewhere deep, deep down in a sub-basement.
|We were working with a Release Train in a major European Investment Bank. We explained to the Stakeholders of the Train, the Epic Owners, how an Agile Release Train works and the purpose of WJSF. The nice thing about working with senior staff in a bank is that they either have a maths degree or a physics degree, which is just applied maths. Given the explanation of the WSJF algorithm, it didn’t take them long to work out that Job Size was the key to gaming the system and within minutes they were slicing features in order to get the work they really needed prioritised; less relevant work being deferred until sometime in the future.|
With WSJF the game is “What’s the smallest we can do that will deliver the business we all work for some value?” Once that is delivered the Agile Release Train is going to come running back and ask, “What’s the next smallest thing we can do that will deliver our business more value?” Little increments of releasable value that are built-upon and complementing each other. This isn’t how Stakeholders have traditionally worked and they will require support and facilitation from the Product Manager, to encourage them to start thinking and requesting incremental pieces that build up to their goals.
Who Provides the Inputs?
Stakeholders to the train, who have been requesting the Features, should know about the first three columns: User/Business Value, Time Criticality and Risk Reduction / Opportunity Enablement. The last of those three might need some technical input, typically the System Architect, because it’s the Technical Enablers that tend to score high in this category and the System Architect should be able to explain the reduction in risk that will be provided on completion of this Enabler.
The engineering staff know about the effort, they are the experts on getting stuff done. Are we going to run the world’s largest round of Planning Poker with all 100+ engineers on the Train? Probably not. Use the technical representative at the Train Level, the System Architect, to organise a representative group of Engineers (covering all the disciplines Development, Test, Architecture, UI/UX, etc… and the technology stack, Front End, Back End, etc…) to provide some centralised estimates. Centralised estimation is fine for the purposes of Forecasting and Prioritisation, but anywhere that a commitment is made then the estimates must come from the people that are making the commitment.
A standardised Story Point scale is useful because it allows easier forecasting but if the teams and trains have not converged onto a standard scale then pure, simple, relative estimating will still work for prioritisation. The relative order of the results will be the same, the scale of the resulting numbers will be different, the decimal point will have shifted a couple of places, but the order will not have changed.
The second vitally important point to communicate is that the Stakeholders providing the inputs above the line have no direct control over the input below the line, the Job Size. To prioritise their work they need a small job size and the only way to get that is to provide a series of incremental features, each valuable and releasable in their own right but building up to the ultimate set of intended functionality. They need to slice up the full scope of work they require into smaller, individually valuable, pieces.
What Patterns Are Available for Slicing Features?
Richard Lawrence and Peter Green, wrote a great piece on how to split stories a number of years ago, which is a MUST READ article for anyone hoping to work in agile ways. If you haven’t seen it already, read it today!
Slicing up features is not dissimilar. As to when to slice a feature, consider this: If a feature is likely to fill a single ART team’s capacity, you would reduce PI risk if you can split that feature up. Not doing so, risks the feature not completing , which then incurs the undesirable knock-on effects of reduced train predictability and reduced stakeholder’s confidence. Below we have summarise some thoughts on how you might consider slicing your features:
|Pattern for Split||Ask yourself if..|
|Behaviours that are Optional||Can you achieve the same outcome using different behaviours?”
|Differing Technologies||Are looking to develop the feature on differing technologies e.g.Andriod, Web and iOS?|
||Does this feature have different user groups, who will approach using this feature differently?|
|Pareto Law of Value
||Is nearly 80% of value coming from 20% of your stories? If so start with the most valuable story grouping|
|Separate Common Enablers
||Does more than one feature, require a common enabler to support it?|
||Start with the highest volume use pathway and save others to separate features|
||Perhaps a data subset provide benefit in its own right? If so start here|
|Go Core First
||Is there an obvious core to the work that can be paired down and enhanced later|
|Now or Later?
||Do you have aspects described that can wait until next PI?|
|Increment through Business Areas||Will the feature lend itself to being released business area at a time?|
|None of the above…||When you are completely out of ideas, it is time to investigate further before attempting to split. A spike should help you identify some options|
Table 1. Feature Splitting Ideas
When slicing Features it is always important to remember that these are the true releasable elements of the system and so must always provide a robust, usable solution to the users. Features will be built Story by Story, but in themselves must provide a ‘complete’ usable solution. Note that a complete usable solution means ‘no bits missing’, rather than that all possible Stories have been implemented. This means that some of the patterns that can be applied when splitting stories, e.g. defer functionality - implement security later, aren’t applicable for Features, the Feature being a releasable point needs to have all the appropriate security present!
There is another important difference between splitting Stories and Slicing Features: when we split a Story we usually split the original Story into a set of similarly sized new Stories that completely replace the original Story; when slicing Features, we usually just find the most important slice and leave the rest to be addressed later. Therefore, always try to slice of the most value bit, it will rise up the WSJF being small and valuable, more slices can be taken off as needed until such time as the value left in the remainder means that it is unlikely ever to be prioritised at which point it can be discarded.
What stops a Stakeholder from shouting “My feature 100 times more valuable than anything else?”
The other stakeholders.
Get the Gorilla’s fighting amongst themselves; the group of peers will very quickly rebalance the situation if any one of them gets too far out of line.
WSJF should never be done by the Product Manager alone because they will never win. If they prioritise one Stakeholder’s feature over another Stakeholder’s, then the losing stakeholder will blame them for not prioritising their work. Turn the scenario the other way around and it will be the other stakeholder blaming the Product Manager for not prioritising their work; the Product Manager can never win; they can only sign off ill with stress1.
WSJF is all about generating Stakeholder alignment, the Product Manager facilitates these discussions but let the Stakeholders argue it out.
Should a Features with a high Business Value be sliced?
No. Only slice on size.
If the size is too big break it up using the previously mentioned slicing patterns. Small and valuable is just what we are after!
More details on preparing features can be found in a series of blog posts: Preparing Features for PI Planning
How much of a backlog?
Enough to fill the PI, but no more.
Elaborating more than is needed for the PI is wasted effort and that time and effort could be better directed to ensuring the current set of features are properly prepared.
Product Managers, having been in discussion with Stakeholders, should have a clear idea of what should be in or out. Roadmaps are always a good starting point.
If necessary, run a quick sanity check with the Stakeholders: “Do you think there are any features that are more important than the selected set?” They might come back with one or two, add these to the set being prioritised, run the WSJF prioritisation process and once an order has been produced you can work out which of the features now fit within the Agile Release Train’s capacity.
What if The Stakeholders don’t like the order?
If the Stakeholders want the set of Features in a different order, then let them order the set however they see fit.
Do you mess with the numbers to try and get the order they want?
No. Do not destroy the truth.
The most important thing about the whole WSJF process is that the Stakeholders agree that this is the right set to take into PI Planning. Is this the set of business needs that they would like addressed in the next Program Increment? Does the set match the capacity available?
The order itself isn’t as important as the set; it is just the order in which the features are placed on the wall in PI Planning and whilst it does indicate business priorities, in a properly empowered train, the teams will be taking those features off the wall in whatever order they choose because they will understand any technical sequencing that might have to occur in order to maximise the amount that the plan can deliver.
The PI Planning process itself is remarkably robust, I have never seen a planning event that hasn’t produced a plan2, but I have seen events where it was impossible to validate whether the plan produced is what the business needed because the business, the Stakeholders, weren’t in agreement over what they needed.
WSJF is a very powerful tool for generating Stakeholder agreement.
The dependencies that we are talking about here are not the collaborations3 between teams that are negotiated and mapped out during PI Planning but pre-existing external dependencies:
|We can’t do X until Mega-Corp Y has delivered an update to our underlying database technology.|
This style of external dependency is known about in advance; if the Stakeholders don’t know about it then the System Architect(s) should. Prior to WSJF, all features should have some degree of technical review. In this instance the System Architect(s) would be adding a note4 to the feature describing the external dependency to inform others in prioritisation or planning discussions.
Is it worth prioritising a Feature whose external dependency is not going to be resolved in this PI?
No. It can’t be delivered so why waste time discussing it. If the Stakeholders don’t like that then they can go negotiate the external issue. They have the power, the money, the influence, the big boots, to go knocking on suppliers’ doors and renegotiate the relationship. Don’t expect the engineering staff to be able to change something that’s well beyond their sphere of influence.
When estimating features, they have to be estimated as if they are the only thing in the world; the estimate contains all of the effort to get the business need delivered. However, Features often overlap, there might be work that is common to two or more features. The estimates for the Features with that common work all have to include the effort to do that common work, because until they’ve been prioritised you don’t know which feature is going to be at the top of the priority list and even once a feature has been prioritised there’s no guarantee that it’s going to get picked first in the PI Planning event.
At PI Planning you’ll get the effort back, the teams shouldn’t be doing the same work repeatedly they’ll do it once for whichever overlapping feature gets planned first and then all the others will benefit and get easier because of that work. Even in Trains which have teams that are Feature orientated and can deliver features end-to-end there are still collaborations because the teams need to work out which team is doing the overlapping work and when.
This is another scenario where the information is often known at the point when the detail of the feature is elaborated. Either the Stakeholders or the System Architects should be able to spot that there is overlap between features and add a note to feature suggesting that there is overlap with another feature and a corresponding note on the other feature in reverse. In PI Planning a team pulling that Feature to plan it will see the note and can look to see if another team has already taken the other feature and negotiate with that team accordingly. The note is used to trigger conversations and negotiations in the PI Planning event by suggesting the overlap rather than explicitly dictating a dependency.
The challenge with overlapping work being repeatedly re-estimated is not with prioritisation but with forecasting. If those estimates are used in forecasts, then it will look like the work is going to be done repeatedly and the forecast will be distorted with the end date being further into the future than it should be. The fix is for the System Architect(s) to spot the overlaps and break out an Enabler feature that can be done in a preceding Program Increment and once it’s done the estimates for the remaining features get smaller because the common, overlapping work has been dealt with. This is why roadmaps are so important, so that the technical staff can look ahead at what might potentially be happening in the future and add enablers to the Architectural Runway to prepare for that future. It is only worth breaking out the enabler if it can go into an earlier Program Increment, it everything is happening in this Program Increment then the whichever Feature goes first carries the burden of the common, overlapping code and the rest will become easier after that point. The Product Manager might need to have a couple of extra features prepared and ready over and above the capacity of the Release Train for the PI knowing that some of that capacity is overlapping work whose capacity will become available for the extra work during the planning event itself.
When do you run WSJF?
As late as possible to have the best possible information.
Which features might need to be considered because they didn’t get completed in the last PI? Those Features that weren’t completed get a new estimate of effort to complete (this should be smaller, therefore they’re likely to get prioritised) and they can be considered alongside the other features. The features previous values for Business Value, Time Criticality and Risk Reduction / Opportunity Enablement are ignored as this is a different set of features and the numbers are relative to the current set being considered; additionally, time has passed, things have changed, what held true last time might not be true this time.
It is also helpful for the codebase5 to be at standstill for Prioritisation and Planning. It is difficult to estimate the effort for a Feature if the codebase is continually moving because the changes to the codebase could be changing the estimates. Estimates are typically done in advance of the main WSJF session, this still has to happen within the IP Iteration once the codebase has come to a standstill so it’s typically only a day or two in advance of the WSJF prioritisation.
Only the prioritisation is done last minute; the preparation of the Features is spread across the whole preceding program increment. PM’s should have a clear idea of what they think is going to go into the prioritisation session; there will be some Feature that they are absolutely certain will be in the next PI, work on those first, they move towards the Features that they’re less certain about until the cut-off limit of what the Agile Release Train can achieve in a Program Increment is reached.
Features should be socialised, not to pre-empt planning, but to de-risk planning by avoiding nasty surprises. We don’t want teams trying to solve features in advance of PI Planning because a) they’ve got the current plan to finish first and b) any pre-work is waste if the team isn’t able to take the feature. In socialising the features, we want to teams to check whether the business needs are adequately described so that in PI Planning they can start the process of solving the problems contained in the feature. If there is an issue with a Feature then the PO/PM group still have time to engage with stakeholders and fix the feature.
When should you socialise the features? Late enough that they’re reasonably well formed but early enough that there is still time to fix any issues discovered. My personal preference is to use the last Backlog Refinement meeting in the last plannable sprint (the one before the IP sprint). At this point in time the Team Backlog should be empty, they are executing the last stories, so instead of refining the team backlog the meeting is repurposed to look ahead at the Features in the Program Backlog.
Prioritisation might not have been done at the point when features are socialised, but the candidate set of features should be relatively well known and can be communicated.
Why does Effort work as a proxy for Duration?
The timebox that is the Program Increment and the rule that “Features must fit within a Program Increment” means that there can never be too much of a discrepancy between duration and effort.
We don’t yet know who will do this work, so calculating a absolute duration is almost impossible, different teams run at different speeds and to pre-allocate work to teams in order to calculate a duration would break self-organisation in the Planning Event. Therefore effort (or “Job Size”) is used as a proxy for the duration of time the work will take.
The two instances where effort and duration diverge are external dependencies and swarming:
|External dependencies have already been discussed; there is no point prioritising a Feature if the External dependency is not going to be resolved in this Program Increment. If some preparatory work is needed to set an External supplier to do their work, then this preparatory work would be broken out as an Enabler that is executed in a preceding Program Increment prior to the main Business Feature.|
|Swarming is when several teams all work together in parallel on the same piece of work to get it done quicker. This will result in a duration being shorter than the effort estimate indicated but since Features must fit within the timebox, that we can do them quicker within that timebox by swarming isn’t going to affect the total number of features that could be done in that timebox.|
I hope with this blog that we’ve shown that there is a lot more to Weighted Shortest Job First than just filling in the numbers.
There are subtleties around how it can influence the behaviour of the participants and who’s behaviour it should be influencing.
We’ve also discussed some of the associated practicalities that help the process run smoothly.
Thanks to my co-teachers on Implementing SAFe® trainings, Ian Spence, Keith de Mendonca and Marika Zep, for their inputs and discussions that have led to this blog
Thanks to Keith in particular for his review efforts.
#1 Not a joke; I have seen it happen!
#2 I’ve coached or facilitated well over 60 PI Planning events in the last decade
#3 I much prefer the phrase collaboration but Scaled Agile use dependency throughout
#4 Most backlog management tools have a Notes section that is separate from the Acceptance Criteria
#5 If you’re not in the software space then Mechanical/Electrical designs, Documents, whatever