If you only quantify one thing, quantify the Cost of Delay.

—Don Reinertsen

 

Weighted Shortest Job First

Weighted Shortest Job First (WSJF) is a prioritization model used to sequence jobs (ex., Features, Capabilities, and Epics) to produce maximum economic benefit. In SAFe, WSJF is estimated as the Cost of Delay (CoD) divided by job size.

Agile Release Trains (ARTs) provide an ongoing, continuous flow of work that makes up the Enterprise’s incremental development effort. It avoids the overhead and delays caused by the start-stop-start nature of traditional projects, where authorizations and phase gates control the program and its economics.

While this continuous flow model speeds the delivery of value and keeps the system Lean, priorities must be updated continuously to provide the best economic outcomes. In flow, job sequencing, rather than theoretical, individual job return on investment, produces the best result. To that end, WSJF is used to prioritize backlogs by calculating the relative CoD and job size (a proxy for duration). Using WSJF at Program Increment boundaries continuously updates backlog priorities based on user and business value, time factors, risk, opportunity enablement, and effort. WSJF also conveniently and automatically ignores sunk costs, a key principle of Lean economics.

Details

Reinertsen describes a comprehensive model, called weighted shortest job first, for prioritizing jobs based on the economics of product development flow [2]. WSJF is calculated as CoD divided by job duration. Jobs that can deliver the most value (or CoD) and are of the shortest duration are selected first for implementation. When applied in SAFe, the model supports a number of additional key principles of product development flow, including:

  • Take an economic view
  • Ignore sunk costs
  • If you only quantify one thing, quantify the Cost of Delay
  • Economic choices must be made continuously
  • Use decision rules to decentralize economic control

The impact of properly applying WSJF can be seen in Figure 1 (see [2] for full discussion).

The areas shaded in blue illustrate the total CoD in each case. Doing the weighted shortest job first delivers the best economics.

Calculating the Cost of Delay

In SAFe, our jobs are the epics and the features and capabilities we develop, so we need to establish both the Cost of Delay and the duration for each job. There are three primary elements that contribute to the Cost of Delay:

  • User-business value – Do our users prefer this over that? What is the revenue impact on our business? Is there a potential penalty or other negative impact if we delay?
  • Time criticality – How does the user/business value decay over time? Is there a fixed deadline? Will they wait for us or move to another solution? Are there Milestones in the critical path impacted by this?
  • Risk reduction-opportunity enablement value – What else does this do for our business? Does it reduce the risk of this or a future delivery? Is there value in the information we will receive? Will this feature open up new business opportunities?

Moreover, since we are in continuous flow and should have a large enough backlog to choose from, we needn’t worry about the absolute numbers. We can just compare backlog items relative to each other using the modified Fibonacci numbers we use in ‘estimating poker.’ Then the relative Cost of Delay for a job is:

Figure 2. The relative Cost of Delay

Duration

Next we need to understand job duration. That can be pretty difficult to figure, especially since early on we perhaps don’t yet know who is going to do the work or what capacity allocation they might be able to give it. So we probably don’t really know. Fortunately, we have a ready proxy: job size. In systems with fixed resources, job size is a good proxy for duration. (If I’m the only one mowing my lawn, and the front yard is three times bigger than the back yard, the front lawn is going to take three times longer to mow.) And we know how to estimate item size in Story points already (see Features). Taking job size, we have a reasonably straightforward calculation for comparing jobs via WSJF, as Figure 3 illustrates.

Figure 3. A formula for calculating Weighted Shortest Job First

Then, for example, we can create a simple table to compare jobs (three jobs, in this case), as shown in Figure 4.

Figure 4. A sample spreadsheet for calculating Weighted Shortest Job First

To use the table, the team rates each job relative to other jobs on each of the three parameters. (Note: With relative estimating, you look at one column at a time, set the smallest item to a “1,” and then set the others relative to that item.) Then divide by the size of the job (which can be either a relative estimate or an absolute number based on the estimates contained in the backlog), and calculate a number that ranks the job’s priority.

The job with the highest WSJF is the next most important item to do.

One outcome of this model is that really big, important jobs have to be divided into smaller, pretty important jobs in order to make the cut against easier ways of making money (that is, small, low-risk jobs that your Customers are willing to pay for now). But that’s just Agile at work. Since the implementation is incremental, whenever a continuing job doesn’t rank well against its peers, then you have likely satisfied that particular requirement sufficiently that you can move on to the next job.

As we have described, another advantage of the model is that it is not necessary to determine the absolute value of any of these numbers. Rather, you only need to rate the parameters of each job against the other jobs from the same backlog.

Finally, as the backlog estimates should include only the job size remaining, then frequent reprioritization means that the system will automatically ignore sunk costs.

A Note on Job Size as a Proxy for Duration

However, we do have to be careful about the proxy we choose for duration. If availability of resources means that a larger job may be delivered more quickly than some other item with about equal value, then we probably know enough about the job to use duration to have a more accurate result. (If I can get three people to mow my front lawn while I mow the back, then these items have about the same duration, but not the same cost.) But this is rarely necessary in the flow of value, in part because if there is some small error in selection, that next important job will make its way up soon enough.


Learn More

[1] Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.

[2] Reinertsen, Don. Principles of Product Development Flow: Second Generation Lean Product Development. Celeritas Publishing, 2009.

Last update: 16 November, 2017