I went along to see him when he visited Princeton, mainly out of curiosity. It turns out that FogBugz is pretty awesome. Let me highlight a couple of its most endearing features:
– the built-in wiki has the best WYSYWG editor I’ve seen
– utilizes backlinking between cases as a simple but handy aggregating related bugs
– really low barrier to entry for creating new bug reports
– beautifully integrated with email and discussion lists
Fine. But what really appealed to me was their emphasis on planning, and on tools that enabled more accurate estimations of shipping dates (‘evidence-based scheduling’). Firstly, they make it as easy as possible to log one’s activities (‘currently working on blah’), so that the system can keep track of how long is being spent on each case. Secondly, they require you to estimate up front how long you think that each case is going to take.
Now, in principle one’s ship date should now be determined by the some function of the number of cases left in your milestones list, how long each is projected to take, and the number of developers at your disposal. In reality, of course, these estimates are probably wrong. One can look at the history of an individual’s estimates, and compare them to how long things actually took. Perhaps I’m regularly off by a factor of 2 – so if I say something’s going to take four hours, it’s probably going to take the whole day. FogBugz builds a simple regression model, predicting actual duration from estimated duration for each developer, from which it can easily spot this factor-of-two trend.
If I’m reliably off by a factor of two, then that’s actually a kind of good news. It means that the system can be confident of its adjusted estimates. It means that the system can guess when the team is *actually* going to meet its milestones, rather than when they think they’re going to, by using the predicted durations from its regression model.
But things are probably more complicated. What if someone’s estimates are sometimes off by a factor of two, sometimes a factor of four, sometimes even over-compensating in the opposite direction? The variability in the goodness of this person’s estimates is more problematic for our projections. FogBugz’s solution is to use a kind of non-parametric permutation statistic to estimate 5%, 50% and 95% confidence intervals for shipping dates. I think the idea is to create a bajillion synthetic potential futures for each developer, each created by shuffling the evidence from their track record of estimations. We create a bajillion projected ship dates, each a combination of a set of synthetic developer-futures. Then, when the team manager wants to know the likelihood of shipping in time for Christmas, he or she can tell where it lies amidst the distribution of bajillion projected ship dates – if Christmas is earlier than all but a handful of the bajillion projected ship dates, then maybe there’s less than a 5% chance of shipping on time. If Christmas lies somewhere in the middle of the bajillion-strong distribution, then the system might estimate a 50% chance of shipping on time.
This idea of scrambling the past to generate a distribution is at the center of most non-parametric statistical tests, which avoid making assumptions about normal distributions, homescedasticity and other evil-sounding stuff that’s unlikely to be true in the real world. Anyway, props to Fog Creek for attempting to inject some low-level statistical intelligence in a way that might very plausibly work.