Mea Culpa: Previous In-Service Times are Inaccurate

“The sorry bus” by waferboard is licensed under CC BY 2.0.

Due to a software error, the in-service time amounts computed for transit networks in previous posts are inaccurate. Posts where an access measurement is used on its own are not affected. When access is expressed in terms of the amount of in-service time—whether for the entire network or an individual route—this quantity may be incorrect. In specific, these posts are impacted:

While I stand by the methodology used in these posts, the numeric results from the calculations are flawed, and the conclusions that they reached are not well-supported.

The Impact

This error caused the calculated amount of in-service time run by an agency’s vehicles to sometimes be higher than the actual amount. For networks produced directly from agency data, this occurred only in one specific case. When a transit trip is partially within the studied area, but makes some stops outside of it, the software should only include the portion that operates within the area. For trips originating outside of the area, and later entering it, the calculation of in-service time still included the time spent outside. Cross-boundary trips in the opposite direction, or trips with no service outside the area, were not impacted. For networks that were created by modifying an agency’s existing network, any route truncation would cause the amount of in-service hours to become inaccurate.

I first combined access with in-service time in order to compare Seattle’s transit service to that of six other cities. The comparison lead me to believe that King County Metro and Sound Transit were worse than other agencies at allocating transit service within the city in ways that increased overall access. I have yet to rerun these calculations, and, for now, there is no reason to believe that this is true.

Motivated by this now-questionable conclusion, I proposed a redesigned King County Metro bus network for Seattle. For this proposal to be practical, it could not use more in-service hours than the amount already deployed within the city. Unfortunately, the software issue inflated this amount. Therefore, the proposed network—which appeared to be operable with no new funding sources—would actually require a nontrivial increase in Metro’s expenditures.

The software issue does not change the fact that the restructured network yielded a 20% access increase over current weekday service. However, this improvement could not be realized without additional funding. Because of that, I no longer consider the proposal, as currently envisioned, to be viable.

There were less-consequential impacts to the route productivity series. Routes with service outside of Seattle were undervalued in terms of journeys per in-service second and lost journeys per in-service second. In these measurements, only the journeys within Seattle were counted in the numerator, but some of their in-service time outside of the city increased the denominator. The rankings of routes by these measurements should be considered inaccurate until the in-service times for cross-boundary routes are recomputed.

How This Happened

In the software, transit trips are stored in a three-tiered manner:

Routings are a list of transit stops.
For each routing there can be many variants, which add arrival and departure timing information. This information is in the form of cumulative time offsets from the beginning of a trip.
For each variant there can be many trips, which contain a starting time, from which the cumulatives are offset, and a route name.

A trip’s base time is combined with the cumulatives in the variant it references, and the stops contained in the routing that the variant references, to produce the actual schedule for a transit vehicle.

By convention, the first arrival cumulative in every variant is zero. For the purposes of path finding, there is no need for this to be true. A trip could have a starting time far earlier than its arrival time at the first stop, as long as the cumulatives in the variant accounted for this. There are no advantages to this, though, so when trips are generated, they use the convention. If the convention holds, computing the amount of in-service time consists of going through every trip that is run within the studied area, in the time span under consideration, referencing its variant, and summing all the last departure cumulatives.

Unfortunately, two operations in the software modified trips so that they no longer followed the convention. As mentioned before, these were elimination of stops outside the area of study, and creating truncated versions of routes when modifying an existing network. When stops were eliminated from the beginning of the trip, for either reason, the trips would be left in an unconventional form where the first arrival cumulative was no longer zero. Thus the last departure cumulative would no longer signified how much in-service time the trip consumed.

I realized that this was happening when I recently started working on a second version of the restructured network proposal. In the first version, I created a network that truncated all transit service that extended outside of Seattle. This time, I wanted to deal with cross-boundary service more accurately. Routes that mainly served to get people from outside the city into it would no longer be hypothetically eliminated, but truncated at a point where onward transfers would be available. When I looked at the in-service hours by route, for a calculation that only considered Seattle, I expected to see very low in-service times for routes that briefly crossed into the city before terminating. Instead, I saw amounts that accounted for much more time.

What’s Next

I fixed the software so that new calculations and new network modifications would properly create trips and variants that matched the convention, and thus yield appropriate in-service times. Existing calculations will remain inaccurate. If this were a piece of production software with paying customers, I would have made the in-service time computation tolerant of the abnormal data from previous ones. This would not be difficult; it would involve subtracting the first arrival cumulative from the last departure cumulative when performing the in-service hours summation. However, this would add an extra computational step that is not necessary for the correctly-stored calculations now being produced. I’ve added a note to previous posts indicating that the data and conclusions may not be accurate.

I still consider it worthwhile to experiment with revising King County Metro’s bus service within Seattle, regardless of whether the present network is particularly inefficient from an access perspective. I must first reestablish the budget of in-service hours, which will be lower than those that I have considered in the past. It is unlikely that a network subject to this budget will achieve the same 20% access increase as before, but it strikes me as even more unlikely that there is no room to improve the current network at all.