Why is the information listed on your 'live update' digital so different / wrong compared to almost every feed plugging the 'data' that you are supposed to be releasing?

stripeycat · May 6, 2018, 9:30am

We are seeing MASSIVE discrepancies in the ‘data’ passed externally and what your live update displays actually show.

There are NO accurate listings on 8 routes. i have stood waiting for these for up to twenty minutes when they magically appear and disappear off of both YOUR display and every other feed i am checking, Then magically reappear at a ‘fixed’ time only to wander off again. What is going on here? Your telephone service claim to have no knowledge of what is going on and just play dumb, we only report what we see, not our fault it is the bus operator’s responsibility, etc. Kindly explain how your data can be so inaccurate. Or is this a question for the ICO?

briantist · May 7, 2018, 1:02pm

In their defence, there are a number of problems which cause the problems we all experience using the APIs from TfL.

One problem is the reliability of GPS. Which isn’t to say that the network of NASA satellites providing the signals are anything by totally reliable, but connections to them to use microsecond adjustments to get highly accurate lat, long positions were designed to target military buildings outside of cities: the lack of a clear view of the whole sky for a London bus in many places can cause known accuracy problems.

The second problem is the reliability of the 3G/4G phone networks for the uploading of GPS location and Oyster information data.

Another issue is that the on-street countdown timers are sent updates only when iBus detects a change to the arrival schedule. If the above two problems stop data coming in, the rest of the system just assumes that everything is fine, thanks.

On the buses, there is no ability to take in data (like Google does via Wayz) about the state of traffic flow, just waiting for GPS data to detect a delay, not predict it.

Onto trains. One large issue here is that unlike petrol/diesel machines they are powered by electricity have can access a lot of torque from halt.

From elsewhere, I’ve been looking at the issues with Darwin, the system for National Rail trains. However the concepts are very similar to those on the tfl API.

Over the last few years I’ve sat on many station platforms comparing the data in Darwin with what is happening at stations. I’ve also spent time with a GPS recording system monitoring the acceleration and deceleration characterises of the trains.

One of the reasons that the “public Darwin” system rather blurs the time of a train by +/- 2 minutes is that there are problems with definitions that at first seem very clear.

One way of measuring the “arrival time” of a train is when the front of a train passes the monitoring device at the start of the platform.

However, the Class 345 trains can brake at 0.2m/s² but can cruise at 113m/s so the train has to being breaking for almost 60 seconds.

So to get from 113m/s to stand-still so the doors can open with the train occupying the platform of a 200 metre-long train takes about 24 seconds.

Now, and only now can the doors being to open and the passengers on the platform will stand aside to let those detaining off. Once clear the waiting customers on the platform board the train, and once they’re done the signal is given, the doors close and when the interlocks are done, they train can start to move.

A/C powered trains - most of those in London - can draw a lot of power to get moving quickly and can to 35m/s in the first 12 seconds, but then reaches a limit and then only accelerates at about 0.03m/s²

So, the “departure time” of a train could be the moment the wheels start moving, or when the whole 200m long train has moved off the platform, some 21 seconds later.

Working with the platform interface people, the customer perception of a train being “catchable” is the wheel-to-wheel dwell time: the moment from the doors opening to the train moving from the platform.

Standing watching the Darwin data on the platforms leads to the conclusion that “control” thinks a train has arrived about 24-30 seconds before the doors open.

If we are defining the intervals between services departures, a train arriving at the platform isn’t a departure, it is an event some time before the train passes the track side counter showing it has left.

To this end, we can look at many months of Darwin data to see how much time a “stopping” train takes in seconds between the recorded “darwin arrival” and “darwin departure”.

This is why “live” train systems need to compensate to show when trains are in platforms, because the only way to show it is to interpolate from the live actual data.

–

Anyway, back to the data. The limitations of data collection (no GPS underground) to trains with transponders passing track data collections marker, it’s actually quite hard for even people in the control rooms to know when a platform is over-busy, a person is taken sick or whatever actually delays the trains.

Poggs · May 7, 2018, 1:44pm

Nice write-up! From my work on the National Rail/Network Rail side of things, trying to record train performance down to the second using train describer stepping is much more difficult than it seems.

There are a number of problems that are difficult to get around which I won’t go in to here, but for customer information, it boils down to one thing - being pessimistic in forecasts on the operator’s side, and allowing a little more time on the customer’s side is much better than trying to be ‘more accurate’.

I fully expect the dot-matrix at the gateline at a station to clear trains off that I can’t reasonably reach. If they show a Morden train in 2 minutes and the next in 12, I might be tempted to run down escalators, push past people, generally be annoying because I think I can make the train!

But to @stripeycat’s point - can you give some concrete examples of where this is happening? Countdown is accurate to the minute or so when I use it on many routes (although on the 17, if I’m at Upper Holloway Station and the bus starts from the previous stop, it’s not always proper accurate, but that boils down to giving out timetabled information versus information you know is based on what has just happened).

mjcarchive · May 8, 2018, 12:33am

There appear to be two questions here. They might be related but it is not clear to me why hey must be related.

The first is why different systems supposedly using the same feed show very different results. Another thread contrasts Google with TfL/Citymapper and suggests that Google is less accurate.

The second question is about the quality of the data in the feed itself. As well as the “bus not yet departed” situation, the system gets in a tizzy (technical expression there) when buses are on diversion, unless a special diversion schedule has been input, which is often not the case, even for diversions lasting a few weeks. Bus gets to last stop before diversion at 1100, say. It is then shown as expected at the first stop not served at 1102 (say) and it carries on being shown that way until it gets back to its line of route, which can take some time if the traffic is bad. Those two issues are not helpful to passengers but they are understandable.

What is not understandable is the kind of thing reported on the 222 where certain buses do not report on leaving Uxbridge while some other buses report phantom expected arrival times some way down the route when in reality they are (I think) parked up at Uxbridge. This sort of thing needs to be understood; is it the operator or its drivers doing the wrong thing or is it something wrong with the algorithms? If the source of the problem remains unknown, it will never be put right.

I would echo the call for examples from Stripeycat. In particular, why not identify the 8 routes messed up?

briantist · May 8, 2018, 9:01am

Thanks! Very kind.

I’m always trying to help TOCs with dwell-time issues and moving towards per-second timing, and I hope I don’t go to my grave before the whole of the UK rail network runs like that with moving block signalling. At least the Victoria Line works like that…!

Anwyay…

I live in East Village, E20 near the start/end of bus route 97 which the iBus/Countdown system can’t seem to get right withing 5 to 15 minutes, when it isn’t just bypassing the whole of E20 for building works. I have tried emails, turning up at TfL developer events and asking too many questions and also asking customer support and London Travelwatch and basically, it just seems that iBus can’t cope with buses that are within a few stops of their starting point.

Looking at how buses get in and out of Stratford City Bus station with a coding hat on, there are multiple routes that can be taken by the buses (when QEOP events are on), even when things are normal there are four sets of traffic lights with no bus priority within the first few stops. Being a little bus station (in London terms) that is no on-site controller ensuring confort/fag breaks don’t go on too long!

So, countdown is great if the buses are every few minutes - because there are so many of them, and it’s also great when the bus is every 15 minutes, because these don’t get overestimated, but the system seems to fail with the 5 to 10 minute services because it’s trying to show travel times in a gridlocked city with random groups of tourists getting buses in groups of 30, oversized pushchairs competing with wheelchair users and the impossibly slow ramps the buses have for them…

The problem for TfL here is that they still need to think they are running a great bus service, which they do compared to almost everywhere else in the UK, when all the arguments have just boiled down to the cladding of the buses (so to speak) so they look great in commercial photos shots and movies of London but don’t work as well as iBus/Countdown would make you think,

briantist · May 8, 2018, 9:20am

Good points.

Q1) Google has the same data as everyone else. However it does suck in the timetables for transport providers mix that where live data exists. Take the long-standing issue of the common tfl apis not providing departure from terminal statations live data (Jubilee at Statford) - Google just mixes in the timetables with the live data feed to fill in the holes.

Is that better or worse than mathematically inducing information? Yes, if you’re a user of a transport system, and a big NO if you’re the provider of it. Being told that “everything is fine” invokes Garbage In, Garbage Out.

Q2) You raise an interesting point about “phantom” data. There seems to be two types of this happening. One is understandable - the “false destinations” you see on a stopping service such as the Overground from Watford Junction to Euston, which claims it’s going only to “South Hampstead” because there are fast trains direct to Euston.

I can name several bus routes that are messed up:

the 74 doesn’t stop at Chobham Acadamy Stop Y (3+ months)
the 308 doesn’t either (for three months or so now)
the 26, 48, 55 don’t stop at Hackney Road Columbia Road (Stop R) more than 3 months now

My concern is that TfL buses just don’t have the resources to track which of their buses aren’t stopping anywhere near some places and are providing all of us developers with poor data to work with.

I understand that bus drivers, who used to carry loads of ready cash around with them didn’t want bus GPS locations to be made available to muggers, their unions were quite right about that. But who would try and mug a bus driver with everyone being Oyster/contactless.

I’m so sorry the the government is cutting the funding of TfL and I’m happy to say that the developer community would probably do everything they can to help, but we need a set BACKSTAGE PASS and more than free, cold pizza.