Detecting individual trips and "real" arrival time by vehicleId?

puntofisso · January 14, 2024, 3:11pm

Hi there,

PROBLEM
I’m pondering what’s the best practice in identifying real trips by vehicle and the relevant timestamp at each stations.
E.g. I want to record that vehicle 205 on the Victoria Line started a new trip at Walthamstow Central at 8:01, stopped at Finsbury Park at 8:10, ended this trip at Brixton at 8:50.

POTENTIAL SOLUTION
Now, assuming I haven’t missed an endpoint with actual arrival time, all we have is expected arrival times in

Then, one could work backwards:

regarding the timestamp of arrival at any station, it’s a matter of updating regularly and saving the last timestamp recorded
regarding the identification of a trip, the best way I’ve found thus far is to check if the train inverts the direction inbound/outbound and/or if it reaches the terminus.

Does this work?
Is this the only way, or is there anything better that could be done?

Thanks,
Giuseppe

briantist · January 15, 2024, 7:22am

Welcome @puntofisso

Our system uses a cron job to capture the necessary TfL data every 5 minutes.

By storing the information every 5 minutes you can capture enough data to be able to show boards (because the arrival times are actual times, not offsets).

If you do this you should be able to work out the differences as you should be able to build up a list of which train called where. 5 minutes would allow for the data to appear often enough to capture all the stop data.

Of course, the TfL system is an “arrival times” system (not departures) and only shows trains once they have left their destination.

Good luck

puntofisso · January 15, 2024, 9:35pm

Hi @briantist,
thanks for your reply

Just for context, I do have a similar cronjob, except it runs every 30 seconds (the maximum frequency allowed).

From what I see, but I could be very wrong, I’m either not sure I’ve explained myself properly, or things are not working the way I seem to understand from your description – do you mean that a prediction for a past transit should re-appear at some point?

That doesn’t seem to be true. For example, I’m following a train that transit through Finsbury Park. At one point, I see it entering Finsbury Park in the API, which says

"vehicleId": "212",
"naptanId": "940GZZLUFPK",
"stationName": "Finsbury Park Underground Station",
"timeToStation": 9,
"currentLocation": "At Platform",

When I query the API again 30 seconds later, the entry for Finsbury Park station has disappeared, and I only get prediction for the next stations.

So how are you suggesting I could get the actuals from this?

By the way, is there any documentation anywhere to help understand how and when the predictions are generated?

Thanks

briantist · January 16, 2024, 7:11am

@puntofisso

OK, this is the TfL “arrivals system” in operation. It’s 100% unable to show you departures (unlike NRE Darwin which shows both so you can work out things like true dwell times).

The moment a train is in the station, it disappears from the data flow: once arrived, it no longer has any need to provide you with any more information.

However, you can just assume that the last predicted arrival time was the time it arrived. This is going to be the case, even if you poll less frequently as per my 5 minutes suggestion because it requires a substantial systematic failure for the Underground trains to not arrive and when they do their arrival times will be updated in the feed.

The Liz Line and Overground are on Darwin so you can see the full to-the-second data there if you want those lines.

(urgh, 96 second dwell time!)

But for the tubes (and DLR, but without indexes) it’s arrivals all the way: Working Timetables (WTT)

puntofisso · January 16, 2024, 9:44pm

Thanks @briantist, that makes sense

I’ll keep working to work out the arrival time as you suggest, it seems good enough for my purposes.

I’m still unsure what’s the best way to identify individual trips – I’ll attempt at using any “inversion” of direction, but I need to look into the data to understand if there are nontrivial edge cases (trains going off service, test trains, etc)

briantist · January 17, 2024, 8:25am

@puntofisso

Bakerloo. Except during engineering works all southbound trains got all the way to E&C, but northbound “often terminate at Queen’s Park or Stonebridge Park”.

Central Line. Some can reverse in platform at White City, but otherwise serve West Ruislip and Ealing Broadway in the west…

In the east train go to either to Hainault or Epping but can turn at other places. Woodford to Hainault is run as a shuttle.

Circle/H&C are two version of the same physical line, but the services all run end-to-end.

District runs as

Wimbleware- Wimbledon to Edgware Road
Edgware Road to Olympia - but not very often
Ealing Broadway to Upminster but can turn at Tower Hill
Richmond to Upminster but can turn at Barking
Wimbledon to Upminster but can turn at Tower Hill and Barking

Jubliee - 99% of trains run Stanmore to Statford, 1% turn at Wembley Park

Met - like this (note that Watford trains going to Baker Street never all the way to Aldgate).

In peak, the Met trains use the “fast lines” to skip stops in the “direction of flow” so can physically skip stations.

Northern Line

Battersea Power station (or Kennington) to Edgware via Camden Town
Battersea Power station (or Kennington) to High Barnet via Camden Town
Battersea Power station (or Kennington) to Mill Hill East via Camden Town
Morden to Edgware via Bank and Camden Town
Morden to High Barnet via Bank and Camden Town
Morden to Mill Hill East via Bank and Camden Town
Morden to Edgware via Charing Cross and Camden Town (not often)
Morden to High Barnet via Charing Cross and Camden Town (not often)
Morden to Mill Hill East via Charing Cross and Camden Town (not often)

Piccadilly has two branches in the west and most service go to the airport. The unidirectional loop is an extra thing for your code to do!

Victoria line - every 90 seconds trains run end to end. The Victoria line uses stepping back to allow all trains to run end to end using different drivers for each run.

London Overground - like this…

Liz runs like this

with these timings

Good luck!

puntofisso · January 17, 2024, 5:25pm

That’s amazing, thanks @briantist!