Hi, I’m new to this so forgive me if it’s been asked before. I have searched the forum though, particularly for advice on vehicleIDs that are “000” and the situation on the Met line right now doesn’t seem to fit the answers.
The problem relates to the ‘Arrivals’ API (https://api.tfl.gov.uk/Line/metropolitan/Arrivals) and it affects not just the app I’m tinkering with but all the commercial services, TFL and Citymapper.
Trains start from Aldgate with ‘sensible’ vehicleIDs, a clear destination and direction and send out arrival predictions for stops on their route. When they get to Harrow on the Hill, they disappear, to be replaced by a vehicleID of 000, no destination, no direction and put out predictions for every stop beyond Harrow, on every branch. This set of impossible promises gets sorted out in time, usually about five minutes, and the train will reappear - usually: sometimes with a new vehicleID, sometimes not. In particular the Amersham/Chesham trains may not make their mind up as to where they are going until the last minute, making the Amersham and Chesham arrival predictions quite random.
This obviously doesn’t affect the passengers, the Chesham train goes to Chesham. However if you look at the TFL or Citymapper boards, you’ll see train arrivals appearing and disappearing without warning or explanation and this is part of the reason why.
It might have something to do with driver changes at Harrow. It may just be the way that things are and we have to live with it. It does highlight something that would be useful in the Arrivals API (for a number of reasons) and that is some kind of ‘object permanence’. At the moment there’s no way of identifying a physical train - vehicleIDs could do this, but they change and are often duplicated, hence the problems above.
Would it be possible to add a field that gives a uniqueID for a physical train so it can be tracked (and spurious predictions ignored) over its journey?
From what I understand of the Trackernet source system, the setId (which we map to vehicleId in the API) is derived from the signalling systems. A value of 000 is shown where the setId cannot be derived from the signalling feed. Otherwise a 3 digit code relating to the timetabled service is assigned.
You often see a 000 when a line is disrupted and the service controllers adopt a more manual way of routing trains to get service patterns back to normal.
However in the case of the Metropolitan, I’ve been advised that setId cannot always be guaranteed, especially for trains that are located north of Preston Road.
I believe this will improve when the 4LM signalling is fully rolled out but I don’t think there is much that can be done in the current signalling configuration.
Thanks James, that’s helpful. I’m thinking aloud but - might it be possible to just ignore the update to a “000” id if you already have a valid vehicleID - particularly if that update is associated with the absence of a destination, direction etc. By all means indicate that the status has changed, but keep the identifying information?
The problem is that these “000” trains beyond Harrow all look the same. It’s not clear if the arrival predictions are coming from one train or more than one, so it’s a guess whether any particular entry is an update to an earlier prediction - and if so, which one … or a new prediction from a new service.
While writing this there was a signal failure between Wembly Park and Aldgate, which caused all the trains on that section to go “000” for a bit, but unlike the ones in the outer suburbs, they had direction and destination information, so it was possible to tell them apart (they’re back to normal now).
I don’t think we can do a lot with the data without upgrading the signalling & ingest of that.
The vehicleId is not actually the train’s physical number. It relates to a trip in the timetable and not the physical unit assigned to that trip. I think that there are some extra fields that could be consumed from the source data that may help distinguish the trains. e.g. http://cloud.tfl.gov.uk/TrackerNet/PredictionDetailed/N/BNK
I think we looked a while back about bringing these fields into the Unified API, but not sure what happened to this initiative (probably Covid!), but I’ll try to see if that’s still on the backlog.
Thanks again, I appreciate it. On a quick look at that XML file, it looks like ‘LeadingCarNo’ would totally solve the problem. TrainID too, but I’m not so sure what that is.
Graham
The Metropolitan Line seems to have particular issues but - hey - it’s outside my window so I’m plugging on. In the TrackerNet detail feed the TrainIDs are very peculiar, flipping between two values every ten seconds or so. It’s almost as if there’s some extra data hidden in there.
Assuming that things on the line aren’t going to change, could I ask that both LeadingCarNo and LCID are exposed in the Unified API … please? TrackID would be really nice too.
In the meantime I’m not sure whether or not it’s worth putting in effort on the TrackerNet feeds (which are quite fun TBH) since they are threatened with retirement. Any timeframe on that?
I know it’s bad form to reply to your own posts, but I’m laying down a trail for anyone who suffers the same difficulties.
The trackernet data doesn’t contain anything that can reliably identify a train. In theory there are plenty of markers that should work, but they don’t:
SetID - often set to 000 and even if set, can be duplicated on different trains at the same time
LCID - promising, but set to 0 about 30% of the time
Leading Car - as for LCID, when it works it’s great, but it’s zero almost half the time
TrainID - This is where it really gets interesting. There are two train IDs for each train. These are always set (it seems to happen automatically) They aways start and end with 1 (at least on the Met line). In the arrivals stream they alternate, roughly every 8 seconds and they don’t appear to be related.
This leads me to speculate that they are generated by physical devices, perhaps one at the front and one at the end of each train. Evidence for this is that they’re sometimes displaying different destination information and not infrequently, different track data.
This looks promising since the values are always set (presumably automatically) although it’s a bit of a coding challenge - involving tuples and dictionaries and whatever. However there’s a further riddle. At Harrow they both disappear and then reappear two minutes later in what is obviously the same train - with different numbers. So … perhaps they are not physical devices … or do they move with driver and guard? On the incoming section between Harrow and Finchley road it’s possible for one to change and the other to stay the same. Curious. If anyone can tell me what’s going on I’d be grateful.
(this started because I want to help a fellow teacher who commutes from Acton to Moor Park. They want to know whether it’s better to cross the line at Rayners Lane, go to Harrow and change again, or just walk to North Harrow. Simple … I thought :-))
Thanks, that’s an interesting article and explains the ‘wrong kind of leaves’ canard, however the devices discussed in the article are all fixed to the track and don’t move with the trains. Whatever is generating the information that I’m seeing is moving.
I think that there are often two overlaid systems in place: one that does the signalling and one that provides the customer service information.
There are systems (Liz Line central section, HS1) where they’re definitely using GSM-R which contains both the train identifier as well at the train location, but I’m not sure how much systems outside HS1 actually make use of this.
I did note, for example, when Liz Line trains were being (on the previous weekend) run from Stratford to/from Paddington by terminating them on Platforms 5 and 8 at Stratford and turning them around, that the CIS got the platform numbers the wrong way around.
This was, I assume, that the signalling was being done manually (Stratford is just outside the core signalling area) but the CIS systems had only the “temporary timetable”.
I guess, in the long run, TfL will move to a system where they can provide the signalling information when it contains the actual train IDs, rather than relying of CIS software that is matching the timetable to what is happening.
You’ll know when you no longer see “Check destination on front of train”