Observations and Questions About London Transport Real-Time Data APIs

Hello Everyone :hugs:,

I’m now working on a project that uses real-time data from many public transport operators to enhance the commuting experience in London. Even though I’ve looked into a couple of the existing APIs, such as TfL’s Unified API, I still have a few queries and ideas that I’m hoping this community can address.

Real-Time Disruptions Data: Although disruption data is accessible, I’ve found that it occasionally lacks the detail required to make wise decisions. Is it possible, for example, to get more precise information in real-time regarding particular disruption kinds or causes (such as signal failures, accidents)? :thinking: This would significantly improve my application’s predictive power.

Crowding Data: While crowding information is currently available through the API, I’m wondering whether there are any plans to increase the data’s coverage or accuracy. Getting crowding levels for buses or DLR that are comparable to those for the London Tube is something that really interests me.

Use of API Rate Limits: Is there anyone who can talk about how they’ve managed API rate limits when growing an application? :thinking: I’m thinking about developing a service that might be highly used, so I want to make sure everything runs smoothly and there are no obstacles in the way.

Future Plans: Will TfL be releasing any new data sources or API functionality in the near future? :thinking: It would be beneficial to keep up with these advancements in order to plan for future project improvements.

https://techforum.tfl.gov.uk/t/missing-london-overground-timetable-info/salesforce-developer

Any advice, experiences, or insights would be very beneficial. In addition, if anyone knows of anyone working on related projects, we may collaborate!

Thank you :pray: in advance.

Hi Joy!

I am going through the API and timetables as well.

For rate limits, I think you can reach out to the TfL team to present your case and they can enable higher rate limits on your end if you need it.

Thanks,
Adarsh

Welcome @Joynic !

There is data in https://api.tfl.gov.uk/line/mode/tube,overground,dlr,elizabeth-line,tram/status (JSON) and Stations, lifts, escalators, works & closures - Transport for London (HTML, but well structured) but the level of information doesn’t list causes.

Real-Time Disruptions Data

Anything that you can use Darwin to get to the data for (Overground and Liz Line) will provide you with ongoing Darwin codes.

https://wiki.openraildata.com/index.php/Darwin:Late_Running_reason_codes_and_text

There are usually two codes for each train - one for the cancellation reason and one for the delay reason. The former might be caused by the latter. If you have a rid, there is a Darwin Historical API that can give you the codes for TfL train in for several years in the past.

Crowding Data

The data is 96 datapoints for each station, per day. Every 15 minutes has a relative business count. 100% = maximum for the given station, 0%= quietest for the given station.

But yes, it doesn’t deal with the Notting Hill Carnivals and soccer matches which is a shame.

Use of API Rate Limits

Yes, these are VERY easy to hit when you’re using tests in an app. I personally use either a Postgres jsonb to cache these where they won’t change much (like the Crowing Data) and also use memcached on the server to keep a local copy.

Future Plans

Let’s ping @jamesevans for this one

Darwin code examples…



image

Hello and welcome to the forum! Thank you for your feedback and suggestions.

We would love to provide more detailed disruptions data and are certainly exploring ways to achieve this. If we provided data similar to what @briantist mentioned (i.e. delay and cancellation reasons), would that meet your needs?

This is also something we would like to improve on. Crowding data for buses would be rather different to that for London Underground, but it is something we are actively looking at.

For services that will get a lot of use, you might like to cache the data within your infrastructure. That way, if multiple users request the same data, you’re only making a single request.

We constantly look to improve our open data offering. Some of them are relatively small things, and a lot of work goes into reflecting the evolving transport network (e.g. the changes we’re making to support the London Overground line naming). But we also have a long-term plan to ensure our data continues to be fit for purpose for all our needs, as well as the needs of open data users. I don’t have anything to announce at this time, but any changes are communicated on this forum or our blog.

1 Like