Travel time, Distance and Route_id

Hello everyone,

I am a Master’s Degree Student in Statistics, working on the TFL Network for an important project.
I have trouble using the API tools and would like to request your help to access this data :

  • Travel time between all pairs of stations (or at least adjacent ones)
  • Distances between adjacent stations
  • Correspondance between Route_id and Stations Crossed through the route.

Also I would like to use this data on R, so I would need a format that R can manage .

Thanking you in advance.

Tom TIVOLI
Trinity College Dublin

Hi Tom, welcome to the forum!

Have you looked at the Journey Planner timetables (which are in TransXChange format) that are linked to from this page?:

Hello and thanks for your answer.

As I said I have lots of trouble using the API tool as it is my first time dealing with this type of content.
Could you explain me in more detail how I could get all the data I am searching for ? I can only get them one by one for now…

@TomT
I am no expert on this but some time ago I found a website which you may find helpful. It seems University of Leeds are (were?) working on similar projects to yourself.

The TransXChange data can be downloaded from:
http://tfl.gov.uk/tfl/syndication/feeds/journey-planner-timetables.zip
Note that this is a large dataset, which expands to a few GBs.

You can refer to the JourneyPatternTimingLink elements:

<JourneyPatternTimingLink id="JPL_1-JUB-_-y05-870844-85-O-2-5">
	<From SequenceNumber="4">
		<Activity>pickUpAndSetDown</Activity>
		<StopPointRef>9400ZZLUKBY2</StopPointRef>
		<TimingStatus>PTP</TimingStatus>
	</From>
	<To SequenceNumber="5">
		<WaitTime>PT1M</WaitTime>
		<Activity>pickUpAndSetDown</Activity>
		<StopPointRef>9400ZZLUWYP2</StopPointRef>
		<TimingStatus>PTP</TimingStatus>
	</To>
	<RouteLinkRef>RL_1-JUB-_-y05-870844-O-2-4</RouteLinkRef>
	<RunTime>PT3M</RunTime>
</JourneyPatternTimingLink>

As indicated by the RunTime element, there is a travel time of three minutes between Kingsbury (9400ZZLUKBY2) and Wembley Park (9400ZZLUWYP2).

For this, refer to the RouteLink element::

<RouteLink id="RL_1-JUB-_-y05-870844-O-2-4">
	<From>
		<StopPointRef>9400ZZLUKBY2</StopPointRef>
	</From>
	<To>
		<StopPointRef>9400ZZLUWYP2</StopPointRef>
	</To>
	<Distance>2926</Distance>
	<Direction>outbound</Direction>
</RouteLink>

The Distance element indicates a distance of 2,926 metres between these two stations.

The following diagram from the TransXChange schema guide shows how JourneyPatternTimingLink, RouteLink, StopPoint and Line relate:


What I am looking for is a list of the correspondance for the routeid column. I have nothing to link this id to a path in my graph.

This looks like data from NUMBAT, which I am not too familiar with – sorry.

This dataset is produced by our Public Transport Service Planning team. Have you tried contacting them at [email protected]?

This is a great resource! I had a question on this.

Some Journey links have WaitTime tag in it and some don’t. Could you please let me know how that logic works? Do the stations with no wait time tag mean a <1 minute wait time?

Hi. Where there is no WaitTime, it means there is no wait time built into the timetable. Of course, if passengers need to be picked up or dropped off, the service will still wait long enough for that to happen. However, in the case of request stops, the service may well simply pass by the stop.

I see! What would be the wait time for tubes to pick up passengers?

This is not available in our TransXChange data. However, train operators are required to wait the station’s minimum dwell time, and only depart when the signal is green and it is safe to do so.

Is there a way to get the dwell time for a station? Or would I be better off assuming a minimum amount (say 2 mins) for each station unless there is a wait time parameter in the timetables?

I have a spreadsheet “Inter station Database” which I found on FOI-1820-1920 (https://tfl.gov.uk/corporate/transparency/freedom-of-information/foi-request-detail?referenceId=FOI-1820-1920), giving inter station distances and run times, which may be useful. A search for “lu inter station database” will also turn up later FOI requests, which may prove worth investigating.

Thank you! This is useful!

I did a manual timing exercise as well with the good old stopwatch on my phone. Most stations have a dwell time of around 20-25 seconds and some bigger stations like Queen’s Park take around 40 seconds.

Would be a good approximation to model. Now my question would be which stations have the 20 second dwell time and which stations have 40 seconds

There are a couple ways of calculating this.

One method I can think of would be to take one of the “Common User Format” timetables, which are equivalent to the PDF Working Timetables but in a data-friendly format. While TfL no longer publishes new versions of these CUF timetables, old versions are still available at http://timetables.data.tfl.gov.uk and contain substantially more information than the TransXChange timetables that are still published. Something like a dwell time at a station isn’t likely to change very often, so this data may still be of use to you despite being a little dated.

In each compressed timetable, you’ll find a file ending in “EVT.csv”, which contains every timetabled “event”, 1 per row. By using the station ID, and arrival/departure times, you should be able to calculate an average dwell time for every station over the course of each day.

The data bucket linked above also includes a technical specification PDF document, which you can use to figure out the more technical aspects of what each column and the values within them mean.


An alternative method would be to use TrackerNet to observe real-time train data and use that to calculate average dwell times. Please note the caveat that TrackerNet data is currently not available due to the ongoing cyber incident.

You may find it useful to use intertube to collect this data, as then you can avoid doing the heavy lifting of converting TrackerNet’s individual platform arrival boards into individual trains that you can track across the line. You may in particular find the TRAIN-STATION-STOP and TRAIN-TRANSIT events useful to calculate the dwell times at stations.

You can find more info about the intertube API here, but note that you would need to contact @eta to get access. I’m sure that if you ask very nicely you may even be able to get a historical dump of data from her rather than waiting for TrackerNet to start working again.


Both of these methods should also assist with one of the objectives you mentioned in your original post, as you would also be able to calculate the travel time between all pairs of stations.

I’d also suggest looking at the PDF versions of the working timetables, as they each contain both the running time and the physical distance between all pairs of adjacent stations.

1 Like

This is great1 I will check these out!

Thank you so much!