Is there an endpoint for a complete bus timetable?

I would like to pull the entire public-facing timetable for a London bus route, specifically:

  • all arrival times any time of day, for all different day types (e.g. Mon-Fri, schooldays/non-schooldays, Sat, etc)
  • for every StopPoint along the route (i.e. every StopPoint returned in /Line/{busId}/Route/Sequence/{direction})
  • for all directions (inbound/outbound)
  • excluding any locations covered only in light running (such as garages)
  • in a format like json that I can easily parse
  • bonus points if any special statuses like ‘24h daily’, ‘school journeys only’ etc could be bundled in with the response

I can’t see an endpoint for this in the swagger. I can recreate all of this by pulling timetable info for every StopPoint individually (/Line/{busId}/Timetable/{stopId}), but that’s a lot of calls to an endpoint with a 1req/s limit. The giant zip with all timetables in is (according to the site) only covering up to 2018, and the stuff at Bus schedules - Transport for London is incomplete and in pdf format.

Is there an upstream source of this data anyone can point me to? If not, is there any advice for how to scrape all this info without needing many hundreds of API calls?

Cheers,
Michael

Yes mostly.

I use /Line/Mode/{modes}/Route to get a list of Line/OriginatorId/Direction entries and then use those to call all the /Line/Timetable entries, but you should be able to use /Line/{LineId}/Route to the same purpose.

Just be aware that the Originator list is not necessarily complete, more an issue for tube etc, see this thread for details

For information (as it relates to a form that wouldn’;t meet your needs, TfL started providing bus schedules in PDF form in dribs and drabs in response to FOI requests. Almost inevitably the volume of requests led to the current situation where all the schedules are available in PDF form from the TfL website. I’m interested that you say it is incomplete - what do you think is missing. The giant zip file consists of xml files (though something tells me that they were actually Excel is disguise - maybe I am misremembering that). They have been converted to PDF. Absolutely everything for all years can be found in PDF form on London Bus Timetable Graveyard.
Note that only a selection of stops are included.

Before going on, what are you actually trying to achieve here? Full timetables showing every stop for every TfL bus route are already available on the Bustimes site. What more are you trying to add to their rather impressive wheel?

I would presume that Bustimes use the xml files in the weekly Journey Planner zip file which is provided to the London Datastore. This contains all stops for each route, excludes dead running and shows days of operation and non-operation (not that I ever quite got the hang of that bit). This is available via JP Datastore.

Hope that is of some use, or at least of some interest.

@mjcarchive ,

When I say incomplete, I am talking solely about the API.

For reasons totally beyond my understanding, Tfl have chosen to be selective about the list of Originator and Destination points in /Line/Route.

Why they choose to (presumably manually - who on earth maintains that list, and why?) decide which ones to include, rather than simply throw in all the ones in the underlying timetable data is a mystery to me. Unfortunately it makes the whole dataset (again in API terms) unfit for purpose as a coherent source.

See the last half a dozen or so posts in the link in my previous post for @LeonByford 's view.

@nickp On completeness I was responding to the original post by @MMJZ

Thank you both for the responses.

@mjcarchive re the q about incompleteness, I was using the B13 as my test route and the schedule appears to describe times for only a subset of stops (i.e. the ‘important’ ones), and so has less info than can be pulled using the /Line/Timetable endpoint.

The JP Datastore zip seems to have everything in it - thank you for linking it - so hopefully I’m saved from thousands of API calls and the incomplete data in /Line/Route (thanks for the heads up @nickp)

Looks like I’m going to be stuck spending the next forever parsing the data (a zip containing more zips containing unordered lists of xmls with wild structures and giant sizes is a hilarious challenge) to get something that looks like the bustimes site but at least I can check my work. As for what I want it for - honestly not too sure yet, but the plan was to recreate the complete timetable view first and go from there

If you just want to see what the zip XML files look like as “readable” html or pdf timetables you could try an old program (TransXChange Publisher) I mentioned in a previous thread. If nothing else it will show just how daunting the task is. Many TfL timetable files are very complicated, varying times with school holidays, different days of the week (within Mon-Fri), etc. The program does its best to cover the variations in a compact format, often with copious footnotes. However, in some cases closer inspection of a table reveals that several different time periods have been concatenated in a misleading order.

@MMJZ - I see. Yes, incomplete in terms of what you want but complete (I think) in the sense that all the schedules that should be there are there. The main purpose of the WTTs as I understand it is to provide route controllers with a succinct view of all aspects of the route’s operation, including driver duties. The modern PDF would have been recognisable to the route controllers (inspectors?) of the 1970s and quite likely to users of handwritten schedules in the 1940s. The key word is “succinct” - producing a document showing all 80 stops or whatever would defeat the point of the exercise.

It is possible to produce a complete all stops timetable for a route. I know that some of the timetables on London Bus Routes are done by parsing the xml into Excel and using the VehicleJourneys and JourneyPatterns branches and in principle that approach could cover all the stops but it needs automation via VBA or similar. Even then, anything beyond doing the basic day types (which London Bus Routes by and large does not) is difficult.

In short, it could be a gargantuan task to produce (and then maintain) it for the entire bus network. Nothing wrong with a challenge but there are challenges and challenges!

@MMJZ In Transxchange format the whole current TfL timetables are updated often at

the whole of the UK is provided by the DfT at

Example of them in use…

  • RTI - Finsbury Park
  • RTI - York

@briantist How does your app select the relevant bus stops for a rail station? I tried it with Surbiton (SUR) and it displayed just one stop (NN) which today has two routes, both towards Kingston. In fact there are 4 more stops (NB, NC, NK, NP), all adjacent and incorporating Surbiton Station in the displayed name. If you “tap to see live feed” it also shows the buses from NB.

@misar Sadly the DfT data is lacking in some places. There is legal no requirement for the data about bus services to be provided, so the similar problem happens for Brighton. It’s a bit disappointing but it’s better than nothing. Brighton has an excellent real-time bus data system (which I was somewhat responsible for…) such as https://www.buses.co.uk/live/149000006637

However the https://beta-naptan.dft.gov.uk/Download/National/csv from the DfT does seem to have a complete set of bus stops.

The stops are calculated using radial distances (ie, meters on OS co-ordinates) to the station, but they are limited to those with active timetables.

I’m going to draw a line under this one but wanted to feed back for anyone else in the thread or anyone with the same goal of working with this data:

The JP zip has all the info needed to recreate all the timetable views you can get on tfl’s website or on bustimes.org. I was working in C# and used the Windows tool xsd.exe on the TransXChange schema to generate a model and then deserialised timetables into that model. I had to swap a couple of enum types for strings in the generated model since the enums weren’t defined for some reason meaning the project wouldn’t build.

Churning through the data at that point is easy enough. The schema explainer document nicely walks through how different constructs link together and seems to have been written by some proper nerd who loved their craft, and I couldn’t be happier for them.

Looking at the B13 as my test case, TfL doesn’t seem to make full use of the some of the format’s features. My personal favourite is the nuance where the ‘last bus on (e.g.) Monday’ actually starts on Tuesday (after midnight), and this ‘journey’ can be labelled as ‘shifted 24 hours’ in the format, but TfL instead declares the after-midnight weekday buses as just ‘buses that run Tue-Sat’. This is why the B13 on bustimes has the midnight buses appear before the 5am buses.

I assume TfL uses some other format for the primary copy of its timetable data, since its own website can distinguish first and last buses properly.

Main thing I’ve learned is that any hope I had of cleanly representing any given bus route is waaaay beyond me, and apparently beyond everyone else as well. The B buses are all wonderful edge cases for any devs:

  • B11 is a cloverleaf route, doubling back on itself to call two stops twice in the same direction in a single journey
  • B13 and B15 are lollipop routes, though technically all London buses are since none of them actually call at the same stop groups on the return journey
  • B12 is a lollipop route that changes the direction in which it goes round the loop halfway through the day
  • B11, B14, and B16 have sequences of stops traversed in the same order by buses in both directions

Bustimes and the tfl website have a real hard time representing these buses too - so I think I’m only going to make progress by either ignoring bus routes with complicated schedules (i.e. a good chunk of them) or crafting bespoke views for each route.

1 Like

@MMJZ Thanks for you input. There are many more readers on forums than contributors.

1 Like