Bus WTT upload incomplete

This morning’s bus WTT update appears to have fallen over after route N72, thus there is nothing for routes N73-N550 or P3 onwards.

This is not the first time an incomplete update has happened.

Of course I cannot tell whether it is the file (re)creation process that fouled up or the subsequent process of uploading them.

Indeed it looks like something very similar happened last week, as files from N87-SuNt on are dated 8th (the Monday) rather than the 7th.

Upload completed this morning. As it happens there are no new additional new files but this will not always be the case.

Can whatever is going wrong on Sundays be corrected at source please?

My mistake - there were additional files. A real pain.

@jamesevans
Not sure who to ping on this.

Same has happened again this week. The process has fallen over between N44 and N5.

That’s three weeks running. It’s possible it was happening unnoticed earlier than that.

It can’t be something as stupid as an upper limit on the number of files that can be created? Can it?

@jamesevans
So near yet so far. This week the problem process fell over between W6 and W7.

Given that the X and Y routes are either defunct or dormant, that should mean that only W7, W8 and W9 have not been refreshed. Of course, the underlying problem, whatever it is, is still there. I hope that someone has been addressing it but the absence of any reply on here doesn’t fill me with confidence on that score.

@mjcarchive
May I ask whether the timetables in question are from one of these sources:
https://tfl.gov.uk/tfl/syndication/feeds/journey-planner-timetables.zip

Or?

@misar
It is the second of those sources but the better way into the 3,500 or so WTTs is via this link to a data bucket
https://bus.data.tfl.gov.uk/
The schedules are the last files to be listed so you have to scroll down a lot.

The advantage of the data bucket is that you can see what is there for all routes in one go. It is also possible to paste the list of all 3,500 into Excel or similar and to download the lot in one go.

You don’t know what is there for a route through the Bus Schedules until you enter a route number to kick off a search. There are an awful lot of routes…

@mjcarchive
Thanks. That page looks rather like a listing of an FTP site which is more convenient for bulk downloads but I have never found any such TfL site.

Thanks @mjcarchive - I’ll speak with @neamanshafiq tomorrow. He has been speaking to our networks team about the job that uploads these files. A tweak was made to that job, so glad to see it’s improved, but maybe another is needed to ensure we’re not cutting off the last few routes.

Many thanks,
James

@misar
Nor have I but I have a macro which does the job. It runs through the list of file names and constructs appropriate string for each file in turn using WinHTTP. I ripped off the core of the idea from elsewhere and wouldn’t pretend to understand it in detail but it works.

The file names give no indication of the date or Service Change Number and as all live files are (or should be) recreated weekly that makes it hard to spot new files. However the first line of each PDF does contain the SCN so I use a macro to generate a file which can be run within Terminal (using mutools) to extract the first line of each to a text file. Read this into the Excel file and comparison with what I already have tells me what is new. I post a list of what is new each week on
https://timetablegraveyard.co.uk/new_wtts.html

Thanks again @mjcarchive for yet more information.

I am beginning to view TfL’s open data like a Russian doll - as fast as I uncover one layer another mystery appears beneath. For example, the schedule pdf files are provided in a format which is useless to the travelling public. The same data is available in their xml (journey-planner-timetables.zip) download and easily converted to “proper” bus timetables in pdf (or html) format. Yet they no longer provide the public with such downloads. I don’t suppose they will tell us but surely one of the TfL staff on here knows a logical explanation?

@misar
I cannot speak for TfL of course,…

The WTT schedules in the form provided are of more interest to enthusiasts than ordinary passengers, as they contain information on vehicle workings and duties. Nevertheless, they can be used to generate “proper” timetables; part of my processing does precisely that for new files in bulk while I am watching Match of the Day or whatever.

The data in the WTTs are not the same as those in the JP files as they exclude most stops. Their genesis was surely in service control and driver timekeeping and for those you really do not want a document with sixty stops each way! Incidentally, the current system for WTTs arose because of the demand generated after TfL started making them available piecemeal in response to FOI requests. As you probably know, only documents that already exist can usually be requested under FOI. Translating them into a quite different format might be commendable in its own right but would not be a requirement under FOI. I’d much rather have them as they are than lose them because the format is not terribly friendly.

The XML Journey Planner files can also be used to generate timetables but I would query whether it is actually any easier! All stops are included, which could be either a curse or a blessing, and it is no longer possible to identify from the files themselves which are the key timing points for which drivers are supposed to observe the times. It is also far from obvious to a newcomer which day post-midnight journeys on a 24 hour service relate to. I’ve never really got a good handle on days of operation and days of non-operation either. There must be solid rules on all that for Journey Planner to work but they are not obvious, at any rate not to me.

Why do TfL no longer provide “proper” timetables? A long time ago, TfL decided that stop-specific timetables were the way to go. To be fair, I think there was research which suggested that many people could not cope with “proper” timetables. It was only when the Journey Planner files were made available as Open Data that it became possible for others to produce “proper” timetables. As others now do this from JP/WTT data (see London Bus Routes for example) there is less need for TfL to do the same thing.

I may be wrong @mjcarchive but I found a method which I think is probably easier.

My original automated processing of the WTT XMLs extracts only the stop and sequence data for a route, not timetables. However, I found a program (TransXChange Publisher) developed long ago for DoT which really does make conversion of TXC XML timetables to pdf or html trivial. There is a Windows GUI with many options (e.g. selection of all or principal stops) or it can be run as a console application (using a special batch file). I use that to enable programming of multiple XML conversions.


They are not as compact as your timetables but are possibly prettier!

PS
Comparing the second table with your timetable the late evening journey times are slightly longer. No idea which version is correct!

@misar
Thanks. I think all the times are consistent between sources - these are Friday times, late Monday to Thursday running times being shorter. The specific extract you have included is MTh at the beginning and Fr at the end. I presume it is possible to do just one or the other in TransXChange. If you can’t it makes for a very confusing timetable. A lot of London routes differ on Fridays.

I didn’t say they were my timetables BTW! I know how to go through the process but the extraction for the timetables on that site is not done by me.

I also note that the extremely verbose timing point descriptions take up an awful lot of screen real estate and that would presumably look worse on a smartphone! And can TransXChange Publisher do repeaters (“every 10 minutes…”)? That makes the pages more concise.

Having said all that, I thought it would be worth a closer look, but I haven’t yet found a link to the download that doesn’t fall over.

My apologies @mjcarchive, I should have mentioned that virtually all trace of TransXChange Publisher has been removed from the relevant websites. The last remaining “official” link for anything about it seems to be here. That page has a link to download a pdf guidance document with a table showing links to download various versions. The link to DfT’s old NaPTAN website for the final version (2.4.6) still works!

I was surprised to find that the program works well even on 64 bit Windows 11 but its speed decreases rapidly with increasing file size and it takes a VERY long time to process the largest XML files. I did some tests of a few TfL XML timetables using the GUI with the default (512 MB) JRE memory limit:
tfl_8-137-N-y05-61180.xml 965 KB 1 second
tfl_54-240--y05-57630.xml 2 MB 14 seconds
tfl_21-3-
-y05-61965.xml 7.25 MB 142 seconds
tfl_21-407--y05-5.xml 10 MB 111 seconds
tfl_8-279-
-y05-62013.xml 13 MB 1030 seconds
It is a 32 bit program but if you have a 64 bit PC install the 64 bit JRE. That allows increasing the maximum JRE memory above 1280 MB which helps with the largest XMLs. I use it to produce any UK bus timetable “on demand” and have no idea how long it would take to convert the entire TfL WTT set.

Hi,

That’s actually an Amazon S3 bucket, so you can easily do bulk downloads manually using an S3 client (I use Cyberduck, though others are available). Sometimes I use Amazon’s command line interface. Or you can do downloads programmatically using their APIs or SDKs that are available for a wide range of programming languages.

Despite having worked at TfL for over five years now, our data can feel like a Russian doll to me, too, so that’s perfectly understandable!

As has been mentioned, the bus schedule PDFs only include timing points. For Journey Planner to work, we need to know what time each bus will call at every stop. Therefore, when the timetables are imported into our journey planning system, there is an interpolation process to add times to stops that don’t have any. Furthermore, adjustments to the timetable are sometimes made by our data team to account for disruptions, special events, etc. Therefore, the information in the Journey Planner TransXChange data may differ from what you see in the bus schedule PDFs. There is also, separately, a version of the timetable that powers iBus and Countdown.

In terms of publishing timetable information in a human-friendly form, we currently have timetables for each bus route on our website, for example:

The data for this comes from Journey Planner and is essentially equivalent to the information you’d see on a Stop Specific Timetable. Although we have no plans to change how we present timetable information on the website, we aren’t beholden to using this current format. However, I’m not sure there’s much of a need or demand for us to provide the information in a different format. One of the benefits of us providing open data is that people wanting to see a full timetable are already well served by other sites such as bustimes.org.

I hope this has helped a bit. If you have any further questions, just let us know. :slight_smile:

Thanks @LeonByford. Your “insider” information is always really helpful in finding how to do a task or understanding TfL’s system.

I mentioned “proper” bus route timetables because TfL has arguably the most comprehensive bus information in the UK yet seems to be almost unique in no longer providing them. For the UK as a whole there is Traveline plus all the regional, county, city, operator, etc bus information sites. Virtually all of them provide traditional bus route timetables, both online or to download. Personally I prefer them to the “my stop” approach especially as for much of the day at most stops the TfL times are the “every 8 to 10 minutes” variety. I suspect that many Londoners and visitors search the TfL website in vain looking for the proper timetable downloads.

It would not need much extra work to add an index page with links to download html or pdf versions of the weekly xml timetable set. A similar London xml set is provided for upload on TNDS so they are already available from Traveline. :slightly_smiling_face:

@misar @LeonByford
Much the same set of xml files is available without going near Traveline via
http://tfl.gov.uk/journey-planner-timetables.zip
Last time I looked the Traveline links for TfL services just took you back to TfL’s own timetable pages, i e access to stop-specifics.

Rather than devoting TfL resource to doing what others are doing for free, another approach might be to insert, somewhere in the timetables section, links to the home pages of sources which effectively do this for free, with the caveat that while they are know to use TfL data, TfL can take no responsibility for their accuracy - or longevity (even sources which have been going for years can fall over if one key individual ceases to be involved).

I was initially intrigued as to how TransXChange Publisher was picking out the key timing points when the JP xml files no longer do so but then I realised that the current 41 file dates back to a time when they did make that distinction. The 279 is a much more recent file. Did your output include all stops? That might be part of the explanation for the grossly inflated (computer) running time.

I am starting to wonder whether code that I or others have written has been reinventing some wheels that we (or at least I) did not know existed. Having said that, if it works reasonably efficiently, it might as well be left alone. What would be useful would be the equivalent of Cyberduck’s S3 capability from an Android phone. Cyberduck is not available for Android and so far the only alternative I have found is Filezilla Pro, which as the name suggests ain’t free.

The idea of providing a full timetable isn’t a bad one, but an initiative like this would require quite a bit of work from multiple teams to design, develop and test the solution. It’s also important to have a support model in place so we have a plan for if something goes wrong. That would also mean taking colleagues away from other important work. So I think it wouldn’t be as straightforward as you may think.

I’ve used an app called BucketAnywhere for S3, which I can use to browse the schedule PDFs. There’s a freeware, ad-supported version, as well as a paid version.