No JP Datastore update this week

I’m pleased to say I am able to import the data once again!

1 Like

The new N11 file is particularly naff. In one direction it shows exactly the same time at several successive stops. I have checked and this is also how it appears on the stop specific timetables online. It also seems to be reflected in Journey Planner, so the wrtong times are being shown - admittedly a right side error.

Example - for the nine stops between Northfields Station and Hillcrest Road the first Sunday night bus is shown as 0012. They, hey presto, at Acton Fire Station it has jumped to 0021.

@jamesevans @LeonByford

Tuesdays’s upload of the Journey Planner zip file to Datastore appears not to have worked.

hi @mjcarchive

Unfortunately, due to availability of staff, it wasn’t possible to get the upload done yesterday. The team are hoping that this will be updated today and we’ll let you know once that’s done.


Still nothing new. Any clearer on an ETA for the file?

hi @mjcarchive

This is now uploaded. Apologies for the delay - we hope to get back to the normal schedule next week.


The Tuesday update is now about 24 hours late. It is hard to schedule work if the uploading of the relevant data files is unreliable. Usually it i reliable but that is the second time in just over a month.

1 Like

@jamesevans @GerardButler
Still nothing yet.

The broadly equivalent file on the Traveline site is there, dated 12th July so I thought I would use that. Unfortunately it raises even more questions. Of the 742 bus-related files within that Traveline zip file, no less than 126 were not there in l
last week’s TfL file. Possible but unlikely, so I looked at the first few.

Route 104 - new TL file has SCN 48420, not seen in a TfL file since late June. Last week’s TfL file had SCN 48421.

Route 109 - TL has 61279, last seen in a TfL file in early June. Last week’s TfL file had 61968 and 61970.

Route 115 - similar picture. TL has 58971; last week’s TfL file had 58972.

I presume that there are some genuine new files (which Is what I am after) but I’m not going to waste more time checking.

On the few occasions I have had to look at the Traveline version before I have not noticed such discrepancies. If there is a a good reason for them I can’t think of it, so the question arises - is the file sent to Traveline all that it should be?

hi @mjcarchive - there was a typo in the filename uploaded. This has been amended and is now available.

Sorry for any inconvenience.


Thanks. I suppose we have all been there. What’s more, even when you have the dodgy file name right in front of you, it still looks right!

I’ll come back to the Traveline file issues later, also I think there are two files with inconsistent stop sequences.

timetable v timetables :slight_smile:

For route 366 (file tfl_16-366-_-y05-2.xml), nearly all the inbound stops are unsequenced in nearly all the journey patterns. It doesn’t start till 24th so plenty of time to sort it out.

HI @mjcarchive I have had a look at this and there new Summer School Holidays schedules have a different stop sequence than the current School days and weekends, I have removed this schedule and extended the current version on an interim basis, there is also a new service change for this route on the 29 July to correct routings in the Galleons Reach area, which we are still awaiting, so this should hopefully fix this stop sequencing issue.

Matthew ( cc @GerardButler )

@MScanlon @jamesevans @GerardButler
Thanks for replies.

I have had a closer look at the London zip file on the Traveline site, which is dated 12th July. There are many files in that zip file which have been dropped from the Datastore version and vice versa. I won’t list them but I note one in particular - the 643 - where the operator has changed.

I then opened up a couple of xml files from the Traveline zip file and they showed the Operating Period starting from 23rd May. On the Datastore file the start date is a rolling date; on the current version there is (I think) nothing earlier than 8th July.

My tentative conclusion is that somehow the file sent to Traveline has not been updated properly (if at all) for six weeks or so, even though it has an up-to-date date stamp within the Traveline dataset.

Maybe changes in process mean that the updated file is no longer sent but in that case it should not appear in the Traveline datasets.

I’d be very happy to be told that I am totally wrong on this!

@MScanlon @jamesevans
Past midnight and no sign of the Tuesday evening update again.

Any feedback on my previous post about what the Traveline London zip file shows? I haven’t checked it myself since.

Hi, I can see that the latest version was uploaded yesterday at around midday.

Please note that due to caching, you may need to add on a cache-busting parameter to obtain it, for example:

Thanks for your response. It worked with the precise link in your response (“somerandomstring” being as good a random string as any!) but I am surprised that I had to resort to that.

I am aware of caching without knowing all the ins and outs. Usually for me it is in the context of individual web pages rather than 120Mb files and I note that I have never had this issue with the Datastore file before. I have not changed any local configurations recently so that can’t be it.

My computer was powered off during the afternoon. Unless I have misunderstood, I would have expected any old files to be cleared from the cache (they remain in my Downloads folder of course). Yesterday evening I tried both the links that usually work, that is my usual one
and one
that is the guts of your link but which I had not used for a few weeks. Both produced the old file, with individual files dates 25th July. As usual, each attempt to download took a couple of minutes to complete; would it not just load up the cached version instantaneously if it was available?

I note that is http rather than https but it too works with a random string at the end. I think I was advised to use this link some years ago in preference to the syndication one.

It’s caching on our side, so there is nothing wrong on your end, and there is no way for you to invalidate the cache, other than using one of these random string parameters.

The cache lasts 24 hours, therefore without any parameters, the latest version of the TxC data will be available at most 24 hours after it has been published. But if you require it more immediately, please use a random parameter.

Thanks - and can I say that I really appreciate your quick and helpful responses on, well, pretty much everything. Which is not to imply that it wasn’t good before!

When I first attempted the download last night it would have been roughly a week since I had obtained a file, which is inconsistent with a cache that lasts just 24 hours, so I remain baffled, if in a different way to before.

1 Like

Thanks for your kind words.

The cache is shared between all users. Imagine the following timeline (times are just examples):

Tuesday 1st August

10:00 User A requests At this time, the file is not held in the cache, therefore the latest version (from Tuesday 25th July) is returned. This is the version that will be cached for the next 24 hours.

12:54 We publish a new version of The old version continues to be cached.

18:00 User B requests The cache has the old version, therefore the old version is returned.

Wednesday 2nd August

10:00 The version in the cache has now reached its 24 hour limit, therefore the cache is cleared.

11:00 User C requests The cache no longer has a copy of the file, therefore the latest version (which is now the one from Tuesday 1st August) is returned. This version will be held in the cache for the next 24 hours.

As you can see, the version of the file that is returned depends on the timing of the requests that other users make. Therefore, if you want to ensure you receive the latest version, you would need to use a cache-busting parameter.