No worrking timetable update 27 October

@neamanshafiq

The update due at 3am on 27th did not happen.

I have no other means of communicating this and if I don’t point it out it nothing happens, hence my cluttering up the forum.

@mjcarchive I for one don’t mind you clutter.

You are making a very political point by holding these public servants to account over how well they are performing legal and administrative duties.

I also note they the team is generally doing a good job, but with computer science as it, you are quite right to point out GIGO nature of these omissions, errors and so forth.

The good people at TfL, and they are some of the best people I’ve met over the years, are humouring us with this excellent forum.

Please keep up the good work.

Brian

Thanks. When I compare what is available for onward use now with what was available, say, fifteen years ago, I admire what has been achieved - and not just by TfL. That’s why it is so disappointing that this particular system - or the way people interface with it - seems to go wrong in such inexplicable ways. Or maybe not inexplicable but nobody has shared the explanation, or enough information to enable others to contribute to establishing the explanation. If the problem is not understood, it will never be solved. Involve us, we may be able to help.

That of course relates to the incorrect files problem, rather than a simple failure of an update to run, wihc can presumably be rectified overnight.

Michael

@neamanshafiq

Still nothing

hi @mjcarchive

Neaman is on leave this week. Unfortunately we’re a bit delayed in getting the weekly upload done. I’ll try to get that done today.

Thanks,
James

@jamesevans

Thanks - I’d assumed it was some other team that actually did the re-creation and uploading.

We’re currently grabbing the data manually from the system that produces it as a workaround. Our colleagues who look after the source system are looking into fixing the automation. When Neaman returns, I’ll ask him to update you on the conversations he’s had with the data owners regarding the erroneous data.

Thanks,
James

1 Like

@jamesevans
Still nothing. Please tell me that this is because the problems are on the point of being resolved!

I did observe that quite a few of the files that should be live are dated 16th October rather than 20th, so that seems to be another issue to add to the list:-
a) updates not happening at all;
b) updates being partial (a new entry this week, pop pickers);
c) current files being overwritten by obsolete, some being cirrected, some not, some being corrected then overwritten again - and on it goes.
d) some files never getting loaded in the first place (e g 111 and 292 from 29th August; a more recent 292 has since been loaded but not so for 111).

The cumulative implication of a) to c) is that there could be far more files than I know about that have not been loaded at all. There is a need to update the “expired schedules” zip file created in 2018 to meet one of the purposes of the WTT upload system, to turn off the flood of FOI requests for individual groups of schedules. I could put in such a request for the 111, for example, but I really don’t want to resort to sticking plasters, particularly as I have no idea how many plasters are needed.

1 Like

Hi @mjcarchive

I’ve uploaded this week’s schedules. Apologies for the delay - we’ve had some operational work that has taken priority over this.

Please let me know if there’s any further issues.

Thanks,
James

1 Like

Thanks, @jamesevans. I do appreciate that there are more important priorities.

The bad news (though not a surprise) is that a quick analysis suggests a further 381 incorrect overwritings. Obviously the creation stage, not the upload.

Problem b) is no longer there; while there are some files in the Data Bucket dated 16th these are probably ones which are not really live; they shouldn;t be there but they are not preventing access to anything truly live.

1 Like

@jamesevans @neamanshafiq
The quick analysis was not misleading. The number of links which now take you to an obsolete file is now no less than 781. That is over 21% of all links on the WTT pages.

As the usual link Files incorrectly overwritten as at 21st July 2020 explains, this probably understates the magnitude of the issue, though hopefully not by much.

1 Like

Try 771, not 781! Typical mathematician, can’t add up.

1 Like

Little change on the 3rd November upload - 759 errors rather than 771.

@neamanshafiq Any news on the conversations with the data owners?

1 Like

@mjcarchive i have an update this morning. our colleagues in Bus Operations believe they have resolved the issue now. i’ve just run a fresh upload if you want to check?

@neamanshafiq

Thanks. Everything crossed as I did the download and started the comparisons.

I can see quickly that there is a sizeable swing from fake news to the decent versions. At first glance it trumps most but not all of the problems. I’m not sure it will be enough to enable a claim of victory but some may have corrected in ways not visible to this observer without closer scrutiny.

As long as there is no legal attempt to make me “stop the count” I will update further later today.

1 Like

@neamanshafiq
I think this will be a whole weekend job to understand properly. At this stage it looks like a job well done but it is quite a complex picture and I have a lot more to assess.

I think what has happened is that getting on for 600 obsolete schedules have been overwritten with correct reinstated records (that is, they have been present before but had been overwritten incorrectly).

I am hoping that most if not all of the other errors have been corrected by loading correct new material (that is, material which has not been present before now). If that is the case then the number of incorrect links could have been as high as 900!

I can see a lot of new files dating from several months ago which clearly weren’t loaded (or created?). What I am not sure is how many eliminate known errors (as in my analyses) and how many eliminate errors of which I was completely unaware. I see a fresh set of schedules for the 125 from July, for example, but I was unaware that there was anything wrong for that route.

A lot of special schedules have been removed which is fair enough as they often have a mayfly existence.

@neamanshafiq
Almost sorted but not quite a cigar!

There are now just 21 erroneous schedules linked to from the TfL site, a massive improvement of 738 on 759 a few days earlier.

This difference is considerably greater than the number of correct files (572) that have been reincarnated in the latest upload.

I think the reason for this is that a fair number of errors were on “special” schedules which had not operated for some time. The latest upload has seen some 522 files deleted without replacement and it seems reasonable to assume that this deletion process has itself eliminated some more of the errors.

It is also worth noting that 375 completely new schedules have been added. By “completely new” I mean that they have never cropped up before. The majority of these are dated earlier than 17th October so it is pretty clear that they are schedules which should have been uploaded in the past but were not.

If that is correct, then the true number of errors before this cleansing of the stables must have been about one thousand, though some of these related to route/day combinations which weren’t really live.

In a way the 21 files that are still incorrect are more interesting than those which are not. They are listed on the usual link but to summarise, 12 have been carried over for last week while 3 (route 362) are newly introduced errors. The remaining six are a bit complicated - previously an old version of an old file was in place. This time the latest version of the old file is in place but it is still the old file (too early).

Two questions arise, I think. First, what was the underlying issue and how has it (almost) been cured? Bear in mind that this is not the first time that this large scale problem has been sorted. If it requires ad hoc manual action it sort it out and relies on consistently getting the right input from humans or another system, isn’t it going to go wrong again? How robust is the solution?

Secondly, if this exercise has uncovered 300 or so files which should have been uploaded in the recent past but were not, how many more are there which are not exposed by this because they are no longer current?

Update. The mass cull of schedules which should not really have been there at all only eliminated some 30 of the 767 errors there last week (I found 8 more when I looked more closely!). I reckon that roughly 200 additional errors were present undetected. So not a nice round thousand but close.

The overnight upload should be interesting…

The overnight upload was actually pretty dull. That’s fine, dull is good sometimes. New schedules added correctly and absolutely no dodgy reincarnations (not even to vote in Michigan). So all the corrections made at the tail end of last week remain in place. On the downside, that means that the very short list of remaining errors remains unchanged as well but I’m not complaining (much).

1 Like