Are there any updates on this and will this be available live via open data? Also if it is available via open data is there a way to be notified of updates? (if so please add me the list!)
Nice thank you! Taking a brief look over the data but I was wondering what all the headers meant exactly, and also the uncertainties on the various measurements.
Current assumptions/queries:
SITE_NUMBER: ?
SITE_ID: reference ID for different counters. Embankment is 4632, Blackfriars is 4865
SERIAL_NUMBER: ? thought this was total cumulative count (equivalent to “cyclists this year”), but on looking more closely it doesn’t line up, it’s about -40000 off.
DATE: midnight on the day
TIME: Time past 30/12/1899 that that cyclist went past midnight on the day. So exact time stamp would be DATE + TIME. Don’t see why not just store it as one?
TimeString: Time past midnight formatted as HH:MM:SS:S00
LANE: ? Seems only available for Embankment, at Blackfriars lane is just 1?
DIRECTION: north/south/east/west + bound. See below for issues
SPEED: speed in km/h (uncertainty?)
SPEED_MPH: speed in mp/h (uncertainty?)
CLASS_INDEX: identifier for class
CLASS: ? (what is 2N)
LENGTH: bicycle length (uncertainty?)
WHEELBASE: distance between the center of the front and the center of the back wheel? (uncertainty?)
VALIDITY: ?
STRADDLE: ?
OVERLOADED: ?
GROSS: ?
HEADWAY: ?
GAP: ?
TIME_GAP: time since last bike in miliseconds, maximum 10 minutes
Also I spotted in Embankment/May there’s Tuesday, Jul 3 2018.xls which should be moved to Embankment/July
EDIT: Also realised Embankment and Blackfriars are wrong way round, need to be swapped (i.e. all the Blackfriar’s ones are in the Embankment folder and vice-versa).
EDIT: The Embankment data seems heavily skewed towards eastbound. Having processed a few files, I can’t seem to find one with more than 3% of journeys westbound, and I find it hard to believe for every cyclist going west there are 35 going east - unless like birds the cyclists are migrating in the summer… Also the Blackfriars data seems skewed southbound, but not as much; 70-80% heading southbound.
Also I think it would be helpful if the files were all CSV (as opposed to XLS) as it’s an open format and easier to parse for applications, and if they were named with an ISO 8601 date at the beginning in a standardised format across different counters.
EDIT: been thinking about this, think a sensible naming format would be: http://cycling.data.tfl.gov.uk/CycleCounters/YYYY-MM-DD-SSSS.csv where SSSS is the SITE_ID.
EDIT: the files are actually CSV but think they have been renamed to .xls, which results in Excel mangling them. Renaming them back fixes the issue.
Once again thanks so much for publishing this data!
If, as domdomegg notes above, the Blackfriars files are in the Embankment folder, and vice versa, then there may be a date problem too.
Lots of us noted that the Embankment CSH counter tripped 1 million at 6pm on Monday 23 July. It was reported live on several social media platforms.
Taking the Blackfriars-foldered file, and assuming that Serial Number is the count, then it doesn’t hit 1 million until 9.36 am in the file for Friday 27 July.
Is there a mis-alignment between the data in the files and observed numbers on the counters’ displays?
Yeah at first I thought the SERIAL_NUMBER was the cumulative count but it’s about -40000 / 4 days off.
I also wonder whether these data dumps are from some feed, or are the result of some TfL employee going to the counter, opening it up and copying some data over - due to the differing file formats, file naming scheme, mixup between the locations and the out of place files I suspect it’s the latter which means there’s probably no way to get it live so @cs3count can be the source of the latest possible news
Thanks for all your comments about the counter data. It actually comes from the supplier’s system but real time is not possible at the moment. I’ve passed your other comments on to the data owner. We are still refining this data set so it’s great to have your feedback.