BikePoint data quality

gebn · August 23, 2023, 9:13pm

Hi folks, I’ve written a Prometheus exporter for the BikePoint API to track availability over time. The idea was to potentially raise notifications when a given station is below a threshold of available bikes or docks.

Unfortunately, sometimes station data does this:

My hypothesis is this happens when a subset of backend processes stop receiving new data, causing them to serve the same values over and over again. From this point on, a given API call randomly receives either current or stale numbers, causing the oscillation when graphed over time. This would explain why the “wrong” value is a constant equal to the correct value at the time the pathological behaviour starts.

I’m seeing the same results when querying the API from two different data centres in London. The behaviour also correlates with slightly lower p99 response times, I suspect brought down by the bad values. Perhaps the backend is reading from a fast but stale cache rather than retrieving from the source:

Any help would be much appreciated! I’m happy to run experiments.

jamesevans · August 24, 2023, 11:13am

Hi @gebn - thanks for highlighting. We’ll take a look into this.

Thanks,
James

gebn · August 26, 2023, 10:29pm

Thanks, in case it helps, these are the periods where the behaviour has happened over the past 12 weeks, using the number of changes in bikes available for one station as a heuristic:

2023-06-19 17:23 - 2023-06-23 09:22
2023-06-26 16:12 - 2023-06-28 15:18
2023-06-29 06:02 - 2023-06-29 17:25
2023-06-30 05:39 - 2023-07-05 13:54
2023-08-09 05:34 - 2023-08-09 17:29
2023-08-10 05:44 - 2023-08-15 17:24
2023-08-22 13:01 - 2023-08-24 16:08

All times UTC.

jamesevans · September 6, 2023, 3:44pm

Hi @gebn - we found some intermittent errors in our BikePoint ETL task. We believe the caching layer in front of the source file is causing issues. We’ve bypassed this layer in our dev environment and it seems to be a lot more reliable.

We’ll look to get this in production next week and I’ll give you a further update then.

Thanks,
James