Has this been improved yet? Both bus-stops.csv and bus-sequence.csv contain stops with no NAPTAN code, and also (oddly) 490000805Z is in the file twice! I can also see headings of null and some rows with carriage returns in the NAPTAN field.
Can these be corrected please? It would be terribly onerous to call the unified API for this data, hundreds of times.
The documentation states that these files are generated weekly, but the comments here suggest otherwise. Maybe a header/trailer record could be added with a date in it?
Thanks @JohnSmith - we do encourage people to use the Unified API as these raw files may be deprecated in the future, but I’ll look into this issue and let you know how I get on.
Thanks @jamesevans - I am using the Unified API, but I need a greater accuracy than it provides, so I am using this data to help with geolocation of bus stops.
The duplication issue is back again.
Can we please get someone to look at this issue more closely? It’s been showing up weekly now for the past month and it can be quite time consuming to resolve.
It seems that this may be occurring when the bus_stops.csv is updated and your process is appending to the existing csv rather than overwriting it.
@denis_stih As this has happened more than once, I have put in a failsafe to looks for the last header row in the file (ie. Stop_Code_LBSL,Bus_Stop_Code) and read from there and I think I’ll keep doing that in case this should happen again! It does seem to contain correct data that way.