Accessibility and Bus Stop - creating A-NaPTAN - we want your thoughts!

Hello lovely people who consume all sorts of nommy data,

[quick refresher - NaPTAN is the data set that has all the Bus Stops, as well as Metro, Rail, Ferry and Airport stops - it is a National data set]

At NaPTAN we are working on getting Accessibility NaPTAN working - we did a techical proof of concept in the first quarter of this year, and we have proved we can get inferred accessibility information from Street Furniture and publish that linked to the NaPTAN data.

As consumers of the data, and potentially those who will share this information out to Passengers, we’d love to engage with you and get your thoughts on what we are doing, and ensure that all the small #fails and edge cases are captured.

I’ve worked on NaPTAN long enough to know the edge cases always catch you out!

In the April public meetings we talked about the roadmap for the Accessibility work – the short version is we are creating three things first:

  • New Olympic Data – a data set giving basic accessibility information about the bus stops on high frequency routes for all ATCO Codes

  • Data pipeline for Ingesting, holding, inferring and publishing A-NaPTAN

  • CMS for publishing data providence, inferences, and schemas

The next public meetings are focussed on some Group User Research activities to help us with this work, and set up the rest of the work.

We have a Minimum Valuable Product:

Our MValueP is a bus stop dataset for some LA’s that gives minimum (and probabilistic) information about accessibility from a wheelchair user lens which is served through our extensible data pipeline via an api that a data consumer can use and the dataset is documented on the NaPTAN site

If you would like to be involved - there are a few ways:
We have a monthly Newsletter:
You can follow us on EventBrite:
You can also watch all of the previous meetings on YouTube:

Let me know if you are interested in getting involved ([email protected])

We are working with Leon and Gerard from TfL - I wanted to start to reach out to other consumers of the data - collecting edge cases is my thing!


Dr J


I am intrigued by the reference to inferred/probabilistic information on a stop. Does the mean something like “given that we know that the stop has features A, B and C there is a X% probability that it is wheelchair accessible”? If so, is the X% going to be included in what vis shown to users? The worst thing would be for a stop to be labelled as accessible with no qualification - then when you get there you find it is not.


That’s a really good question. I’ll try to answer it here in text - it might be best to have a look at some of the videos about Accessibility in the YouTube channel -

We are using the street furniture to infer the accessibility of a stop. Using User Research we have created a MuSCoW for our passenger personas - e.g. Brenda uses a wheelchair and wants to travel to see her friend Beverly in her new house

For Brenda we know she will need a Kerb that a bus can “kneel” to and deploy a ramp - so a stop that does not have a clear kerb would not be accessible.

We also know the stops which are on main trunk routes (High Frequency of buses) are more likely to have the money spent on them and are more likely to have the infrastructure on the street.

We need to build a data set which can cover the country and provide some information while we work with the 148 Local Transport Authorities (LTA) to get the street furniture data for their local authority - and we know this will take quite some time. Not every LTA is TfL!

To build this data set - we are working on finding the “trunk” stops (those very high frequency stops) - and then we are sampling those and assessing the street furniture. When we have about an 80% chance of seeing the street furniture we expect, we can confidently say to a passenger - these stops are very likely to be accessible and meet your minimal needs (in the case of Brenda this is a Kerb and a Hard Surface).

We are working with a TRUE/FALSE/NULL for each Passenger Persona - NULL means we have no information. FALSE means we have data that tells us the stop is not going to be accessible. TRUE means we have data that tells us the stop meets their access needs.

The probabilistic data is no longer used once we have actual street furniture data.

Does this make sense?

This is very good for me to try and answer - as one of the things we want to do is clearly explain the providence, inference and governance of all the data. Your feedback will really help us develop the right level of information and context.


Dr J

The derivation certainly makes sense. What I would worry about - and I am probably misunderstanding the intention here - is presenting probabilistic data as TRUE/FALSE. If it a highly informed guess, than third party data users need to understand that and cover that subtlety in way which conveys it to the end user. What third parties do may not be entirely in your control.

I’m aware that the best (in this case covering every base) can be the enemy of the good and that people want simplicity and aren’t always very good with probabilities. Good luck, anyway.

I wonder whether having five options (FALSE, PROBABLY_FALSE, NULL, PROBABLY_TRUE, TRUE) would solve this problem (with the PROBABLY_ indicating that the information is based on a probabilistic model rather than concrete data).

Or there could otherwise be a separate property that describes whether the data is based on probability or concrete data.

Hi Leon, that’s an excellent idea - I will take it to the team and present in the next public meeting (hope to see you there …)

We are thinking about how to demonstrate that “we know we’re likely to be right…” version