Accessing user accounts for Oyster

Hi there,

What is the view of TFL on providing endpoints to allow innovation to customers based on their account info.

I wrote a test application before via screen scraping… and then TFL peppered there site with Captcha, which I thought wasn’t great.

Just a side note to pre-answer comments about screen scraping… it was used in banking for years before Open Banking came out and neither the regulator or banks stopped it so it’s can’t be as woeful as people might let on.

Hi conor, welcome to the forum.

The terms and conditions of TfL’s APIs explicitly prohibit the use of screen scraping (https://tfl.gov.uk/corporate/terms-and-conditions/transport-data-service#on-this-page-3) and whilst I don’t work for TfL I can imagine a few reasons why they might suggest this is appropriate:

  • Scraping for information such as Oyster journeys, balance, etc… would require the user’s credentials. Those credentials would also let you see payment details, addresses, and so on. This is an absolute security nightmare.
  • Scraping generally incurs a greater load on web servers because of the need not only to collect/calculate data, but then “present” it to the user agent doing the scraping. Load on the backend may (should!) be the same in a decent software architecture, but the added compute power required to serve hundreds/thousands of requests that never get viewed by a user is undesirable at best, and detrimental to service at worst.
  • Relying on scraping puts you at risk of changes to page layout / design causing your scraping to stop working.

I take on board that you wrote a test application previously, but the fact that TfL added Captcha to the site and broke it probably wasn’t accidental :wink:

There are currently no API endpoints that I’m aware of to let you interrogate Oyster data. That said, if you felt so inclined you might want to have a read of https://blog.tfl.gov.uk/2017/09/15/how-we-built-the-tfl-customer-api/, grab yourself a packet capturing tool, and maybe even subject yourself to a man-in-the-middle attack to see traffic between the app and the APIs in the background. If you do choose to do that, bear in mind that I very strongly suspect it would be a breach of terms and conditions for someone, somewhere. I’m not a lawyer, and also not endorsing it as a technique!

If you only want to process data belonging to you, and you’re happy to wait a week between “dumps”, you might want to also check out Is there any API for accessing individual TFL customer daily journey with details?.

HTH