StopPoint Arrivals Old Response Via Code (C#)

Hello,

Navigating to:
https://api.tfl.gov.uk/StopPoint/490001341G/Arrivals?app_id=xxx&app_key=xxx

Returns the following:

$type: "Tfl.Api.Presentation.Entities.Prediction, Tfl.Api.Presentation.Entities",
id: "654406327",
operationType: 1,
vehicleId: "SN13CGZ",
naptanId: "490001341G",
stationName: "Woodgrange Park Station",
lineId: "425",
lineName: "425",
platformName: "G",
direction: "outbound",
bearing: "243",
destinationNaptanId: "",
destinationName: "Clapton, Nightingale Road",
timestamp: "2020-04-27T13:51:42.1974035Z",
timeToStation: 522,
currentLocation: "",
towards: "Forest Gate",
expectedArrival: "2020-04-27T14:00:24Z",
timeToLive: "2020-04-27T14:00:54Z",
modeName: "bus",
timing: {
$type: "Tfl.Api.Presentation.Entities.PredictionTiming, Tfl.Api.Presentation.Entities",
countdownServerAdjustment: "00:00:05.2251553",
source: "2020-04-26T16:32:19.313Z",
insert: "2020-04-27T13:51:58.881Z",
read: "2020-04-27T13:52:04.101Z",
sent: "2020-04-27T13:51:42Z",
received: "0001-01-01T00:00:00Z"
}

Note the timestamp is for the 2020-04-27

However if I perform the same request via the HttpClient in C# , I receive a response with time stamps of 2020-04-24 :

{
"$type": "Tfl.Api.Presentation.Entities.Prediction, Tfl.Api.Presentation.Entities",
"id": "1395977545",
"operationType": 1,
"vehicleId": "SK19FCJ",
"naptanId": "490001341G",
"stationName": "Woodgrange Park Station",
"lineId": "25",
"lineName": "25",
"platformName": "G",
"direction": "inbound",
"bearing": "243",
"destinationNaptanId": "",
"destinationName": "City Thameslink",
"timestamp": "2020-04-24T14:52:23.4782183Z",
"timeToStation": 127,
"currentLocation": "",
"towards": "Forest Gate",
"expectedArrival": "2020-04-24T14:54:30Z",
"timeToLive": "2020-04-24T14:55:00Z",
"modeName": "bus",
"timing": {
	"$type": "Tfl.Api.Presentation.Entities.PredictionTiming, Tfl.Api.Presentation.Entities",
	"countdownServerAdjustment": "00:00:04.8854978",
	"source": "2020-04-24T12:37:46.33Z",
	"insert": "2020-04-24T14:52:56.768Z",
	"read": "2020-04-24T14:53:01.636Z",
	"sent": "2020-04-24T14:52:23Z",
	"received": "0001-01-01T00:00:00Z"
}

}

I can correct the issue in code by adding a cookie to the request (Copied from the request sent via the browser). However this is not an ideal solution.

Curiously it only appears to happen to a small set of Stop Codes that I am using (May be more that I don’t know):
Norwood Junction Station:
490001216E
490001216P
490010448S
490001216T

Woodgrange Park Station:
490001341G
490001341H

Could someone please investigate as to why I receive out of date information.

Kind Regards,

Ywain

Hi Ywain,

I’m working in C# on .NET Core 3.1 so happy to try and help from a code perspective.

It sounds to me like there’s some client-side caching going on. If you’re adding a cookie, that might be enough to invalidate the caching. Can you post a sample of your code showing your use of HttpClient? I’m using RestSharp for my implementation which hides away some of the implementation detail but happy to try your code and see if I can spot anything.

HTH
Daniel

Welcome @Ywain

This sounds like a suspicious caching issue, which might be HttpClient being “helpful”.

You might want to try adding a random extra parameter to your calling URI which would defeat any page caching.

I trust you have compared your results with https://api.tfl.gov.uk/swagger/ui/index.html?url=/swagger/docs/v1#!/StopPoint/StopPoint_Arrivals ?

Hi @originalarkus
Thank you for the assist.

var url = "https://api.tfl.gov.uk/StopPoint/490001341G/Arrivals?app_id=xxx&app_key=xxx";
var response = await new HttpClient().GetAsync(url);
response.EnsureSuccessStatusCode();
var json = await response.Content.ReadAsStringAsync();

Im using .Net Standard 2.0. It does seem like some caching is taking place at the endpoint.

I’m not sure if the cookie ‘Solution’ would be an appropriate approach, as I’m not sure how long it would last before going stale and resulting in the responses to revert to the old data.

Hope this helps.

Hi @briantist ,

Your suggestion works :slight_smile: it does indeed invalidate the caching

The new url looks like:
https://api.tfl.gov.uk/StopPoint/490001341G/Arrivals?app_id=xxx&app_key=xxx&t=637235972516073718

t being

DateTime.UtcNow.Ticks

Thank you very much for your suggestion.

2 Likes

Hi Ywain,

For sure, the cookie isn’t ideal but it does prove there’s a caching issue.

I’ve had a look and a test around and it looks to me that HttpClient is behaving exactly as it should do but unfortunately not in the manner we want it to. If you have a look at the Headers returned from the API call, you’ll see “Age” and “Cache-Control”. Age is ticking up second by second and currently sits at around 272287 seconds (as I write this), and the Cache-Control header specifies a cache duration of 604800 seconds. This seems like a huge amount of time. When I test with some other NaptanIds, or with random data to bypass the cache, I get Cache-Control durations of 30 seconds instead. This appears, to me, to be a bug on TfL’s side. You can verify this behaviour yourself by adding the following somewhere after response has been populated:

foreach (var header in response.Headers)
    Console.WriteLine($"{header.Key}:  {string.Join(",", header.Value)}");

You can, as @briantist suggested, add a constantly changing variable to the URI you call but bear my personal preference would be to avoid this in the long term as it defeats the purpose of the Cloudflare cache. That said, if the cache is broken it has defeated itself anyway…

Not sure how to flag someone at TfL to make them aware of this… @briantist you seem to know people! Have you any suggestions who we can reach out to?

It’s also worth noting that depending on which version of .NET Framework or .NET Core you’re compiling against, you may see a large number of not-quite-closed TCP connections if you use a single HttpClient per request. May be worth checking out https://softwareengineering.stackexchange.com/questions/330364/should-we-create-a-new-single-instance-of-httpclient-for-all-requests if you haven’t already :slight_smile:

Hope this helps.

Thank you very much for your time and effort on this. Will definitely be making some adjustments to the code based on your suggestion.

Indeed, my suggestion was a diagnostic one! Like those don’t always end up the final code :wink:

I usually ask @jamesevans to join in.

I just found a similar one in my code that has been there 8 weeks now…! So easily done!

Slightly unrelated but I’ve found the /mode/{mode}/arrivals endpoint doesn’t permit cachebusters in the URL! I’ve tried a few parameter names, but this example with a “random” value for “t” returns a 404…

https://api.tfl.gov.uk/Mode/bus/Arrivals&t=1234567890
{"$type":"Tfl.Api.Presentation.Entities.ApiError, Tfl.Api.Presentation.Entities","timestampUtc":"2020-05-14T00:33:53.9751627Z","exceptionType":"EntityNotFoundException","httpStatusCode":404,"httpStatus":"NotFound","relativeUri":"/Mode/bus/Arrivals&t=1234567890","message":"Resource not found: http://api:8001/Mode/bus/Arrivals&t=1234567890"}

We’ve been rumbled!

I think you will find that the first parameter is always a “?” in a properly formed URL, not “&” as you have used… :slight_smile:

https://api.tfl.gov.uk/Mode/bus/Arrivals?t=1234567890

Sorry, I should’ve pointed out I tried both. 404 with a ? too :frowning:

I guess you could change it from a GET to a POST and change a post parameter? Does it support POST?