Cloudflare problems today?

@jamesevans

I was attempting to answer a question in the Forum this morning at 08:01 to 08:06 and I couldn’t get a simple call to work…

https://api.tfl.gov.uk/line/mode/river-bus/status

it’s showing up in my web-browser history has “Error”. I note at that the same time I couldn’t access HS2 shows off revised plans for Camden ventilation shaft and the common thing here is they both use Cloudflare.

My client is a bit perturbed as there is no code to deal with calls to the api.tfl.co.uk hanging, as they always return very quickly and will never hang.

Do I need to change my code or is this hanging an unwanted behaviour?

Thanks

Hi @briantist

We had some performance issues this morning with the Unified API. We’re aware of the root cause and we’re looking at how to prevent reoccurrence.

We don’t believe there was any issues with Cloudflare at the time.

Apologies for the inconvenience.

Thanks,
James

2 Likes

Hello @jamesevans
Is there any progress, i am getting complaints as the App cannot receive anything because of this

Hi @Ajebz

We have had a new incident this afternoon with our API Management system. We’re working with the supplier to investigate the issue and we’ll update you as soon as we can.

Thanks,
James

1 Like

@jamesevans

If this helps I can see the problem in Cloudwatch…

  • server 1: starts at 2021-09-09T13:36:07.153+01:00 still going on at 2021-09-09T15:35:34.928+01:00
  • server 2: starts at 2021-09-09T14:28:30.877+01:00 still happing at 2021-09-09T15:31:21.643+01:00

I’ve put better error control on “Server 3” but that just means I’ve seen

Exception: Operation timed out after 10001 milliseconds with 0 bytes received in /var/www/html/xxx/TfLStatus/UrlGetContentsMemcached.php:53\nStack trace:\n#0 /var/www/html/xxx/TfLStatus/UrlGetContentsMemcached.php(23): xxx\\TfLStatus\\UrlGetContentsMemcached->urlGetContents('https://api.tfl...')\n#1 /var/www/html/xxx/TfLStatus/TflStatusSimple.php(49): xxx\\TfLStatus\\UrlGetContentsMemcached->cacheTFLcall('https://api.tfl...')\n#2 /var/www/html/xxx/TfLStatus/TflStatusSimple.php(29): xxx\\TfLStatus\\TflStatusSimple->getRainbowBoardFromTfL('https://api.tfl...')\n#3 /var/www/html/xxx/TfLStatus/TflStatusSimple.php(20): xxx\\TfLStatus\\TflStatusSimple->getSimpleArray('https://api.tfl...')\n#4 /var/www/html/view/HtmlView/TflHelper.php(14): xx\\TfLStatus\\TflStatusSimple->getLineStatusArray()\n#5 /var/www/html/view/HtmlView/CreateStaticView.php(90): view\\HtmlView\\TflHelper::generateTflForStation('PBL',

I’m still getting the
Exception: Operation timed out after 10001 milliseconds with 0 bytes received in /var/www/html/xxx/TfLStatus/UrlGetContentsMemcached.php:53 at 15:56

1 Like

Getting multiple reports no cams at https://www.tfljamcams.net.

@jamesevans, any ETA for resolution please?

@briantist, I store a local copy of the json daily.
Is it possible to query api (php) and after n seconds timeout, revert to local json for camera data?

Did you have anything specific in mind to counter api problems?

Hi @OldManBrook - it’s still with the supplier. It’s a PaaS cloud solution, so we’re a bit hamstrung on what we can do ourselves. It’s escalated as a critical incident with the supplier.

Thanks,

James

Always the Gentleman, thanks @jamesevans

All the best

Just checked in and my app is retrieving the feed ok.

:+1: @jamesevans @briantist @Ajebz

2 Likes

Thanks - we have put in a partial workaround, but there are still some services affected. We’ll update when we have a proper resolution.

Thanks,
James

3 Likes

@jamesevans @OldManBrook
Ah yes same here, starting to receive feeds
:slight_smile:

1 Like

@jamesevans

Thanks. I know how hard these things are to sort when the world is watching you every move, millisecond by millisecond .

I’m pleased to say the last error I had was at 17:44:02.546+01:00 and the last actual timeout was at 16:42:10.589+01:00

Fingers crossed my code changes (200ms timeout on the curl requests, transfer of https error codes into json objects) will stop this breaking everything else should this happen again.

2 Likes

@jamesevans The problem has returned this morning at 08:07:17.923+01:00. I have two servers doing an “Operation timed out after 10001 milliseconds with 0 bytes received” on getting https://api.tfl.gov.uk/line/mode/tube,overground,dlr,tflrail,tram/status and my browser still hasn’t returned anything from https://api.tfl.gov.uk/line/mode/tube,overground,dlr,tflrail,tram/status after 60 seconds.

image

Another server started finding them at 07:58:45.620+01:00 and has failed lots of times and it’s only 08:25:11.132+01:00

I’m noting the “Cloudflare Location: London” from my original posting.

Another server started finding them at 07:58:45.620+01:00 and has failed lots of times and it’s only 08:25:11.132+01:00

Hi @briantist,

Apologies again for the inconvenience.

We’re still looking into the root cause of these API Management issues with the supplier. Unfortunately there is little we can do about them until the supplier can fix or advise us how to fix.

We seem to be getting them some mornings around 7:30-8:30 in the morning.

Thanks,
James

2 Likes

@jamesevans Thanks for the update. I will keep my fingers crossed!

1 Like

We’ve put in a workaround that we believe is robust. We’re monitoring the service (as we always do) and so far so good.

1 Like

@james tyvm! I’ve seem nothing in the server logs today!

1 Like