Weblog entry #263 for simonw
#263
How to detect broken browser caching behaviour?
Posted by simonw on Sun 6 Jul 2008 at 00:10
Looking at the logs from a load peak on one of our web servers, a single client seems to have requested the same file many times. In one case, for a common graphic it was requested 800+ times in a 2 hour period (I'd have expected once or twice), all HTTP 304 requests. Many other graphics were similarly requested multiple times.
Clearly this browser has no local caching, or has caching disabled. The User-Agent string suggests IE7, and the user otherwise appears to be using the web site as expected.
I doubt this user alone is the cause of the load peak, but it would seem useful to identify and inform this user that his browser settings are suboptimal for both his benefit and ours. The performance of our site must suck, even when everything is working well server side for this user.
I'm sure it isn't hard to invent our own test for this, but it also struck me that we aren't going to be the first people in the world to want this test, but my attempts to search for such a test failed miserably.
Anyone seem this done before?
Clearly this browser has no local caching, or has caching disabled. The User-Agent string suggests IE7, and the user otherwise appears to be using the web site as expected.
I doubt this user alone is the cause of the load peak, but it would seem useful to identify and inform this user that his browser settings are suboptimal for both his benefit and ours. The performance of our site must suck, even when everything is working well server side for this user.
I'm sure it isn't hard to invent our own test for this, but it also struck me that we aren't going to be the first people in the world to want this test, but my attempts to search for such a test failed miserably.
Anyone seem this done before?
Comments on this Entry
Posted by Anonymous (24.6.xx.xx) on Sun 6 Jul 2008 at 21:18
Have you checked to see what your server sends out?
It may not be the client that's broken, but the server.
There are many methods of setting cache rules - Expires headers, max-age, ETag, etc. If your server is setting these incorrectly (e.g. too-low a max-age) then the browser will re-validate the content for each hit.
There are a number of utilities that will show you the headers. I'm sure you're familiar with at least some of them.
You can also test your site in IE7 using something like http://webpagetest.org/ - a free site that tests any URL in IE7 and shows you the results, both for a first-time load and subsequent loads so you can test cache effectiveness.
It may not be the client that's broken, but the server.
There are many methods of setting cache rules - Expires headers, max-age, ETag, etc. If your server is setting these incorrectly (e.g. too-low a max-age) then the browser will re-validate the content for each hit.
There are a number of utilities that will show you the headers. I'm sure you're familiar with at least some of them.
You can also test your site in IE7 using something like http://webpagetest.org/ - a free site that tests any URL in IE7 and shows you the results, both for a first-time load and subsequent loads so you can test cache effectiveness.
[ Parent | Reply to this comment ]
We have about 20,000 users a week on that application, and we see this behaviour on a only a handful of browsers.
The server could put out more explicit and longer expiry information, but it seems to me that something is very broke is a client requests the same image 800 times in 2 hour that it isn't explicitly set to not cache.
The server could put out more explicit and longer expiry information, but it seems to me that something is very broke is a client requests the same image 800 times in 2 hour that it isn't explicitly set to not cache.
[ Parent | Reply to this comment ]
Well at least it's sending an if-none-match or if-modified-since request (hence the 304 response). It's more of a problem if you have to actually ship the data; or, worst of all I think, if it caches something that it shouldn't have done.
Anyway, I think you need to snoop exactly what the requests look like. Maybe there's some sort of proxy at that address?
Anyway, I think you need to snoop exactly what the requests look like. Maybe there's some sort of proxy at that address?
[ Parent | Reply to this comment ]