Tag Archive for 'acceleration'

Zenphoto plugin: HTTP Cache Control

httpCacheControl is a plugin that makes your Zenphoto pages cacheable in order to increase site performance and minimize resource usage.

This is a beta release. httpCacheControl has been tested on my private Zenphoto installation but not on a “live” site. Please send me any feedback, suggestions, improvements, criticisms.

Making pages cacheable by browsers, proxies, HTTP accelerators can significantly increase your site’s performance and eliminate wastage of resources (CPU cycles, memory, bandwidth, time). While caching might be a complicated matter for sites with time-critical or user-sensitive information, it is a great performance-booster with hardly any side-effects for relatively static sites. Zenphoto is just one such application as it primarily serves static images.

Even if you cannot allow the possibility of visitors ever seeing stale content, this plugin can improve your site’s performance. httpCacheControl can make your pages cacheable but require that visitors revalidate their cache on every single visit. This plugin can determine if the requested page changed since the last visit using a fraction of the resources it would take to process, output, and transmit the entire page. Thus, within a few milliseconds, httpCacheControl can tell the visitor, “this page hasn’t changed since your last visit; use your cached copy,” or “this page has changed; I will now send you a fresh copy.”

httpCacheControl is designed to be used in the album.php, albumarchive.php, image.php, index.php of your theme; in other words, the ZP_ALBUM, ZP_IMAGE, ZP_INDEX contexts. The function doConditionalGet() is required and included in http_cache_control.php. (You can read about doConditionalGet() here and download it separately.)

Example Usage:
Insert the following at the top of index.php.

< ?php
include_once('http_cache_control.php');
// get mtime of this file, cacheable by all, fresh for 1 day
httpCacheControl(__FILE__, 'public,max-age=86400');
?>

This will make your index.php cacheable by all and considered fresh for 1 day. After 1 day, the cache will revalidate with Zenphoto and get a fresh copy if needed.

The first required parameter is the path to the file that should be used to calculate Last Modified date, usually “__FILE__”.

The second optional parameter is the string of valid cache directives for the “Cache-Control” header; the value of max-age, if present, will automatically be used to set the “Expires:” header. The parameter defaults to “public,must-revalidate”, which makes the page cacheable by anyone w/ no possibility of stale content. See RFC2616#14.9 for a list of possible cache directives.

How it works:
This plugin is based on the idea that there are a finite number easily accessible objects that can cause Zenphoto pages to change. Changes in the following objects can cause Zenphoto pages to change. If none of these objects changed, chances are that the output of the current PHP page did not change either.

  • [1] this file, http_cache_control.php (changes in cache control behavior)
  • [2] the file indicated in the first parameter of the function call httpCacheControl($file), generally the theme file that called this function (theme changes)
  • [3] the “options” table in the database (configuration changes)
  • [4] the Gallery directory, Album directory, or Parent directory of the current image depending on the current context (new image or album uploaded)
  • [5] the comments of the current page

Note that checking if these objects have changed is much faster than executing the entire PHP page and checking if the output of the page has changed. Thus, even if you instructed caches to revalidate on every request (Cache-Control: no-cache), this plugin will improve your site’s performance because execution of the entire script is avoided if the given page did not change.

This plugin generates Last Modified dates and ETags of pages using the factors listed above, and compares with the If-Modified-Since and If-None-Match request headers sent by the client to determine if Zenphoto should return a “304 Not Modified” header or serve up a full page.

The Last Modified date of a Zenphoto page is determined by taking the most recent of the Last Modified dates of [1], [2], [3], [4], [5].

The ETag of a Zenphoto page is set to the MD5 hash of the string concatenation of

  • a) the URL used to request this page ($_SERVER[’REQUEST_URI’])
  • b) the version number of this Zenphoto installation
  • c) the Last Modified date of [1]
  • d) the Last Modified date of [2]
  • e) the Last Modified date of [3]
  • f) the Last Modified date of [4]
  • g) the Last Modified date of [5]

Performance:
According to my rudimentary profiling, execution of httpCacheControl generally takes less than 10 milliseconds on my shared hosting service. Thus, in the case where your client has a good cached copy, this plugin cuts your page’s execution time down from hundreds of milliseconds to a <10ms. In the case where your client has a stale cache and needs a fresh copy, your page takes an extra 10ms to load. Also remember that time is not the only resource saved.

It should be most effective if you deploy a reverse proxy or HTTP accelerator like Squid or Varnish in front of your web server.

Download:

zenphoto_http_cache_control.zip

Known issues:

  • Although the plugin can determine the creation date of the most recent comment of a page, it cannot tell if the comment is edited subsequently because the edit date doesn’t appear to be stored in the database. Thus, a page might be considered fresh even if one of the comments has been updated since the last visit.

Changelog:
December 01, 2007

  • added support for comments; last modified time calculations include the date of the most recent comment

November 30, 2007

  • return mtime in UNIX timestamp instead of RFC1123 format for more flexibility

November 28, 2007

  • first public release

If you liked this post, please subscribe to my feed. Thanks for visiting!

My first code release: Conditional GET PHP function doConditionalGet()

I’ve started a section on this blog called “My Code” that features code I’ve written. My first release is doConditionalGet(), a generic PHP function that implements HTTP’s Conditional GET mechanism. This function facilitates making dynamic PHP pages cacheable by HTTP accelerators, browsers, or other caches. Read more about it here.

Suggestions, comments, criticisms are very much welcome!

Conditional GET PHP function

doConditionalGet() is a generic PHP function that implements HTTP’s Conditional GET mechanism.

Sending the full contents of a webpage over the Internet whenever a client requests that page is a waste of resources (CPU, memory, bandwidth, time) if the client has retrieved this page before and the content has not changed since. The inventors of HTTP came up with an idea to prevent such wastages. In a single query, the client (browser, proxy, HTTP accelerator, etc.) can say, “I have a copy of this page from a previous visit. If the page has changed since my last visit, give me a fresh copy. Otherwise, tell me that the page is unchanged and give me nothing.”

The HTTP Conditional GET mechanism also allows web servers to instruct clients to ask if the content of the requested page has changed on every visit (Cache-Control: must-revalidate), or to not ask and simply assume that the content is unchanged for a certain amount of time (Cache-Control: max-age=X).

Web servers like Apache automatically handle the Conditional GET mechanism for static objects like HTML, CSS, JavaScript, image files. This is not the case with dynamically generated content like PHP pages. Sometimes, this is a good thing because we do not want browsers or proxies to cache dynamic content that changes on every visit, or content that is sensitive or personal and should not be shared by more than one client. However, PHP-generated content is often no different from normal HTML pages that seldom change, or can be allowed to grow “stale” for a certain amount of time. Furthermore, if revalidation of a PHP page can be performed much more quickly than executing the entire page, even revalidation on every visit might make sense, improving site performance while ensuring that visitors never see stale content.

Usage
doConditionalGet() sends the “304 Not Modified” header and aborts script execution if the client has a good cached copy of the given document, otherwise it returns control to your main script to output content like normal. You must provide the function with a Last Modified date, ETag, and optionally a freshness duration for the given document. doConditionalGet() must be called before any content has been outputted. You can either call this function at the top of your script or use the ob_start() function to buffer and delay the output of your script till the end.

Usage Example

// get mtime of some file
$mtime = filemtime($some_file);
// set ETag to MD5 hash of filename+mtime
$etag = '"'.md5($some_file.$mtime).'"';
// make cacheable by anyone, fresh for 1 minute
$cache_control = 'public,max-age=60';
doConditionalGet($mtime, $etag, $cache_control);

Please see the script file for more in-depth documentation. Also see the Zenphoto plugin httpCacheControl for a “real-life” example.

Download

do_conditional_get.zip

Changelog
November 30, 2007

  • return mtime in UNIX timestamp instead of RFC1123 format for more flexibility

November 28, 2007

  • rearranged logic of function to improve performance
  • can now set any Cache-Control header with the third parameter
  • automatically set Expires header with max-age value if available

November 27, 2007

  • use GMT instead of local time
  • removed ETag generation to make function more generic
  • comply with the updated RFC2616 #13.3.4 when both IMS and INM headers exist (see discussion)
  • code cleanup and documentation

Credit
Based on the work of

References