An AJAX Caching Strategy

May 3, 2006

The ability of AJAX applications to make an HTTP connection behind the scenes to fetch small bits of information is one of the powerful tools that the modern browser APIs come equipped with. For most browsers (e.g., Firefox 1.5, Safari 2.0, Opera 8.5), the engine behind this task is the window object's XMLHttpRequest (XHR) object. Internet Explorer 5.5-6 has different syntax for this object, such as XMLHttp, but IE 7 is expected to adopt the same object as the other browsers.

Avoiding Willy-Nilly

Making HTTP requests willy-nilly from AJAX applications, however, is almost never a good idea or design decision. The server side of the equation may not be able to handle the flood of requests. The client side of the AJAX application may have some of its requests time out or abort, which will disrupt the user experience that is meant to be AJAX's strength.

Energy Prices

An application that I introduced in a recent article uses XHR to fetch up-to-date oil, gasoline, and other energy prices. These numbers, which are assembled by the U.S. Energy Information Agency (EIA), are in the public domain. My application harvests or "scrapes" them from the appropriate sites. It would be better to connect with an energy web service (HTML scraping is awkward and brittle), but I have not been able to find an open source, free-of-charge one. I'm open to recommendations!

The EIA web pages suit the application's purpose, which is not after all designed to give oil futures traders up-to-the-second data. The application displays the prices in the browser window when the user loads the web page. Figure 1-1 shows what the left side of the page looks like. The left column holds the energy prices, which highlight when the mouse passes over them.

Figure 1-1. Web page doodads fetch energy prices. (Click image for full-size screen shot.)

Each region or div tag on the page that is devoted to a separate price (e.g., oil) has a Refresh button. This gives the user the option to fetch the latest price. The EIA, however, only updates these prices about once per week. Therefore, it would be wasteful to allow these requests to go through if the user only loaded the web page one hour ago. We know that it is highly probable that the price is exactly the same, so we should leave the currently displayed price alone.

Specify a Time Period for Caching Data

There are always a number of viable solutions to such a problem or requirement. The one I chose was to create an object, using the Prototype open source JavaScript library, that keeps track of when a price was last updated by an HTTP request. The object has a property that represents a range, say 24 hours. If the price was fetched less than 24 hours ago, then the object prevents a new request from being initiated and keeps the displayed price the same.

This is an AJAX caching strategy that is designed to periodically refresh data from a server, but only if the client's version of the data has been cached for a specified period. If the application user leaves the page in her browser for more than 24 hours, then clicks a Refresh button, XHR goes out and grabs a new price for display. We want to give the user the ability to refresh the price without reloading the entire application into the browser, yet we also want to cut down on unnecessary requests.

Import the JavaScript

The web application uses script tags in the HTML to import the necessary JavaScript, including the Prototype library. The first xml.com article I wrote on this application shows how to download and set up Prototype.

<head>...

<script src="http://www.eeviewpoint.com/js/prototype.js" type="text/javascript">

</script>

<script src="http://www.eeviewpoint.com/js/eevpapp.js" 

type="text/javascript">

</script>

..</head>

Set Up the Refresh Buttons

First I'll show the code from eevapp.js that I used to get the energy prices. Then I'll show the object that acts as a kind of request filter.

//When the browser has finished loading the web page,

//fetch and display the energy prices

window.onload=function(){

    //generatePrice method takes two parameters:

    //the energy type (e.g., oil) and the id of the

    //span element where the price will be displayed

    generatePrice("oil","oilprice");

    generatePrice("fuel","fuelprice");

    generatePrice("propane",

            "propaneprice");

    

    //set up the Refresh buttons for    

    //getting updated oil, gasoline, and 

    //propane prices.

    //Use their onclick event handlers.    

    $("refresh_oil").onclick=function(){ 

        generatePrice("oil","oilprice"); }

    $("refresh_fuel").onclick=function(){ 

        generatePrice("fuel","fuelprice"); }

    $("refresh_propane").onclick=function(){ 

        generatePrice("propane",

            "propaneprice"); }

    

};

The code comments explain what's going on here. The one odd syntactical segment that needs explanation is the $("refresh_oil").onclick type of expression.

Prototype has a $() function that returns a Document Object Model (DOM) Element object. This Element is associated with the HTML id attribute that the code passes into the $() function as a string parameter. This syntax is the equivalent of writing document.getElementById("refresh_oil"), but it's shorter and therefore handier for the developer. What does "refresh_oil" refer to? This is the id of the Refresh button on the web page. See the left column in the browser screen depicted by Figure 1-1 for a view of the Refresh button.

<input type="submit" id="refresh_oil" .../>

The code sets the onclick event handler of this button to a function that will optionally seek to grab a new energy price. Optionally, meaning that our filter will decide whether or not to launch an HTTP request.

Check the `CacheDecider` Object First

The generatePrice() function first checks with a cache-related object called CacheDecider to determine whether a new HTTP fetch of an energy price can be initiated. Each energy category--oil, gasoline, and propane--has its own CacheDecider object. The code stores these objects in a hash named cacheHash. Therefore, the expression cacheHash["oil"] will return the CacheDecider object for oil prices.

If the CacheDecider or filter allows the code to make a new HTTP request (CacheDecider.keepData() returns false), then the code calls getEnergyPrice() to refresh, say, the crude oil price. The code comments explain exactly what's going on.

/*Check the CacheDecider object to determine

if a new energy price should be fetched*/

function generatePrice(energyType, elementID){

    //set the local variable cacher to the CacheDecider

    //object associated with oil, gas, or propane

    var cacher = cacheHash[energyType];

    //If this object is null, then a CacheDecider object

    //has not yet been instantiated for the

    //specified energy type.

    if(! cacher) {

        //CacheDecider has a parameter that

        //specifies the number of seconds to keep

        //the data, here 24 hours

        cacher = new  CacheDecider(60*60*24); 

        //store the new object in the hash with

        //the energy type, say "oil", as the key

        cacheHash[energyType]=cacher;

        //Fetch and display the new energy price

        getEnergyPrice(energyType, elementID);

        return;

    }

    //The CacheDecider has already been instantiated, so

    //check whether the currently displayed energy price

    //is stale and should be updated.

    if(! cacher.keepData()) { 

        getEnergyPrice(energyType, elementID);}

}

/*

Use Prototype's Ajax.Request object to fetch an energy price.

Parameters: 

energyType is oil, fuel, or propane.

elementID is the id of a span element on the web page;

the span will be updated with the new data.

*/

function getEnergyPrice(energyType, elementID){



  new Ajax.Request(go_url, {method: "get",

      parameters: "priceTyp="+energyType,

      onLoading:function(){ $(elementID).innerHTML=

                              "waiting...";},

      onComplete:function(request){

          if(request.status != 200) {

            $(elementID).innerHTML="Unavailable."

          }   else {

            $(elementID).innerHTML=

            request.responseText;}

      }});



}

Prototype's `Ajax` Object

The AJAX-related code using Prototype's Ajax.Request object is only about ten lines long. The code connects with a server, gets a new harvested energy price from the U.S. government, then displays the price without changing any other part of the web page. The code is, however, somewhat confusing to look at. Here's an explanation.

You are probably used to seeing the XMLHttpRequest object being programmed directly. Prototype, however, instantiates this object for you, including dealing with the different browser types the users will show up with. Just creating the new AJAX object, as in new Ajax.Request, sets the HTTP request in motion. The prototype.js file includes the definition of this object. Since our web page imported that JavaScript file, we can create new instances of this object without jumping through any other hoops. Our code doesn't have to deal with the various browser differences involving XHR.

The first parameter to Ajax.Request is the URL of the server component it is connecting with. For example, http://www.eeviewpoint.com/energy.jsp. The second parameter specifies that this is a GET HTTP request type. The third parameter provides a querystring that will be passed to the server component. The entire URL, therefore, might look like http://www.eeviewpoint.com/energy.jsp?priceTyp=oil for example.

Progress Indicator

A couple of other parameters let the code developer decide what the application should do at different stages of the request processing. For example, onLoaded represents a stage of the request sequence when the XMLHttpRequest.send() method has been called, and some HTTP response headers are available, as well as the HTTP response status. At this stage, our application displays the text "waiting..." where the new price will soon appear. onComplete represents the finished stage of the energy-price request. The function that is associated with onComplete, therefore, determines what happens with the request's return value. The code automatically provides the function with a request parameter that represents the instance of XHR used for the HTTP request.

onComplete:function(request){...

Therefore, in this case, the code gets the return value with the syntax request.responseText. If everything goes well, this property will be the new price, such as 66.07.

The Ajax.Request object also allows the code to take action based on certain HTTP response status values. This is accomplished by checking the status property of XHR (request.status). If the status is not 200 (it could be 404 meaning "Not Found" for example), a value signifying that the request was successful, then the application displays "unavailable" in the space where the price should be.

We could change this behavior so that an unsuccessful request just leaves the current price display the same.

new Ajax.Request(go_url, {method: "get",

    parameters: "priceTyp="+energyType,

    onLoading:function(){ $(elementID).innerHTML=

       waiting...";},

    onComplete:function(request){

        if(request.status == 200) {

           $(elementID).innerHTML=

           request.responseText;}

}});

The `CacheDecider` Object

The final piece of the client-side puzzle is the CacheDecider object. Figure 1-2 is a UML diagram depicting this object. I wrote the object in JavaScript using the Prototype library.

UML class diagram for CacheDecider
Figure 1-2. A Class diagram depicting CacheDecider

Here is the code for the object, which appears in the eevapp.js file.

var CacheDecider=Class.create();

CacheDecider.prototype = {

    initialize: function(rng){

        //How long do we want to

        //keep the associated data, in seconds?

        this.secondsToCache=rng;

        this.lastFetched=new Date();



    },

    keepData: function(){

        var now = new Date();

        //Get the difference in seconds between the current time

        //and the lastFetched property value

        var secondsDif = parseInt((now.getTime() - 

        this.lastFetched.getTime()) / 1000);

        //If the prior computed value is less than

        //the specified number of seconds to keep

        //a value cached, then return true

        if (secondsDif < this.secondsToCache)  { return true;}

        //the data in the cache will be refreshed or re-fetched,

        //therefore reset the lastFetched value to the current time

        this.lastFetched=new Date();

        return false;

    }

}

The prototype.js file includes an object definition for Class. Calling Class.create() returns an object with an initialize() method. Like a constructor method in other languages, JavaScript will call this method every time the code creates a new instance of an associated object.

Our initialize() method for CacheDecider initializes two properties: secondsToCache, which represents the number of seconds to cache a value before another can be fetched; and lastFetched, a JavaScript Date object representing the last time the cached value, whatever it is, was refreshed.

CacheDecider does not store a reference to a cached value; other objects use a CacheDecider to store and check time limits or ranges.

If this is not clear, check the explanation beneath the "Check the CacheDecider Object First" subheading.

CacheDecider also has a keepData() method. This method determines whether the number of seconds that has elapsed since the lastFetched date exceeds the number of seconds that the cached value is supposed to be kept.

In our application, we hold on to an energy-price value for 24 hours, until it can be refreshed with an HTTP request.

cacher = new CacheDecider(60*60*24);

Resources

Prototype:
http://script.aculo.us/
JavaScript code: http://www.parkerriver.com/ajaxhacks/xml_article_js.zip
Java code: http://www.parkerriver.com/ajaxhacks/xml_article_java.zip

If this number of seconds has not been exceeded, then keepData() returns true. Otherwise, the lastFetched property is reset to the current time (since the cached value can now be refreshed via an HTTP request), and the method returns false.

The user can update the data whenever they want if they reload the entire web page. This action will initialize anew all of the objects that we have been discussing.

The Server Component

This application uses a kind of "cross-domain proxy" design pattern, as discussed at this informative website: http://ajaxpatterns.org/Cross-Domain_Proxy. Even though my application is accessing information from the U.S. EIA web pages, the scraping is initiated by a component that derives from the same domain (eeviewpoint.com) as our application web page.

The server component can be written in your language of choice for scraping web-page information: PHP, Ruby, Python, ASP.NET, Perl, Java. I chose Java, being partial to servlet/JSP programming and the Tomcat container. The Ajax.Request object connects with a JSP, which uses Java utility classes to harvest the energy information from U.S. EIA sites. See the "Resources" box to learn where to download the Java classes I used.