Fixing AJAX: XMLHttpRequest Considered Harmful

November 9, 2005

AJAX applications wouldn't be possible (or, at least, wouldn't be nearly as cool) without the XMLHttpRequest object that lets your JavaScript application make GET, POST, and other types of HTTP requests from within the confines of a web browser. All of the most interesting AJAX applications that have appeared in the past couple of years use the XMLHttpRequest object extensively to give users a responsive in-browser experience without the messiness of traditional HTML forms posting.

But the kind of AJAX examples that you don't see very often (are there any?) are ones that access third-party web services, such as those from Amazon, Yahoo, Google, and eBay. That's because all the newest web browsers impose a significant security restriction on the use of XMLHttpRequest. That restriction is that you aren't allowed to make XMLHttpRequests to any server except the server where your web page came from. So, if your AJAX application is in the page http://www.yourserver.com/junk.html, then any XMLHttpRequest that comes from that page can only make a request to a web service using the domain www.yourserver.com. Too bad -- your application is on www.yourserver.com, but their web service is on webservices.amazon.com (for Amazon). The XMLHttpRequest will either fail or pop up warnings, depending on the browser you're using.

On Microsoft's IE 5 and 6, such requests are possible provided your browser security settings are low enough (though most users will still see a security warning that they have to accept before the request will proceed). On Firefox, Netscape, Safari, and the latest versions of Opera, the requests are denied. On Firefox, Netscape, and other Mozilla browsers, you can get your XMLHttpRequest to work by digitally signing your script, but the digital signature isn't compatible with IE, Safari, or other web browsers.

Solutions Worthy of Paranoia

There is hope, or rather, there are gruesome hacks, that can bring the splendor of seamless cross-browser XMLHttpRequests to your developer palette. The three methods currently in vogue are:

Application proxies. Write an application in your favorite programming language that sits on your server, responds to XMLHttpRequests from users, makes the web service call, and sends the data back to users.
Apache proxy. Adjust your Apache web server configuration so that XMLHttpRequests can be invisibly re-routed from your server to the target web service domain.
Script tag hack with application proxy (doesn't use XMLHttpRequest at all). Use the HTML script tag to make a request to an application proxy (see #1 above) that returns your data wrapped in JavaScript. This approach is also known as On-Demand JavaScript.

The basic idea of all three hacks is the same: fool your user's web browser into thinking that the data is coming from the same domain as the web page.

A word of caution here: there is a good reason why XMLHttpRequests are restricted. Allowing them to freely access any domain from within a web page opens up users to potential security exploitation. Not surprisingly, these three hacks, which offload the request to your web server, potentially threaten to disparage your web server's identity, if not its contents. Caution is advised before deploying them.

Application Proxies

An application proxy is a program on your server that handles web service requests for AJAX client applications. You can write the proxy in your favorite programming language. A simple one is easy to write in PHP. This one is for Yahoo's REST web services called proxy_curl.php. For security reasons, I've hardcoded the domain of the web service in the proxy. This proxy requires that your PHP implement have the curl extension installed.

// Hardcode the hostname

define('HOSTNAME', 'http://api.local.yahoo.com/');

// Get the REST call path from the AJAX application

$path = $_GET['path'];



// The URL to fetch is the hostname + path

$url = HOSTNAME.$path;

// Open the Curl session

$session = curl_init($url);

// Don't return HTTP headers. Do return the contents of the call

curl_setopt($session, CURLOPT_HEADER, false);

curl_setopt($session, CURLOPT_RETURNTRANSFER, true);

// Make the call

$xml = curl_exec($session);



// The web service returns XML

header("Content-Type: text/xml");



echo $xml;

curl_close($session);

This is a minimal proxy implementation. Since the proxy is doing the actual web services requests, there's a lot of information about the user state and the success or failure of the request that could be returned to the AJAX application. A few things that you'd want to consider returning or managing are: handling secure (e.g. HTTPS) requests, returning any cookies, handling HTTP POSTs (this one only does GETs), returning the HTTP status code, and managing multiple requests (if your AJAX application makes multiple asynchronous requests, you'd want to identify which request you're receiving).

To call the proxy from your AJAX applications using XMLHttpRequest, make the XMLHttpRequest back to your own server and add the path to the web service as a variable to the web service call. Since I fetch $_GET['path'] in the proxy above, it's expecting that you'll pass the parameter path in your request. Here, I'm making a request to Yahoo's Geocode REST service. Note that since I'm sending the REST URL to a proxy, it will get decoded twice. So, it's necessary to do twice the encoding that you'd normally do with the special characters in the URL using JavaScript's encodeURI() function.

// Call Yahoo's Geocode REST web service

proxy_host = 'www.yourserver.com'; 



// Encode the parameter to the Geocode REST call

address = encodeURI('1000 S Congress Avenue, Austin, TX'); 



// Include the path to the web service and double encode the path

path = "MapsService/V1/geocode?appid=YahooDemo&location=" + address; 

proxy_request = encodeURIComponent(path); 



// The actual request

proxy_request = "http://" + proxy_host + "/proxy_curl.php?path=" + proxy_request; 

....

....

// Make the request

.....xmlhttprequest(proxy_request)....

Apache Proxy

Using an Apache proxy is the easiest and cleanest way of getting around these restrictions. However, it requires that you have access to the main Apache httpd.conf file as well as have Apache's mod_proxy extension loaded. Apache proxy directives are not allowed in local .htaccess files so this method is not a good choice for developers using shared hosting services.

# Pass the call from http://www.yourserver.com/call to http://api.local.yahoo.com


ProxyPass    /call/    http://api.local.yahoo.com/


# Handle any redirects that yahoo might respond with


ProxyPassReverse    /call/    http://api.local.yahoo.com/

Another way to do this is to use Apache's mod_rewrite using the passthru directive:

RewriteEngine on 

RewriteRule ^/call/(.*)$ http://api.local.yahoo.com/$1 [P]

(Note: this rewrite rule may be broken across more than one line in your browser; but mod_rewrite rules won't work that way, so be careful.)

The Apache proxy approach is clean and simple. Consider a call to the Yahoo Geocode REST Web service:

http://api.local.yahoo.com/MapsService/V1/geocode?appid=YahooDemo&location=78704

With either of the Apache proxy examples functioning, your AJAX application can make a call to:

http://www.yourserver.com/call/MapsService/V1/geocode?appid=YahooDemo&location=78704

and the request will be seamlessly forwarded, and the results returned, to your AJAX application.

Script Tag Hack, or On-Demand JavaScript

The script tag hack is a more complicated approach that involves dynamically generating an HTML script tag and using the src attribute of the tag to make the request. It never makes an XMLHttpRequest, but it's worth looking at because it provides a fairly standard way of making web service requests from within an AJAX application. The code I talk about below is adapted from Darryl Lyons' blog posting on the subject.

The HTML script tag can only return JavaScript. So, to make this approach work, we need to modify the application proxy above to return JavaScript. Two str_replace statements are used to encode any single quotes or newlines that may be in the XML -- they will cause JavaScript scripts to break. Finally, the Content-Type header is changed from text/xml to text/javascript. I'll call the new proxy proxy_script.php (here, the proxy returns XML, but we could also return JSON or other data encodings as well). The XML is placed into a JavaScript global variable -- in this case I've given it the imaginative name xml:

...

...

// Make the call

$xml = curl_exec($session); 



// encode the returned XML as a Javascript variable

$xml =str_replace("'", "&#039;", $xml); 

$xml =str_replace("\n", "", $xml); 

$xml = 'var xml = \''.$xml.'\';';

// The web service returns javascript

header("Content-Type: text/javascript");

echo $xml;

...

...

To make a call, you dynamically create a script tag in your browser's DOM and then point the src attribute at the proxy. To demonstrate this as simply as possible, I created a short web page:

<body>

<a href="javascript:getLocations()">Click this link</a> to dynamically create a script tag and make 

a web service call. A button will appear below after the call has finished</p>

<br/><br/>

<div id="locationData"></div>

</body>

Clicking on the link above will call the function getLocations() which will dynamically create a script tag and retrieve a web service. The function getLocations() calls the function getDataFromServer():

function getLocations() {

     getDataFromServer("ScriptTagID","http://localhost/proxy_script.php");

}</p>

The function getDataFromServer takes as arguments the name of the script tag that will be dynamically generated (and later destroyed) and the prefix of the URL to be fetched. getDataFromServer() does all the real work. The script tag in IE 5 and 6 can fetch data asynchronously using a proprietary mechanism described here. To encapsulate that behavior, I check for the presence of an IE browser using the same definition function as Sarissa, renamed to BROWSER_IS_IE. For all other web browsers, the script tag essentially fetches data synchronously, though it's possible to attach pseudo-asynchronous properties to it (this is left as an excercise for the reader).

After the data is fetched via the newly created script tag, the callback function, putXMLhere(), is called, and the fetched data is available.

<script>

 var callback = "putXMLhere();";

 

function getDataFromServer(id, url) {



    // Fetch the element pointed to by the id. If it exists, we destroy it so we can create a new one.

    oScript = document.getElementById(id);



    // Point at the script tag, if it exists

    var head = document.getElementsByTagName("head").item(0);

     // Destroy the tag, if it exists

    if (oScript) {

    // Destory object

    head.removeChild(oScript);

    }<p>    // Create the new script tag

    oScript = document.createElement("script");



    // Setup the src attribute of the script tag

    sendPath = encodeURIComponent("/MapsService/V1/geocode?appid=YahooDemo&location=78704");

    wholeurl = url + "?path=" + sendPath;

    oScript.setAttribute("src", wholeurl);



    // Set the id attribute of the script tag

    oScript.setAttribute("id",id);



    // Create the new script tag which causes the proxy to be called

    head.appendChild(oScript);

    // Asynchronous script tag properties -- a proprietary IE "feature"

    if (BROWSER_IS_IE) {

        if  (oScript.readyState == "loaded") {

        eval(callback);

        oScript.onreadystatechange = null;

       } else {

        oScript.onreadystatechange = CheckAgain;

       }

    // All other web browsers just do the callback function

    } else {

       eval(callback);

    }

}

 

// Used by IE to handle state changes

function CheckAgain() {

  if (oScript.readyState == "loaded") {

      eval(callback);

      oScript.onreadystatechange = null;

    } 

}</p>// Once the script tag has loaded the data, it's available in the global Javascript variable "xml" which was sent from the proxy.

 function putXMLhere() {

     ohandle = document.getElementById("locationData");

     ohandle.innerHTML = ohandle.innerHTML + "<form><input type='button' value='View XML' onClick='alert(xml); return true;' /></form>";

  }

</script>

Further Plumbing

All of these approaches will provide a seamless, cross-browser AJAX experience for your users while they access third-party web services. But always check with your friendly security administrator before deploying them.

Yes, it's all been done before. These three sites were my major sources of inspiration (or plagiarism): :-)

Darryl Lyons' page on dynamic script tag usage was the inspiration for my investigation.

Premshree Pillai's discussion of the Apache proxy got me looking at that territory.

Think about Cross-Domain Mediation using portlets -- Cross-Domain Mediation is a nice name for "proxy."

Example code for this article: scripthack.zip