Friday, July 22, 2005

AJAX and Web Statistics

Fredric Paul of Techweb brought up a new and interesting perspective on AJAX: how are web statistics and web advertising going to be handled if there are no more pages. The conclusion of the article in my own words: 'there is a problem, but we might find a way to solve it'. I was surprised by this article: I feel there are so much MORE possibilities with AJAX, not less.

With www.backbase.com I've been closely involved in creating and managing one of the largest corporate sites that are build with AJAX technology (using the Backbase software of course). We have around 4000 unique visitors per day, from over 160 countries in the world. For web statistics We've been working with standard log analysis software since the start. We're using Sawmill, but any other tool would also work. It gives us all the information that you also get on a 'normal' website. Reading the article on Techweb could make you wonder how this is possible.

I do understand the difficulty of grasping AJAX: it a popular topic, maybe even a hype. Many people write about it, and not everyone has a solid story. So certain aspects of AJAX that are repeated over and over again are catching on. One of the aspects is the death of the page refresh: with AJAX you can just load new information into an existing web page. In the Techweb article it is therefore assumed that there are no pages anymore, because the URL stays the same. This is not entirely true: you are still loading new information, it's just not an entire page. The new information is also a page on a web server, and it is therefore logged. On the backbase website we've stored all content-pages in one directory, in a hierarchical structure. This is how it looks:

content
products
product_overview.xml
dotnet_edition.xml
j2ee_edition.xml
community_edition.xml
benefits
etc.

This means that you can easily see which information is requested by the user, and this is all stored in the web server logs. For web analysis software it's irrelevant whether this is a full page or only the content. It is important that 1 file contains one article though.

An analogy: the frameset tag

This topic brings back memories of seemingly ancient HTML tag: the frameset. Most sites with frames also had 1 URL, and new content was loaded in the 'content' frame. This is very similar to a content-rich AJAX site, and it worked fine with web statistics software.

Moving beyond basic statistics with AJAX

Of course, with framesets you could only load HTML pages. With AJAX you can load anything, such as unformatted XML data. In that case you might conclude that you cannot track how users are navigating through this XML data. This is not really a problem. First of all, the amount of data you can load at once into the browser is limited: it takes too long to download large amounts of data, and the browser could run out of memory. Nevertheless, with AJAX you move more intelligence to the browser: so you can easily track user behavior on the client, and periodically send an update to the server. Agree, there are no standard solutions for this at the moment, but that's definitely going to change. Let's look at an interesting scenario that we've worked on a while ago.

A financial services company wanted to optimize forms on their website. Their ultimate goal was to optimize revenue per form, thus minizing mistakes made by the user. They wanted to know how many times a certain form was filled in incorrectly, per user and on average. This information would allow them to continuously improve the form. The statistics they wanted weren't on a page level, but on a form field level. This is exacly what is possible for Rich Internet Applications. It's provides the opportunity to bring statistics to the next level.

Advertising & AJAX

I didn't understand the message of the Techweb article regarding AJAX and online advertising. Currently advertisements are typically loaded per page. With AJAX you have flexibility to decide when a new advertisement is displayed, you don't have to link it to a page refresh or loading of new information at all. You could put a timer on the advertisements: a new advertisement every minute for example. I don't get the point of the Techweb article: can someone explain it to me?