Jep's blog: September 2005

Thursday, September 22, 2005

JSON vs XML

Dave Johnson has an interesting post about the preferred data transport format for AJAX: is it XML or JSON? He prefers XML. So do I. Dave emphasizes the importance of XML on the client: he uses a datagrid to illustrate this. Personally I like the clean architecture when you just load XML to the client and transform it to whatever you like using XSLT. The XML family of standards seems to be a very good fit with the browser world, and - according to Dave's article - the performance is also pretty good.

Monday, September 12, 2005

AJAX: reducing latency with a CDN

In my previous article about AJAX and Latency I talked about the effect of high network latency on AJAX applications. My suggestion was that well-designed AJAX apps can still be more responsive than traditional web applications, even with high latency. Nevertheless, a low-latency connection is always better than a high-latency connection: so on backbase.com we're improving the infrastructure. First we've measured performance, and now we're optimizing the site itself, and we're considering a Content Distribution Network. A lot of this can also be applied to regular websites, but I'll highlight some AJAX peculiarities along the way.

Measuring latency

So now and then we get complaints about the speed of the Backbase website, so we've looked at several well-known performance management tools, such as Gomez and Keynote: they usually measure entire pages, including images, CSS and JavaScript. Although the Gomez service can handle JavaScript, it has problems with some AJAX sites, including our own: so we couldn't reap the benefits of this advanced service. Therefore we've settled for Watchmouse, which offers affordable availability and performance monitoring for single files from 14 world-wide locations. Watchmouse gives us the following data:

Connect time: the connection is made
First-byte time: the first byte of the response arrives
Download time: the entire file is downloaded

We measured the following averages (milliseconds):

	Connect	First byte	Download	Kb/sec
Netherlands	5.47	111.41	298.28	237.92
Sweden	28.50	154.00	413.83	149.81
Italy	49.00	225.00	448.57	185.58
New York	134.56	368.11	892.11	92.89
Florida	118.33	318.83	895.00	73.43
California	169.00	419.83	977.00	70.63
Texas	130.00	343.57	1724.86	29.72
Singapore	361.67	869.00	2102.00	32.62
Australia	369.37	763.17	2040.83	33.07

So the farther you get from our server in Amsterdam, the greater the latency. With latency I mean first-byte time (although you can define latency in different ways). This is a measurement for 1 file. I've made a rough calculation for the total download time of our home page:

download time = (# of files / 2 concurrent HTTP connections) * first-byte time + (download size / download speed)

I'm not sure if this is entirely correct, but it seems to give a good approximation, for example for California:

(70 / 2) * 0.41983 + 200 / 70.63 = 14.7 (latency) + 2.8 (download time) = 17.5 seconds

So the latency is about 4 times more important than download size. It also highlights that most web hosting providers offer adequate downloads speeds: our provider is small, but the connection is fast. We also tested the speed of some large providers, and this didn't make much of a difference in speed. Our action points are:

Reduce the number of file requests for the home page
Implement world-wide caching (CDN)

Reduce the number of files

When we first measured, the number of files was about 87. We've already reduced this number to 68 files by merging many CSS files into 2 files, merging many HTML files, and combining some images. As you can see in the table below, 2/3 of the files are images: because we don't want to change the design, we can't improve much further. We can still merge the JavaScript files, but then we're more or less done. So if we reduce from 87 files to 60 files we already have a 30% improvement. Then it's up to caching for further improvements.

	Size (Kb)	Size (%)	# of files	% of files
Images:	117197	62.9%	45	66.2%
HTML:	16695	9.0%	11	16.2%
CSS:	8615	4.6%	2	2.9%
JavaScript:	43753	23.5%	10	14.7
TOTAL:	186,260		68

World-wide caching (CDN)

Even if we have a home page with 60 files, the impact of latency is still considerable. Looking at the characteristics of latency, we have to get the files as close as possible to our web site visitors. Enter 'CDN', or Content Distribution Network. A CDN caches the most-requested files from your site on servers scattered across the globe, and they send visitors to the nearest cached copy of your web site. If files are not in cache or if files are dynamic, they are retrieved from the origin server, usually over an optimized connection.

There are a couple of things to consider when using a CDN with an AJAX website:

JavaScript cannot load files from other (sub-)domains
Some CDNs are more suitable for AJAX sites
AJAX actually works very well with CDNs!

The first item is the security-across-domains issue: if your page is on www.domain.com, AJAX cannot load files from www.whatever.com or whatever.domain.com (only with a trick in IE). Why is this important? Because several CDNs use an image.domain.com subdomain that you use to load all static files. This subdomain refers to the dynamic DNS server of the CDN network. An alternative is to have www.domain.com hosted by the CDN's DNS servers: now you should specify which files should be cached, and which files should be loaded from the origin server. I haven't tested this myself yet, so any feedback is welcome.

The first large CDN was Akamai: they now have 14,000 servers in 65+ countries. The last couple of years a new type of CDN optimized for rich media was introduced, for example by Limelight. Those providers have fewer servers, but connections optimized for fast delivery of large files with high quality of service. A third approach focuses on world-wide distribution of intranet applications (ERP, CRM, etc.). They link up a limited number of office locations: Netli is the pioneer in this area. In our case we have many small files delivered to a lot of visitors from many different locations: this seems to favor the Akamai model. Mirror Image (and possibly Savvis) offer a similar service. Again, feedback is welcome, as we are still selecting a CDN vendor.

Although AJAX creates some constraints for CDN usage, it's also more effective compared with regular websites. This is because you can 'assemble' the page on the client-side. In traditional sites, the page is either dynamic or static (I define static as 'the same for each user', so it could dynamically generated on the server, but just not personalized). If it's static it can be cached, if it's dynamic it cannot be cached. An AJAX page can be partly static, and partly dynamic, where the dynamic part is retrieved with XMLHttpRequest. This is also the case on Backbase.com: only 1 file is dynamic, the login dialog. So 98% of the files can be cached, except for this 1 file. Pretty good.

To summarize

If you are running a global AJAX website in which you have a lot of static file requests, first try to reduce the number of files, and then consider a 'traditional' CDN to further improve performance. Be aware of the 'security-across-domain' issue. When we've implemented a CDN I can tell you more about the results. If you have good tips for selecting a CDNs, let me know!

Tuesday, September 06, 2005

Backbase in Top 10 Ajax apps - twice

Dan Grossman, a venture capitalist from Venrock Associates created a Top 10 of Ajax applications. Other than the Ajax app list on Ajax Patterns, this list has a ranking from 1 to 10. It does not include applications by large companies such as Google or Yahoo!

Number 1 is Kiko, the online calender, and number 2 is the Backbase RSS Reader, our newest demo application. The Backbase Portal (one of our most long-standing demos) is featured on place 8, and the recently updated Backbase Explorer is number 3 on the list of honorable mentions.

Personally I also like the Backbase Windows starter kit, although it could be expanded a bit more. The embedded movie is quite funny, by the way.

Monday, September 05, 2005

Improve the Backbase page on Wikipedia...

We need your help: the Backbase page on Wikipage could be deleted if we don't add more content to it. As Wikipedia is a collaborative effort (and not a marketing brochure), you can help by adding text to this page. To contribute, just go to http://en.wikipedia.org/wiki/Backbase, and click on 'edit this page'. Just make sure the information has a neutral point of view and would be interesting for anyone who wants to learn more about Backbase.

Thank you!

Sunday, September 04, 2005

AJAX Latency problems: myth or reality?

I've read many articles on AJAX and network latency in the last year, but every article seems to claim to a different thing: one talks about latency as a problem specific to AJAX, while others claim that especially AJAX applications can overcome network latency problems. While yet others take a hands-on approach and build tools to simulate network latency on localhost. Confused...

So I searched Google for 'AJAX Latency', and read many of these articles again. Below is short overview of what I've found.

The first hit is from Wikipedia (3 Sept 2005):

"An additional complaint about Ajax revolves around latency, or the duration of time it takes a web application to respond to user input. Given the fact that Ajax applications depend on network communication between web browser and web server, any delay in that communication can introduce delay in the interface of the web application, something which users might not expect or understand."

They refer to 'Listen kids, AJAX is not cool':

"If you writing a user interface, make sure it responds in 1/10th of a second. That’s a pretty simple rule, and if you break it, you will distract the user."

I agree with one thing: you do want a response in 1/10th of a second. But is this realistic? Cédric Savarese points out that - even when it's not 1/10th of a second - the user expects to see something loading: he suggests using a loading indicator as a replacement for the traditional page refresh. But then he also mentions another perpective:

"What happens really is that XmlHttpRequest is not used for what it is good at: asynchronous, behind-the-scene, access to the server, but in the context of a synchronous transaction by the user. Users want instant feedback from an application, and a better way to achieve that is by freeing the application from its over-reliance on the server."

And Michael Mahemoff (of Ajax Patterns fame) adds the following:

"It's not an all-or-none thing. With AJAX, you can continue to download in the background if you want. "

So they suggest moving more intelligence to the client, and loading data in a smart way, ideally asynchronously without having the user wait for it: I think that's really what that first 'A' of AJAX is all about. But this could mean that you're loading more data on application startup, using precious bandwidth. Or maybe you're even loading data that the user will never see (pre-loading). I found a very punchy quote in a discussion on TSS:

"These days, bandwidth is cheap, latency expensive."

I can confirm this from my own experience: on www.backbase.com around 80% of the download time is caused by latency, and 20% by download speed (bandwidth). So you can better load some more data beforehand than have very frequent requests for small files.

We have measured latency and download speed with a global performance measurement system (more about this in part 2), but that's not convenient to use during development because it takes at least a couple of days before you have enough data. Then I read the article by Harry Fuecks:

"(...) alot of AJAX development is happening @localhost so these problems simply aren’t showing up."

So he has created an AJAX Proxy to simulate a high-latency environment on localhost: kudos! In another article he indicates that the use of synchronous requests should be avoided at all times. But even asynchronous requests need to be handled carefully: "Can multiple asynchronous XMLHttpRequests be outstanding at the same time?", asks
Weiqu Gao. Harry again did some research, and came up with a couple of recommendations:

Avoid Uncaught Exceptions: don't call the XMLHttpRequest object when it's still processing another request
Implement a solution for timeouts: XMLHttpRequest doesn't handle this automatically as do other socket client APIs
Make it possible to abort a request gracefully
Make sure that responses arrive in the right sequence

These are the technical aspects. I have found few articles about the usability aspects (probably because I didn't look very well). Marco Tabini quotes on his weblog: "one of the fundamental elements of AJAX programming (...) was to always give your users the appropriate feedback, so that they know when something happens." I agree with that: the user should not be surprised by unexpected behavior of the user interface. Interaction Designers should therefore also be aware of some of the latency issues. For them I would summarize it as follows:

If a user's action causes a server request, don't expect a response within 1/10th of a second: consider showing a 'loading' message
Specify the usage patterns of an application so that the developers know how preloading of data can best be implemented (think Google Maps, which prefetches maps just outside the border of the screen)
Be careful with 'hidden' functionality such as auto-save functionality, because it might conflict with other actions the user performs: cooperate closely with the developer(s) to avoid usability problems.
Clearly specify the sequence of events, e.g. 'action 1 has to be completed before the user should be allowed to start with action 2', which gives developers relevant information to avoid concurrency issues.

And to make the link between technology and usability, I have a quote of Jonathan Boutelle, who already understood all of this more than a year ago:

"Predownloading data is critical to providing low-latency experiences. But blindly downloading data without consideration for how likely the user is to need it is not a scaleable approach. RIA architects will have to consider these issues carefully, and ground their decisions about preloading in user research, in order to create superior user experiences."

After reading all of this I've come to a tentative conclusion: network latency is an important issue to consider during the implementation of an AJAX Application, both by the developer as well as by the interaction designer. If you make the wrong decisions, usability can be terrible. If you make the right decisions, AJAX will significantly improve web application usability. It is still a tentative conclusion, because I'm pretty sure I haven't read all the relevant articles: let me know what your thoughts are.

Jep's blog