Monday, September 12, 2005

AJAX: reducing latency with a CDN

In my previous article about AJAX and Latency I talked about the effect of high network latency on AJAX applications. My suggestion was that well-designed AJAX apps can still be more responsive than traditional web applications, even with high latency. Nevertheless, a low-latency connection is always better than a high-latency connection: so on backbase.com we're improving the infrastructure. First we've measured performance, and now we're optimizing the site itself, and we're considering a Content Distribution Network. A lot of this can also be applied to regular websites, but I'll highlight some AJAX peculiarities along the way.

Measuring latency

So now and then we get complaints about the speed of the Backbase website, so we've looked at several well-known performance management tools, such as Gomez and Keynote: they usually measure entire pages, including images, CSS and JavaScript. Although the Gomez service can handle JavaScript, it has problems with some AJAX sites, including our own: so we couldn't reap the benefits of this advanced service. Therefore we've settled for Watchmouse, which offers affordable availability and performance monitoring for single files from 14 world-wide locations. Watchmouse gives us the following data:

  • Connect time: the connection is made
  • First-byte time: the first byte of the response arrives
  • Download time: the entire file is downloaded

We measured the following averages (milliseconds):


Connect    First byte    Download   Kb/sec   
Netherlands   5.47 111.41298.28 237.92
Sweden28.50154.00413.83 149.81
Italy49.00225.00 448.57185.58
New York134.56368.11892.1192.89
Florida118.33318.83895.0073.43
California 169.00 419.83977.0070.63
Texas130.00343.57 1724.8629.72
Singapore361.67869.00 2102.00 32.62
Australia369.37763.17 2040.8333.07

So the farther you get from our server in Amsterdam, the greater the latency. With latency I mean first-byte time (although you can define latency in different ways). This is a measurement for 1 file. I've made a rough calculation for the total download time of our home page:

download time =
(# of files / 2 concurrent HTTP connections) * first-byte time
+ (download size / download speed)

I'm not sure if this is entirely correct, but it seems to give a good approximation, for example for California:

(70 / 2) * 0.41983 + 200 / 70.63
= 14.7 (latency) + 2.8 (download time)
= 17.5 seconds

So the latency is about 4 times more important than download size. It also highlights that most web hosting providers offer adequate downloads speeds: our provider is small, but the connection is fast. We also tested the speed of some large providers, and this didn't make much of a difference in speed. Our action points are:

  • Reduce the number of file requests for the home page
  • Implement world-wide caching (CDN)

Reduce the number of files

When we first measured, the number of files was about 87. We've already reduced this number to 68 files by merging many CSS files into 2 files, merging many HTML files, and combining some images. As you can see in the table below, 2/3 of the files are images: because we don't want to change the design, we can't improve much further. We can still merge the JavaScript files, but then we're more or less done. So if we reduce from 87 files to 60 files we already have a 30% improvement. Then it's up to caching for further improvements.

Size (Kb)    Size (%)   # of files    % of files   
Images:11719762.9%4566.2%
HTML:166959.0%1116.2%
CSS:86154.6%22.9%
JavaScript:   4375323.5% 1014.7
TOTAL:186,260 68 

World-wide caching (CDN)

Even if we have a home page with 60 files, the impact of latency is still considerable. Looking at the characteristics of latency, we have to get the files as close as possible to our web site visitors. Enter 'CDN', or Content Distribution Network. A CDN caches the most-requested files from your site on servers scattered across the globe, and they send visitors to the nearest cached copy of your web site. If files are not in cache or if files are dynamic, they are retrieved from the origin server, usually over an optimized connection.

There are a couple of things to consider when using a CDN with an AJAX website:

  • JavaScript cannot load files from other (sub-)domains
  • Some CDNs are more suitable for AJAX sites
  • AJAX actually works very well with CDNs!

The first item is the security-across-domains issue: if your page is on www.domain.com, AJAX cannot load files from www.whatever.com or whatever.domain.com (only with a trick in IE). Why is this important? Because several CDNs use an image.domain.com subdomain that you use to load all static files. This subdomain refers to the dynamic DNS server of the CDN network. An alternative is to have www.domain.com hosted by the CDN's DNS servers: now you should specify which files should be cached, and which files should be loaded from the origin server. I haven't tested this myself yet, so any feedback is welcome.

The first large CDN was Akamai: they now have 14,000 servers in 65+ countries. The last couple of years a new type of CDN optimized for rich media was introduced, for example by Limelight. Those providers have fewer servers, but connections optimized for fast delivery of large files with high quality of service. A third approach focuses on world-wide distribution of intranet applications (ERP, CRM, etc.). They link up a limited number of office locations: Netli is the pioneer in this area. In our case we have many small files delivered to a lot of visitors from many different locations: this seems to favor the Akamai model. Mirror Image (and possibly Savvis) offer a similar service. Again, feedback is welcome, as we are still selecting a CDN vendor.

Although AJAX creates some constraints for CDN usage, it's also more effective compared with regular websites. This is because you can 'assemble' the page on the client-side. In traditional sites, the page is either dynamic or static (I define static as 'the same for each user', so it could dynamically generated on the server, but just not personalized). If it's static it can be cached, if it's dynamic it cannot be cached. An AJAX page can be partly static, and partly dynamic, where the dynamic part is retrieved with XMLHttpRequest. This is also the case on Backbase.com: only 1 file is dynamic, the login dialog. So 98% of the files can be cached, except for this 1 file. Pretty good.

To summarize

If you are running a global AJAX website in which you have a lot of static file requests, first try to reduce the number of files, and then consider a 'traditional' CDN to further improve performance. Be aware of the 'security-across-domain' issue. When we've implemented a CDN I can tell you more about the results. If you have good tips for selecting a CDNs, let me know!

7 Comments:

Blogger Edward! said...

Netli can in fact accelerate the Internet content. They do it already for Looksmart, Lexis Nexis and a bunch of other companies. I know Netli's a lot easier and faster to implement than a CDN solution like Akamai or Limelight. There's also a form where they'll do a free analysis for you.

5:02 PM  
Anonymous Anonymous said...

You might want to take a look at Cachefly, also. I don't know much about them, just that they're another CDN, with reasonable prices.

11:59 PM  
Anonymous Anonymous said...

hi,
it happens that i know very well the cdn world.
we are currently speedera resellers in Paris.
you can drop me an email at dwetzel at altern dot org if you want to discuss it further.
regs

5:57 AM  
Blogger Mark Ranford said...

Dont go and spend bucks on outdated concepts, pleeeese, , look u are on the edge of the curve in one area, and then looking at a behind the curve approach in another. P2P based tech is going to eat those players for lunch, second there is a next gen internet infrastructure in r&D which is already available for use with top players behind it eg Intel (PLANET), so my advice would be to check that out eg CORAL one of the implementations, its free, its the future, and becuase of your readership you'd do these new CDN developments a service. I use to work ata CDN startup play and yes we competed against Akamai and Speedera amongst others, but unfortunately the besyt tech doesnt always win :-)

7:55 AM  
Anonymous Anonymous said...

Hi Mark,
partially agree with the last comment
the only issue i see with P2P apps
is the low upload bandwith that endusers like me are having.
What was the startup CDN name you've worked for ?

10:34 AM  
Anonymous Anonymous said...

I agree with Edward in that Netli can actually work in your favor because of the following reasons :

(a) Count the number of RTTs required for downloading the entire page. A 'typical' 70Kb page with a mix of static and dynamic content takes about 30 RTTs to finish downloading which Netli actually brings down to 3. There is your biggest saving in avoiding the extra RTTs that go into the each retrieval and hence the latency perceived by users far away.(I did this for myself)

(b)With CDNs like Akamai, you still have the overhead of warming the caches and repopulating them when the site content changes in anyway. Netli offers an acceleration in transport direct from the origin server which obliviates the need for such mechanisms

(c)Though Netli started out as a Dynamic Application Delivery service recently it seems to offera static content offload much like Akamai's.

Hope this helps.

5:41 PM  
Anonymous Anonymous said...

Actually, Akamai has some good capabilities and performs better that Netli in Netli's niche. We ran a trial with them both head to head and Akamai performed better. I had thought Netli would be better in app perf area but Akamai proved to be faster consistently. Regarding some of the points about overhead with pre-warming the cache, etc. That actually isn't true. Akamai can be configured to leverage cachable vs. non-cachable performance optimizations and RTT optimizations are in their edge-to-client mapping system, etc. In any event, they all have their pros/cons but Akamai seemed to have lots of useful features from our experience.

Also, to Mike Ranfords points re: P2p, from a technical perspective, P2P can compete with CDNs I think. The issue comes that customers want the busines side of things - reporting, control of content, ability to bill, high availability, etc. These don't seem to be working in P2Ps favor at this point. We will see how it evolves, though, and you make some good points.

2:08 PM  

Post a Comment

<< Home