Good question: We own a load balancer/ADC. Why not just use it to accelerate our site?

Once you’ve been in an industry for a long time, you start hearing the same patterns in client meetings. They seem to have a checklist of objections that they must all get from the same website. (I can’t find the site, so maybe only CTOs and VPs get access.)

Last week, I was in a client meeting and got lobbed a nice softball sales objection that I hear all the time:

“We own load balancers/ADCs (from F5, Citrix, Radware, A10, Cisco, Brocade, Riverbed, etc.) Why shouldn’t we just use them to accelerate our site?”

I relish this question because it lets me pontificate on one of my favorite subjects: all of the differences in culture, technology, and focus between the network players and those of us who toil at the highest layer of the stack.

After answering the question for the client, I realized that I’ve never written a post on ADCs (aka application delivery controllers), what they offer when it comes to performance, and whether or not they’re worth it when it comes to delivering a faster end-user experience. So here goes…

Caveat: I can already see the hate mail piling up. To be clear, I’m NOT saying you don’t need an ADC. I think many sophisticated customers need an ADC. We own a number of them, and they are mission-critical to our business.

The point of this post is to:

  • demystify web acceleration and front-end optimization from an ADC perspective
  • look at ADC features from the perspective of real end-user performance
  • see if an ADC provides any end-user performance value beyond what a properly configured web server provides under normal circumstances.

Defining the web acceleration space

Solutions in this space go by many names: load balancer, application delivery controller, traffic manager, and so on. These days, most people are using the term “application delivery controller” (ADC). An ADC is an appliance or software that sits in front of your web servers in your datacenter. It sees all of the traffic to and from your web servers. Originally, solutions in this space were simple products that were designed to distribute workload across multiple computers. In recent years, the market has evolved to include more sophisticated features.

Here are just a few of the things ADCs do well:

  • load balancing
  • improving the scale of server infrastructure
  • server availability and fault tolerance
  • security features
  • layer 7 routing
  • a whole host of other infrastructure services that are vital to today’s modern network designs

For more, my friends Joe and Mark at Gartner do a great job of describing the main players and highlighting capabilities in the Magic Quadrant for application delivery controllers. While Strangeloop is included in this Magic Quadrant, we are not an ADC. Unlike ADC solutions, we focus on front-end optimization. But we frequently get lumped in with ADCs because, like them, we have an appliance that sits in front of servers and does good stuff for web traffic. The true ADC vendors that matter are F5, Citrix, Radware, A10, Cisco, Brocade, and Riverbed (Zeus).

Let’s clarify our terms: What “performance” means for ADC vendors vs. what it means for the FEO community

Now let’s focus on the performance aspect.

ADCs do not focus on front-end problems, yet the words they use are very similar — and in some cases identical — to the words used by the front-end performance community. We need clarification.

The term “performance” is often used very differently by ADC vendors, so let me be clear: what I am talking about is how fast a page loads on a real browser for a real person in a real-world situation. Although this seems obvious, I constantly hear the terms “performance” or “acceleration” used to represent scale or offload (when functionality is “offloaded” from the server and moved to a device instead), or in some cases technical minutiae that has no discernible impact on the performance of web pages for real users.

From a performance perspective, a typical ADC helps mostly with back-end performance. Back-end performance optimization means optimizing how data is processed on your server before it’s sent to the user’s browser. ADCs provide most of their benefit by offloading jobs from the web server, which in turn allows the web server to focus all of its energy and horsepower on serving pages.

While ADCs contribute to the smooth functioning of a site’s back end, those of us in the performance community have established and re-established that the major problem with user-perceived web performance is not in the back end. According to a recent post by Steve Souders, between 76% and 92% of end-user response time is at the front end. While server overload can happen during crazy traffic spikes (like those experienced by florist sites on Valentine’s Day morning), the fact is that most web servers are not overloaded most of the time, thus most of these offload solutions don’t really help user-perceived web performance on a day-to-day basis.

When we do see backend problems, they’re rarely load related and are more often issues with costly database lookups or other application-logic issues that cause back-end time to be greater than average. And let’s not forget that, when there are back-end problems, they almost always contribute to the time to first byte of the HTML. Static objects are rarely affected by the same issues that typically show up with backend problems.

If you apply core performance best practices via an ADC, do you get a faster end-user experience than you would via your server?

I took four typical performance optimizations and applied them using both an ADC and a traditional web server under normal load. The goal was to get a side-by-side, before-and-after look at whether or not the ADC delivered a faster end-user experience than the server.

1. Compression

Compression involves encoding information using fewer bits than the original. Compression reduces payload which reduces download time. In the web world, it is often done using gzip or deflate.

I have been talking about compression for years. It really helps front-end performance. Compression is available for free on all modern web servers, so the question here is: do I gain any speed by turning on compression in my ADC versus the compression I would have via my web server? Compression helps reduce the download time (the blue bar in the waterfall below).

Observation: We don’t see any material difference when using compression in an ADC versus compression on a web server.

Conclusion: Overall benefit is minimal to none when compared to a normal web server.

2. Multiplexing to the back end

ADCs (as well as some standalone acceleration solutions) use a technique called “TCP multiplexing” (also called HTTP multiplexing, TCP pooling, and connection pooling), which frees the server from having to maintain many, many concurrent connections from many, many clients. Connections take up resources on the server, and servers can’t hold open a very large number of connections at once. Gradually, the server degrades with more and more concurrent connections. With each new connection, after some threshold, it takes a tiny bit more time to open a new one. Ultimately, the server will probably get to a point where it can’t open new connections, so the new connections just hang or get rejected, depending on how the stack handles it.

ADCs handle the concurrent connection problem up front, and allow the server to only have to deal with a smaller number of very long-lasting connections, which the ADC maintains and manages with each of the servers. The main benefits of the ADC in this case are consistency and predictability.

The experiment below was designed to highlight how much TCP multiplexing helps from an immediate end-user performance perspective.

We turned multiplexing off and on for a production site under normal load online blackjack play money, and observed the performance benefits. (This site represents a typical mid-tier e-commerce site: 75 html page requests per second, which equates to roughly 200 million page views per month.)

Without multiplexing:

With multiplexing turned off, we see an average time to first byte on the HTML of 169 ms.

With multiplexing:

We then turned on multiplexing, waited for an hour, and sampled the site again. With multiplexing turned on, we see an average time to first byte on the HTML of 168 ms.

Observations: Concurrent connection management, set-up, and tear down in most modern web servers is efficient. In some edge situations where the servers are under serious load, this feature will show a bigger benefit, but for the most part our findings were that multiplexing to the back end had limited overall performance value. (Note: I’m not going to get into the offload value in this post, but this is not to say that multiplexing is worthless. Remember: I’m only speaking to the end-user performance benefit.)

Conclusion: Immediate acceleration benefit for most sites is minimal to none. However, it’s a good idea to turn on multiplexing anyway, since it will help with edge cases (such as being hit by a flood of traffic). It also helps shield the server from spikes and unusual loads, and provides a more predictable performance environment for the servers.

3. Object and page caching

Object and page caching are aimed at offloading the server and reducing time to first byte. These features are offered by most vendors, but they’re often hard to configure, and so they’re rarely used. Instead, many customers who use a CDN will “offload” caching responsibility to their CDN or to a standalone cache in their network (e.g. Varnish).

When you look at this technique from a waterfall perspective, you see that the performance benefit comes from reducing time to first byte for three reasons:

  • The ADC may return objects faster than your web servers.
  • Offloading the work of serving those objects would allow the web server to focus on serving dynamic content and further improve time to first byte (TTFB) on dynamic objects.
  • Back-end issues can cause the HTML to take a long time for the server to generate, in which case caching the HTML for the page will help, especially with TTFB. But keep in mind the HTML is static enough to be cacheable, even if it’s for a few minutes at a time.

For example, the time to first byte on the HTML (highlighted in the red box) may decrease with page caching.

Observations: In most cases, I see very little difference between the time to first byte on objects or pages served from an ADC cache versus a web server cache. Page caching can mask back-end issues that cause slow HTML generation, but the page can be cached either at the ADC or the server itself. In many cases, the offload benefit can help busy web farms, but an ADC cache versus your web server cache is not going to buy you much benefit. Also, as mentioned above, for most public-facing sites, site owners rely on the CDN to provide caching.

Conclusion: Performance benefit on most pages <50 ms.

4. TCP/IP optimization

Every vendor will claim a “state of the art” TCP/IP stack and “hundreds” of improvements to make sites faster. Most of these optimizations fall into two categories:

  • Expanding buffer sizes and detects low latency to manage congestion
  • Ways to limit packet loss and recovery in the case of dropped packets

Obviously, a good TCP/IP stack is important. The question here is: does it materially affect performance when measured against a modern web server with a standard configuration?

As the waterfall below shows, the TCP/IP stack improvement would affect the time to first byte as well as the download time.

Specific to HTTP, implementing keep-alive connections to maintain longer TCP/IP connections with clients is also something ADCs do. We all know the benefit of keep-alive connections, and we know that they help front-end issues since the cost of setting up and tearing down new TCP connections is minimized for the browser. But, like most other issues here, modern web servers are pretty good at using keep-alive connections with clients. So the ADC isn’t improving page performance or helping front-end issues any more than the server would do this with a few configuration tweaks that are probably in place anyway.

Observation: Each vendor seems to have a different approach to what comes out of the box. Overall, the TCP/IP stacks of the vendors are much better than out-of-the-box web server stacks, and they do make a difference. The difference is in line with my observations about the TCP stack improvements presented by dynamic site acceleration (DSA).

Conclusion: Performance gain under normal circumstances <150ms

Summary

Performance best practice
Performance gain when implemented using an ADC vs a web server under normal load
Compression Minimal to none
TCP multiplexing Minimal to none (but still good to use in case of traffic spikes)
Object and page caching <50 ms
TCP/IP optimization <150ms

So… what to do with this information?

You need to understand what an ADC will do for you and how it will help user-perceived performance. Buying it for security, scalability, and offload is a very different decision than if you want it purely for acceleration or think it will make your site load 40% faster.

It is very common to lump the benefits of offload and the benefits of site acceleration into one category. You need to separate these categories.

If you’re considering an ADC purchase solely to make your pages faster for your users, I recommend following these steps:

  1. Determine if your web servers are modern and have the capacity to handle your request volume. Also ensure they are using compression and object caching.
  2. Get waterfalls from different locations using WebPagetest. Check out the HTML bar (usually the first bar) and see if reducing the server think time (green: assume 10%, only if you can cache your HTML) and the download time (blue: assume 5% in North America and Europe and 25% in other parts of world) actually brings you a benefit.
  3. Research the different vendors, or email me. (I know them all and would be happy to help.)
  4. Pick a few ADCs to try.
  5. Test each vendor yourself using an open-source tool like WebPagetest. Be wary of the tests the vendors send you as these often do not reflect how the site performs for a real end user.

My opinion: I remain a skeptic that ADC can help with end-user performance in any meaningful way.

I see ADCs more as a savvy way to commoditize offload devices than really helping with front-end performance.

As an acceleration play, I see that ADC vendors have fallen way behind. Their tools are very expensive and, when it comes to performance, they offer small incremental performance gains.

I do see a lot of potential for this area of the technology stack. I think it has some promise, but there’s been very little innovation in recent years. There have been no truly exciting developments since Cisco acquired FineGround in 2005 and F5 acquired its Web Accelerator product in 2006. These products, like most others in the space, have not evolved in 5-6 years.

I’d like to be more convinced. If anyone has real-world performance data with compelling evidence that ADC performance gains are significant compared to a well-configured web server under normal circumstances, I’m all ears.

Related posts: