The experience your customers have while interacting with your company's online presence says so much about your business, its priorities, and your brand. Whether your company conducts online transactions or not, performance optimization have become more of a "need" than a "want". A slow performing web site is bound to have less engagement among critical audiences, lower transaction volume, degraded brand fidelity, and higher bounce rates. In this post, we will talk about some of the key considerations when evaluating web performance technologies and vendors.
In discussing performance measurement, many of the methodologies described in this post will focus on so-called "synthetic measurements". Although Real User Monitoring (RUM) data is becoming the industry standard to measure actual user experience, it is not a practical means of gathering data for all use cases. In those cases where real user data cannot be collected, we can fall back to synthetic testing which have their own testing methodology.
Akamai conducts many dozens of synthetic tests for customers every week - in addition to constantly collecting RUM data. What follows are some observations to help companies design better evaluation criteria and get better data in cases that dictate synthetic measurement - in this case, when comparing performance configurations across multiple CDNs in a pre-production (i.e. "not live") environment.
Step 1 - Environment Setup
When evaluation multiple CDN configurations for a single web property, each of the CDN vendors should be able to provide a mock testing environment with no changes required from a customer. This translates to a "trial" hostname, which will mimic the live production web site. For example, if we were to evaluate vendors to accelerate www.customer.com, each of the vendors should provide a temporary hostname to test, which looks like www.customer.<cdn_identifier>.com.
Step 2 - Environment Validation
The temporary hostname should be an "exact" match of the live production web site. The list below will serve as a good starting point for validation:
1. The number of objects served over the "trial" hostname should be equal to the number of objects on a production site.
2. The number of bytes served should be equal to or less than production site. It could be "less" than the production site based on the compression features enabled by the vendor.
3. All the vendors must have the same set of caching rules applied. Some of the vendors do have the ability to cache dynamic requests (like the base html page) itself. If a particular vendor is caching while others are not, then the tests are going to be biased towards any one with more aggressive caching rules. So a thorough configuration review of vendors is required.
Step 3 - Performance Test Setup
Once the testing environment is setup accurately, the next step would involve setting up the actual performance tests. Consider the following key questions before you setup the "right" tests based on what you are trying to learn from the trial:
1. What type of testing?
As we described, when evaluating pre-production performance configurations, there are no 'real users' - hence no real end user data - during the evaluation phase, so we will not be able to compare RUM data. This kind of use case dictates the use of synthetic tests to conduct the evaluation. Synthetic tests provide a "clean room" environment to measure performance. However, even within the family of synthetic, there are different types of synthetic tests - backbone, last-mile and cellular. It is best to run last-mile tests as they more closely represent end user experiences when compared to backbone tests. If your company has a meaningful number of users connecting over mobile devices, then push for synthetic testing over cellular networks. This will also open up a conversation around mobile acceleration capabilities across vendors.
2. Which testing platform to use?
Generally, you'll benefit from using a third-party testing platform with a history of delivering results within your industry. Do some quick research to see if the testing platform publishes an availability and performance index for companies in your industry. Do not use any internal testing tools specific to a particular vendor, as there might be bias in the setup. A good synthetic testing platform should have a stable set of global test agents (particularly in cities and countries relevant to your business). Care should be taken to make sure that the test agents are simulating end user behaviors on real browsers. Some of the testing platforms available can emulated browsers and in some cases actual test with real browsers. Emulated browsers just capture network times where real browsers account for front end (or "render") time, browser caching, parallel connections etc. Hence "real browsers" should be used if possible.
3. From where should the tests be run?
Testing locations should account for both "long haul" delivery (across countries and continents) as well as in region delivery - again, depending on your business model and the audiences you are trying to reach. For example, if you have the origin server hosted in North America that delivers to global audiences, it would be good to validate performance not just from test agents within the US but also from other global locations.
4. What should be tested?
A typical user transaction on the site - consisting of multiple pages that represent a common usage pattern for users completing a specific task or tasks - should be tested. This serves two purposes. First, it simulates the performance gains that an end user will potentially see in a real-world transaction and second, it allows you to get a sense of multiple features that an optimization solution may bring to bear on your site. Some performance optimizations will benefit one type of page, while others may benefit another type of page, depending on the structure and data or object characteristics of the site. Testing a single object to evaluate the benefit of caching or just focusing on a single "base" html page without complex characteristics or tiny API transactions will likely not showcase the true power and relevance of the solution.
5. How long should the tests run?
The testing period should cover both peak and non-peak business days. Data over a period of 3-5 days should provide a good representation. Testing over short periods of time (hours, for instance) might have skewed data due to network inconsistencies and outages or audience drops and spikes.
6. What is the frequency of the tests that need to be run?
High frequency tests simulate high traffic scenarios where the content is always fresh in cache and low frequency test simulates low traffic scenarios where content is fetched from your data center (or "origin") every time. Under a normal scenario, a test that runs every 30 minutes from each of the testing agent would give a reasonably realistic simulation of traffic.
During the testing phase additional bandwidth utilization at the origin should be expected.
Step 4 - Review of Performance Metrics
The following section provides examples of testing results to help establish criteria for reviewing the performance results.
One useful metric to evaluate is the total response time of the series of pages representing a complete user transaction. Also, the W3C-specified DOM complete can be useful in cases where front-end treatments (FEO) are applied.
Histogram: Histogram views of the results will be helpful in understanding the distribution of tests represented both before and after applying optimizations:
The first graph of average response time does not clearly show which test case is necessarily a better performing test... but the second graph, which has the histogram distribution, shows that the blue series of tests has 15% more audience experiencing load times of less than 7 seconds. This also helps remove the outliers in the tests.
In-Depth analysis: In addition to the overall metrics, time should also be spent on the following:
1. Break down results by testing location: The overall result can be skewed by a few testing locations out-performing or under-performing. This will also help in looking for performance from locations that are important to the business.
2. Break down results by transaction steps: Some of the performance solutions might not be able to accelerate certain steps in the transaction. For example, the transaction might have a step to upload a file. Not all CDN vendors have an option to accelerate uploads. Looking at the response times by steps will help identify the better vendor based on your business needs.
Availability: The availability drop in the test results is most likely due to script failures rather than CDN server issues. This should not be confused with the uptime of the solution. Performance metrics should be compared only from the tests with healthy availability (90% +). The 10% failures can be due to testing platform issues.
Another key metric to note is the offload information. The vendor should be able to provide data on hits and bandwidth offloaded, to help calculate the cost savings. This will help plan the origin infrastructure investment once a vendor is selected.
Step 5 - Other Considerations
In addition to performance of the solutions, each vendor should be evaluated based on the following parameters.
Many vendors claim to have a global presence, but this presence may be low in areas that matter most to the business. It is essential to learn more about the scale of the network, and the specific deployments.
The preferred vendor should share the product roadmap for the solution to be future proof. As the business expands, the vendor must be available to extend services which are beneficial to the business.
Custom Solutions and Support:
Many companies decide to work with a vendor who has dedicated services teams to design and support custom solutions based on the platform. Other companies recognize their need for world-class 24/7 customer support. Still others may be highly self-reliant, relying heavily on self-service and APIs to utilize their CDN of choice. What is appropriate for you and your business will be specific to your needs.
In conclusion, we as a part of the Performance Specialist Team at Akamai follow and recommend the above best practices to our prospects to help them evaluate the performance of various optimization solutions that are available in the market today. This helps them come to their own conclusions regarding the appropriateness of our solution.