Written by: Ritesh Kalyanshetty and Sandeep Singh
Pervasiveness of CDNs within the expanding Media and Entertainment Industry
Media and entertainment companies, including content owners and over-the-top (OTT) service providers are living in an era symbolized by a dramatic increase in content consumption fueled by the widespread availability of online video content, coupled with factors like hyper-connected devices and ubiquitous high-speed data.
To paint a quick picture of the state of affairs, the Subscription Video on Demand (SVoD) market is generating revenues of USD 24.7 Billion in 2019 and is expected to grow to USD 28.2 Billion by 2023 at a CAGR of 3.2%. In terms of consumption, globally, consumers will spend 84 minutes a day watching online video in 2020, up 25% from 2018.
This rapid growth in streaming video traffic, along with the viewers' expectations around a high-quality video viewing experience across devices and networks, has led to media organizations widely using Content Delivery Networks (CDNs) with the goal to provide high availability and high performance to end-users to ensure a high-quality viewing experience.
CDNs today carry a significant portion of the world's Internet traffic and are ubiquitous in their presence in mitigating the toughest challenges of delivering content over the Internet. CDNs are an essential tool to successfully offer a seamless video viewing experience to end-users by delivering an enormous amount of media at scale.
According to the recently published IDC MarketScape: Worldwide Commercial CDN 2019 Vendor Assessment report, which positioned Akamai in the Leaders category, CDNs will carry 72% of internet traffic by 2022, up from 56% in 2017. The report also mentions that CDNs have become an essential tool to handle the demands created by the massive amount of web content, high-definition (HD) video, and large downloads on the internet today.
With Opportunities Come Challenges - Monitoring CDN Performance
CDNs make it possible to deliver the high-quality viewing experience to end-users by providing a highly distributed edge delivery network that can provide the required scale and availability to meet viewer demands around high quality and performance.
Regional dependencies on an ISP's performance, peering across networks and regional capacity limitations expose a CDN platform to the risk of outages, slow-downs or degraded performance in specific regions. This necessitates the need for incorporating a CDN performance monitoring mechanism to provide visibility into performance to mitigate risks and threats associated with service disruption for a Media organization using the CDN's services as well as staving off adverse brand value impact in today's hyper-connected social media era.
More and more media organizations, especially those catering to a diverse and widespread global audience base, are now adopting multi-CDN technologies to hedge the CDN performance risk and load balance across platforms based on geo-specific performance.
Every time we speak of CDN load balancing solutions, players that immediately come to mind are Cedexis Radar (now Citrix), Conviva Precision, Smart Switch from Youbora (NPAW) and MUX. All these solutions position themselves as a transparent platform that enables customers to provide a high quality of experience (QoE) for end-users and rationalize their delivery spend. It is critical for media organizations to understand how CDN load balancing solutions differ in their methodologies used to determine the best performing CDN to ensure the decision to choose the best delivery platform is accurate. Otherwise there could be a false-positive effect which impacts the goal of ensuring the best quality for the end-user.
A Sneak peek into Cedexis Radar to measure CDN Performance
Radar is a platform developed by Citrix/Cedexis that provides data regarding the availability and performance for any public infrastructure using either community measurements or private measurements.
The measurements use a set of test objects available across different public infrastructures including Cloud Service Providers and CDNs. Akamai configures these test objects and makes them available as one of the community measurements so any media organization that opts into the Radar community is able to see Akamai's performance data. This methodology enables members of the Radar community or media organizations to get visibility into the performance of platforms, regardless of whether they actively leverage the platform or not. It also provides visibility to organizations into regions where they may not have any users yet but could be considering as a potential market in the future.
Some other solutions use Key Performance Indicators (KPIs) as applicable to user traffic from recent intervals in time to determine the best performing platform. Using these solutions, media organizations get the flexibility to enable constrained optimizations by setting a threshold for how much traffic share should a platform receive - for example, in a given geography, based on an established threshold. This helps the organizations to manage delivery economics, as well as known capacity and performance limitations of the platform in a given geography.
Akamai's Positioning on Delivery Performance
Akamai strives to provide a high quality of experience to end users by ensuring that the performance of our platform is optimized for any given environment. These optimizations, as explained further on, are evolving as platform features; while some others can be implemented as a use-case based depending on a specific media organization's content and origin characteristics.
Providing media organizations with visibility into platform performance using important metrics that directly affect the success of their business and helping them make timely decisions to manage and optimize their content, monetize it, and make sure viewers are interested and engaged will be key areas of focus for Akamai in 2020. Akamai continues to make investments into server-side analytics and real-time monitoring of platform health and performance.
Media Reports, which provide server-side visibility into ingest and delivery performance for operational excellence and improved end-user viewing experiences, are continuing to evolve with additional metrics, dimensions and features being added including: KPIs that are closely aligned to Quality of Experience metrics.
A lot of enhancements are in progress around real-time monitoring for the core media delivery products targeting a significant decrease in data latency across all reporting metrics. Data Feeds for these core delivery products are just the beginning of Akamai's low-latency data initiative. Data Feeds will bring near real-time visibility on delivery performance, CDN health, latency, errors and events through raw data logs.
Focus on the Cedexis Methodology
For Media Organizations using Cedexis for CDN performance monitoring, it is important to understand what metrics should be used for the performance measurements and why these metrics are chosen to assess a given platform.
Considering Cedexis, the Radar probes use a standard small test-object for availability and RTT probes and a standard large test-object to measure throughput. The Radar configuration can be set up to measure any of the following parameters below to determine the performance of a CDN or a public infrastructure:
- Availability: Availability also known as 'cold start probes' are intended to allow media organizations to warm the caching services from the provider. Although there is a measurement value associated with this probe, at a high level, the availability probe can be used by a media organization to determine whether the provider is available or not.
If a platform is not configured to perform a cold start probe, the results of the RTT probe in place of a cold start report is used to provide the availability metrics.
- RTT/Latency/TTFB: This is the time for a single packet to be returned in response to an HTTP request. (ResponseStart - RequestStart) using the Resource Timing API provides this metric.
- Throughput: The throughput is measured (kilobits per second) for an entire request and response based on a large-test object download. File size (kilobytes) * 8 / (responseEnd - requestStart) using the Resource Timing API.
For a media organization, it is important to understand which metric should be used - Throughput or TTFB. The specifications of the standard objects on the Akamai platform applicable to Radar community measurements is as below:
- Size of Akamai Official public object used for Availability or RTT - 43bytes
- Size of Akamai Official public objects used for Throughput - 100KBytes
The Radar community measurements can differ from private measurements, in which case a media organization should ideally choose objects and metrics which are a close reflection of their actual end-user traffic and use-case. The private measurements can also be a reflection of the end-to-end setup and not just a single component in the form of a CDN, Cloud service provider, origin services etc.
Akamai Optimizations and Roadmap
Akamai has been adopting optimization technologies to specifically optimize for the methodology used by Cedexis. These optimizations, over time, tie back into our roadmap with the focus to continually improve the overall delivery performance on the platform. The sections in the blog below describe some of these optimization technologies and they tie into our platform roadmap.
Akamai Optimizations to align to Cedexis Data
Akamai has been exploring the use of multiple TCP congestion control algorithms, and one of the protocols in scope is the Bottleneck Bandwidth and RTT (BBR) protocol. BBR originally developed by Google, continuously estimates the bottleneck bandwidth for the connection based on how much data has been delivered over a recent window of time. It then uses this estimated bottleneck bandwidth, along with the recently observed minimum RTT to calculate an optimal data delivery rate.
Over time, BBR adjusts a "gain" factor in order to sometimes probe for more available bandwidth and at other times to drain the queue of unnecessary packets. BBR will also occasionally drop the sending rate in order to empty the queue so that it can sample a new minimum RTT which estimates propagation delay.
On the Akamai platform, we also implemented changes to the TCP Initial Congestion Window. When using the TCP protocol, the congestion window is a state variable that controls how much unacknowledged data can be in-flight between the sender and the receiver. As the name implies, the initial congestion window is the value used as the starting point.
As part of Akamai's initial effort, a combination of BBR and changes to the TCP Initial Congestion window have had a positive impact on the delivery performance on the platform, as applicable to measured latencies and throughput for Cedexis Radar community measurements. The same can possibly be also carried forward to private measurements by media organizations as well. It has also been analyzed that returning the object in one RTT impacts performance positively. This is also part of Akamai's continuous effort to gather data-points so that we are well placed to fine-tune these settings.
Performance Optimization Roadmap
While the optimizations described above have been able to provide positive delivery performance gains with different network types and client platforms, there are instances in which these implementations have also either resulted in a neutral outcome or a decline in performance. As an example, a large Initial Congestion Window (ICWND) on a lossy mobile device might result in a degraded experience. Therefore, Akamai is working towards delivery performance improvements for media organizations by applying optimizations on a per-session basis by looking at historical connection and statistical data, pre-defined policy, and priority applicable to Radar measurements and end-user traffic holistically.
The idea is to invest on developing a self-adapting system to predict the best transport protocol option for each individual connection. This can be made possible by employing machine learning algorithms on the huge volume of server-side measurements Akamai collects by delivering traffic on its platform. The preliminary analysis shows that 17% - 35% of media traffic would benefit from a custom-set of transport-level optimizations as compared to the default platform behavior. Such custom optimizations can result in an additional 5% improvement in the median throughput on the platform.
Akamai has had a long and successful track record of manual optimizations deployed on the network for specific use-cases. The effort and investments moving forward are to make such analyses and changes more automated while maintaining and improving platform stability and performance.
Working with Akamai to Optimize Delivery Performance
Along with all the optimizations that Akamai has deployed on its platform and the various efforts and investments in the roadmap, below is a quick checklist of action items for a media organization to derive the best delivery performance out of their Akamai setup:
- Check what metrics are being used to evaluate the performance and understand why have they been selected for measuring performance.
- Check if the use of Openmix applies any exceptions like "Handicap" that influences the traffic share in certain geographies.
- Contact your Akamai Account Representative to evaluate the possibility of applying BBR and Initial Congestion Window (ICWND) changes in the short-term.
- Setup a session with Akamai to understand how to optimize single object performance and Akamai's roadmap for protocol optimization to address this.
Ensuring a high-quality video viewing experience for end-users is a critical success factor for media organizations. The first step towards achieving this is to ensure that a delivery partner is tuned optimally in providing high-quality video viewing experiences for end-users. The optimizations detailed above along with support from Akamai in implementing them would go a long way in helping Media organizations achieve best-in-class delivery performance.
 Statista, SVoD Market Estimate, July 2019