Understating Customer Perceived Latency

Measuring web performance through end user’s lens

5 min readOct 15, 2019

When we talk about web app’s performance there are many metrics that come into the picture like the load time, time to the first byte, time to contentful paint, etc.

Out of these, there are some metrics which talk about the network latency, the time a web page was waiting for the connection to be established & files/scripts to be served over the network & some metrics are specific to what happens after that, for example the time to first paint, first contentful paint, time to interact & first input delay.

Here are the standard timings that are reported by the browser window’s performance object

https://www.w3.org/TR/navigation-timing/timing-overview.png

Before we hit the page’s onLoad handler, all these happen, but this doesn’t include the latency that end users may perceive.

The real user’s perceived latency will most of the time be higher than the time for loadEventEnd after hitting a URL in the browser.

For an end-user, the page is useful & is loaded completly only when it is usable.

Time to interactive is when a web page’s content is fully loaded, the browser has rendered the page, there is meaningful & usable content on the page that the user can interact with.

What is preventing the web-page to be ready, fully loaded & interactable after the content is downloaded by the browser?

The CPU thread being busy at the time of page load is what causes a page to be rendered fully or partially but still not usable or interactable. Let us look at some of the reasons for that:

There is a lot of JS computation happening on page load.
Take a look at these recommendations to avoid it: https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/javascript-startup-optimization/

2. The web app is image-heavy. In such a case you will see images being rendered like a curtain drop effect & page is not usable.

3. The CPU is busy doing something else so that it is not able to follow the event loop correctly & is missing cycles to complete the page rendering. (This mostly is environmental & something, as a web app developer we have less control on). This is what makes it difficult to use the time to interactive as a metric be captured for real user monitoring & to make business decisions out of it. Real user devices may vary drastically in terms of RAM/CPU/OS configuration, browser, platform & operating conditions like %CPU busy, etc.

Time to first interactive vs Time to consistently interactive

Since time to interactive was coined as a web performance metric there have been opinions & questions that should we be interested in time to first interactive (the time when for the first time CPU thread was available to respond the user’s interaction) or time to consistently interactive (the time after page load after which CPU is idle consistently & available to respond to user’s interactions without any delay)?

The major difference between the two is also the way or algorithm to calculate these two numbers. We will have a look at the algorithm at a high level in a short while.

We have these two metrics defined separately because of a fact that after our page load, there could be deferred/async calls happening that can keep CPU busy for a while. Consider a page that loads quickly & then does async calls to get the data to be rendered in a widget or a side-bar, in such cases the CPU will still be busy in performing these operations & may not respond to user actions smoothly.

Vendors who provide time to interactive have chosen either one of them as a standard.

How to capture time to interactive?

There have been many ways to capture time to interactive by different vendors like Google, Akamai, etc & for different purposes. The time to interactive numbers that we get in Chrome Dev Tool’s Lighthouse is restricted to have a constant platform & environment.

While there are vendors that provide time to interactive for Real User Monitoring (RUM) as well, like Akamai’s Boomerang.

At a high level, this is how the algorithm looks like:

https://github.com/WICG/time-to-interactive#definition

Please note that since in this algorithm we wait for 5 seconds of CPU idle period to report time to interactive it is more likely to be time to consistently interactive.

If we just report the time at which we see the network & CPU to be idle for the first time after the reference, it is time to first interactive.

Many vendors prefer time to consistently interactive as a number to report, due to its stability.
https://docs.google.com/document/d/1GGiI9-7KeY3TPqS3YT271upUVimo-XiL5mwWorDUD4c/edit

Tools to capture time to interactive

Google’s Lighthouse (Lab data) https://developers.google.com/web/tools/lighthouse/
Akamai’s Boomerang (RUM) https://developer.akamai.com/tools/boomerang/

Custom Implementation

For my use case, I tried a custom implementation of the algorithm.

The support for getting to know the active network requests, the behavior of timers, support for LongTaskAPI varies with browsers. To have RUM in place supporting the majority of the browsers, we do fallbacks of the non-supported APIs.

1. We look for 4 occurrences of consecutive 50ms intervals where we see:1.1 No resource being downloaded.1.2 CPU busy % being under 25.1.3 There are no disabled submit buttons on the page1.4 User is already not interacting with the page2. If we see a window of 4 cycles where all above conditions are met, we declare TTI as start of the window3. ELSE we move the window ahead

Learnings

“TTI can be difficult to track in the wild”, time to interactive is not something as a developer I would be very much interested in. There are platform/environmental factors that can increase the numbers. At the same time, it is good to see how end users are perceiving the webpage through such metrics through real user monitoring. For a business decision-maker, the CPL makes it clear where to invest in terms of performance improvements.