Yoav's blog thing

Mastodon   RSS   Twitter   GitHub  

Adapting without assumptions

There have been a lot of talk recently about the Network Info API.

Paul Kinlan published an article about using Service Worker along with the Network Info API to send network information up to the server and let the server adapt its responses to these network info bits. There is also an intent to implement the downlinkMax attribute in the Blink rendering engine.

Since I have Opinions™ on the matter, and Twitter and mailing lists aren't always the ideal medium, I wrote them down here.

This is a lengthy post, so if you don't have time to read through it all, its claims are:

# Current NetInfo API doesn't expose what devs need

The current API is built around the following attributes:

The problem with the above is that it rarely provides Web developers with useful and actionable data without them having to make huge (and often false) assumptions about what that info means for the things they actually care about (and which are, for most cases, not the things this API exposes).

If you take a closer look at the downlinkMax table you can see that the info you get from it is dubious at best. If your user is on an Edge network, you would be led to think that their available download speed is 384 kbps. While they most probably don't have that much bandwidth at their disposal, you can use that in order to figure out that they are on a not-so-great network, and change the resources you serve them accordingly.

But, what if they are WiFi-tethering over their 2G phone? In that case, you'd be led to think that the connection type is "WiFi" and the speed is capped at 11 Mbps. Not too shabby.

Except that the user would be experiencing even worse network conditions in the latter case than in the former one, without the developer knowing anything about it.

There are many other cases where looking at downlinkMax will lead you to the wrong conclusions. For example, take the case where your users are on an extremely lossy WiFi network (AKA: "hotel/conference WiFi") where their effective bandwidth is very low. Or the case where they are on an HSDPA network which in theory can reach 14.3Mbps, but in reality, they are sharing a cell with thousands of other users, all trying to download cat-based entertainment, since they are all waiting for the bus/train/plane, which means the cell's bandwidth is thinly divided between all those users, and the cell's backhaul network (which is fetching those cats from the landline internet) is saturated as well.

In fact, the only case where downlinkMax is useful is in the "user is on an Edge network" case. For everything else, you're out of luck: bad or tethered WiFi, 3G with poor coverage, poor backhaul, etc. will all present themselves as pretty good networks. That means that we could effectively replace downlinkMax with an isUserOnEdge boolean.

Even if we look at possible downlinkMax improvements using a bandwidth estimate of some sort, according to the current spec:

All of which leads me to believe that downlinkMax is not providing the information that developers actually need, and makes me worry that the info will be abused by developers (due to lack of better network information) if we would expose it.

# So, what do developers need?

The general use-case that developers are trying to tackle here is that of content adaptation to the user's condition. I'd claim that the main use-case would be to serve rich content to devices that can handle it right now, while providing decent and fast experience to devices that can't handle the rich content, due to some constraints.

Some of the specific use-cases I heard people mention are:

Now, if we take these use-cases into consideration, what are the constraints that we need to expose to developers that would enable them to successfully tackle these use cases?

I think the list would include:

Let's dive into each one of those.

# Network conditions

The current NetInfo API talks about exposing network information, basically mimicking the Android APIs that can give an app developer the same info. So, as we've seen, this info gives the developer the rough network type and the theoretical bandwidth limits of the network the user is on.

But as a developer, I don't much care about which first-hop radio technology is used, nor what is its theoretical limit. What I want to know is "Is the end-to-end network fast enough to deliver all those rich (read: heavy) resources in time for them to provide a pleasant user experience rather than a burdensome one?"

So, we don't need to expose information about the network, as much as we need to expose the product of the overall end-to-end network conditions.

What developers need to know is the network conditions that the user is experiencing, and in most cases, what is their effective bandwidth.

While that's hard to deliver (and I once wrote why measuring bandwidth is hard), the good folks of Google Chrome net stack are working to prove that hard != impossible. So, it looks like having an in-the-browser end-to-end network estimation is no longer a pipe dream.

Now, once we've estimated the network conditions, should we expose the raw values?

I believe we shouldn't, at least not as a high-level "your bandwidth is X" single number.

The raw network information of incoming effective bandwidth and round-trip-times can be overwhelming, and the potential for misuse is too high. It's also very likely to change rapidly, causing non-deterministic code behavior if exposed through script, and huge variance if exposed through Client-Hints.

What I believe we need to expose is a set of actionable, discrete values, and browsers would "translate" the stream of raw network data into one of those values. That would also enable browsers to start with rough bandwidth estimations, and iterate on them, making sure they're more accurate over time.

As far as the values themselves, I propose something like unusable, bad, decent, good and excellent, because naming is hard.

Having discrete and imprecise values also has the advantage of enabling browsers to evolve what these values mean over time, since today's "decent" may very well be tomorrow's "bad". We already have a Web platform precedent for similar discrete values as part of the update-frequency Media Query.

As a bonus, imprecise values would significantly decrease the privacy concerns that exposing the raw bandwidth would raise.

# User preferences

We already have a proposal for this one. It's called the Save-Data header that is part of the Client-Hints specification. It might be a good idea to also expose that to JavaScript.

The main question that remains here is how do we get the user's preferences. As far as I understand, the idea in Chrome is to take advantage of a user's opt-in to their compression proxy as an indication that they are interested in data savings in general.

That's probably a good start, but we can evolve that to be so much smarter over time, depending on many other factors that the browser has about the user. (e.g. geography, data saving settings at the OS level, etc.)

# Device capabilities

The current state of the art at detecting old and busted devices and avoiding sending them resources that they would choke on (due to constrained CPU and memory) is dubbed "cutting the mustard". While a good effort to make due with what we have today, it is (again) making a bunch of potentially false assumptions.

The "cutting the mustard" method means detecting the presence of modern APIs and concluding from their absence that the device in question is old and busted. While their absence can indicate that, their presence doesn't mean that the device is full-powered high-end smartphone. There are many low-end devices out there today with shiny new FirefoxOS installations. Any Android 4 phone may have an always-up-to-date Chrome, regardless of its memory and CPU (which can be extremely low).

Bottom line is: we cannot assume the state of the user's hardware from the state of their software.

On the other hand, exposing all the different metrics that determine the device's capacity is tricky. Do we expose raw CPU cycles? Raw memory? What should happen when CPU or memory are busy with a different app?

The solution to that is not very different from the one for network conditions. We can expose a set of discrete and actionable values, that can evolve over time.

The browsers can estimate the state of current hardware and current available processing power and memory, and "translate" that into a "rank" which would give developers an idea of what they are dealing with, and allow them to adapt their sites accordingly.

Lacking better names, the values could be minimal, low, mid and high.

# Battery state

That's easy, we already have that! The Battery status API is a candidate recommendation specification, and is fully supported in Chrome/Opera and partially supported in Firefox. All that's left is to hope that support to other modern browsers would arrive soon.

# Monetary cost

That part is tricky since browsers don't actually have info regarding the data costs, and in many cases (such as tethered WiFi) our assumptions about the cost implications of network type are wrong.

I think that the only way out of this puzzle is asking the user. Browsers need to expose an interface asking the user for their preference regarding cost (e.g. enable them to mark certain WiFi networks as expensive, mark roaming as expensive, etc.).

Another option is to expose a way for developers to ask the user's permission to perform large downloads (e.g. message synchronization, video download, etc.), and the browser can remember that preference for the current network, across multiple sites.

What we definitely shouldn't do is tell developers that they should deduce cost from the network type being WiFi. Even if this is a pattern often used in the native apps world, it is blatantly wrong and is ignoring tethering as well as the fact that many cellular plans have unlimited data. (which brings back memories of me trying to sync music albums over unlimited 4G before going on a 12 hour flight, and the native app in question telling me "we'll sync as soon as you're on WiFi". Argh!)

# Why not progressive enhancement?

Why do we need to expose all that info at all? Why can't we just build our Web sites to progressively enhance, so that the content downloads progressively, and the users get the basic content before all the fancy stuff downloads, so if their network conditions are bad, they just get the basic parts.

Well, progressive enhancement is great for many things, but cannot support some cases of content adaptation without adding extra delays.

# What happens when multiple paths are used?

As pointed out by Ryan Sleevi of Chrome networking fame, multi-path would break any attempts to expose either the available or theoretical network bandwidth. That is absolutely true, and yet another reason why we don't want to expose the raw bandwidth, but a discrete and abstract value instead. The browser can then expose the overall effective bandwidth it sees (aggregated from all network connections), even in a multipath world.

# How do we prevent it being the new User Agent string?

Another concern that was raised is that exposing network information would result in worse user experience (due the developer abuse of the exposed data), and would therefore result in browsers lying about the actual conditions the user is in.

In my view, the doom of the User-Agent string as an API was that it requires developers to make assumptions about what that string means regarding other things that actually matter to them (e.g. feature support).

While I agree with those concerns regarding downlinkMax and type, I believe that as long as we keep the assumptions that developers have to make to a minimum, there's no reason developers would abuse APIs and harm their user's experience while doing so. That also means that there would be no reason for browsers to eventually lie, and provide false API values.

# What about the extensible Web?

Doesn't the concept of exposing a high-level value rather than the raw data stand at odds with the Extensible Web manifesto?

I don't think it is, as long as we also strive to expose the raw data eventually. But exposing the full breadth of network info or device capabilities info is not trivial. It would most probably require an API based on the Performance Timeline and I suspect it would have some privacy gotchas, since exposing the user's detailed network, CPU and memory usage patterns smells like something that would have interesting privacy implications.

So, we should totally try to expose a low-level API, but I don't think we should hold exposing the high level info (which I suspect would satisfy most use-cases) until we have figured out how to safely do that.

# To sum it up

I strongly believe that exposing network conditions as well as other factors about the user's environment would provide a solid foundation for developers to better adapt the sites they serve to the user's conditions. We need to be careful about what we expose though, and make sure that it will not result in assumptions, abuse and lies.

Thanks to Paul Kinlan, Tim Kadlec and Jake Archibald for reviewing and commenting on an early draft of this post.

← Home