Quite often people ask me: Why do we need a player? Can’t we just embed an MP4 video using an HTML5 video tag?

Apart from logging statistics, playing commercials and rendering interactive elements over the video – topics beyond the scope of this post –, it is all about multiplicity. When publishing video, we are confronted with an array of media formats and delivery methods, and a plethora of browsers and devices.

A player that can do all those things has to decide which thing to do in which situation. Almost like a human being… Only, in this case we – or you! – want consistent results. That’s what this post is about.

Three is a crowd

Before the advent of the unified player, there were three players: one built in Flash, one built in HTML5/Javascript, and one that we called the fallback player, which did just what we started this post with; embed an MP4 using an HTML5 video tag. Some script logic decided which of the players to serve, based on the requesting user agent (browser and/or device), and that was it.

This had two obvious drawbacks: new player features had to be implemented at least twice, and once served, the player choice was final.

Use Your Head!

A better way is to implement the decision process in the player. Here’s what it boils down to:

The first stage is asset selection. The list of available assets is pruned to contain only assets that the browser/device running the player is capable of playing. The resulting shortlist is presented to the user. By default, the highest quality asset is selected, constrained by available network bandwidth and screen resolution.

The second stage is selection of display technique. We coined the term “head selection” for this, analogous to multi-head graphics adapters. If undecided, the head selection prefers HTML5 over Flash. Mind you: when the user selects a different video quality, the head selection is re-done. This may result in on-the-fly head switching, something that medical science is only dreaming about!

The third stage is the error recovery stage. Ideally it is never reached. In the real world, however, assets that appeared to be playable may fail to play in the head selected for it. Of course, remaining combinations are then tried.

And the rest will follow

Once we’ve got this in place, not only the main program, but also timeline media (picture-in-picture), commercials and derived players benefit. The goal is always to provide the best user experience, by supporting as many screens as possible with optimal video quality, whether it takes HTTP live streaming (HLS), Flash, or proprietary display techniques for Inline Video on iPhone.

Hopefully this clarifies why I think you need a player; it is indispensable in taming the ever-evolving internet.