iPhone 3.0 live HTTP streaming |
2009-09-26
|
The last few months news about streaming to iPhone 3.0 has been making the rounds. I've been holding off commenting on it for a while since I didn't actually look into it much and didn't want to base anything on hearsay. And I don't even have - or want - an iPhone!
Last week I took some time to read the IETF draft and the Apple developer introduction.
On my next plane ride I quickly hacked together a simple segmenter in Python, and tried it the next day at work to see that it sort-of-worked for about a minute.
And yesterday evening, during Nerd Night, I changed my original plans (since Wiebe cancelled, I wasn't going to work on the Spykee robot yet) and decided to go back to the iPhone streaming hacking.
After tweaking mpegtsmux to do something useful with GStreamer's GST_BUFFER_FLAG_DELTA_UNIT and teaching the segmenter to always start a new segment on a non-delta-unit, and after switching to a black videotestrc with a timeoverlay (the normal one seems to trigger a weird encoder bug in our H264 encoder, need some help from our Fluendo codec gurus for that), I started a simple stream last night:
I left it running for the night.
And this morning when I got up, it was still going strong, and I left it pass the 10 hour mark:
So, a good first step.
Hope to finish up some loose ends across the week to make this work inside Flumotion.
I'll leave you with my first impressions on this Apple creation:
- Naming a draft 'HTTP Live Streaming' pretending this is something new after years of Shoutcast - Icecast - Flumotion is either plain ignorance or typical Apple hubris. At least qualify the name with something like 'segmented', 'TS', or 'high-latency', Apple. Come on, play nice for once.
- The streaming system is very different from your typical streaming system. Effectively, this approach creates a live stream by segmenting a live feed into a sequence of MPEG Transport Stream segments at a regular interval. This has some benefits and drawbacks.
- The key concept is now the playlist file, an extension of .m3u called .m3u8. This playlist file is the entry point into the stream, as it lists the segments that make up the stream.
- This playlist file can reference other playlist files. This is what enables adaptive bandwidth streaming.
- One clear benefit that Apple was aiming for is that they effectively managed to separate the preparation part from the streaming part - the actual streaming can be handled by any old web server that can serve up files. I'm sure this is the main benefit they had in mind. The benefit is two-fold: first of all, it's easy and cheap to install web servers, and second, you get all the benefits of using a bog-standard protocol like HTTP: firewall acceptance, proxy and caching support, edge caching, ... Take for example the fact that a company like Akamai charges more for some streaming protocols because they have to deploy specific servers and can't use all their edge infrastructure for it.
- Another benefit is that you are generating the data for your live and ondemand streaming at the same time. The transport segments can be reused as is for ondemand .m3u8 streams. This blending of live and ondemand is something we started thinking about with the developers at Flumotion too.
- A third benefit is how easy this system would make it to do load balancing on a platform. In most streaming services, a connection is long-lived, and hard to migrate between servers. Since in Apple's live HTTP streaming the stream consists of several short files, you can switch servers by updating the playlists, effectively migrating the streaming sessions to another machine within a minute.
- As for drawbacks, the biggest drawback I see is the latency. In this system, the latency is at least the segmentation interval times three. This is because the playlist should only contain finished segments, and the spec mandates that the player have at least three segments loaded (one playing, two preloaded) to work. So, the recommended interval of 10 seconds gives you at best a 30 second latency. I don't really understand why they didn't work around this limitation somehow (for example, by allowing a growing transport stream in the playlist, marked as such, or referencing future files, marked as such), because this is where live iPhone streaming is going to catch the biggest amount of flak, if our customers' opinion about latency in general is anything to go by.
- Another possible drawback is the typical problem with most HTTP streaming systems - no synchronization of server and client clocks. Computer clocks typically don't match in speed, so in practice this usually means that the client's buffer will eventually underrun (causing pauses) or overrun (usually causing players to stop). In practice this is not that big of a deal, and I doubt on the iPhone sessions will be long enough to really make this a problem.
Whether this will become a general-purpose streaming protocol remains to be seen. I would assume that Apple is at least going to make this work in a future update of OSX. For us though it is an exciting development, allowing us to showcase the flexibility of our design to this new protocol. And while I saw some fellow GStreamer developers griping about this new way of streaming, there as well it should be seen as an advantage, since (in theory at least) the flexible GStreamer design should make it possible to write a source element for this protocol that abstracts the streaming implementation and just feeds the re-assembled transport stream much like a dvb or firewire element would do.