Pages

Monday, May 25, 2015

Daala and H.264

This is an article I've been thinking over for a while, and was sort of forced to write it when I saw the ideas within were taking over another article I was writing about Mozilla and Firefox OS.  So I'll write this here and (hopefully) finish the Mozilla post afterwards.

So, x.264 sucks.  Don't get me wrong, it's a decent codec.  But it's also a bit of a honey pot, one that doesn't directly cost you and me, but it costs device and software manufacturers, so does cost us indirectly in the long run.

While Apple and Google were lining up behind h.264, I think people (myself included) were blinded by its being an "open" video standard.  I presumed that to mean license free.  What it really means is anybody can create their own decoder, but you still have to pay a license to provide that or any decoder.  That's right, it doesn't matter if libx264 is open source, it still costs you if you want to ship it on a device or use it to create a web-based player.  Mp3 decoding requires a similar license fee.  This is a contributing factor to Microsoft discontinuing Windows Media Player.  It also means I was a bit hard on Archos back in the day for charging for extra codec packs.  They might have made some money off them, but they were at least slightly justified in making them optional.

The insidiousness is this:  h.264 is a decent codec, and free for most of us to use.  So of course if we rip video, or edit home movies, we would probably save them in h.264.  I would bet most home video files and torrent files are h.264.  Blu-Ray video is also h.264, by the way.  So the trick is, by making it free to users, they've helped to ensure that h.264 is what people will use.  Therefore, anybody who wants to create an audio/video platform (be it a game console, phone, web site, set top box) would have to support h.264, and therefore would have to pay money to MPEG LE, LLC in license fees.  (you could argue that you could have your web platform use whatever codecs you want, but then you'd lose CPU power, both because CPUs are commonly optimized for h.264 decoding, and because most source videos would be h.264, so you'd have to decode them anyway.)

And if you think about it, you can see why Apple and Google were all for it.  Google has more than enough money to toss out to pay up for a license for Youtube, same with Apple and iTunes.  But it also creates a nice little barrier for entry, preventing other similar sites from popping up and competing.  I'm not sure how much it costs, or if it's by video or a yearly fee or what.  It may in fact only be a nuisance fee that doesn't block anybody.  (actually, there's a price breakdown here, it looks like an annual cost, around 14 cents a subscriber at one tier)

Most Linux distros don't include any non-free software in their official repos.  As you can gather, it's not due to snobbery.  It's because if they shipped non-free products like these decoders with their software they'd have to pay licence fees.  It may be just slightly because of the most well-meaning snobbery, but mostly it's because of legal reasons, I'm sure.

The Xiph Foundation, the organization behind Ogg Vorbis and Opus, are now working on Daala (with Mozilla's help).  Daala is the answer to h.264 and h.265---it's not finished yet (it'll probably take another year or so) but it already outperforms both of the aforementioned MPEG codecs.  I should say though that by the graph in the video linked below it seems to only marginally outperform both.  Don't expect magic, just be happy that this thing which is completely free is also superior to what others want you to pay for.

If you're interested in Daala, check out this video by Timothy Terriberry, explaining the choices they made in creating the video codec.  It's an hour long but fascinating.  My favorite part, his description of a invertible blocking filter.  They can block up the video to make it easier to process, encode it, and then have the decoder deblock the video (inverting the previous blocking filter), preserving a lot of the information at a reduced CPU cost.

Another funny bit in the video, Mr. Terriberry explains that one of the ways to avoid patent infringement was to find something all other video codecs do,  and simply not do that thing.  There are four areas in which they deviated from the norm.  One of them, Displaced Frame Difference, is so common that their technical writer inserted it into an early draft of their patent application on his own, assuming that since it's a video codec, it must use DFD.

One last note, the x.264 and lame mp3 encoders are both open source, and both considered by many to be the top of their fields.  x.264 especially is thought to be the best optimized h.264 encoder out there.  I can only imagine when Daala comes out, and these communities start optimizing it, I can't wait to see what happens.

Til then,

David

No comments:

Post a Comment