Present Perfect


Picture Gallery
Present Perfect

Collabora and Fluendo collaborate fluently!

Filed under: Fluendo,GStreamer — Thomas @ 13:01


Well, this sure has been a long time in the making.

Fluendo and Collabora have a checkered past which I won't get into, but on paper it has always made sense for these two companies to collaborate and making GStreamer work commercially. One company specializes in products, the other in consulting (I'm sure you can figure out which is which), and complement each other perfectly to make GSstreamer more successful commercially.

I personally have always believed that we need to get GStreamer to other platforms and make them as easy to use as possible. Windows was an obvious target in the past, and now Android is another. There is a big difference between a successful open source project, and a commercially successful one. Flumotion's Andoni Morales who came with me to the GStreamer 0.11 hackfest in Malaga is going to be working on this one SDK to rule them all.

Christian beat me to it in the blogosphere, but the word is now officially out! Feel free to read Fluendo's press release.

GStreamer 0.11 Application Porting Hackfest

Filed under: Conference,Flumotion,GStreamer,Hacking,Open Source — Thomas @ 11:16


I'm in the quiet town of Malaga these three days to attend the GStreamer hackfest. The goal is to port applications over to the 0.11 API which will eventually be 1.0 There's about 18 people here, which is a good number for a hackfest.

The goal for me is to figure out everything that needs to be done to have Flumotion working with GStreamer 0.11. It looks like there is more work than expected, since some of the things we rely on haven't been ported successfully.

Luckily back in the day we spent quite a bit of time to layer parts as best as possible so they don't depend too much on each other. Essentially, Flumotion adds a layer on top of GStreamer where GStreamer pipelines can be run in different processes and on different machines, and be connected to each other over the network. To that end, the essential communication between elements is abstracted and wrapped inside a data protocol, so that raw bytes can be transferred from one process to another, and the other end ends up receiving those same GStreamer buffers and events.

First up, there is the GStreamer Data protocol. Its job is to serialize buffers and events into a byte stream.

Second, there is the concept of streamheaders (which is related to the DELTA_UNIT flag in GStreamer). These are buffers that always need to be send at the beginning of a new stream to be able to interpret the buffers coming after it. In 0.10, that meant that at least a GDP version of the caps needed to be in the streamheader (because the other side cannot interpret a running stream without its caps), and in more recent versions a new-segment event. These streamheaders are analogous to the new sticky event concept in 0.11 - some events, like CAPS and TAG and SEGMENT are now sticky to the pad, which means that a new element connected to that pad will always see those events to make sense of the new data it's getting.

Third, the actual network communication is done using the multifdsink element (and an fdsrc element on the other side). This element just receives incoming buffers, keeps them on a global buffer list, and sends all of them to the various clients added to it by file descriptor. It understands about streamheaders, and makes sure clients get the right ones for wherever they end up in the buffer list. It manages the buffers, the speed of clients, the bursting behaviour, ... It doesn't require GDP at all to work - Flumotion uses this element to stream Ogg, mp3, asf, flv, webm, ... to the outside world. But to send GStreamer buffers, it's as simple as adding a gdppay before multifdsink, and a gdpdepay after fdsrc. Also, at the same level, there are tcpserversink/tcpclientsrc and tcpclientsink/tcpserversrc elements that do the same thing over a simple TCP connection.

Fourth, there is an interface between multifdsink/fdsrc and Python. We let Twisted set up the connections, and then steal the file descriptor and hand those off to multifdsink and fdsrc. This makes it very easy to set up all sorts of connections (like, say, in SSL, or just pipes) and do things to them before streaming (like, for example, authentication). But by passing the actual file descriptor, we don't lose any performance - the low-level streaming is still done completely in C. This is a general design principle of Flumotion: use Python and Twisted for setup, teardown, and changes to the system, and where we need a lot of functionality and can sacrifice performance; but use C and GStreamer for the lower-level processor-intensive stuff, the things that happen in steady state, processing the signal.

So, there is work to do in GStreamer 0.11:

  • The GStreamer data protocol has not really been ported. gdppay/depay are still there, but don't entirely work.
  • streamheaders in those elements will need adapting to handle sticky events.
  • multifdsink was moved to -bad and left with broken unit tests. There is now multisocketsink. But sadly it looks like GSocket isn't meant to handle pure file descriptors (which we use in our component that records streams to disk for example)
  • 0.11 doesn't have the traditional Python bindings. It uses gobject-introspection instead. That will need a lot of work on the Flumotion side, and ideally we would want to keep the codebase working against both 0.10 and 0.11 as we did for the 0.8->0.10 move. Apparently these days you cannot mix gi-style binding with old-style binding anymore, because they create separate class trees. I assume this also means we need to port the glib2/gtk2 reactors in Twisted to using gobject-introspection.

So, there is a lot of work to be done it looks like. Luckily Andoni arrived today too, so we can share some work.

After discussing with Wim, Tim, and Sebastien, my plan is:

  1. create a common base class for multihandlesink, and refactor multisocketsink and multifdsink as subclasses of it
  2. create g_value_transform functions to bytestreams for basic objects like Buffers and Events
  3. use these transform functions as the basis for a new version of GDP, which we'll make typefindable this time around
  4. support sticky events
  5. ignore metadata for now, as it is not mandatory; although in the future we could let gdppay decide which metadata it wants to serialize, so the application can request to do so
  6. try multisocketsink as a transport for inside Flumotion and/or for the streaming components.
  7. In the latter case, do some stress testing - on our platform, we have pipelines with multifdsink running for months on end without crashing or leaking, sometimes going up to 10000 connections open.
  8. Make twisted reactors
  9. prototype flumotion-launch with 0.11 code by using gir

That's probably not going to be finished over this week, but it's a good start. Last night I started by fixing the unit tests for multifdsink, and now I started refactoring multisocketsink and multifdsink with that. I'll first try and make unit tests for multisocketsink though, to verify that I'm refactoring properly.

GStreamer Conference number 2

Filed under: Conference,Flumotion,GStreamer — Thomas @ 14:28


I'm in Prague right now for the second GStreamer conference. Prague is as pretty as I remember it from eighteen years ago when I was still in high school and we had our yearly school trip.

It's great to see a mix of familiar and new faces again. 11 years ago GStreamer was made public, and I joined a year later around the 0.1.1 release if I recall. And now it's this huge living breathing thing.

Tomorrow I will be giving a talk about Flumotion here, at 12.00 in the main room. If you're interested in GStreamer beyond mere playback, this talk is for you. The only sad part is that my good friend Jan Schmidt will be talking about Bluray at the same time, but I'm relying on Ubicast to record it properly so I can see it later!

Adventures in fingerprinting

Filed under: DAD,Fedora,GStreamer — Thomas @ 20:55


One of the key concepts in my rewrite of DAD is that it should be possible to relate the same track across different files and computers. I have copies of files, and different encodings of the same track, spread across machines. Various applications I use for playback seem to exist in isolation on each machine, and so I tend to rate only occasionally knowing that my ratings aren't centralized. And I get annoyed when banshee detects three copies of an album, and then orders them by track number, playing each track three times before moving on to the next one.

The logical way to do is is through acoustic fingerprinting. These are algorithms that extract certain features from an audio file and calculate an algorithm-specific 'fingerprint' for it. Usually, these fingerprints are not identical across different encodings of the same file, so you can't look up twins in a list; but the fingerprints can be compared to each other and a 'difference' within a certain confidence interval calculated.

Most fingerprinting algorithms have a library that calculates a fingerprint and then submits it to a complimentary web service where it can quickly compare it to find twins.

In the past, either the client library/application or the web service (or both) was not open enough to be of interest for most Free Software people.

But recently, someone in the #morituri channel mentioned acoustid which only consists of open components. So, that seemed interesting enough to try out!

The chromaprint client-side library consists of a library, a sample application (linked against FFMPEG), and a python module with some sample scripts.

There is also a gst-chromaprint GStreamer plug-in on github. (As a side note, amazing to see that GStreamer plug-ins these days come for free! I recall the days when we had to the work ourselves to write GStreamer plug-ins for libraries)

So, after giving them a quick test run, I packaged up the whole set and it's now available for Fedora 14 and 15 in my package repositories

The chromaprint-tools package contains fpcalc and you need to enable rpmfusion-nonfree to get its ffmpeg dependency.

And after that, I created a Task in DAD for chromaprint, and now I have:

$ dad analyze chromaprint /opt/davedina/audio/albums/Afghan\ Whigs\ -\ Gentlemen/Afghan\ Whigs\ -\ Debonair.ogg
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
/opt/davedina/audio/albums/Afghan Whigs - Gentlemen/Afghan Whigs - Debonair.ogg:
Found 1 results
- Found 4 recordings.
- musicbrainz id: 62b2952a-4605-4793-8b79-9f9745ea5da5
- artist: The Afghan Whigs
- title: Debonair
- musicbrainz id: 8ff78e73-f8bd-4d78-b562-c3e939fb93fb
- artist: The Afghan Whigs
- title: Debonair
- musicbrainz id: a0d5ced6-43e8-450a-bf11-94f1f4520b92
- artist: The Afghan Whigs
- title: Debonair
- musicbrainz id: d01ac720-874c-48d6-95c6-a2cb66f9d5d0
- artist: The Afghan Whigs
- title: Debonair


Now it's time to dump that in the couchdb database backend, and start identifying duplicate tracks.

Acoustid seems to be a relatively young project, but its maintainer is very active on the mailing list and it's filling a hole in the open world that I'm happy to see filled! Thank you Lukas.

Step 1

Filed under: GStreamer,Hacking — Thomas @ 13:24


[root@ana ~]# rpm -Uhv /home/thomas/rpm/RPMS/x86_64/gstreamer011-*
Preparing... ########################################### [100%]
1:gstreamer011 ########################################### [ 33%]
2:gstreamer011-devel ########################################### [ 67%]
3:gstreamer011-debuginfo ########################################### [100%]


Next Page »