Over the past few years I’ve been quietly exploring ideas for my ideal music application. When I lived together in that great house in Gent, we had a hacky set of PHP code that let us import music, rate it, and have it play back. It worked for our purposes, but it was a collection of hacky PHP code and hacky Perl code.
Now I’m not saying I got that much better at coding, but I’m sure I improved a little bit. I’ve always put off actually writing the damn code to replace it, and hence I have a bunch of separate music collections – the music I was listening to in that house (properly rated, but very outdated), random collections of downloads, and now the collection of CD’s I bought ever since leaving that house that never quite made it into my computer and are now being imported by the Lego robot.
Over a year ago, I re-implemented the mixing backend on top of GNonLin, which for the most part works as long as I don’t actually dereference tracks played – somethign to figure out at some point. I have ideas about a pure web-based mixing backend as well, but I need to learn modern stuff like JQuery first.
But the missing key really was something that handles the database part well enough, because my application should work distributed – it should manage my tracks on all my devices, including all my computers, and be able to figure out that some crappy mp3 of a song on my laptop is the same song as the flac version at home on my NAS. So if I rate that crappy mp3 on my laptop, I want that taken into account when my home machine creates a mix.
And for me, CouchDB promised to fill that niche. Except of course that I spent the last year figuring out how I can marry CouchDB’s approach to replication with my natural desire to denormalize. It turns out that’s possible with CouchDB, but it involves doing a lot of client-side caching (and invalidating/changing on change notifications) and is already pretty slow when I do it for my 14000 test tracks.
So, I’ve decided to experiment in a world where normalization is not needed, and I’m just going to pick one central concept (The ‘track’), store as much related data into that document as possible (on each computer, the fragments of audio files that represent that track; its ratings; what album it’s on; which artists made the track), treat some of those values as caches for the last known value from parent documents, and just go for speed first and see how that goes.
Yes, I am going to relax about not having everything perfect on the inside, so I can move on and write some more code that I can actually use.
I enjoyed a lot trying to shoehorn CouchDB into my relational wordview, but I want to see what life is like on the other side.
Before I was also very focused on migrating my old data (from the music I had when I was in the house in Gent) and its ratings. That’s still important to me, but I think right now I’d more enjoy having something that lets me listen to and rate new music. When I originally wrote DAD I didn’t expect to be getting so much music that wasn’t from CD’s. That’s obviously not the case anymore, and I’m probably one of the last maniacs still buying CD’s and worrying about getting them sample-perfect onto my NAS. In today’s reality I need to deal with having the same track fifteen times, in various qualities, and I wish my computer handled that for me.
As part of this shift in approach, both in how I use CouchDB and what music I now want to listen to, I’m going to build the code from the opposite side I’ve been doing, focusing on smaller building blocks and getting the experience right. Step one will be collecting the right data about audio files, splitting them into individual fragments, and loading music in two passes into the databases. I’ll focus on having small tools that show that the application can add tracks quickly and start playing them, filling in the more costly information later, and show that the GUI frontend can update these in realtime in the database view.
And, as usual, I like to shoehorn in a use for my python command class, so I’ll be using that as a collection point for these little tools as I work my way up.
After plugging in the right plumbing, in twenty minutes I had this on top of my old code:
$ dad analyze level /mnt/nas/media/davedina/audio/albums/Nirvana\ -\ In\ Utero/Nirvana\ -\ All\ Apologies.ogg
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
Successfully analyzed file /mnt/nas/media/davedina/audio/albums/Nirvana - In Utero/Nirvana - All Apologies.ogg.
- fragment 0: 0:00:00.000000000 - 0:03:50.230204081
- peak 0.240 dB (105.672 %)
- rms -14.199868248342282 dB
- peak rms -8.913940439528652 dB
- 95 percentile rms -12.001385041642244 dB
- weighted rms -14.202287606952533 dB
- weighted from 0:00:01.205986394 to 0:03:39.612879818
- fragment 1: 0:23:59.107482993 - 0:31:32.227482993
- peak 0.526 dB (112.876 %)
- rms -14.742109190444983 dB
- peak rms -8.729096757819718 dB
- 95 percentile rms -11.56951163744373 dB
- weighted rms -14.742603253857133 dB
- weighted from 0:23:59.223582765 to 0:31:18.498684807
In case you were wondering, this shows the code correctly determining that the ‘All Apologies’ track on the In Utero CD contains in fact two songs. It always annoys the hell out of me when any of the music players I use doesn’t play anything for 20 minutes just because Kurdt thought that would be amusing all those years ago.
(In case you were really astute, you may have noticed that this code claims that the peak of these fragments is over unity, which would be weird and wrong you would think. Monty could give you a long and interesting explanation on how that is in fact natural and every time I read it I still don’t get it, even with my audio engineering background, and I still don’t know if this apparent peak level is a bad thing, but in practice my playback code auto-levels anyway and consistently reduces volume on tracks, so I don’t think it matters anyway…)