thomas.apestaart.org

morituri 0.2.2 “my bad” released

Filed under: morituri,Python,Releases — Thomas @ 22:23

2013-07-30
22:23

The 0.2.1 release contained a bug causing "rip offset" find to fail. That's annoying for new users, so I spent some time repenting in brown paper bag hell, and fixed a few other bugs besides. Hence, my bad.

I can understand that you didn't all mass-flattr the 0.2.2 release - you tried it and you saw the bug! Shame on me.

Well, it's fixed now, so feel free to pour in your flattr love if you use morituri! Just follow this post to my blog and hit the button.

The 0.2.2 packages are in the Fedora 17-18-19 repositories. Enjoy!

Comments (4)

morituri 0.2.1 “married” released

Filed under: Hacking,morituri,Python,Releases — Thomas @ 09:02

2013-07-15
09:02

I finally managed to set aside a few hours this weekend to fix some smaller issues in morituri and put out a new release. (For those who don't know, morituri is an accurate CD ripper for Linux)

Life's been a little busy lately and my spare time hacking has been suffering. But I'm happy I got a nice stretch of hacking hours in on morituri, and hope to repeat it in the next few weeks to knock out some more complicated issues, like tackling the reports of problems with latest pycdio releases.

The most important change is probably the filtering of non-FAT and other special characters, which I ended up doing a lot like sound-juicer does, because I trust Ross to have looked at this in detail.

In addition, after curiously reading Lionel Dricot's posts about Flattr, I decided to get a little more serious about trying Flattr again (I had only flattr'd about 4 things so far due to lack of content). I integrated Flattr in my wordpress install, upgrading it in the process, and installed the chrome extension which should give me many more options to flattr other people's content - for example, github repos.

So if you like morituri, go to this post on my website and click the Flattr button you see at the bottom of this post or on the morituri homepage!

I don't expect to get rich off it, but I think it's a nice way of showing you appreciate someone's work.

Comments (5)

morituri and Hidden Track One Audio

Filed under: morituri,Music,Python — Thomas @ 21:08

2013-05-10
21:08

I have tomorrow (saturday) blocked out for a whole day of morituri hacking as I will be home alone.

One of the things a lot of morituri users are puzzled by is its relentless drive to extract every single sample of audio from the CD. Currently, even if it's a really short pre-gap, and most likely just an inaccurate master or burn, with no useful audio in it.

For me, that was a design goal of morituri - I want to be able to exactly reproduce a CD as is. That is to say, ripping a CD should extract *all* audio from the CD, and it should be possible to make a copy of that CD and then rip that copy, and end up with exactly the same result as from the original CD. (I'm sure there's a fancy scientific term for that that I can't remember right now)

To a lot of other people, it seems to be annoying and they don't like having those small almost empty files lying around.

So I thought I'd do something about that, and that it might be useful as well to analyze my current collection of tracks and figure out what's in there. Maybe I can find some hidden gems that I hadn't noticed before?

So I added a quick task to morituri that calculates the maximum sample value (I didn't want to use my own level element in GStreamer for this as I wanted to make sure it was actual digital zero; this should be done in an element instead though, but I preferred the five minute hack for this one).

And then I ran:

rip debug maxsample /mnt/nas/media/audio/rip/morituri/own/album/*/00*flac

Sadly, that turned up 0 as the biggest sample for all these tracks!

Wait, what? I spent all that time on getting those secret tracks ripped just to get none? That's not possible! I know some of those tracks!

Maybe the algorithm is wrong. Nope, it works fine on all the regular tracks.

Oh, crap. Maybe morituri has been ripping silence all this time because my CD drive can't get that data off. Yikes, that would be a bit of egg on my face.

No, it works if I check that Bloc Party track I know about.

Ten minutes of staring at the screen to realize that, while I was outputting names from a variable from the for loop over my arguments, the track I was actually passing to the task was always the first one. Duh. Problem solved.

As for what I found in my collection:

a cute radio jingle that brought back memories from a live bootleg I had made myself of Bloem. That's from over ten years ago, but that must have been around the time I learned about the existence of HTOA and wanted to get one in
found unknown HTOA tracks on Art Brut's Bang Bang Rock & Roll, Mew's Half the world is watching me; not their best stuff
soundscapey or stagesetting tracks on QOTSA's Songs for the Deaf, Motorpsycho's Angels and Daemons at play And Blissard; not that worth it (the Blissard track was ok, but really quiet)
Pulp hid a single piano chord in a 2 second pre-gap on This is Hardcore; very curious. It's not an intro to the first track, because it doesn't fit with the sound at all.
Damien Rice hid a demo version of 9 Crimes (the first track) in the pregap; instead of piano and female vocals, he plays guitar and sings all the parts.
Got reacquainted with my favourite HTOA tracks: the orchestral quasi-wordless medley on the Luke Haines/Das Capital disc; the first Bloc Party album with a beautiful instrumental (up there with the hidden track at the end of Placebo's first album; both bands delivering an atypical but stunning moodscape; the beautiful cover of Ben Kenobi's Theme by Arab Strap on the Cherubs EP (no idea why that landed in my album dir, that needs to be fixed); the silly Soulwax skit for their second album.

Of course, Wikipedia has the last word on everything

I note that they think Pulp recorded a cymbal, not a piano. And now that I see the title of the QOTSA hidden track, I get the joke I think.

In total, on my album collection of 1564 full CD's, I have 171 HTOA's ripped, 138 tracks of pure digital silence, and only about 11 are actually useful tracks.

I expected to find more gems in my collection. I'll go through ep's, singles and compilations next just to be sure.

But with this code in hand, maybe it's time to add something to morituri to save the silent HTOA tracks as pure .cue information.

Comments (4)

Votes for talks at open source conferences

Filed under: Conference,Python — Thomas @ 12:53

2013-05-07
12:53

I've never been a fan of voting for talks, because it tends to be poorly implemented under the guise of democracy. Of course it's easy for me to talk, I've never organized anything at that scale.

I'll give two examples on why I feel this way, one of which triggering today's blog post.

First off, my colleague Marek submitted a talk to Djangocon. The talk was about how to use feat (a toolkit we wrote for livetranscoding) to serve Django pages, but in such a way that they can use Deferreds to remove the concurrency bottleneck of "1 request at a time" per process running Django.

Personally, to me, this is one of the most irritating design choices of Django - from the ground up it was built synchronously (which could have been fine in most places). But the fact that, when you get a request, you have to always synchronously respond to it (and block every other request for that process in the meantime) is a design choice that could have easily been avoided.

In our particular use case, it was really painful. If our website has to do an API request to some other service we don't control that can easily take 30 seconds, our process throughput suddenly becomes 2 pages per minute. All the while, the server is sitting there waiting.

Yes, you can throw RAM at the problem and start 30 times more processes; or thread out API requests; or farm it out to Celery, and do some back-and-forthing to see when the call's done. Or do any other number of workarounds for a fundamental design choice.

Since we like Twisted, we preferred to throw Twisted at the problem, and ended up with something that worked.

Anyway, that's a lot of setup to explain what the talk was about. Marek submitted the talk to DjangoCon, and honestly I didn't expect it to get much traction because, when you're inside Django, you think like Django, and you don't really realize that this is a real problem. Most people who do realize it switch away to something else.

But to my surprise, Marek's talk was the most-voted talk! I wish I could link to the results, but of course that vote site is no longer online.

I guess I expected that would mean he'd be presenting at DjangoCon this year. So I asked him today when his talk was, and he said "Oh that's right. I did not get accepted."

Well, that was a surprise. Of course, the organising committee reserves the right to decide on their own - maybe they just didn't like the talk. But if you ask your potential visitors to vote, you'd expect the most-voted talk to make it on the schedule no ?

The feedback Marek got from them was surprising too, though. Their first response was that this talk was too similar to another talk, titled "How to combine JavaScript & Django in a smart way". Now, I'm not a JavaScript expert, but from the title alone I can already tell that it's very unlikely that these two talks have many similarities beyond the word 'Django'.

After refuting that point, their second reason was that they wanted more experienced speakers (but they didn't ask Marek for his experience), and their third reason was that the talk was in previous editions of DjangoCon US/EU (it's unclear whether they meant his talk or the JavaScript one, but Marek's definitely wasn't, and we couldn't find any mention of the other talk in previous conferences. I'm also not sure why that even matters one way or the other. This email thread was in Polish, so I have to rely on Marek's interpretation of it)

Personally, my reaction would have been to complain to the organizers or Django maintainers. Marek's flegmatic attitude was much better though - after such an exchange, he simply doesn't want to have anything to do with the conference.

He's probably right - it's hard to argue with someone who doesn't want to invite you and is lying about the reasons.

The second example is BCNDevCon, a great conference here in Barcelona, organized by a guy who used to work for Flumotion who I have enormous respect for. I've never seen anyone create such a big conference over so little time.

He believes strongly in the democratic aspect, and as far as I can tell constructs the schedule solely based on the votes.

Sadly I didn't go to the last one, and the reason is simply because I felt that the talks that made it were too obviously corporate. A lot of talks were about Microsoft products, and you could tell that they won votes because people's coworkers voted on talks. I'm not saying that's necessarily wrong - given that he worked at our company and has friends here, I'm sure people working here presenting at his conference have also done vote tending. It's natural to do so. But there should be a way to balance that out.

I think the idea of voting is good, but implementation matters too. Ideally, you would only want people that actually are going to show up to vote. I have no idea how you can ensure that, though. Do you ask people to pre-pay ? Do you ask them to commit to pay if at least 50% of their votes make it in the final schedule, kickstarter-style ?

These two examples are on opposite extremes of voting. One conference simply disregards completely what people vote on. If I had voted or bought a ticket, I would feel lied to. Why waste the time of so many people? The other conference puts so much stock in the vote, that I feel the final result was strongly affected. I seriously doubt all those Windows 8 voters actually showed up.

Does anyone have good experiences with conference voting that did work? Feel free to share!

Comments (4)

measuring puppet

Filed under: puppet — Thomas @ 20:58

2013-01-24
20:58

For one of work's projects, we'll soon be working on scaling our platform more, which will require deploying a bunch more machines. For this project, we basically have a local dev platform, an online dev platform, a preproduction platform, and a production platform.

Right now, all these platforms have exactly one host. There are two puppetmasters, one for the dev platforms and one for the pre/pro platforms.

Since deploying a bunch more machines is going to require a lot more puppet running, I want to work on removing as much friction as I can from my puppet work. I do the runs manually as we upgrade platforms during deployment, and a run typically takes well over a minute. For me, that's too long - it causes me to waste time, lose focus, task switch, and forget I should be following up on puppet runs. It makes finetuning puppet modules a chore as I hack on them.

So I wanted to start by trimming some of the obvious fat before I segment my puppet config into separately testable pieces. I would have expected puppet apply to actually have something to help with that, but it doesn't.

After thinking it through, I realized I wanted some kind of tool that would timestamp output of puppet apply --debug so I could see which things it does take more time than others.

I wasn't sure what to google for, but timestamp stdout seemed to bring up some results, and I hit on http://joeyh.name/code/moreutils/ which includes a tool called 'ts' and is a simple pipe filter that timestamps lines going to stdout.

That was almost good enough. What I really wanted though was to know how much time elapsed since printing the last line. My perl is rusty, but I managed to quickly cook up a patch that makes it print incremental timestamps.

Now I can do a puppet run like this:
puppet apply --modulepath=`pwd`/modules:`pwd`/dev/modules manifests/site.pp --debug | ts -i "%H:%M:%.S" | egrep -B 1 "^00:00:0[^0]" 00:00:00.001066 debug: Executing 'test -e ../commit && ( test xorigin/master == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/master^0 | head -n 1` == `cat ../commit` ) )' 00:00:02.646908 debug: Service[postfix](provider=redhat): Executing '/sbin/service postfix status' -- 00:00:00.000987 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.871258 debug: Exec[git-checkout-/var/www/partner-test](provider=posix): Executing check 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:00.000942 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.886606 debug: Prefetching aliases resources for mailalias -- 00:00:00.000957 debug: Executing '/usr/sbin/semanage fcontext -l | grep -q '^/home/git/dev(/.*)?'' 00:00:01.750281 debug: /Stage[main]/Dexter::Apache/Selinux::Set_fcontext[home-httpd]/Exec[semanage-/home/git/dev-httpd_sys_content_t]/unless: /usr/sbin/semanage: Broken pipe -- 00:00:00.000855 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:02.064475 debug: /Schedule[puppet]: Skipping device resources because running on a host -- 00:00:00.001048 debug: Executing '/usr/sbin/semanage fcontext -l | grep -q '^/srv/merchant(/.*)?'' 00:00:01.750129 debug: /Stage[main]/Partner::Install/Selinux::Set_fcontext[srv-merchant-httpd]/Exec[semanage-/srv/merchant-httpd_sys_content_t]/unless: /usr/sbin/semanage: Broken pipe -- 00:00:00.000861 debug: Executing 'test -e ../commit && ( test xmaster == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify master^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.841316 debug: Exec[git-checkout-/var/www/merchant](provider=posix): Executing check 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:00.000955 debug: Executing 'test -e ../commit && ( test xorigin/release-1.4.x == `cat ../commit` || ( git fetch -a; test `git rev-parse --verify origin/release-1.4.x^0 | head -n 1` == `cat ../commit` ) )' 00:00:01.858206 debug: Service[httpd](provider=redhat): Executing '/sbin/service httpd status'

Some explanation is in order.

The puppet apply is straightforward if you know puppet a little - it will apply a manifest and spit out a lot of debug info.

The output gets piped into ts, which will do incremental timestamping (with -i, which is what my patch adds) according to the specified format (ts by default uses seconds precision, but can do microsecond precision if you use %.S in the format).

Then I grep for all lines that take at least 1 second to be output, and display the line before that too (since puppet is generating output either before or after a possibly long-running task, so either on the line before or the line that took too long).

In the first section, I doubt service postfix status is to blame, so it's probably my convoluted git updating that takes too long. I need to rework that module so it doesn't fetch on every run.

In the third section, semanage is to blame. Hm, maybe I need to find a different way to look up whether the particular fcontext rule I want to add is already there. I've considered converting it to facts, although that sounds like it would be stretching facts a little - that's a lot of info to store in a fact.

The others are repeats of both, so I know where to start trimming the fat now!

And when all > 1 sec items are gone, time to shave off more below that.

If you want to try out ts with incremental timestamping, it's available in the rebuilt moreutils rpm in my package repositories for CentOS 6 and F16/17/18.

If any puppetmaster (hah!) has good tips on how to debug and measure the catalog generation step (the one on the master), let me know!

Comments (6)

Present Perfect

morituri 0.2.2 “my bad” released

2013-07-3022:23

morituri 0.2.1 “married” released

2013-07-1509:02

morituri and Hidden Track One Audio

2013-05-1021:08

Votes for talks at open source conferences

2013-05-0712:53

measuring puppet

2013-01-2420:58

2013-07-30
22:23

2013-07-15
09:02

2013-05-10
21:08

2013-05-07
12:53

2013-01-24
20:58