[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

Organizing photo libraries

Filed under: Life,Pictures — Thomas @ 12:51

2013-05-18
12:51

The weather's picking up so it's time for spring cleaning around the house. When I moved back to Barcelona three years ago I took with me my old analogue photos and negatives, with the idea of sorting through them at some point and getting them digitized. And while I'm at it, maybe it's time to pull all my various folders of photos together too and organize them.

Well, I finally started. I grouped the negatives, labeled them by year, put them in individual envelopes, and handed them off to a professional lab to scan them after doing a quick test run on one set (which turned out great, but it's *really* annoying me that they scan to JPEG by default, charge 40% extra for TIFF, and use a non-multiple-of-8 resolution to scan at which means I can't losslessly rotate the negatives. Yes, I'm anal.)

So now I pulled together all my various folders of photos, and before I start doing tagging and stuff like that, I want to organize them in a decent folder layout. Googling for ideas pretty much suggests that the way to go is

YYYY/MM/DD

with possibly some description together with the DD

I'm not really happy about that, however, because there are certain things I'd like to be able to do:

  • easily see where photos come from - did I make them ? did I get them from someone ? Did I download them from Facebook ?
  • Are these original files from a camera without editing ?
  • Are these the original scans ? From negatives ? From actual photos ? Or are they retouched, rotated, denoised, ...
  • Are these photos SFW ? Can I point my media center slideshow to this directory and have it safely show any photos under it ? (What do you mean, you've never snowboarded at night in only your underwear, and mooning the photographer ?) Or maybe not even SFW, but simply watchable and reasonable quality or subject material?

I realize some of these issues can not be resolved simply with a directory layout. But I'm sure some of you must have had similar issues or come up with a slightly better layout ?

Point me in the right direction please.

morituri and Hidden Track One Audio

Filed under: morituri,Music,Python — Thomas @ 21:08

2013-05-10
21:08

I have tomorrow (saturday) blocked out for a whole day of morituri hacking as I will be home alone.

One of the things a lot of morituri users are puzzled by is its relentless drive to extract every single sample of audio from the CD. Currently, even if it's a really short pre-gap, and most likely just an inaccurate master or burn, with no useful audio in it.

For me, that was a design goal of morituri - I want to be able to exactly reproduce a CD as is. That is to say, ripping a CD should extract *all* audio from the CD, and it should be possible to make a copy of that CD and then rip that copy, and end up with exactly the same result as from the original CD. (I'm sure there's a fancy scientific term for that that I can't remember right now)

To a lot of other people, it seems to be annoying and they don't like having those small almost empty files lying around.

So I thought I'd do something about that, and that it might be useful as well to analyze my current collection of tracks and figure out what's in there. Maybe I can find some hidden gems that I hadn't noticed before?

So I added a quick task to morituri that calculates the maximum sample value (I didn't want to use my own level element in GStreamer for this as I wanted to make sure it was actual digital zero; this should be done in an element instead though, but I preferred the five minute hack for this one).

And then I ran:

rip debug maxsample /mnt/nas/media/audio/rip/morituri/own/album/*/00*flac

Sadly, that turned up 0 as the biggest sample for all these tracks!

Wait, what? I spent all that time on getting those secret tracks ripped just to get none? That's not possible! I know some of those tracks!

Maybe the algorithm is wrong. Nope, it works fine on all the regular tracks.

Oh, crap. Maybe morituri has been ripping silence all this time because my CD drive can't get that data off. Yikes, that would be a bit of egg on my face.

No, it works if I check that Bloc Party track I know about.

Ten minutes of staring at the screen to realize that, while I was outputting names from a variable from the for loop over my arguments, the track I was actually passing to the task was always the first one. Duh. Problem solved.

As for what I found in my collection:

  • a cute radio jingle that brought back memories from a live bootleg I had made myself of Bloem. That's from over ten years ago, but that must have been around the time I learned about the existence of HTOA and wanted to get one in
  • found unknown HTOA tracks on Art Brut's Bang Bang Rock & Roll, Mew's Half the world is watching me; not their best stuff
  • soundscapey or stagesetting tracks on QOTSA's Songs for the Deaf, Motorpsycho's Angels and Daemons at play And Blissard; not that worth it (the Blissard track was ok, but really quiet)
  • Pulp hid a single piano chord in a 2 second pre-gap on This is Hardcore; very curious. It's not an intro to the first track, because it doesn't fit with the sound at all.
  • Damien Rice hid a demo version of 9 Crimes (the first track) in the pregap; instead of piano and female vocals, he plays guitar and sings all the parts.
  • Got reacquainted with my favourite HTOA tracks: the orchestral quasi-wordless medley on the Luke Haines/Das Capital disc; the first Bloc Party album with a beautiful instrumental (up there with the hidden track at the end of Placebo's first album; both bands delivering an atypical but stunning moodscape; the beautiful cover of Ben Kenobi's Theme by Arab Strap on the Cherubs EP (no idea why that landed in my album dir, that needs to be fixed); the silly Soulwax skit for their second album.

Of course, Wikipedia has the last word on everything

I note that they think Pulp recorded a cymbal, not a piano. And now that I see the title of the QOTSA hidden track, I get the joke I think.

In total, on my album collection of 1564 full CD's, I have 171 HTOA's ripped, 138 tracks of pure digital silence, and only about 11 are actually useful tracks.

I expected to find more gems in my collection. I'll go through ep's, singles and compilations next just to be sure.

But with this code in hand, maybe it's time to add something to morituri to save the silent HTOA tracks as pure .cue information.

Votes for talks at open source conferences

Filed under: Conference,Python — Thomas @ 12:53

2013-05-07
12:53

I've never been a fan of voting for talks, because it tends to be poorly implemented under the guise of democracy. Of course it's easy for me to talk, I've never organized anything at that scale.

I'll give two examples on why I feel this way, one of which triggering today's blog post.

First off, my colleague Marek submitted a talk to Djangocon. The talk was about how to use feat (a toolkit we wrote for livetranscoding) to serve Django pages, but in such a way that they can use Deferreds to remove the concurrency bottleneck of "1 request at a time" per process running Django.

Personally, to me, this is one of the most irritating design choices of Django - from the ground up it was built synchronously (which could have been fine in most places). But the fact that, when you get a request, you have to always synchronously respond to it (and block every other request for that process in the meantime) is a design choice that could have easily been avoided.

In our particular use case, it was really painful. If our website has to do an API request to some other service we don't control that can easily take 30 seconds, our process throughput suddenly becomes 2 pages per minute. All the while, the server is sitting there waiting.

Yes, you can throw RAM at the problem and start 30 times more processes; or thread out API requests; or farm it out to Celery, and do some back-and-forthing to see when the call's done. Or do any other number of workarounds for a fundamental design choice.

Since we like Twisted, we preferred to throw Twisted at the problem, and ended up with something that worked.

Anyway, that's a lot of setup to explain what the talk was about. Marek submitted the talk to DjangoCon, and honestly I didn't expect it to get much traction because, when you're inside Django, you think like Django, and you don't really realize that this is a real problem. Most people who do realize it switch away to something else.

But to my surprise, Marek's talk was the most-voted talk! I wish I could link to the results, but of course that vote site is no longer online.

I guess I expected that would mean he'd be presenting at DjangoCon this year. So I asked him today when his talk was, and he said "Oh that's right. I did not get accepted."

Well, that was a surprise. Of course, the organising committee reserves the right to decide on their own - maybe they just didn't like the talk. But if you ask your potential visitors to vote, you'd expect the most-voted talk to make it on the schedule no ?

The feedback Marek got from them was surprising too, though. Their first response was that this talk was too similar to another talk, titled "How to combine JavaScript & Django in a smart way". Now, I'm not a JavaScript expert, but from the title alone I can already tell that it's very unlikely that these two talks have many similarities beyond the word 'Django'.

After refuting that point, their second reason was that they wanted more experienced speakers (but they didn't ask Marek for his experience), and their third reason was that the talk was in previous editions of DjangoCon US/EU (it's unclear whether they meant his talk or the JavaScript one, but Marek's definitely wasn't, and we couldn't find any mention of the other talk in previous conferences. I'm also not sure why that even matters one way or the other. This email thread was in Polish, so I have to rely on Marek's interpretation of it)

Personally, my reaction would have been to complain to the organizers or Django maintainers. Marek's flegmatic attitude was much better though - after such an exchange, he simply doesn't want to have anything to do with the conference.

He's probably right - it's hard to argue with someone who doesn't want to invite you and is lying about the reasons.

The second example is BCNDevCon, a great conference here in Barcelona, organized by a guy who used to work for Flumotion who I have enormous respect for. I've never seen anyone create such a big conference over so little time.

He believes strongly in the democratic aspect, and as far as I can tell constructs the schedule solely based on the votes.

Sadly I didn't go to the last one, and the reason is simply because I felt that the talks that made it were too obviously corporate. A lot of talks were about Microsoft products, and you could tell that they won votes because people's coworkers voted on talks. I'm not saying that's necessarily wrong - given that he worked at our company and has friends here, I'm sure people working here presenting at his conference have also done vote tending. It's natural to do so. But there should be a way to balance that out.

I think the idea of voting is good, but implementation matters too. Ideally, you would only want people that actually are going to show up to vote. I have no idea how you can ensure that, though. Do you ask people to pre-pay ? Do you ask them to commit to pay if at least 50% of their votes make it in the final schedule, kickstarter-style ?

These two examples are on opposite extremes of voting. One conference simply disregards completely what people vote on. If I had voted or bought a ticket, I would feel lied to. Why waste the time of so many people? The other conference puts so much stock in the vote, that I feel the final result was strongly affected. I seriously doubt all those Windows 8 voters actually showed up.

Does anyone have good experiences with conference voting that did work? Feel free to share!

If I was 16 years younger…

Filed under: General — Thomas @ 22:30

2013-05-03
22:30

I'd totally try and be the intern for pinboard.

The money is great for a summer job, but that's not the important part. pinboard seems interesting, it's a real service, and it's (I assume) small enough to understand from top to bottom. Contrary to, say, a Google Summer of Code project, you get to touch a real existing service, and from what I can tell from the blog you get to do it with a smart and funny guy.

You've got five weeks left; even if you're in the middle of exams right now, apply!

(And if you do, why not add the features to merge and rename tags while you're at it?)

picture