[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

The Art of the Rip

Filed under: DAD,Hacking,Python — Thomas @ 10:40

2009-05-03
10:40

While I'm working on the ripping software, I find myself going back and forth between various references to figure out the small details and the pieces that subtly get interpreted differently between them. As is the case with other projects, I can easily see myself forgetting about these details soon, and cursing myself a year from now for not having written down my clear understanding of today.

So, in an effort to appease my future self, I've started writing down a condensed form of the important information I've come across.

On that page, I'm also comparing various ripping programs and how they handle the various details I consider important for correct ripping. I'll use that information and that chart as the basis for the features of my ripping program.

I'm trying to stay as objective as possible on that page, so feel free to tell me about mistakes, omissions, software I should be adding, ...

By now, I have a good set of goals for my ripping program:

  • lossless ripping
  • accuracy is the number one goal
  • speed is always second to accuracy
  • hands-off one-click/command ripping
  • separate ripping from metadata fixing
  • rip hidden track one audio automatically

With this in mind, I thought yesterday how I could figure out the drive's read offset the way EAC does it. I've come up with a simple program that:

  • checks if the current CD is in the AccurateRip database
  • if it is, rip the first track with various offsets
  • if any of the AccurateRip checksums match, that is most likely the offset for your drive

It took longer to test the program than to write it, since my AccurateRip checksum calculation is currently done purely in Python and thus rather slow.

In any case, using Bat For Lashes' "Fur and Gold":

[gst-git] [thomas@ana trunk]$ PYTHONPATH=$PYTHONPATH:`pwd` python examples/ARcalibrate.py
CDDB disc id 8a0aa10b
AccurateRip URL http://www.accuraterip.com/accuraterip/9/f/f/dBAR-011-00112ff9-00976269-8a0aa10b.bin
4 AccurateRip reponses found
ripping track 1 with offset 46
AR checksum calculated: b880421e
ripping track 1 with offset 47
AR checksum calculated: 4a29a173
ripping track 1 with offset 48
AR checksum calculated: 903b390e
MATCHED against response 3
offset of device is 48
ripping track 1 with offset 49
AR checksum calculated: e7c008f1
[gst-git] [thomas@ana trunk]$

I made the program scan from 46 to 49, knowing that my drive has a +48 read offset. Now I'm going to add an option to choose the range, an option to start with the most common offsets, and think about including using online databases of drive features to start with the one most likely to be correct for your drive.

13 Comments »

  1. Isn’t there a way to copy the entire CD and annotate somewhere where the tracks start/end?

    Wouldn’t that be less error-prone and future-proof? Once you have the whole CD, you can fix track offset at a later stage.

    Comment by Pla — 2009-05-03 @ 11:48

  2. @Pla: if you rip with offset 0, and you rip only ‘just enough’, you will not have ripped the final few samples. It would probably be possible if your drive can overread and you put some upper limit on how much to read extra, but it would add a lot of complexity. It’s much simpler to know your read offset from the start and have your rips come out clean without further processing needed.

    Comment by Thomas — 2009-05-03 @ 11:56

  3. If accurate ripping is your number one goal and your are willing to sacriface speed for it, then i very much recommend Rubyripper: http://code.google.com/p/rubyripper/

    Rubyripper uses cdparanoia in a innovative way which makes it very efficent ripper but it’s quite slow. It helped me rip some damaged cds which were quite difficult for EAC.

    Comment by KM — 2009-05-03 @ 12:29

  4. @KM: See https://thomas.apestaart.org/thomas/trac/wiki/DAD/Rip#Comparison

    Rubyripper really only adds test and verify to the mix, although it looks like they have a clever way of doing it using cdparanoia. But it doesn’t handle gaps at all afaict, and you have to configure read offset manually.

    Comment by Thomas — 2009-05-03 @ 12:39

  5. You are not alone in caring about such “neurotic analysis”!

    Your goal section is spot on – thank you.

    Personally I’d add the ability to replay the ripped CD (including pre-gaps or gapless, hidden tracks etc.) ‘as the artist intended’ to your goals. i.e. that it should also be possible to play back a ripped CD such that it’s identical to the original CD, not just the underlying data that’s identical (I suspect you intend this, though I’d imagine this requires support at a different level… in the playback tools).

    I’ve always been perplexed as to why people are only interested in playback of tracks, not albums.

    Is gapless a zero length Index 00, or a missing Index 00?

    Comment by kp — 2009-05-03 @ 14:47

  6. Future Thomas will thank you for it.

    (http://www.thepaincomics.com/weekly090218.htm)

    Comment by Paul Collins — 2009-05-04 @ 00:57

  7. Instead of freedb you you can also use musicbrainz (http://musicbrainz.org/), it uses the whole TOC to identify the CD and has more accurate information on it.

    Comment by Kurt Roeckx — 2009-05-04 @ 22:27

  8. multidrive usage would be cool. i have a desktop with two internal cd drives, and also i have an extral USB drive. my ideal work flow would be for the drives to open, my put in a disk, close the drive, it rips, and then ejects when done. so every few mins a drive would spit out a disk and i could stick in the new one.

    i think my cpu could keep up. but if not the data could be cached and stuck in a queue

    Comment by ssam — 2009-05-05 @ 01:50

  9. Does your example program rip the entire track each time, with different offsets? Isn’t it possible to read enough samples to cover the range you’re interested in, and then just calculate the checksums starting at a different point?

    Comment by Jan Schmidt — 2009-05-05 @ 14:13

  10. @Jan: yes, that’s something I want to do in the future to optimize it, but only after moving the AccurateRip calculation to pure C.

    Comment by Thomas — 2009-05-05 @ 14:17

  11. Check this out, interesting findings about drives ripping performance:
    http://www.hydrogenaudio.org/forums/index.php?showtopic=71597

    Some kind of multi-session ripping which would compare insecure rips of the same cd with different drives and if there’s a match, combine them to one secure rip, as suggested in post #7, could be useful.

    Comment by KM — 2009-05-07 @ 23:59

  12. @KM: that does look interesting. If I run into a disc that I can’t rip, I’ll try that out and see if I can work it in.

    Comment by Thomas — 2009-05-08 @ 09:47

  13. […] there is no concrete goal set, I’m well on my way. I have various projects going on, from writing a correct ripper (which Linux still lacks) to getting Lego Mindstorms sets to create a CD ripping robot to reviving […]

    Pingback by thomas.apestaart.org » New Year’s Resolutions — 2009-08-16 @ 12:26

RSS feed for comments on this post. TrackBack URL

Leave a comment

picture