[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

ripping CD’s

Filed under: Music — Thomas @ 01:01

2007-08-24
01:01

So, before I shell out cold hard cash for new drives, I thought I should at least first figure out how I'm going to actually rip all my CD's.

If the goal of ripping my CD's is to have a perfect copy on disk, then certain logical conclusions present themselves.

  • I should rip a CD completely as one file; this avoids problems with CD's with gapless transitions (for example, live cd's)  Besides gaps between tracks due to encoding (a lot of lossy compression methods have a windowing algorithm which causes them to actually decode some extra leading and trailing samples), there's also the fact that CD tracks are made up of sectors of 2352 audio bytes or  588 16-bit stereo samples (at 75 sectors per second), and so any audio file that's not a perfect multiple of the sector size will not be able to play back gaplessly on a CD because the sector gets zero-padded.  (Arguably they probably would be fine in separate files anyway given that I would expect a rip from CD to have exactly  a sector multiple of samples)
  • Another reason to rip as one file is that I've come to believe that a good music management system should actually separate the concept of "song" from the concept of "medium".   A lot of "sound files" actually contain more than one song (think hidden track with lots of silence at the end of a regular CD track), and some sound files contain no songs at all (think spacers between songs and the hidden track).  If you don't know what I'm talking about then you haven't been sufficiently pissed off yet at how "All Apologies" comes with this incredibly long stretch of silence before a crap bonus track.
  •  So if you're going to separate the concept of "audio file" and "song" anyway, why not go the whole nine yards and rip the whole CD as one track, allowing you to also re-burn a CD from that ?
  • "songs" can then be defined as a start and end position in that file in the management system, defaulting to the actual cue points that the CD knows about

Of course, it's probably not going to be that easy yet to handle music like this.  I had a hard time finding any tools on Linux that would actually generate a .CUE file from a CD in my cd drive.  The closest I got was using cdrdao read-toc to generate a .toc file and then convert with cuetools.  Anyone know of other alternatives ?

Also, it didn't look like cdrdao deals with pre-gap tracks correctly.  For example, for "Any Minute Now" by Soulwax (I knew that Much Against Everyone's Advice had a pre-gap track, but that CD is locked upi in a box in Belgium.  But the internet told me that this album had one too, and lo and behold, there it was in all its crappiness), cdrdao extracts this:

CD_DA

// Track 1
TRACK AUDIO
NO COPY
NO PRE_EMPHASIS
TWO_CHANNEL_AUDIO
ISRC "BEP010400101"
SILENCE 05:29:03
FILE "data.wav" 0 04:35:07
START 05:29:03
So it marks the pre-gap as silence.  Annoying.  And also, tracks can have different pre-gap size, and it doesn't look like the .toc format takes this into account.  Does anyone know ?

(Also, I had completely forgotten about ISRC  codes, and I had *NO* idea they were actually recorded on CD tracks - not too old to learn - are FreeDB and MusicBrainz even tracking these codes ? Does anyone know of an on-line database of these things ?)

Once I know I have the tools and the format in place to make an exact copy of a CD, I can start looking at what I'll need to do to make a player support this if it doesn't yet, and maybe add GStreamer support if it's not there yet ?

To sum up - what do people use on Linux to rip a CD to one big audio file plus a .cue file that can be used to make a track-for-track identical copy of a CD, including pregaps ?

27 Comments »

  1. I have the same problem, but I still haven’t found the time to look into it.

    I *think* I’ve read somewhere that flac codec and file format should support this, but I haven’t seen any tools actually using it.

    Also Matroska container should be able to hold flac audio files, with “chapter” information (I think this is mentioned somewhere in the mkvmerge man page, but I’m not sure), but I haven’t tried that either…

    If you find a solution, please let me know.

    Cheers,
    Petar

    Comment by Petar Vasić — 2007-08-24 @ 01:29

  2. MusicBrainz does not yet track ISRC codes: http://wiki.musicbrainz.org/ISRC.

    Comment by Aurélien Mino — 2007-08-24 @ 01:35

  3. Fixed link: http://wiki.musicbrainz.org/ISRC

    Comment by Aurélien Mino — 2007-08-24 @ 01:36

  4. On your first point: no, you shouldn’t rip CDs as a single track. That’s stupid because it’s hard to seek among tracks in players like Amarok, and it will mess up your statistics. What you should do is encode to Ogg Vorbis or FLAC so you get sample accuracy.

    No sane music player supports your idea of a “filesystem” within a long track, containing several songs. A song is a song is a file, period. If you’re gonna start seeking in long files, why not revert to a CD changer instead?

    I hate when people rip entire CDs as a single track, because when they share it and I download it I have myriad problems trying to listen to it / split the tracks.

    You’re trying to solve a minor technical issue which poses no problem using the right technology (cdparanoia + ogg, both of which were designed for sample accuracy). What are you doing trying to use cdrdao and MP3 in the year 2007? Cuz I have no problem re-recording tracks ripped in my suggested fashion to CD, gapless and everything.

    Please don’t try to drive a nail with a rolled-up newspaper.

    Comment by Rudd-O — 2007-08-24 @ 02:24

  5. Yes, FLAC can do this. I’ve used a tool that did it, but it was some time ago and can’t remember what it’s called…. Sorry!

    Comment by Quentin Hartman — 2007-08-24 @ 02:32

  6. @Rudd-O: you need to relax a little. First of all, the main idea is to have a perfect *storage* copy. From this storage copy I can rip individual compressed tracks just fine.

    Second, I don’t particularly care about Amarok. I’m willing to do some programming work to have this way of storing files supported in music players I use. After all, what’s the difference between playing a CD and playing the bit-for-bit copy of a CD ? A good CD player should be able to play and use both.

    Third, I don’t know what statistics you are worried about but I doubt I would care. Again, it’s just some programming work to support the separation between medium and song – I suggest you re-read what I wrote to really understand it. Again – think “All Apologies” ripped from “In Utero” to see what I mean. You’re wrong – a song is a song is *not* a file. I don’t understand what your comment about CD changers is supposed to mean.

    Fourth, I seriously don’t care about your experience with downloading big audio files – I am talking about *my* music collection and I’m not planning to offer it for download to you. As I already mentioned, one thing I’d be doing is precisely write stuff that supports and handles this kind of file, so you could benefit from that work.

    Fifth, Ogg is not designed for sample accuracy. Flac and Vorbis are. Minor nitpick, but if you are splitting hairs I might as well :) Vorbis is out of the picture – I”m only talking about lossless compression.

    Sixth, why do you bring up MP3 ? I sure didn’t.

    Seventh, your system does not reproduce the kind of disc I just gave as an example. From all your comments it looks like you just came by to rant instead of actually reading what I’m trying to accomplish. Just to spell it out for you – I want something that correctly reproduces pregaps. *Especially* hidden songs in pregaps. Explain how yours handles that.

    Comment by Thomas — 2007-08-24 @ 02:55

  7. It would be nice to have an user-space filesystem to implement a translation from cue & raw images to standard filesystem.

    Comment by jxb — 2007-08-24 @ 03:36

  8. How about just making a clone out of it with dd?

    dd if=/dev/cdrom of=image.iso

    or similair. Couldn’t you then just mount the image (or create a plugin/script for your favourite music player that does it for you) and use it just like a normal cd in the cdrom-drive? Is there something i’m missing here about what you want to do with it?

    Comment by zith — 2007-08-24 @ 03:53

  9. “the main idea is to have a perfect *storage* copy. From this storage copy I can rip individual compressed tracks just fine.”

    May be a stupid suggestion but…

    Why don’t you just copy the CDs as .iso files. That way you will have a perfect *storage* copy, and you can mount them using standard tools and rip the tracks as you need.

    Comment by Forbes — 2007-08-24 @ 04:20

  10. I researched this for a while and ended up deciding that what I needed to do was rip into a .flac/.cue-type format, which I could them convert to mp3s via some hand-rolled script. Then I could point my various music players to the mp3 directory instead.

    I also planned to have some third file that would have all the tagging information and would include some methodology to let me override the track information to, for example, say “instead of track 11 being from 11475-18553, make a track 11 from 11475-12511 and a track 12 from 15660-18553”.

    I actually wrote scripts to do all this before deciding it was way too much of a pain in the ass, especially since getting track info from CDDB via a single huge .flac file was now kinda a pain in the ass and I have like 1200 CDs so I didn’t want to have to spend 10 minutes on each of them.

    So now I just rip them to individual flac files :(

    Comment by no one in particular — 2007-08-24 @ 04:45

  11. Maybe this helps:
    http://www.hispalinux.es/~data/abcde.php

    Comment by Benni — 2007-08-24 @ 05:38

  12. It would be great if someone could implement DDP (http://www.dcainc.com/products/ddplicense/index.html) in a free way. DDP allows the creation of a disk image that includes the PQ points, ISRC etc. all in one file. At least one CD mastering facility has used this format to build a player which allows you to check their work before its been committed to the manufacturer (http://www.sterling-sound.com/docs/player/). this app plays the DDP image directly, so nuances like pregap etc are preserved.
    I wrote to the license holders of DDP asking about whether an open source implementation would be ok, and they said that they’d get back to me.. but haven’t yet (after about 2 years). Perhaps someone more familiar with licensing than me could see if its allowable?
    Anyway, DDP seems perfect for your application and being able to read/write these images would come with some bonuses for other projects.
    For instance, although ardour can build a toc/cue file from markers, it would be great to be able to directly produce an image suitable for CD manufacture (although they can in reality handle properly formed CUE files).

    Comment by nick_m — 2007-08-24 @ 06:08

  13. Have you tried CueCreator ?
    http://www.extreme-alternative.org/CueCreator.html

    Comment by wannes — 2007-08-24 @ 07:55

  14. I have actually made a home-grown solution for this, using some python programming. The basic idea is that I rip the CD to an archival directory containing .bin/.cue files (from cdrdao, and I don’t think there are any problems with pregaps) and a .meta file with the vorbis tags for all the tracks in a single file. Then I can run a command to create a number of ogg or mp3 files for the individual tracks to be used by various players. Since I usually update the metadata in the individual track files, I also have a way to sync back the updated information to the archival .meta file. So in the end, I have two mirrored directory structures, one with the CD images, and one with the track files, and I have high-level scripts to sync between them.

    Let me know if you want to know more about it. It’s not a packaged solution, but almost.

    Comment by David KÃ¥gedal — 2007-08-24 @ 08:58

  15. Hi Thomas,

    I did some search some time ago and found these scripts quite ok:
    http://www.mail-archive.com/flac@xiph.org/msg00040.html

    It will rip your CD’s get the metadata from musicbrainz and also generate CUE sheets. You can tell it to use flac2mp3 to also generate mp3’s afterwards (don’t know if it also can generate ogg’s).

    But I don’t know if this will fullfil your requirements concerning pre-gaps and so on.

    – Daniel

    Comment by Daniel — 2007-08-24 @ 09:10

  16. apt-get install abcde

    In the config file set OUTPUTFORMAT to flac, MAKECUEFILE to y and ONETRACK to y as well.
    Any cddb style server can be used for naming. the resulting flac + cue sheet can be fed as an input to abcde to generate mp3/ogg/…

    L & L

    p2.

    Comment by peter — 2007-08-24 @ 09:22

  17. Here is roughly what I’ve been doing (though with error handling):

    mkcue > $CUE
    LASTTRACK=`grep AUDIO $CUE | tail -n 1 | sed -e ‘s/.*TRACK \([0123456789]*\) AUDIO.*/\1/’`
    cdparanoia -Z — -$LASTTRACK $WAV
    flac –cuesheet=$CUE $WAV -o $FLAC -f
    abcde -N -d $FLAC

    That makes a single FLAC per CD, with embedded track listing, then rips oggs from the FLAC files. I initially rip to WAV with the constraint being the speed of CD reading, then run FLAC conversion and ripping jobs overnight.

    If you find a better solution using free softare, I’d be interested to improve what I’m doing.

    Comment by Moray — 2007-08-24 @ 12:44

  18. Exact Audio Copy (EAC) seems to be very well regarded. It runs fine under Wine. Appearently it is a nobrainer to get a single flac + cue-sheet. A bit more work if you want the cue-sheet embedded. I don’t have details.

    Comment by Bjørn — 2007-08-24 @ 13:08

  19. Probably not the answer you want but Wine + EAC works fine for me.

    Comment by Dennis Laumen — 2007-08-24 @ 13:44

  20. I like albums, and don’t care much for individual songs. But sometimes i need to skip to the beginning of a song (in case my mp3-player resumes right in the middle of a song. I wish there was some kind of uniform album file format, that an mp3-player would understand. So instead of the concept of a medium, i’d rather see the concept of an album (small but subtle difference).

    Comment by eelco — 2007-08-24 @ 17:31

  21. Moray’s suggestion seems very close. The only question I have: are there any players that understand the multiple track in flac file? It looks like gstreamer based players (rhythmbox) don’t understand this.

    Comment by Raf — 2007-08-25 @ 02:21

  22. Why not use the Broadcast Wave Format? That will give you uncompressed audio and a standard to follow. Applications just need to support the spec for it. Most linux apps have no idea what a broadcast wave is though. BWF is a subset of WAV so most applications would play a BWF file, but not be able to detect track from the metadata. Check the wikipedia site for more information on it.

    Formats without support for embedded metadata are stupid and should be avoided where possible. Hey you could even put cover foto and lyrics in the BWF format. Proper wave players should just ignore the chunks they don’t understand.

    Comment by Rene Rask — 2007-08-25 @ 14:26

  23. @Rene: I actually feel very strongly the opposite way – a file is better off not containing metadata. As the word says, it is data about data, and should be separate from it. The biggest practical reason for this is the fact that changing the metadata means changing the file.

    Ideally, the data would be just that – the audio data, the integrity of the data would be monitored by the backend (for example, a simple md5sum), and the backend stores the metadata.

    But of course, it’s convenient to have the metadata in the file itself so the one file has all its metadata included, makes it easier to copy and manipulate.

    Comment by Thomas — 2007-08-25 @ 14:51

  24. I’ve just added this kind of a Use Case to the list of use cases that get discussed for the next generation of the player API for elisa:
    https://code.fluendo.com/elisa/trac/wiki/CurrentDevelopment/PlayerNG/Usecases#playingonlypartsoffile

    Comment by Benjamin — 2007-11-07 @ 11:49

  25. I am also looking for how to create an exact bit-for-bit representation of a CD. This includes wanting to retain the error correction codes from the disk. I am planning to re-rip my CD collection (~300) now that storage technology is well past need to store in MP3 format for my home music system. I want to use bit-for-bit copy of disks for 2 primary reasons.

    1) Codecs and technology change over time. I never want to have to go back and feed CDs into my computer a 3rd time. :) Having an exact copy of the disks should avoid this.
    2) CDs do not always follow the standard. Music Companies put intentional errors and hidden items.

    My plan, similar to yours, is to archive the CDs data online, then use the online data to create the music library as WAV or maybe FLAC. These days the on-line storage space for this is not an issue. I am doing this under Debian Linux.

    I am exploring using cdrdao for ripping the CDs. Both .bin and .cue files. The concern here is that I am not sure I can have confidence that there are no errors. The initial test I ran tonight is not encouraging.

    Then I will look at either cdemu or a bin2iso (or something else) to convert to WAV, then to FLAC and/or MP3.

    Comment by Chip L — 2007-12-06 @ 08:34

  26. abcde is definitely your best bet; it’s the best we have in linux. For alittle added security with regard to ripping damaged or scratched cd’s I’d recommend adding “-z -X” to your “CDPARANOIAOPTS=” in the config file. It will allow cdparanoia to continue to reread sectors that are misread, and if an accurate read is impossible, it will exit and notify you.

    Comment by VCSkier — 2007-12-14 @ 21:19

  27. Try rubyripper. It will securely rip to a single file (wav or flac or whatever) and also output the proper cuesheet and a log of the ripping. Should also grab pregaps and all that as well.

    Comment by Ceee — 2008-09-01 @ 09:41

RSS feed for comments on this post. TrackBack URL

Leave a comment

picture