2007-02-09

When will iTunes grow up?

As I've written about previously, I'm ripping my entire CD collection to mp3. Currently I've got about 33,000 tracks in mp3 format. At home, I don't use any library or database software; I just use TiVo to browse and playback. (TiVo's browsing interface isn't great for large collections either, but that's a topic for another day.)

Ripping all these CDs has been a lot of work, so obviously I'm keeping a backup. Right now, everything fits onto my external drive (a 320GB Iomega) that I fill up using Robocopy (an excellent Microsoft replacement for xcopy.) This has the side benefit of being able to bring my entire mp3 collection into work and share it with my coworkers using iTunes.

However, I'm quickly coming to the conclusion that iTunes simply doesn't scale to tens of thousands of tracks. Some of the issues I'm running into:
  • Startup time is very slow. Is it really necessary to load the entire database at startup? My "iTunes Library.itl" file is 37MB, and my "iTunes Music Library.xml" file is 43MB. Uses gobs of memory too - over 100 MB and I'm not even playing anything. Seems like iTunes could just load metadata for the first few hundred tracks in the library, and defer loading the rest until needed - or at least do it in a low-priority background thread.
  • Shutdown time is similarly slow. Even if I shut it down right after starting up, it still needs to "save" something. Why can't it flush database writes during idle time, rather than saving them all until shutdown?
  • About half the time, iTunes crashes when I shut it down. 'Nuff said.
  • Getting album art takes forever - five or six hours with my collection. If the process is interrupted (i.e. iTunes crashes or I go home for the night) then it starts at the top the next day. It seems to go faster on tracks that it's already got art for, but not that much faster, and I have a significant number of albums for which Apple doesn't have cover art, so those don't go any faster the second time around. At least you can cancel album art retrieval permanently.
  • Calculating gapless playback information takes forever squared, especially since iTunes like to do it at the same time as it's getting album art. (Both processes seem to go faster if I force them to run sequentially rather than in parallel.) I really wish you could cancel this process permanently, as gapless music makes up only a fraction of my collection.
  • iTunes has no built-in way to clean up orphaned tracks. If I archive something off to DVD-R, I end up with a bunch of exclamation marks in iTunes. I've tried using iTunes Library Updater but it's amazingly slow. One simple way for iTunes to handle this would be to make the exclamation mark column sortable, so that I could delete all the orphaned tracks at once.
  • "Add Folder to Library" seems to crap out without adding the entire folder. For me, it failed around 14,000 tracks. It refused to add any more than 14,000 tracks no matter how many times I tried the feature, and what's weird is that it didn't stop after the first (alphabetically) 14,000 - it loaded a scattering of tracks from A to Z. Perhaps it was the first 14,000 sorted by time or something, but I found a workaround - drag-and-dropping the folder onto iTunes loaded everything.
Are there workarounds for any of these issues? Does iTunes scale any better on a Mac? Sharing is a "killer feature", so I'm stuck with iTunes for the time being.

2007-01-25

Ripping the world, part 2

Now that I'd chosen a format, I was ready to begin. I quickly realized, however, that my customary techniques needed some refining for large-scale ripping. I use the following tools for ripping CDs to mp3:
  • Exact Audio Copy: For creating digitally exact copies, I haven't found anything better. The interface is (to put it kindly) somewhat dated, and configuration is somewhat involved, but once you get it set up it's great. There was a good configuration guide online at one point, but it's only available via the Wayback Machine now.
  • LAME: Again, not the most user-friendly of tools (although there are front ends) but very powerful and high quality. I use -V0 --new-vbr for my encodings.
My typical usage scenario had been ripping an individual CD, so I had set up EAC to invoke LAME in the background after extracting each track. EAC retrieves artist/album information from freedb and passes it to LAME for use in the ID3 block in the mp3s. With the setup I had at the time, I could rip in secure mode at about 8x, and convert to mp3 at around 10x, so a typical CD took about 10 minutes to convert to mp3. However, since mp3 encoding is very CPU-intensive, I couldn't do a whole lot with my computer during the encoding process, and I was looking for a "batch mode" type of process that I could do while doing email, surfing the web, etc.

The obvious solution was to rip to WAV and put off doing the mp3 encoding until idle time - i.e. overnight. Ripping to WAV can easily be done in the background while using the computer for other activities, and with EAC set to eject the CD when finished, it's very convenient to swap out the disc, alt-tab back to EAC, hit return twice to dismiss the dialogs, and then Ctrl+A, F5, enter to start the next rip. However, WAV files can't store ID3 information so I had to figure out another way to get that information from EAC to LAME.

I had been using a great metadata editor called MP3Tag, and it has a feature where you can search freedb based on a set of files. (For those who aren't aware, CD information databases such as freedb are indexed via a disc ID that is calculated based upon the number of tracks and the length of each track, so MP3Tag is calculating a disc ID from a virtual CD representing the set of files you select. Pretty clever feature, if you ask me! This is also one reason why using a high-quality ripping program is so crucial - if your track lengths are wrong, then the freedb lookup will fail...)

So, one option would have been to rip to WAV, encode to mp3 sans ID3 information, and then use MP3Tag's virtual disc ID feature to download the ID3 information. However, I found this to be quite tedious, since it had to be done disc-by-disc. Another MP3Tag feature came to the rescue, however! I'd often used MP3Tag to rename files based on their tag information - i.e. to add track numbers to "Leper Messiah.mp3", "Master Of Puppets.mp3", etc., or to clean up ridiculous scene naming conventions like "01-Slayer-Reign_In_Blood-Angel_Of_Death-Release-NuHS.mp3". MP3Tag can do the reverse as well, though - namely extract ID3 information from the filename. I guess I have to credit those wacky scene kids - I realized that I could configure EAC to put all of the relevant metadata into the filename, convert to mp3, and then use MP3Tag to pull the metadata out of the filenames, insert it into the ID3 blocks, and then change back to short filenames like "01 - Reign In Blood.mp3". Best of all, each step could be done in batch mode.

That's the method I've been using so far. I started off strong, and got about three or four hundred discs ripped over the course of a few weeks before getting sidetracked. I can pretty easily do twenty or thirty discs in one sitting, while surfing the web or watching TV, so I'm slowly working my way through the rest of my collection.

By the way, if anybody sees a way to improve my process, I'm all ears! Thanks.

2007-01-24

Ripping the world, part 1

When my second child came along, I had to give up my office, and with it, much of my personal storage space. One casualty was my CD racks, which held my roughly 1200 CDs. (I got my first CD in 1986, so while 1200 may seem like a lot, it's not too insane on an annual basis...) My solution to this dilemma was to move the racks and the empty cases out to the garage, but keep the discs themselves in boxes stored in the closet. This worked reasonably well, except that the inconvenience factor of having to go into the closet to get a disc, coupled with losing the tactile/visual aspect of handling the cases/artwork, ended up distancing me from my music.

This was not good.

New solution: rip all my CDs to digital music files. I briefly considering hiring a service to rip all my CDs, but it was (and remains) prohibitively expensive. So, I steeled myself for the massive task of doing all the ripping myself. My first decision what was format to use, and I considered a couple:
  • Uncompressed WAV: Obviously, this would be the highest-quality and most future-proof format. Unfortunately, the storage requirements would be very high, and many playback devices can't handle WAV natively, so I'd end up constantly converting to another format anyways.
  • Lossless compression (i.e. FLAC): This is an intriguing compromise; it would preserve the future-proof quality of WAV, but not eat up so much disk space. Unfortunately, the device support just isn't there, so I'd be converting to WAV and then to mp3 every time I wanted to play something. I have a sneaking suspicion that I'll regret this decision eventually, however.
  • Lossy compression (i.e. mp3): This is what I ended up with; at high bitrates (lame --alt-preset extreme originally, -V0 currently) I can't hear any artifacts in my usual playback environments (iPod over headphones or in the car, Tivo through my stereo at home) and the disk space requirements are fairly modest. The downside is that when (not if, but when) mp3 is surpassed by a better format, I'll be staring a re-rip in the face.
More about the tools and techniques I've developed to streamline the mp3 creation process in another post...