Archiving on physical media for dummies – Part 1 – LTO

As a tech literate person using a computer and the internet, you already understand the importance of backing up your data. However, backups are not the same as archiving. While backups ensure you have a recent copy of your data to restore from in case of a disaster, archiving is focused on the long-term preservation of information. This archive can serve as an additional layer of assurance that your most valuable files are secure many years down the line, or it can allow you to hoard even trivial data that you might never want to look at ever again without it polluting your actively backed up data – it will always be there for recovery if you change your mind. Another interesting use case might be the creation of a “digital inheritance”. One technology that has been around for decades and can help you with that is LTO:

I have been using LTO (Linear Tape-Open) for archiving large amounts of RAW photos that I deemed worthless, including lots of individual frames from timelapses and astrophotography. I also archived my large music collection when I abandoned audio files in favor of streaming – now that I am partially going back to local files, this archive is coming in very handy.

What is LTO?

LTO is a high-capacity, robust and cool looking and sounding tape storage format. It has been around since the 1990s in, as of today, nine generations. With each of these generations, capacities and speeds increased and features were added. LTO is particularly suitable for archiving large amounts of personal data due to its longevity, reliability, and cost-effectiveness.

LTO is also shockingly slow and the hardware can be a little unreliable. Because the drives typically use some form of SAS and the different generations of tape cartridges are each only compatible with certain generations of drives, this can make implementing an LTO archiving system a little intimidating for hobbyists. Luckily, the accompanying Linux software stack is robust and easy to use.

Why use LTO for Archiving as a Hobbyist?

If you are still reading this you are probably some kind of data hoarder like myself. LTO can, but should not be part of your regular backup strategy (3-2-1 rule). Instead it should be considered best as a type of cold storage – an extension of your storage capacity to be used for rarely accessed data. For that use case, it offers a number of advantages over other media:

  1. Longevity: LTO tapes have a shelf life of up to 30 years, which is significantly longer than traditional hard drives or solid-state drives, as well as many optical media. This makes LTO a reliable option for long-term storage.
  2. Capacity: LTO tapes offer high storage capacities, with the latest LTO-9 version offering up to 18 TB of uncompressed storage and up to 45 TB with compression. This capacity allows you to archive large amounts of data on a single tape. However, these later standards require very expensive equipment to utilize which is probably out of reach for a more casual home use case – more about the various generations below.
  3. Scalability and cost effectiveness: Due to the cost of entry, LTO is only worth getting into if you plan fill a certain number of tapes, however, if you do get beyond that threshold, it is very cheap to get additional capacity just by buying more tape cartridges.
  4. Coolness: Using LTO at home is a good conversation starter and the whirring of the drive is proven* to have positive impacts on your health, such as lowering of blood pressure and improving your patience.

*not really

Which LTO Generation is the sweet spot?

I got into LTO-4 a number of years ago. At the time it offered the highest cost efficiency for the capacity I was aiming for. Calculating the best option depends on a number of factors, but the TL;DR is that in 2023 the LTO-5 generation is probably the way to go. Older generations may make more sense if you know you only want to store smaller amounts of data. Meanwhile, newer generations are just too expensive to get into right now. Note, that if your desired capacity is very low, there are other media that may prove more efficient for archiving, such as DVD-RAM.

TCO Calculation

(DAmount / DTapeCapacity) * CTape + CDrive + CController + CCable = TCO

So if you are interested in archiving 16TB of data you are looking at a TCO for LTO-4 of 10*12€ per Tape (assuming the compressed capacity of 1.6TB per tape and current prices on ebay) + 70€ for the drive and around 30€ for controller plus cables = 220€.

For LTO-5 a typical price per tape might be 17€ and you only need 6 of them. The drive might cost 160€. So that’s a total of 292€. I would probably reach for the newer LTO-5 in this case.

Note that the prices for the drives vary wildly. On eBay LTO-5 drives can go for as low as 100€ and as high as 600€. I recommend watching the market and bidding until you get a cheap one.

One final note on LTO generations: Keep in mind that there are limits on the compatibility between different generations. For example, an LTO-5 drive can only read and write LTO-4 and LTO-5 tapes, as well as read LTO-3 tapes. This can make later upgrading complicated and more costly, so you when in doubt consider a newer generation.

Definitely study the generations and the compatibility notes on Wikipedia first.

Getting Started with LTO on Linux

Now this is the fun part. To set up you tape drive simply connect your SAS controller and drive and push in a tape. It should automatically be recognized and available in /dev:

andreas@pennyworth:~$ ls -l /dev/tape/by-id/
total 0
lrwxrwxrwx 1 root root  9 Mar 12 17:34 scsi-3500110a0014d6b46 -> ../../st0
lrwxrwxrwx 1 root root 10 Mar 12 17:34 scsi-3500110a0014d6b46-nst -> ../../nst0
lrwxrwxrwx 1 root root  9 Mar 12 17:34 scsi-HU1132HUE8 -> ../../st0
lrwxrwxrwx 1 root root 10 Mar 12 17:34 scsi-HU1132HUE8-nst -> ../../nst0

Use the nst device if you do not want your drive to rewind the tape automatically after each file transfer. This will save you a lot of time if you plan on saving more than one file in one sitting. Rewinding can take a lot of time. To do it manually, you can eject the tape using the drive’s eject button or you can use:

root@pennyworth:~# mt -f /dev/nst0 rewind

Because the tape is linear, you’ll want to use tar (tape archive, that’s where the name comes from) to archive your files into a large block so the drive doesn’t have to seek – which it really can’t anyway.

Make sure you are using mt-st and not the often preinstalled GNU mt command. GNU tar does not allow you to enable compression which you probably want to do. On Debian you might run:

root@pennyworth:~# apt install mt-st
root@pennyworth:~# mt -f /dev/st0 compression 1

Finally to save and retrieve tars from your tape you can use:

root@pennyworth:~# tar --create --file=/dev/st0 /root/testfile
root@pennyworth:~# tar -xzf /dev/st0 -C /tmp/output

There’s a lot more to look into such as hardware encryption, but this should cover the basic use case of archiving a directory and retrieving it.

A note on drive reliability

I have had two SAS controllers failing as well as on drive destroying itself and taking a tape with it. Researching online, it appears that I was particularly unlucky, although particularly drive failures can occur. Keep in mind that the drives you might come into contact with may be 10-20 years old and may have had a hard life. An external drive is probably less likely to have a lot of wear, compared to one that was in use in a tape library.

Handling tapes

Since we are talking about archiving data, you’ll want to make sure you store your tapes properly for maximum longevity:

  1. Store tapes in a cool, dry, and dust-free environment to prevent damage and degradation.
  2. Use the unique codes on the label of each tape to document where you stored which data.
  3. Regularly check the health of your tapes using:
mt -f /dev/nst0 status

Is it for you?

Ultimately, whether LTO is a media format worth using as a hobbyist is very subjective. Even data hoarders won’t usually have that much use for this class of medium. You probably also need to want to get into this mildly unusual technology for the fun of it. With that being said, it can be rewarding and efficient for a certain audience of people. I certainly enjoy using it.


Posted

in

, ,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.