Re: Using Linux for data archival
- From: The Natural Philosopher <a@xxx>
- Date: Sat, 24 Jan 2009 09:07:42 +0000
Cyber Punk wrote:
I currently have lots of data that I'd like to archive with varyingWhat I have here, may work for you.
degrees of reliability. The data I have consists of:
1) Important documents - needs to be encrypted and redundantly stored.
I can use Truecrypt for encryption.
2) Large files of non-essential data I'd just like to have easily
accessible. DVD rips, isos, music.
3) Many small files of non-essential data, such as website wgets or
saved webpages.
Can anyone recommend:
1) The best Linux filesystem to use for such data; doesn't corrupt
easily, remains quick for a few large files/many small files, less
prone to data fragmentation.
2) What open source data archival software I should use that has on
average a high compression ratio and a recovery record to help recover
most/all of the archive in the event of data corruption.
3) Whether it is better to store data as tarfiles & compressed with a
recovery record, or uncompressed without being tarred and no recovery
record.
4) One insiduous way hard drives fail is that files start
disappearing. Is there a way of getting Linux to report missing
files?
5) My file book keeping was less than perfect and sometimes I have
multiple copies of files of the same name. Is there a way of copying
everything into a large hard drive but getting Linux to only overwrite
clashing file names if they are newer?
6) Whether statistically speaking with some thought of cost, one is
better off with RAID arrays or just backing up key data to DVDs.
Thanks.
I have a debian linux server, which contains ALL my data that I don't want to lose, and serves three desktop machines. Using SMB. Its a 6 year old chassis with very little RAM and no screen at all.
Even my mail clients store all the mail on it, and I did toy with web stuff, but decided a list of bookmarks and browsing history wasn't that important.
It has a second hard drive, and every night a cron job rdiff updates the second drive to be a copy of the first.
I looked at burning dvds, but the cost of doing that was after a short while, more expensive than the second hard drive. And I had filename issues.
The second drive is twice as big as the first one. When the first one fills up, I will make the second one the main data disk, get one twice as big again, and use that as backup.
The advantages of doing this are:-
- Having a file server integrates well with the desktop machines: if you get in the habit of using the server for everything, there is no extra action required to ensure the data is on there.
- using the second hard drive to autoback the first, is - if automated - a huge boon. Unlike RAID, you actually have a *copy* of everything, so if you screw up a file as I did yesterday, the original is in the backup for last night..no need to find a backup and do anything untarrish.
- If either of the disks on the server go pear shaped, you have the other. RAID can itself go pear shaped. I always prefer mirroring to RAID if the data is slow moving enough. use RAID to keep a fast moving data handling machine on a 24x7 uptime..don't use it to preserver archival material.
- the data is instantly accessible. No need to store DVDs, find them, insert and fiddle.
If you are truly paranoid, get a friend who is likewise, and back up each others data over the Internet. That works even if your machines get stolen.
.
- References:
- Using Linux for data archival
- From: Cyber Punk
- Using Linux for data archival
- Prev by Date: Re: How do I mount vfat sticks so every one can write to them?
- Next by Date: Re: audio conversion
- Previous by thread: Re: Using Linux for data archival
- Next by thread: Re: Using Linux for data archival
- Index(es):
Relevant Pages
|