[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Git for backup storage



>> `git gc` does delete the old data (if it's not reachable any more).
> And it is very expensive.  My point exactly.

It's fairly expensive indeed, but it's usually an operation that is not
very time-sensitive: it can usually be delayed to a convenient time, and
you can run it infrequently and as a low-priority background task.

A good reason why you usually don't want to run it frequently, is that
due to the sharing ("deduplication"), there's usually not that much
garbage to collect.

[ IOW, often a thousand backups (of the same machine) don't take up
  much more space than a single backup.  ]

>> BTW, if you want to (ab)use a Git repository to do backups, you should
>> definitely look at `bup`.
> Thanks, it might be exactly what I am looking for.

Bup uses the same format as Git, but has its own implementation for most
operations because the performance of Git is tuned for a very different
use-case.  With Bup it's common to have a repository that is much larger
than 100GB, whereas Git very rarely manages repositories of such size.


        Stefan


Reply to: