Re: Single root filesystem evilness decreasing in 2010? (on workstations) [LONG]
On Thu, 4 Mar 2010, thib wrote:
OTOH - I haven't studied XFS - but from the little overviews I read about
it, I suppose its allocation groups are a way to scale with this problem
(along with other unrelated advantages like parallelism in multithreaded
environments). What happens if a filesystem doesn't have anything like it?
Filesystems will hit scale problems at some point. As you note AGs in XFS
help it to scale alot but you do need to be careful in selecting the
number. Too many and you can become CPU bound.
Maybe no-one cares because we currently don't have filesystems big enough to
actually see the problem?
Some people definitely do.
I agree with that, but I know it's because I, personally, *need* to know
what's going on, all the time. Some people are OK with letting a program
(even such a critical one) do some magic; and without having tested any
"complex" one, I suspect they try to KIS for the user.
The problem is that if a backup system breaks you get to keep both pieces
:) Failing to understand your backup system and now you can DR under the
worst case is a serious risk.
The problem is, if there's a problem with the backup system itself, then
it's going to be a long night. If there's no need for such software, I,
again, agree, there's no use to take risks, even if they're minimal.
Amanda is a good example. I keep 'backup state information at the
beginning of the tapes and allows the information to be dumped to a test
file easily. I have done a 10TB SAN DR with Amanda and used printed out
pages of the tape state information to guide me. It was relatively
painless considering the amount of data I was bringing back.
Considering your experience, I have to believe you; we can always backup
very simply, even very large systems. It's just weird to picture, all these
complex backup systems would be useless? (I know, it's not a binary answer,
but you know what I mean.)
I'm not saying they are useless but organisation do need to take more time
considering DR I think. Large organisations will have fully operational
DR sites and they can afford to run a database for their backup system
since they can expect at least one of their sites to be operational at any
given time.
I have known people who run a copy of the backup DB on a laptop which is
supposedly kept offsite. These laptops likely come on site occassionally
and they are a prime candidate for bitrot.
Anything that gets between me and data restoration makes me nervous :)
And for those people who think that off-site/off-line backups aren't needed
anymore because you can just replicate data across the network, I'll give
you 5 minutes to find the floor in that plan :)
I guess I'm perfectly OK with that, but are we still talking about
workstations? :-)
I'm talking about servers. There is no substitute for offsite/offline
backups and there never will be. This is one of the few topics were I
will use absolute statements like this.
You can never predict the nature of the failure. If you try to figure out
how a failure will occur then you will sooner or later run in to a failure
of imagination.
The only way to guarantee against a single disaster of a certain size is
to physically seperate the data stores by a sufficient distance and keep
the backups offline.
No technology can change this fundamental truth since our understanding of
the possible failure modes will always be incomplete.
My understanding is that the "cached" column of the output of free(1) is the
sum of all pages, clean and dirty. The "buffers" column would be the
Right. It might be nice if free did display them seperately. It would
confuse people less then :) /proc certain present the info. Checkout the
source of 'free' - it is a really simple application.
Since there's no "cached" column for the swapspace, I guess no clean page
gets pushed there, although it could be useful if that space is on a
significantly faster volume. Anyway, the "used" column should be the total,
actual swapspace used, so your comment kind of confuses me. Am I really
wrong here?
I'd recommend doing some reading. The cached system memory and the swap
space disaplayed by free are really unrelated concepts (at least at the
level we're talking about here).
If you want to chat on IRC about fun subjects like caching and swap space
sometime you can find me as Solver on Freenode & OFTC.
Cheers,
Rob
--
Email: robert@timetraveller.org
IRC: Solver
Web: http://www.practicalsysadmin.com
I tried to change the world but they had a no-return policy
Reply to: