[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Summary: Moving /tmp to tmpfs makes it useless

2012/6/19 Wouter Verhelst wrote:

>> That's not true. Only applications, that are limited by /tmp speed will
>> become faster then. Do you know such applications?
> Any application which performs I/O anywhere has a chance of being
> limited by it.

In theory. But do you know any applications actually using that chance?

> If you write to /tmp on disk and someone or something calls "sync" at
> precisely the wrong moment, you're stuck, and your performance suffers.

I don't know any examples. Can you suggest one? I mean, what should be
those two programs, one of which calls "sync" while another writing large
data to /tmp so I could measure the performance difference?

> I'm not saying the speedup will be extreme, but it will be there, and
> (cumulated over loads of programs using /tmp) it will be significant
> enough to be noticeable.

If it is noticeable it's great! What'd I do to notice it on my system?

>>> Can I provide a use case where this will matter? Not necessarily. But
>>> then, can you provide a use case where this will *not* matter? Really?
>> Yes. Everything.
> Oh, interesting. You have the data to back that claim up?

Every test I posted to this list. :)

> You make it sound as if those three are the only uses of /tmp. That's
> just not true.

I just named the most popular ones for regular users.

> - The nbd test suite stores fairly large files in $TMPDIR on which it
> then runs nbd-server

How much faster it becomes with tmpfs? Can you do e.g. 5 tests in a row
with tmpfs and then 5 tests without it to compare the difference? If it
can really benefit from tmpfs, then, it's good, it would be the first
real-world program to use tmpfs so far... Also, how large tmpfs does it
need? Is it still faster with tmpfs if tmpfs gets swapped?
(set cpufreq governor to "performance" when doing the test, and stop
`*cron` if tests are too long)

> - Any data transformation or filtering which needs to be done in
> multiple passes over a file would use a temporary file for
> intermediate results

That's a theory. Neither you nor I can answer the question of how much
faster such "any data transformation" will become.

> Multi-pass video transcoding is an example of this, and which
> (depending on the codecs used and the hardware on which it runs) could
> certainly be I/O bound.

It's CPU-bound unless you know some *really* fast codec that I'm not
aware of. :) Or do you have some example that I can run and see it
becoming faster with tmpfs myself?

> - using mc on a tar.gz which was compressed with --fast

It could be. Is it just a guess, or have you checked it on some real-world
archives? I.e. my test with linux-kernel archive have not confirmed that.

> The point is that neither you nor I can reasonably be expected to list
> all possible uses of /tmp

It's not the point. The point is to find those uses that are limited by /tmp
I/O speed.

> and that RAM is faster than disk, so that when you access a tmpfs you're
> going to be somewhat faster than when you access a disk-backed filesystem

No. That's a theory, and it's wrong. :) RAM is faster than disk, so
accessing a tmpfs MAY be faster IF you're limited by the I/O speed.
Theories can be wrong, that's why I always ask for examples, tests...

>> Hm, it's a bad idea to use tmpfs with swap... And it's not a good idea
>> to use tmpfs without swap, since it would be too small and may even
>> trigger OOM killer. When is it good to use tmpfs then? ;)
> I never used tmpfs on a system with swap, and I've not often seen the
> OOM killer start doing its job. My current machine has 4GiB of RAM,
> which seems to be more than enough.

Heh. You're saying that "tmpfs is not too bad" for you. But it does not
automatically make it good for you. :)

>> User cannot break the system filling /tmp on disk. But he can do that
>> if he fills /tmp on tmpfs. So /tmp on tmpfs adds one more point of
>> failure for servers.
> No, that's not true. The real danger in filling up /tmp is not that
> other processes can't write temporary files anymore (causing a minority
> of processes to hang or die; those who just happen to need temporary
> storage at that point in time), but that no process can write any file
> anymore (causing a significant majority of processes to hang or die).

Hrm... But there're no other directories on a root partition that user
can write to. If you mean that /tmp on tmpfs prevents /var or /home from
being filled then it's not true either. Putting /tmp on tmpfs will force
people to use /var/tmp or /home for large files and will (not "can", but
"will", since /var/tmp is not cleaned) eventually fill those partition,
which is exactly what you were trying to prevent. :)

> Now whether that advantage outweighs the disadvantages you've outlined
> is something we can talk about. However, whether RAM is faster than disk
> isn't;

Why are you limiting it to either "yes" or "no"? There're other options.
It's not about "RAM is faster than disk". It's about "whether there're any
programs becoming noticeably faster because RAM is faster than disk".
Applications are not limited to use /tmp only. And /tmp is not limited to
be either on tmpfs or root partitions. There're other options.

For example, if we find just one package, becoming faster because of tmpfs,
great, then we'll patch that package to include /var/ram directory and
configure it to mount tmpfs there and use it by default.

If we find 5 of such programs, great!, we can have a separate package
`varram.deb`, that sets up tmpfs for /var/ram, and set it as dependency
to those programs. We can even limit /var/ram write ACLs to e.g. `varram`
group, so that regular users could not DoS those programs by filling it.

The only case when we need /tmp on tmpfs is when there're more programs
benefiting from it than programs broken by it. In that case it would be
easier to patch programs, broken by tmpfs, than patch all the programs
becoming faster. But the problem is: we had not found ANY programs
becoming faster because of tmpfs and not being broken by it. :)
At least I have not seen a reproducible benchmark yet. Have you?

PS: Sometimes I don't understand people. I'm trying to find a solution
that is good/fast for every use-case possible, but people don't want to
name those use-cases... Why?

Isn't it easier to give some reproducible real-world example instead of
writing all those theories and philosophical debates?

Talk is cheap. Show me the code. (c) Linus

Reply to: