[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Summary: Moving /tmp to tmpfs makes it useless



Some people asked for a thread summary. So here it is.

Contents
========
* Short Problem Summary
* My point
* Initial suggestion - RAMTMP=no + d-i extension
* Later suggestion - RAMTMP=auto
* Other ideas
* Alternatives
  - SSD setup - Normal
  - SSD setup - Paranoid
* "/tmp on tmpfs is good" quotes
* "/tmp on tmpfs is bad" quotes
* FAQs

Short Problem Summary
=====================
Problem description in quotes:
  >> then /tmp using tmpfs *will* lead to issues that many wont understand.
  > As will /tmp on a small root partition.
  > As will a small dedicated /tmp partition.
  True. But debian does not have small root partition by default. And it
  does not install with a small dedicated /tmp partition by default as
  well. Then why does it come with /tmp on tmpfs by default?

Jakub Wilk:
  /tmp on tmpfs is a nice hack (I use it myself!), but it's a terrible default.

Carlos Alberto Lopez Perez:
  Defaults should be sane for most of the cases, and not for corner cases.
  Also defaults should prioritize stability (and non-breakage) over
  performance.

My point
========
(The reason why this thread was started.)
Mounting /tmp to tmpfs may be a good thing in some rare cases, but it's
a bad default. It breaks a lot of things and, which is more important,
brings nothing good.

I outline it again: it's not tmpfs is never good, it's /tmp on tmpfs is
never good. It's possible to write some artificial scripts, that work
faster on tmpfs, but there're no popular real-world applications becoming
better because of _/tmp_ on tmpfs. On the other hand there're a lot of
popular real-world applications that break or make the system unstable
because of /tmp on tmpfs.

If you keep pushing things further, start to "fix" those apps, that
"should not write large files to /tmp", and make them write to another
directory, then you're just making /tmp useless. Because you can take
one more step and "fix" them all, then remove the /tmp directory
completely, and move all the software out of it.

It is often said, that /tmp on tmpfs solve some problems. Actually it
only gives a feeling of false safety. The problems that it should solve
are often better solved without /tmp on tmpfs (see "Alternatives").

Initial suggestion
==================
Set RAMTMP=no by default, but extend debian installer so that it allowed
smart people who really know what they're doing to enable it. In best case
it should allow to configure any tmpfs directory (/var/lock, /media, /opt,
etc.), not just /tmp. It would be nice to have the `size` option there as
well.

Later suggestion
================
In addition to RAMTMP=no and RAMTMP=yes add another value RAMTMP=auto
and make it default. The "auto" mode would mount /tmp to tmpfs only if
it was on a / partition and such mount increases available free space
on it.

For example on a system with 2TB root fs /tmp will stay on disk, while
on systems with 8GB RAM and <1GB free space on disk /tmp will be mounted
to tmpfs.

That should fix things for systems with small (or read-only) root
partition without breaking things that were working before.

Other ideas
===========
Other ideas crossed the thread:
* Add -pipe as default flag to dpkg-buildflags
  (needs some careful checks)
* We may need an [optional] directory mounted to a large tmpfs by default.
  /tmp is not the only place where one can mount tmpfs.
  For programs always working better with tmpfs have it mounted to e.g.
  /var/ram, and configure those programs to use /var/ram/ by default
  (we had not identified those programs yet).
* We need a good benchmark. Nobody has done one for /tmp yet. Even for
  gentoo users (often having /var/tmp/portage on tmpfs) I could not find
  a benchmark in details. And a good benchmark is hard to do.
  (If anybody is interested, I can suggest some ideas about such benchmark)

Alternatives
============
There're a lot of things you can do with /tmp except mounting it to tmpfs.

First, the most common one. If you have some program that works better
on tmpfs, and you have enough RAM to put it on tmpfs then... you should
put it on tmpfs. But! It does not mean that you should put /tmp on tmpfs,
just put your program there:
  # mkdir /var/ram
  # mount tmpfs /var/ram -t tmpfs -o size=100G
and run your program in /var/ram. That will allow your program to work
faster, and won't break anything. /tmp is not the only place where you can
mount tmpfs after all. Why /tmp on tmpfs is better than /var/ram on tmpfs?

Next, if for some reason you don't want /tmp to be on your root partition
(if you have read-only root, for example) you can:
  # install -d -m 1777 -o root -g root /home/tmp
  # mount --bind /home/tmp /tmp
  [edit /etc/fstab to have the mount permanent]
That gives /tmp on a /home partition without memory/swap headache.
As a bonus it will obey /home quotas and clean on boot automatically.

Also `libpam-tmpdir` allows you to set up separate per-user tmp
directories. But remember, that a common tmp-dir may be still needed
on a multi-user servers (such as ssh-servers), so users could exchange
files among each other.

A common option for servers is to put different things on different
partitions. So when configuring a server you can easily put /home, /var,
/tmp, /usr and /opt on different partitions on disk. That gives you
great flexibility (you can configure separate quotas for /tmp partition —
you can't do that with tmpfs).

If you hate partitions for some reason you can create it in file
(e.g. /var/tmpdir.img) and loop-mount it. This will allow you to
easily resize your /tmp as well.

It's also often believed that using /tmp on tmpfs saves disk writes
for short-lived files, which can be useful for SSD disks, for example.
That's not really true. Kernel filesystem cache is smart enough, and
small short-lived temporary files stay in disk cache and never hit the
disk (NB: this is true for journaled filesystems as well). But if you
really care about SSD writes here are a two real solutions for that:

Note:
  Before you start, make sure you have something to worry about.
  SSD disks often die because of firmware bugs, not writes limit.
  I.e. modern SLC SSD disks are supposed to live at least 50 years:
    http://www.storagesearch.com/ssdmyths-endurance.html
  Of course it depends on chip types, SLC/MLC/etc.

SSD setup - Normal: don't write some things
-------------------------------------------
The idea is simple: you need to find most writing apps and configure them
not to do that. To find such apps you can use tools:
  iostat
  iotop
  btrace
  ...
You may also look at audit-related tools. Or you can write your own
scripts using /proc/diskstats and /proc/PID/io. Next step depends on apps
you find. For example, you may want to use Private Browsing in firefox. Or
if you're heavily using vim you can disable its swap file (`set noswapfile`
in .vimrc). If you notice that your (K)DE often writes to /var/tmp, you can
mount /var/tmp to tmpfs. :) You may also want to increase dirty_*_centisecs
timeouts to some larger values.

In my case most writes were done by browser (so I disabled caching,
symlinked history/cache to tmpfs) and syslog (disabled fsync). It takes
some time to catch all the apps. You may want to write some script and
leave it running in background and collecting stats of running processes,
so you could monitor your writes from time to time.

SSD setup - Paranoid: only write things you need
------------------------------------------------
Assuming that you have a normal non-SSD PC:
0. Split your SSD disk into two partitions "metaroot" and "other". The
"metaroot" partition will contain your system, while all the files you
want to save would be on "other" partition.
1. Create a debian livecd with all the software you need:
  http://live.debian.net/manual-3.x/html/live-manual.en.html#7
2. Unpack content of the livecd to "metaroot" partition.
3. Manually install boot-loader (grub, syslinux...) to "metaroot"
4. Configure boot-loader to boot your livecd
5. Boot and use. :)
Whenever you need to write something — write it to "other" partition. If
you need some software that you forgot to put on a livecd, or if you want
to upgrade — leave a note to yourself on "other" partition, later rebuild
livecd and replace files on "metaroot" partition with a new build.

This is harder to configure, but allows you to control every disk write
happening in your system. Also this gives you a flexible "undo": if
something goes wrong with new livecd build, put older build back.

"/tmp on tmpfs is good" quotes
==============================
No real quotes here. Most of this and other threads were about why
/tmp on tmpfs is not that bad. But there're no real quotes explaining
why it's good.

The most common one is "tmpfs is good, because fsync() is no-op there".
But since no application uses fsync() on /tmp files it's not related
to /tmp.

Me, Adam Borowski and Roger Leigh made some artificial tests proving,
that tmpfs itself can be good sometimes. I.e. it may be useful for some
scripts to have a large tmpfs mounted to /var/ram. But these tests do
not explain why _/tmp_ on tmpfs is good.

"/tmp on tmpfs is bad" quotes
=============================
Thomas Goirand:
  On the server side, you can add that MySQL often uses /tmp,
  for example mysqldump does, and files there can be huge too. It
  can be extremely dangerous to have a server's /tmp getting full,
  and swap getting highly used (eg: potentially, your server could
  simply crash).
Bastien ROUCARIES:
  gscan2pdf too, I use it to send my tax receipt and It crash during
  scaning due to this issue (tmpfs full)....
Carlos Alberto Lopez Perez:
  What happens if tmpfs on /tmp is the default?
  * The user will end with a /tmp of 615M.
  * One day the user try to copy a DVD and he is unable to do it #665634
  * One day the user try to stream a movie from youtube or vimeo and
he is unable to do it #666096
  * The user gives up with Debian and decides to go back to Windows.
  What happens if tmpfs on /tmp is *not* the default?
  * The user is happy with Debian because he can get things done and
tells its friend about Debian.
Wookey:
  open a .ppt (powerpoint) file in libreoffice. The conversion
involves writing a file
  in /tmp/<mktmpdir> for every page/image. To open an image-heavy
256Mb .ppt I have
  lying about here, generates 382MB of files in /tmp.
Joey Hess:
  I read the debian-user list and have forwarded a half-dozen cases of
  users experiencing problems with tmpfs /tmp to the BTS.
brian m. carlson:
  For my personal purposes, tmpfs on /tmp is fine.  For shared-hosting
  purposes, tmpfs on /tmp is a DoS waiting to happen.
Russell Coker:
  My workstation unexpectedly went from having 2G of free space on the root
  filesystem for /tmp to 600M of tmpfs.  600M is almost filled by two TED talks
  so with my habits of downloading multiple video files that was never going to
  work.
Steve Langasek:
  The problem is not whether applications gracefully handle ENOSPC.  The
  problem is whether we as a distribution are causing users to hit ENOSPC when
  there's no justifiable reason for it.
Stefan Lippers-Hollmann made a benchmark of kernel built on ext4 vs tmpfs.
It's not directly applicable to /tmp (people don't build packages in /tmp
by default), but it proves that heavy tmpfs usage makes system less usable.

/tmp on tmpfs also break the TMPTIME feature, but I don't have a good quote
for that.

And I just thought that using tmpfs also makes suspend-to-disk slower.

FAQs
====
There's a major difference between "most programs will still work" and
"most programs will become faster". Most of this (and others) thread were
about "works for me" and "/tmp on tmpfs is not that bad". I'll skip that
part mostly, and list popular Q/A for "/tmp on tmpfs is good".

Q: Solaris uses /tmp in tmpfs for years.
A: Yes. And Linux uses it on extfs for years.
   Solaris UFS was too slow, they had no choise. But Linux's faster with
   extfs than Solaris with tmpfs: http://www.tux.org/lkml/#s9-12

Q: But still solaris uses /tmp in tmpfs and has no problems
A: That's not true. Solaris users/admins actually have problems because
   of the whole swap space being exhausted by a few iso-files in /tmp.
   But /tmp is being tmpfs under Solaris for so long that solaris admins
   are used to look for free space in /tmp and fix the problems, that
   linux users never had.
   I don't want to get used to problems, do you?

Q: Storing files on tmpfs+swap must be faster because it does not have all
   the metadata, inodes, journals and other stuff.
A: It's not that simple. Tmpfs is not just data, it supports directories,
   attributes, etc. It still stores all the metadata, and it actually has
   the inodes internally. Swap partition also has its internal format
   (there're even different swap versions: see v0 and v1 in `man mkswap`).
   I.e. when you write to ext3 you write through a single ext3fs layer,
   but when you write to swap you write through two tmpfs+swap layers.

   Things get worse when you start reading a file you just wrote. When you
   read from ext3, the oldest part of the filecache is dropped and data is
   placed to RAM. But reading from swap means that your RAM is full, and in
   order to read a page from swap you must first write another page there.
   I.e. sequential read from ext3 turns into random write+read from swap.

   Also remember, that writing files to ext3 does not affect other running
   processes as much as writing to swap.

Q: /tmp on tmpfs increases apps performance.
A: What apps? /tmp is not a performance-critical directory in Linux. Real
   apps don't write there during performance-critical operations. Even if
   they do, they write large files. And large files work faster when
   written on real disk, rather then swapped out and slow down the entire
   system. The apps that can really benefit from tmpfs are too rare. And
   we're talking about default settings and most common cases.

Q: Ok. I wrote an optimized application that indeed works faster on tmpfs
A: And I can write an application that works faster on reiserfs than on
   ext3. Will you change *default* root filesystem to reiserfs because of
   my own application that nobody uses? ;)
   You can mount tmpfs to /var/ram and run your application there instead.
   That will make your app happy and won't break other apps.

Q: gcc writes small files in /tmp
A: not when used with -pipe option

Q: But I can't modify every Makefile in the world
A: You don't need to. You only need to add -pipe to your *FLAGS. You don't
   build software with default autotools flags (-g -O2) anyway, I guess.
   If you build packages you can write:
     APPEND CFLAGS -pipe
     APPEND CXXFLAGS -pipe
   to the /etc/dpkg/buildflags.conf. It may even make building faster.

Q: /tmp on tmpfs reduces number of HDD spinups on my laptop.
A: Not much, since /tmp is mostly unused by default. But vm.laptop_mode=1
   do it much better than tmpfs. Since laptop_mode works for entire disk,
   not just /tmp, you don't need to use tmpfs.
   Also check laptop-mode-tools package.

Q: I extremely care about my / fs and want to use it as rarely as possible.
A: There're a lot of options, that would solve your problem without making
   your system unstable because of high memory usage, or break programs
   because of no free space in /tmp. Check the "Alternatives" section.

Q: /tmp on tmpfs reduces number of disk writes.
A: Not much. Kernel cache already does that instead. And most disk writes
   are not related to /tmp anyway. But if you do worry about disk writes
   check the "Alternatives" for a better solution.

Q: tmpfs peforms much better under heavy fsync()s
A: True. But applications usually do not fsync() files in /tmp by default.
   So there's no need to change *default* setting.

Q: /tmp on tmpfs is part of a larger goal, of making it easy to have
   read-only root partitions.
A: That's the worst way of having a read-only root partition. :)
   To have a read-only root you need separate partitions for /home and /var.
   And since you already have several partitions, you can create one more
   partition for /tmp. If you don't want having it as a separate partition
   you can simlink or mount-bind it to i.e. /home/tmp. That will give you
   a fast /tmp, that doesn't eat your RAM, cannot cause heavy swapping and
   freezes, and additionally support quotas (tmpfs does not support them).

Q: Still /tmp on tmpfs prevents from filling up the / partition
A: There're quotas for that. (also see next Q/A)

Q: /tmp on tmpfs prevents even root from filling up the / partition
A: Nothing prevents root from filling / partition. :) But there's
   another feature TMP_OVERFLOW_LIMIT to help just in that case.

Q: /tmp on tmpfs can be resized, you just need to add more swap
A: That's why /tmp on disk is better by default — no need to resize it.
   If by default you have a single partition for everything, your /tmp
   is as large as your disk. No need to add more swap, no need to resize,
   no need to worry about it. Ever.

Q: I have a separate /home and I think it's good that /tmp becomes
   useless. Later we'll be able to move it somewhere under $HOME, as e.g.
   XDG_CONFIG_TMPDIR so that everybody used something like ~/.cache/tmp
   for that and obeyed their quotas.
A: It's pointless to drop /tmp and then reinvent it somewhere else on disk.
   Everyone could just use /tmp on disk...
   But if the point is to have /home and /tmp on the same partiton then
   no need to change anything, you can get that right now:
   # install  -d  -m  1777  -o  root  -g  root  /home/tmp
   # mount  --bind  /home/tmp  /tmp
   [edit /etc/fstab to keep the mount permanently]
   That gives /tmp on a /home partition without memory/swap headache.
   As a bonus it will clean on boot automatically.

Q: I'm a smart man, I know what I'm doing, what apps I'm breaking and what
   consequences my decision might have, but I still need my /tmp in tmpfs.
A: Then you should do that. In those rare cases when defaults need to be
   changed they should be changed.


Thanks for everybody for participating in this discussion, spending
your time for doing tests, sharing your experience and ideas.
I hope I had not missed any of them.

PS: somebody also posted a summary at https://lwn.net/Articles/499410/

-- 
  Serge


Reply to: