[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: do packages depend on lexical order or {daily,weekly,monthly} cron jobs?



On 2019-08-05 17:34, Ian Jackson wrote:
With current code the options are:

A. Things run in series but with concatenated output and no individual
   status.

B. Things run in parallel, giving load spikes and possible concurrency
   bugs; vs.

I can see few people who would choose (B).

People who don't care much about paying attention to broken cron
stuff, or people who wouldn't know how to fix it, are better served by
(A).  It provides a better experience.

Knowledgeable people will not have too much trouble interpreting
combined output, and maybe have external monitoring arrangements
anyway.  Conversely, heisenbugs and load spikes are still undesirable.
So they should also choose (A).

IOW reliability and proper operation is more important than separated
logging and status reporting.

If we are in agreement that concurrency must happen with proper locking and not depend on accidental lineralization then identifying those concurrency bugs is actually a worthwhile goal in order to achieve reliability, is it not? I thought you would be the first to acknowledge that bugs are worth fixing rather than sweeping them under the rug. We already identified that parallelism between the various stages is undesirable. With a systemd timer you can declare conflicts as well as a lineralization if so needed.

I also question the "knowledgeable people will not have too much trouble". Export state as granular as possible and there is no guesswork required. I have no doubt that my co-workers can do this. But I want their life to be as easy as possible.

Similarly I wonder what the external monitoring should be apart from injecting fake jobs around every run-parts unit in this case. Replacing run-parts with something monitoring-aware? Then why not take the tool that already exists (systemd)?

And finally, the load spikes: Upthread it was mentioned that RandomizedDelaySec exists. Generally this should be sufficient to even out such effects. I understand that there is a case where you run a lot of unrelated VMs that you cannot control. In other cases, like laptops and desktops, it is very likely much more efficient to generate the load spike and complete the task as fast as possible in order to return to the low-power state of (effectively) waiting for input. I suspect that there is a conflict between the two that could be dealt with by encouraging liberal use of DefaultTimerAccuracySec on the system-level. I understand that Debian inherently does not distinguish between the two cases. I'd still expect a Cloud/Compute provider to offer default images in any case that could be preconfigured appropriately.

I apologize that I think of this in terms of systemd primitives. But the tool was written for a reason and a lot of thought went into it.

Kind regards
Philipp Kern


Reply to: