Re: Possible akonadi problem?

To: debian-kde@lists.debian.org
Subject: Re: Possible akonadi problem?
From: Kevin Krammer <kevin.krammer@gmx.at>
Date: Sun, 08 Feb 2015 10:37:31 +0100
Message-id: <[🔎] 4381817.YUh9BH9LfV@persephone>
In-reply-to: <[🔎] 1964830.uOH12gJgeP@merkaba>
References: <CAKmZw+aJUupW8+GSpCUTx+orAwkj+uVjw9+ofg+dD-+6qpz+JA@mail.gmail.com> <2553444.Qzu6u2dm3t@persephone> <[🔎] 1964830.uOH12gJgeP@merkaba>

On Saturday, 2015-02-07, 14:30:16, Martin Steigerwald wrote:
> Am Samstag, 31. Januar 2015, 10:28:41 schrieb Kevin Krammer:
> > On Friday, 2015-01-23, 09:54:10, Martin Steigerwald wrote:
> > > 1) It is a *cache* and *nothing* else. It never stores any information
> > > inside the cache that isn´t stored elsewhere or that it isn´t able to
> > > rebuild. That said, in case of issues with the cache, it is possible
> > > to
> > > remove it and rebuild it from scratch *without* *any* data loss
> > > involved and *without* having to recreate filter rules.
> > 
> > That's not always possible.
> > Most obvious example is writing data to backends who's actual storage is
> > unreachable, i.e. an IMAP server not reachable due to no network
> > connection.
> 
> Okay, for me thats more of a journal not a cache. But it can be seen as a
> write cache, yes.

Then call it a journal :)
Using a single word to describe the feature set of the application will almost 
always fail, since a single world will hardly cover all aspects properly.

For example it is an access arbiter, but that does not cover any of the 
functionality for offline access.

So any single world will always be inaccurate, but could still be used if the 
context is focussing on a certain feature.

E.g. if the topic is access arbitration, the access arbiter is desciptive 
enough.

> And it creates problems for backup purposes. At least, if having such a
> kind of journal is unavoidable, I think it should be file based. Like some
> outgoing maildir for mails. 

It is file based.
The default configuration uses a size threshold of 4096 bytes to speed up 
access to small data items (contacts, calendar entries, mail headers), but it 
is configurable on purpose.

> How did KMail 1 solve this?

For IMAP it mirrored the account into a local maildir and kept 
synchronization/state information and headers in proprietary binary files 
scattered through that tree.

So the main difference is that the status/cache information is no longer 
scattered and queryable.
And available to any type of backend (e.g. server based addressbook or 
calendar, non-IMAP mail server)

> Why? I think I wouldn´t store a mail that is not stored elsewhere just in
> the database.

Even with the default configuration that is not happening very often, but, as 
said above, ensurable.

> I´d make Akonadi as robust as it can get on a database, i.e.
> cache loss. No config, no data, just metadata, ideally only recreatable
> metadata in there. Similar to Baloo. And store everything else with the
> backend storage if possible.

That's how it works. All data is forwarded to the backend, which applies it to 
the best of its capabilities.
There is of course always room for increasing a backend's capabilities, e.g. 
detecting the availability of extended attributes and using that to suppport 
storage that lacks features (e.g. mbox having no state information).

> Also treat the backend storage as authoritative. If the backend storage
> has a mail, the database does not see, the mail is there. Period.

The basisc synchronization implementation works like this (from the top of my 
head):
- get a list of local items, get a list of remote items
- for every item that exists in both, check if remote item is newer/changed
- for every item not in the local list, add
- for every item not in the remote list, remove

Some resources have more advanced implementations, e.g. if the backend allows 
detection of changes based on some criteria (timestamp, monotonous increasing 
identifiers, etc).

> I think Akonadi should follow the same robustness principles as for
> example Postfix. It receives the mail, writes it, fsync() it and *only*
> then says "I have it, you can discard it" to the sending mailserver.

That would only work if the backend is always reachable, obviously.

Being offline capable was one of the main design goals, as observations over 
many years had shown that more and more data was not stored locally but on 
servers.

> > > 2) Make it *just plain* obvious *where* Akonadi stores the
> > > configuration and the data. For now its at least ~/.config/akonadi
> > > one or two files per resource (the changes.dat there),
> > > ~/.kde/share/config/akonadi*, ~/.kde/share/config/kmail2rc (it
> > > contains references to Akonadi resources), ~/.kde/share/config/ which
> > > contains the local filter rules.
> > 
> > The config for Akonadi is in $XDG_CONFIG_HOME/akonadi.
> > The other locations are those of programs using Qt4 based kdelibs. The
> > switch fir XDG_CONFIG_HOME will most likely happen with the first Qt5
> > based version of said programs.
> 
> So the amount if different directories will go down?

Down from two to one?
Maybe, the main change is the common root.

> I am hinting at user intro-spectability here. Sure I can understand a
> maildir, but even after some years, Akonadi still puzzles me. There is a
> bug report where moved mails are for a long time just in the database or
> file_db_data and do not appear in the destination maildir. For me thats a
> big, huge no-go.

Move within the same backend or between backends.
Anyway, as you said, a bug.

> Actually I do not see at all, why the mails should be cached within the
> database of file_db_data *at all* on a *local* maildir based move
> operation. Just move them already!

Yes, that's how the Maildir resource handles item moves: it moves the item's 
file.
Any cached content, e.g. headers, remains the same, the new source location is 
updated (change in item remote identifier).

Moves between resources are obviously more difficult, a sequence of retrieve, 
add and remove.

> > > 5) If you use a database, make perfectly sure that there *never ever*
> > > can be two database processes trying to access the same database. I
> > > have seen this several times with Akonadi MYSQL backend that I had
> > > two mysqld processes. Thread the database as *part* of Akonadi and
> > > make an akonadictl *stop* it *or* report a failure when it cannot
> > > stop it. And make akonadictl never start it, if there is still one
> > > running.
> > 
> > My understanding is that the control process sees the subprocess as
> > finished. This will of course be solved by systemd which can terminate
> > subprocesses based on cgroups membership.
> 
> I don´t see how systemd is needed for that. And it would be non-portable
> to BSD then.

I said systemd solves this, I didn't say it is needed.
Process group termination is one of the features gained through the kernel's 
cgroup capabilities, any alternative cgroupsmanager is likely to offer it as 
well.
Other kernels might have similar or comparable capabilities.

Orphaned processes seem to be a common problem, otherwise there wouldn't have 
been the need to properly implement remedies against it.

> But well… I hope Akonadi Next will be much leaner and not use MySQL at
> all.

Some people seem to have it working with PostgreSQL quite well already.

Cheers,
Kevin

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply to:

References:
- Re: Possible akonadi problem?
  - From: Martin Steigerwald <martin@lichtvoll.de>

Prev by Date: Re: Possible akonadi problem?
Next by Date: kontact filters does not work fine
Previous by thread: Re: Possible akonadi problem?
Next by thread: kontact filters does not work fine
Index(es):
- Date
- Thread