[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: email backend for fedmsg



On Tue, 24 Mar 2020 at 22:45, Nicolas Dandrimont <olasd@debian.org> wrote:
>
> On Tue, Mar 24, 2020, at 21:51, clime wrote:
> > On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont <olasd@debian.org> wrote:
> > >
> > > Hi!
> > >
> > > On Sun, Mar 22, 2020, at 13:06, clime wrote:
> > > > Hello!
> > > >
> > > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> > > > fedmsg usage in Debian.
> > > >
> > > > There is a note: "it seems that people actually like parsing emails"
> > >
> > > This was just a way to say that fedmsg never got much of a user base in the services that run on Debian infra, and that even the new services introduced at the time kept parsing emails.
> >
> > Hello Nicolas!
> >
> > Do you remember some such service and how it used email parsing specifically?
>
> I believe that tracker.debian.org was introduced around that time.
>
> At the point it was created, tracker.d.o was mostly consuming emails from packages.debian.org to update its data. These days tracker.d.o has replaced packages.d.o as "email router", in that it receives all the mails from services (e.g. the BTS, the archive maintenance software, buildds, salsa webhooks, ...) and forwards them to the public.
>
> > I am still a bit unclear how email parsing is used in Debian
> > infrastructure, don't get me wrong, I find it elegant
>
> Ha. I find that it's a big mess.
>
> Here's the set of headers of a message I received today from tracker.d.o, which are supposed to make parsing these emails better:
>
> X-PTS-Approved: yes
> X-Distro-Tracker-Package: facter
> X-Distro-Tracker-Keyword: derivatives
> X-Remote-Delivered-To: dispatch@tracker.debian.org
> X-Loop: dispatch@tracker.debian.org
> X-Distro-Tracker-Keyword: derivatives
> X-Distro-Tracker-Package: facter
> List-Id: <facter.tracker.debian.org>
> X-Debian: tracker.debian.org
> X-Debian-Package: facter
> X-PTS-Package: facter
> X-PTS-Keyword: derivatives
> Precedence: list
> List-Unsubscribe: <mailto:control@tracker.debian.org?body=unsubscribe%20facter>
>
> I'll leave you to judge whether this makes sense or not.
>
> (and it turns out that the actual useful payload was just plaintext with no real chance of automated parsing)
>
> > but from what I have found (e.g. reportbug), in the beginning there is an
> > email being sent by some human which will then trigger some automatic
> > action (e.g. putting the bug into db). So it's like you could do all
> > your work simply by sending emails (some of them machine-parsable).
> >
> > So do you have the opposite? I do some clicking action somewhere and
> > it will send an email to a certain mailing list to inform human
> > beings? Or let's not just clicking but e.g. `git push` (something that
> > you can still do from command line).
> >
> > Do you have: I do some clicking action somewhere and it will send an
> > email to a certain mailing list where the email is afterward parsed by
> > another service which will do an action (e.g. launch a build) based on
> > it?
>
> Both of these are somewhat true.
>
> Some examples of email-based behaviors:
>  - Our bug tracking system is fully controlled by email.
>  - Closing a bug in reaction to an upload is done by an email from the archive maintenance system (dak) to the bug tracking system.
>  - Salsa has a webhook service that react to UI clicks (e.g. "clicking the merge button") by sending an email to the BTS (e.g. to tag bugs as pending), or to tracker.d.o (for new commit notifications).
>  - Some of our IRC bots are triggered by procmail rules.
>  - At some point mentors.debian.net depended on a NNTP gateway to the debian-devel-changes mailing list to trigger removal of superseded packages (...)
>  - etc. etc.
>
> I'm still not sure where your trail of questions is going? fedmsg in Debian has been dead for years at this point, and there still doesn't seem to be much interest to implement anything beyond email parsing in some of our core systems.

Cool, so basically what I am thinking about is to create a free
software from what you are describing. I.e. create reusable tooling
out of the Debian messaging system. Something that a new linux
distribution can easily start using to connect their services.

I didn't know Debian infra works like this but I find it very
elegant/efficient and I would like the solution you have to be
reusable by others.

So basically the tooling should contain:
- unified email message format
- library that is able to translate a message to a language data
structure (e.g. dictionary in python)
- email receiver that would be listening for emails coming from the
bus and emitting events based on that (this could be part of the
library so you would be able to attach a callback for an incoming
message or just do blocking waits)
- email publisher - something that can send a new message into the
bus, i.e. to a preconfigured mail server (a "broker" or "hub")
- mail server that would have an http API to manage topic
subscriptions  (i.e. add/delete me from a given topic) - it would
receive a message from a publisher for a given topic, found out who is
subscribed to it, and duplicated the email message for each consumer
and send it to them

For the mail server I am thinking about https://www.courier-mta.org/
and using https://www.courier-mta.org/maildropgdbm.html for
subscription management.

Basically, this I thought could be a new "email backend" in fedmsg
instead of zeromq one...

I am not very familiar with email technology but I like the idea because:
- if you do an email setup for people, you are going to already be
technically skilled to do it for services or vice versa
- one of communicating agents may be a human being that is watching
what's going on in system by having dedicated inbox folders for each
type of event (topic) - no amqp/zeromq/mqtt -> email translation is
needed here - everything is just email (except for irc messages
emitted based on those)
- i think this can be optimized to work very reliably inside one
infrastructure (e.g. debian.org) but at the same time it is easy for
an outside listener to join in with his/her own service and start
doing some stuff based on Debian events (if the subscription hub is
public)
- it uses the most standard and compatible protocol possible (SMTP) so
shouldn't be an opinionated technology - theoretical message
throughput will be limited because of that (i suspect SMTP is not
extremely fast) but it should be still sufficient to handle all the
distribution events

I am still exploring ideas to do a federated message bus so this is one of them
Please, take this as a wild brainstorming, maybe I should have given
this more time to settle in my head but on the other hand, I won't
mind being pwned too much here
clime

>
> Bye,
> --
> Nicolas Dandrimont


Reply to: