[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: DebianBug links (was Re: Complex conversion issues)



On Sat, Jul 26, 2025 at 09:29:39PM +0800, Maytham Alsudany wrote:
> On Sat, 2025-07-26 at 13:48 +0100, Andrew Sayers wrote:
> > On Sat, Jul 26, 2025 at 11:56:30AM +0800, Maytham Alsudany wrote:
> > > On Fri, 2025-07-25 at 15:21 +0100, Andrew Sayers wrote:
> > [...]
> > > > ## DebianBug links
> > > > 
> > > > https://wiki.debian.org/htdocs/bugstatus.js seems to handle [[DebianBug:]],
> > > > [[https://bugs.debian.org/]], and apparently launchpad as well.
> > > > The obvious MediaWiki equivalent is a gadget[2].
> > > [...]
> > > 
> > > The easiest way I found to do this (where this = adding a strikethrough
> > > to closed bugs) is to run a very small service[9] to check the BTS and
> > > render the necessary HTML, then add an InterWiki connection for that
> > > service with "scary transclusion"[10] enabled. I've tested it on my
> > > local MW instance and it works perfectly. MW also caches it, so it
> > > doesn't contact the server on every page load.
> > > 
> > > This also opens up new opportunities for things like incorporating
> > > dynamic package data -- who knows?
> > 
> > That's a really interesting approach.  Some thoughts...
> > 
> > https://wiki.debian.org/htdocs/bugstatus.js calls
> > https://wiki.debian.org/cgi-bin/bugstatus?bug=<number>
> > I'm not sure where to look for the source of that script, but I assume it
> > uses the Debbugs SOAP interface[11], and presumably caches results?
> 
> https://salsa.debian.org/debian/wiki.debian.org/-/blob/master/bin/bugstatus
> You are correct in that it uses the SOAP interface, but it doesn't cache
> results.

If the current solution doesn't cache, we're presumably looking to replace
a solution that hits the BTS every time someone visits a page with a BTS
link on it.  That suggests...

a) users will see any caching as a regression
b) the BTS admins will see any caching as an improvement
c) if we have to give up and use a JS solution, browsers could
   contact the BTS directly, avoiding the need for the Perl script?

> > Speaking of caching, it would be nice to have a solution that's updated
> > regularly without spamming the BTS server, but I don't see a
> > "bugs updated since <date>" request in the SOAP interface.
> > Anyone object to me submitting a wishlist bug against debbugs?
> > Or am I better off asking on #debbugs instead?

I wasn't aware of the UDD before, and my question is answered with e.g.:

    https://udd.debian.org/bugs/?merged=ign&fnewerval=7&flastmodval=7&rc=1&sortby=last_modified&sorto=desc&format=json

> > 
> > maytham explained on IRC that scary transclusion has a setting to control
> > how often MW polls the service[12].  That seems like a good plan if we have
> > to use the existing debbugs API, but if debbugs is upgraded to list recent
> > changes, it would be nice to push those to the site a bit faster.
> > How about a solution like this:
> > 
> > 1. when the service is queried, it returns the result then edits
> >    Template:Debbugs/<number> with the same result
> > 2. the service polls debbugs every 60 seconds for recent updates,
> >    and updates any existing Template:Debbugs/<number>
> > 3. Template:DebianBug uses Template:Debbugs/<number> if it exists,
> >    or else scary-transcludes the service
> > 
> > ... which would update links within a minute, without putting much load
> > on either the BTS or wiki servers.
> 
> Wouldn't that just push caching away from the wiki's builtin system to
> the wiki pages? Also seems like it would cause *more* traffic by
> checking debbugs every 60 seconds, when MediaWiki only fetches
> information on demand and caches information for up to 1 hour (by
> default).

I'm not sure I understand the distinction - surely wiki pages *are* the wiki's
builtin caching system?

For clarity, here are some terms I'll try to use consistently in this thread:

* a "pull-based solution" is something like scary transclusion, which 
  makes one small request per record per hour (or day, or whatever)
* a "polling-based solution" is something like the script I'm proposing,
  which makes one big request total per minute (or hour or whatever)
  and scans that request for matches
* a "push-based solution" would be something like a webhook where the remote
  server notifies us when events occur

A pull-based solution would check at most once per hour per bug, whereas a
polling-based solution would guarantee exactly one redundant request site-wide
per minute.  So it's not immediately obvious which have higher total traffic.

You mentioned before that you had access to the current wiki's log
files - could you look for requests to /cgi-bin/bugstatus to see how much the
current wiki is accessing the BTS, and how many unique bugs it's asking about?

> 
> The BTS handles high traffic well, so I don't think this is an issue
> that needs to be handled, as well as the fact caching already happens.
> 
> The caching period can be decreased, and InterWiki comes with a
> maintenance script to clear all of its cache if needed.

Reducing the cache time to one minute would work as well as polling once per
minute, but I would expect it to trigger frequent re-renders of those pages.
If we start tweaking the timeout, we should keep an eye on MW server load.

> > Finally, how about making the template returned by the service look like:
> > 
> > {{
> >    {{{1}}}
> >    |summary=...
> >    |pending=...
> >    |id=...
> >    |severity=...
> >    ...
> > }}}
> > 
> > You could then call it like `{{raw:wiki:debbugs|<number>|MyHandler}}`,
> > which would in turn call Template:MyHandler with the relevant parameters.
> 
> Do you mean add the ability to fetch different parameters from the bug
> system? I don't think this is necessary since no wiki pages currently do
> this and it would only duplicate information. Except maybe the bug title
> can be in the link text when an option is passed?

Having now looked at External Data, I was suggesting a homebrew version of
#display_external_table - let's look at that instead :)

> Yet another interesting possibility that doesn't require running another
> service is the External Data[13] extension, which can pretty much
> achieve the same thing by accessing the BTS SOAP API directly and
> fetching information. It also supports caching[14] and allows for
> different caching expiry times for different URLs and hosts.
> 
> It can even fetch data from databases[15], which opens the possibility
> for querying the UDD mirror (which is a PostreSQL database).
> 
> We can do some really cool stuff with this :)

Agreed!

External Data is still a pull-based solution, but if circumstances conspired to
need polling (or even pushing), we'd just make a mechanism to purge individual
page caches when values changed.

At a pinch, it might even be possible to get data *out* of MediaWiki this way.
For example, I previously mentioned creating a table of ToDo items.
Consider a function call like:

    {{#external_value:Anchor|source=https://wiki.debian.org/todos?page={{PAGENAME}}&text={{{1}}}}}

... an external service could add the text to an internal database,
return an anchor to display on the page, then provide a list of
ToDo items with links from some other query.

That solution couldn't update the external service when the ToDo item is
deleted (because there's no {{#external_value:}} call left).  I've always let
Cargo solve that by magic, but presumably there's an extension or something
that would let us push that kind of page update to an external server.

> 
> --
> Maytham
> 
> > > > [1] https://www.mediawiki.org/wiki/Manual:Interwiki
> > > > [2] https://www.mediawiki.org/wiki/Extension:Gadgets
> > > > [3] https://www.mediawiki.org/wiki/Extension:Emoticons
> > > > [4] https://www.mediawiki.org/wiki/Manual:Table_of_contents#Depth
> > > > [5] https://www.mediawiki.org/wiki/Extension:SyntaxHighlight
> > > > [6] https://www.mediawiki.org/wiki/Template:Hint
> > > [7]  https://wiki2025.debian.org/wiki/Template:DebianIRC
> > > [8]  https://wiki2025.debian.org/wiki/User:Maytha8
> > > [9]  https://salsa.debian.org/Maytha8/iwservice
> > > [10] https://www.mediawiki.org/wiki/Manual:$wgEnableScaryTranscluding
> > [11] https://wiki.debian.org/DebbugsSoapInterface
> > [12] https://www.mediawiki.org/wiki/Manual:$wgTranscludeCacheExpiry
> [13] https://www.mediawiki.org/wiki/Extension:External_Data
> [14] https://www.mediawiki.org/wiki/Extension:External_Data/Caching_data
> [15] https://www.mediawiki.org/wiki/Extension:External_Data/Databases


Reply to: