[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: new in UDD: duck importer (URL checker)



Hi Soren,

On 11/06/25 at 09:08 -0700, Soren Stoutner wrote:
> On Tuesday, June 10, 2025 11:52:13 PM Mountain Standard Time Lucas Nussbaum 
> wrote:
> > # statistics
> > 
> > * UDD knows about 207055 URLs
> > * 15207 URLs (7.34%) are failing
> > * 937 (2.37%) source packages failed to be processed by duck (I need to
> >   look into that)
> 
> I noticed that it is trying to check upstream metadata Repository URLs (which, 
> in some cases, are not expected to return a result to a standard web request).
> 
> For example, see the entry for privacy browser:
> 
> https://udd.debian.org/duck/?
> email1=soren%40debian.org&email2=&email3=&packages=&ignpackages=&format=html#results
> 
> It doesn’t like:
> 
> Repository: https://git.stoutner.com/PrivacyBrowserPC.git
> 
> But it shouldn’t be able to access that unless it is using the git protocol.
> 
> The Repository-Browse URL works as expected:
> 
> Repository-Browse: https://gitweb.stoutner.com/?
> p=PrivacyBrowserPC.git;a=summary
> 
> https://salsa.debian.org/soren/privacybrowser/-/blob/master/debian/upstream/
> metadata?ref_type=heads#L7

Thanks for the feedback.

I improved the URL tester to deal with the Git protocol (so now it
checks that it talks to a valid Git repository for e.g. Vcs-Git URLs),
and also added an override for the Repository metadata field, to check
using that method, if the field looks like a Git repository.

It would have been better to name the field "Repository-Git" or
something to avoid guessing the type of repository based on URL, but
it's probably too late for that.

Lucas


Reply to: