Re: Ré : new service: orig-check -- Debian Upstream Tarball Checker
On 15/12/25 at 10:52 +0100, William Desportes wrote:
> Hi Lucas,
>
> This is awesome !
> It was the missing part of reproducing packaging.
Thanks!
> Checking this is quite important, as some changes can get into Debian without anyone noticing.
> I already used such a trick to add back a tests folder before the next upstream release.
>
> I picked some PHP packages to check some results:
(I reordered your points)
> - https://orig-check.debian.net/result/ddd0d6864cad4326cc353efe382dd1bd3b4443c9d29540f51c2ead713920d9f6
> -> The tarballs are not identical, but the contents do. Can you throw a different error ?
This should have been caught by the attempt at "normalizing" tarballs,
but this failed because the PHP tarballs are strange: they include a
package.xml file at the root.
The normalization algorithm was trying to remove the leading directory
(to avoid problems with e.g. Date-1.2.3 vs php-date-1.2.3), but that
failed if there are several first components.
I improved this part of the code to ignore stripping the first path
component when there are multiple first path components, and now that
package tests successfully (after tarball normalization).
> - https://orig-check.debian.net/result/9ddbe16a5bc7b2951cb2d1c2d1a81e8558d400b97e561aac8700de7181944664
> -> This one has PEAR packaging applied, it should reproduce 100%. right ?
> - Same for https://orig-check.debian.net/result/80129fdbf962e615e123f2ee612858107c1ba4d9509fe67ae40a102a7e9cb508
> - Same for https://orig-check.debian.net/result/87a70c491e083b365692ed799a28dcd18245ebb45474713e88590f10a9a90e6e
> - Same for https://orig-check.debian.net/result/512b6555d406ba4f4d1b9600b83a3efa83c25c371a509a866bdd1dd5b801e1e7
There's a leading php-image-text-0.7.0/ directory in the debian tarball
(for example, README is php-image-text-0.7.0/Image_Text-0.7.0/README).
Maybe something strange happened when the upstream tarball was imported
into git?
Anyway, now they all pass with the change to normalization.
> - https://orig-check.debian.net/result/93ff5415eca576a32a7787d6edd89a00759d1c6a5839750f9715e9ee64bd3afe
> -> I am not too sure why dscverify fails. But the "debian uupdate" did seem to be a standard, I am removing it from my packages as I can no see a difference with an without it. How was it supposed to work ?
I haven't used it for ages, but my understanding is that it essentially
does what gbp does when doing gbp import-orig: it merges the current
Debian packaging work and the new upstream tarball. But there are other
more advanced uses of uscan custom scripts, like repacking beyond what
is possible with mk-origtgz.
> - https://orig-check.debian.net/result/1e65c45f230c2d92166506b6e67fd60e250520cb3dac80759444dd77371bccb6
> -> A nice bug to catch on dscverify "Use of uninitialized value $status in pattern match (m//) at /usr/bin/dscverify line 157."
> -> Should my DM key be trusted ?
> - https://orig-check.debian.net/result/b4c755e772ac980316df0c58022405bce70b1d06eddc3ce4ea4e0476f0144d02
> -> dscverify should not fail for phpmyadmin, right ?
I must work a bit more on the dscverify part. Its results are ignored
currently.
> -> I am not sure why it said "290 - uscan did not produce an orig tarball with matching name". On my workstation it repacks after removing d/copyright excluded files.
The expected orig name is 'phpmyadmin_5.2.2-really+dfsg.orig.tar.xz'.
The file produced by uscan is 'phpmyadmin_5.2.2+dfsg.orig.tar.xz'
> - https://orig-check.debian.net/result/a087e2e6b89cc7314b3c83d1352c2b5bfbccf3471b7a4fbb4c6ac986372a3144
> -> you can see that the normalized compare shows files removed by gbp filter: https://salsa.debian.org/matomo-team/doctrine-cache/-/blob/debian/unstable/debian/gbp.conf?ref_type=heads#L5
> -> I think this might require a fix in your process. Quite some of my packages have .git* files removed
> - I guess same on https://orig-check.debian.net/result/78e971747e32dc6e7904e79fe6bb988fa568678d0eaf613ab546903658c8ff0f not one of mine.
> - Same https://orig-check.debian.net/result/07c57ce108e93ec9cac8e24c744eb6ce1cc8de034ef37e619bc655d4a67b990d
Ooch, that's a path that I indeed totally missed.
I need to think a bit more about this, but maybe the solution is to add
a whole different process where I would git init, gbp import-orig/gbp
export-orig, and use the resulting tarball for comparison.
> There is quite a lot of "version not found" on
> https://orig-check.debian.net/statistics Some of them are just very
> old tags, that can not be found by the current uscan. Maybe it would
> be worth it to trick uscan by giving an direct GitHub tag url and it
> would find the origin tarball ?
If I understood correctly, uscan with mode=git now tries harder to look
for older tags, at least for github? That's probably the way forward.
> And for some of scan failed: "504 Gateway Time-out" could you detect it for github.com and re try some seconds after ? For example for shotcut.
I reprocessed all packages with 5XX errors. I'm not sure yet about a
retry policy, but yes that's something that should be added.
So, to summarize, To-Do list items on my side are:
- uupdate support
- dscverify
- gbp-level filters
Thanks for the feedback!
Lucas
Reply to: