[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1042947: UDD: create a duck importer



Hi Lucas,

On 2023-08-03 10:30, Lucas Nussbaum wrote:
> duck-as-a-service (duck.debian.net) has been broken for a long time,
> and
> the corresponding UDD importer is broken as well (see #949009,
> #963887).
> In the meantime, duck continued evolving (was rewritten?) and is now
> checking a lot more places for URLs.
> 
> It would probably be useful to re-create a way to provide duck
> results
> as a service, based on UDD, similarly to what is done for upstream or
> lintian data.
> 
> Ideally, this would be done in cooperation with the duck maintainer
> to
> do the following changes:
> - in duck, separate the logic to get URLs from sources, from the
> logic
>   to check those URLs (for example, allow dumping a list of URLs, and
>   also using a list of URLs as source)
> - in duck, provide machine-readable outputs (JSON?)

Currently duck has two features which can help us:

- The `-n` switch, which gets all URLs and prints them to stdout
- The `-l filename` switch, which takes a file with one URL per line
and checks them

Theoretically, what's missing in only a `--json` switch, which would
change the output from console/text to JSON.

But, as I see it, the `-l` argument is limited in two aspects:

- It provides only the URL, loosing the checker type which is used to
select what kind of validation will be performed.

  For instance, a https://salsa.debian.org/rfrancoise/tmux.git of type
VCS-Git would be tested as a standard URL in the `-l` context, instead
of a git repository.

- It requires a file

I'm thinking of implementing a new JSON specific input format
(`--input-json`?), including the two information, which would read from
stdout instead of a file.

The format would be as simple as:

```json
[
   {"type": "VCS-Git",
    "url": "https://salsa.debian.org/rfrancoise/tmux.git";,
    "filename": "debian/control",  # optional key
    "line_number": 10},            # optional key
   ...
]
```

Following this logic, the output format for checking URLs would be the
same, as to have `duck --json -n | duck --input-json` working.

The JSON result would hold an additional dictionary for each URL
entries
named "result", described as follows:

```json
[
   {"type": "VCS-Git",
    "url": "https://salsa.debian.org/rfrancoise/tmux.git";,
    "filename": "debian/control",  # optional key
    "line_number": 10,             # optional key
    "result": {
       "state": 0,  # 0 for OK, 1 for Error, 2 for Information
       "detail": "Informative message",
       "certainty": "possible"     # optional key
   }},
   ...
]
```

Let me know what you think of it.

> Then UDD could process source packages to extract URLs, check those
> URLs
> on a regular basis (similarly to what is done for lintian), and
> publish/export the results in all relevant places.

Best,
-- 
Baptiste Beauplat

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: