[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1120799: release.debian.org: CI job scheduling is not aware of transitions



Package: release.debian.org
Severity: normal
X-Debbugs-Cc: elbrus@debian.org

The CI jobs scheduled by britney don't take transitions into account, causing not all affected packages to be pulled from unstable.

As recent discussed in the gdal transition bugreport [0], CI job scheduling needs to be improved to do the right thing for packages with multiple packages affected by ongoing transitions like libgdal-grass.

The attached script is a proof of concept that implements the algorithm I suggested to solve this issue.

It parses the list of dependencies for the autopkgtests and expands this list to include all transitive dependencies.

The list of affected source packages in the transition for the trigger package is parsed from the Ben JSON output files, the state of the transition is determined by the location of the .ben file (ongoing & finished subdirectories).

For the transitive test dependencies, if their source package is affected by the transition, that source package is pinned in the CI job.

The debci API is not used currently, output is printed which can be copy/pasted into the self-service form which should suffice for a POC.

The dependency resolution currently uses UDD, because I didn't yet figure out how to use python-apt's apt_pkg to do a Trivial-Only run in forky chroot on a trixie system, pointers are very welcome.

Example output for two different packages from the recent gdal transition:

 CI Job Request

 Package Name: libgeo-gdal-ffi-perl
 Suite: testing
 Trigger: gdal/3.12.0+dfsg-1

 Pin Packages:
 package_version: gdal/3.12.0+dfsg-1
 package: gdal
 src:gdal, unstable
 src:libgeo-gdal-ffi-perl, unstable
 
 Extra APT Sources:
 unstable
 testing

This only pulls gdal & libgeo-gdal-ffi-perl from unstable because non of the transitive dependency were also involved in the transtion.

 CI Job Request

 Package Name: libgdal-grass
 Suite: testing
 Trigger: gdal/3.12.0+dfsg-1

 Pin Packages:
 package_version: gdal/3.12.0+dfsg-1
 package: gdal
 src:gdal, unstable
 src:grass, unstable
 src:libgdal-grass, unstable

 Extra APT Sources:
 unstable
 testing

Here all three affected packages are pulled from unstable which is the only combination that will work as discussed in all the recent gdal transition bugreports.

I guess when dependency resolution is implemented with python-apt, this should get incorporated in britney2?

How difficult is it to setup a test instance for that? setting-up-britney.rst doesn't look very daunting, but I suspect it leave out a lot of details.

Before I spend more time on this, I'd like to hear what you think of this approach.

[0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1120361#53

Kind Regards,

Bas
#!/usr/bin/python3 -u

import argparse
import copy
import json
import os
import pprint
import re
import subprocess
import sys

import apt_pkg
import psycopg2
import psycopg2.extras
from debian.deb822 import Deb822
from debian.debian_support import AptPkgVersion


args = None
db = None
cursor = None
releases = {}


def execute_command(cmd, stdin=None):
    if args.verbose:
        print("Executing: %s" % ' '.join(cmd))

    process = subprocess.Popen(
        cmd,
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )

    (stdout, stderr) = process.communicate(stdin)

    if args.debug:
        print("returncode: %s" % process.returncode)
        if stdout:
            print("stdout:\n%s" % stdout.decode())
        if stderr:
            print("stderr:\n%s" % stderr.decode())

    return (process, stdout, stderr)


def execute_query(query, param=None):
    global cursor

    if args.debug:
        print("query:\n%s" % query)
        print("param:\n%s" % pprint.pformat(param))

    if param is not None:
        cursor.execute(query, param)
    else:
        cursor.execute(query)

    return cursor


def db_close():
    global db
    global cursor

    if args.verbose:
        print(
            "Closing database connection to %s on %s:%s" % (
                args.db_name,
                args.db_host,
                args.db_port,
            )
        )

    cursor.close()
    db.close()


def db_connect():
    global db
    global cursor

    if args.verbose:
        print(
            "Connecting to %s on %s:%s as %s" % (
                args.db_name,
                args.db_host,
                args.db_port,
                args.db_user,
            )
        )

    db = psycopg2.connect(
        dbname=args.db_name,
        host=args.db_host,
        port=args.db_port,
        user=args.db_user,
        password=args.db_pass,
        cursor_factory=psycopg2.extras.DictCursor,
    )

    cursor = db.cursor()

    return (db, cursor)


def upstream_version_from_revision(version):
    pkg_version = AptPkgVersion(version)

    if args.debug:
        print("version: %s" % version)
        print("upstream_version: %s" % pkg_version.upstream_version)

    return pkg_version.upstream_version


def strip_version(package_version):
    package = re.sub(
        r'/\S+$',
        '',
        package_version,
    )

    if args.debug:
        print("package_version: %s" % package_version)
        print("package: %s" % package)

    return package


def get_affected_dependencies(
    test_dependencies,
    affected_sources,
    release='testing',
):
    global cursor

    affected_dependencies = []

    for package in test_dependencies:
        query = """
            SELECT source
              FROM packages
             WHERE distribution = 'debian'
               AND release = %(release)s
               AND architecture IN ('amd64', 'all')
               AND package = %(package)s
        """
        param = {
            'release': releases[release],
            'package': package,
        }

        execute_query(query, param)

        rows = cursor.fetchall()

        if args.debug:
            print("rows:\n%s" % pprint.pformat(rows))

        for row in rows:
            if (
                row['source'] in affected_sources and
                row['source'] not in affected_dependencies
            ):
                affected_dependencies.append(row['source'])

    affected_dependencies = sorted(affected_dependencies)

    if args.debug:
        print(
            "affected_dependencies:\n%s" % (
                pprint.pformat(affected_dependencies),
            )
        )

    return affected_dependencies


def virtual_to_actual(packages, release='testing'):
    global cursor

    for package in packages:
        query = """
            SELECT package
              FROM packages
             WHERE distribution = 'debian'
               AND release = %(release)s
               AND architecture IN ('amd64', 'all')
               AND package = %(package)s
          ORDER BY version DESC
             LIMIT 1
        """
        param = {
            'release': releases[release],
            'package': package,
        }

        execute_query(query, param)

        rows = cursor.fetchall()

        if args.debug:
            print("rows:\n%s" % pprint.pformat(rows))

        if not rows:
            # Possible virtual package

            query = """
                SELECT package
                  FROM packages
                 WHERE distribution = 'debian'
                   AND release = %(release)s
                   AND architecture IN ('amd64', 'all')
                   AND provides ~ %(pattern)s
              ORDER BY version DESC
                 LIMIT 1
            """
            param = {
                'release': releases[release],
                'pattern': r'(^| |,)' + package + r'($| |,)',
            }

            execute_query(query, param)

            rows = cursor.fetchall()

            if args.debug:
                print("rows:\n%s" % pprint.pformat(rows))

            for row in rows:
                if package in packages:
                    packages.remove(package)

                if row['package'] not in packages:
                    packages.append(row['package'])

    return packages


def resolve_dependencies(
    dependencies,
    transitive_dependencies=None,
    release='testing',
):
    global cursor

    if transitive_dependencies is None:
        packages = copy.copy(dependencies)
    else:
        packages = copy.copy(transitive_dependencies)

    transitive_dependencies = []

    for package in packages:
        query = """
            SELECT depends,
                   recommends
              FROM packages
             WHERE distribution = 'debian'
               AND release = %(release)s
               AND architecture IN ('amd64', 'all')
               AND package = %(package)s
          ORDER BY version DESC
             LIMIT 1
        """
        param = {
            'release': releases[release],
            'package': package,
        }

        execute_query(query, param)

        rows = cursor.fetchall()

        if args.debug:
            print("rows:\n%s" % pprint.pformat(rows))

        if not rows:
            # Possible virtual package

            query = """
                SELECT depends,
                       recommends
                  FROM packages
                 WHERE distribution = 'debian'
                   AND release = %(release)s
                   AND architecture IN ('amd64', 'all')
                   AND provides ~ %(pattern)s
              ORDER BY version DESC
                 LIMIT 1
            """
            param = {
                'release': releases[release],
                'pattern': r'(^| |,)' + package + r'($| |,)',
            }

            execute_query(query, param)

            rows = cursor.fetchall()

            if args.debug:
                print("rows:\n%s" % pprint.pformat(rows))

            if not rows:
                raise Exception("Error: Package not found: %s" % package)

        if rows[0]['depends']:
            depends = apt_pkg.parse_depends(rows[0]['depends'])

            for dep in depends:
                pkg = dep[0][0]

                if (
                    pkg not in dependencies and
                    pkg not in transitive_dependencies
                ):
                    transitive_dependencies.append(pkg)

        if rows[0]['recommends']:
            recommends = apt_pkg.parse_depends(rows[0]['recommends'])

            for dep in recommends:
                pkg = dep[0][0]

                if (
                    pkg not in dependencies and
                    pkg not in transitive_dependencies
                ):
                    transitive_dependencies.append(pkg)

    if transitive_dependencies:
        for package in transitive_dependencies:
            dependencies.append(package)

        dependencies = resolve_dependencies(
            dependencies,
            transitive_dependencies,
            release,
        )

    return dependencies


def get_test_dependencies(package_dir):
    control_file = os.path.join(
        package_dir,
        'debian',
        'control',
    )

    build_depends = []
    recommends = []
    packages = []

    with open(control_file) as f:
        for paragraph in Deb822.iter_paragraphs(f):
            if args.debug:
                print("paragraph:\n%s" % pprint.pformat(paragraph))

            for key in paragraph:
                if key.startswith('Build-Depends'):
                    depends = apt_pkg.parse_src_depends(paragraph[key])

                    if args.debug:
                        print("key: %s" % key)
                        print("depends:\n%s" % pprint.pformat(depends))

                    for dependency in depends:
                        package = dependency[0][0]

                        if package not in build_depends:
                            build_depends.append(package)
                elif key == 'Recommends':
                    depends = apt_pkg.parse_depends(paragraph[key])

                    if args.debug:
                        print("key: %s" % key)
                        print("depends:\n%s" % pprint.pformat(depends))

                    for dependency in depends:
                        package = dependency[0][0]

                        if package not in recommends:
                            recommends.append(package)
                elif key == 'Package':
                    if paragraph[key] not in packages:
                        packages.append(paragraph[key])

    if args.debug:
        print("build_depends:\n%s" % pprint.pformat(build_depends))
        print("recommends:\n%s" % pprint.pformat(recommends))
        print("packages:\n%s" % pprint.pformat(packages))

    cmd = [
        args.autodep8,
        package_dir,
    ]

    (process, stdout, stderr) = execute_command(cmd)

    if process.returncode != 0:
        print(
            "Error: Failed to execute: %s (%s)" % (
                ' '.join(cmd),
                process.returncode,
            )
        )

        return

    test_dependencies = []

    for paragraph in Deb822.iter_paragraphs(stdout.decode()):
        if args.debug:
            print("paragraph:\n%s" % pprint.pformat(paragraph))

        depends = apt_pkg.parse_depends(paragraph['Depends'])

        if args.debug:
            print("depends:\n%s" % pprint.pformat(depends))

        for dependency in depends:
            package = dependency[0][0]

            if package == '@':
                for package in packages:
                    if package not in test_dependencies:
                        test_dependencies.append(package)
            elif package == '@builddeps@':
                for package in build_depends:
                    if package not in test_dependencies:
                        test_dependencies.append(package)
            elif package == '@recommends@':
                for package in recommends:
                    if package not in test_dependencies:
                        test_dependencies.append(package)
            else:
                if package not in test_dependencies:
                    test_dependencies.append(package)

    test_dependencies = resolve_dependencies(test_dependencies)
    test_dependencies = virtual_to_actual(test_dependencies)
    test_dependencies = sorted(test_dependencies)

    if args.debug:
        print("test_dependencies:\n%s" % pprint.pformat(test_dependencies))

    return test_dependencies


def schedule_autopkgtest(extra_packages=None):
    print("CI Job Request")
    print("")
    print("Package Name: %s" % args.package)
    print("Suite: testing")
    print("Trigger: %s" % args.trigger)
    print("")
    print("Pin Packages:")
    print("src:%s, unstable" % strip_version(args.trigger))
    if extra_packages is not None:
        for package in extra_packages:
            print("src:%s, unstable" % package)
    print("")
    print("Extra APT Sources:")
    print("unstable")
    print("testing")
    print("")


def get_affected_sources():
    json_dir = os.path.join(
        args.transitions_path,
        'json',
    )

    if not os.path.exists(json_dir):
        raise Exception(
            "Error: Ben JSON output directory not found: %s" % (
                json_dir,
            )
        )

    trigger = strip_version(args.trigger)

    affected_sources = []

    for entry in sorted(os.listdir(json_dir)):
        if args.debug:
            print("entry: %s" % entry)

        # gdal.json
        # gdal3.12.json
        # auto-gdal.json
        match = re.search(
            r'^(?:auto-)?' + trigger + r'(?:\d+\S+)?\.json$',
            entry,
        )

        if match:
            json_file = os.path.join(json_dir, entry)

            with open(json_file, 'r') as f:
                data = json.load(f)

                for level in data:
                    for source in data[level]:
                        if source == trigger:
                            continue

                        affected_sources.append(source)

            break

    if args.debug:
        print("affected_sources:\n%s" % pprint.pformat(affected_sources))

    return affected_sources


def has_ongoing_transition():
    if not os.path.exists(args.config_path):
        raise Exception(
            "Error: Ben config directory not found: %s" % (
                args.config_path,
            )
        )

    trigger = strip_version(args.trigger)

    for subdir in [
        'ongoing',
        'finished',
    ]:
        config_dir = os.path.join(args.config_path, subdir)

        if not os.path.exists(config_dir):
            raise Exception(
                "Error: Ben config subdirectory not found: %s" % (
                    config_dir,
                )
            )

        for entry in sorted(os.listdir(config_dir)):
            if args.debug:
                print("entry: %s" % entry)

            # gdal.ben
            # gdal3.12.ben
            # auto-gdal.ben
            match = re.search(
                r'^(?:auto-)?' + trigger + r'(?:\d+\S+)?\.ben$',
                entry,
            )

            if match:
                return True

    return False


def get_source_package(source_details):
    source = source_details['testing']['source']

    package_dir = os.path.join(
        args.source_path,
        '%(source)s-%(upstream_version)s' % {
            'source': args.package,
            'upstream_version': upstream_version_from_revision(
                source_details['testing']['version'],
            ),
        }
    )

    if args.debug:
        print("package_dir: %s" % package_dir)

    if os.path.exists(package_dir):
        return package_dir

    if source.startswith('lib'):
        subdir = source[:4]
    else:
        subdir = source[0]

    dsc_file = None
    for file in source_details['testing']['files']:
        if file.endswith('.dsc'):
            dsc_file = file
            break

    if dsc_file is None:
        print("Error: No .dsc file for package in testing!")
        return

    url = args.mirror
    if not url.endswith('/'):
        url += '/'
    url += 'pool/%(component)s/%(subdir)s/%(source)s/%(dsc_file)s' % {
        'component': source_details['testing']['component'],
        'subdir': subdir,
        'source': source,
        'dsc_file': dsc_file,
    }

    if args.debug:
        print("url: %s" % url)

    oldpwd = os.getcwd()

    os.chdir(args.source_path)

    cmd = [
        args.dget,
        '-u',
        url,
    ]

    (process, stdout, stderr) = execute_command(cmd)

    if process.returncode != 0:
        print(
            "Error: Failed to execute: %s (%s)" % (
                ' '.join(cmd),
                process.returncode,
            )
        )

        return

    os.chdir(oldpwd)

    return package_dir


def get_source_details(package):
    source_details = {
        'testing': {},
        'unstable': {},
    }

    for key in source_details:
        query = """
            SELECT source,
                   version,
                   files,
                   bin,
                   component
              FROM sources
             WHERE distribution = 'debian'
               AND release = %(release)s
               AND source = %(source)s
          ORDER BY version DESC
             LIMIT 1
        """
        param = {
            'release': releases[key],
            'source': args.package,
        }

        execute_query(query, param)

        rows = cursor.fetchall()

        if args.debug:
            print('rows:\n%s' % pprint.pformat(rows))

        for row in rows:
            files = []
            for line in row['files'].split('\n'):
                #  ee2c1f9e2e532b1dbdd0ff9792162f1c 2156 libgdal-grass_1.0.4-2.dsc  # noqa: E501
                match = re.search(
                    r'^\s*[0-9a-f]+\s+\d+\s+(\S+)\s*$',
                    line,
                )
                if match:
                    files.append(match.group(1))

            packages = row['bin'].split(', ')

            source_details[key] = {
                'source': row['source'],
                'version': row['version'],
                'files': files,
                'packages': packages,
                'component': row['component'],
            }

    if args.debug:
        print("source_details:\n%s" % pprint.pformat(source_details))

    return source_details


def get_releases():
    global releases

    query = """
        SELECT release,
               role
          FROM releases
         WHERE distribution = 'debian'
           AND role IN ('testing', 'unstable')
    """

    execute_query(query)

    rows = cursor.fetchall()

    if args.debug:
        print('rows:\n%s' % pprint.pformat(rows))

    for row in rows:
        releases[row['role']] = row['release']

    if args.debug:
        print("releases:\n%s" % pprint.pformat(releases))

    return releases


def transition_autopkgtest():
    global releases

    db_connect()

    releases = get_releases()

    source_details = get_source_details(args.package)

    package_dir = get_source_package(source_details)

    if not package_dir:
        db_close()
        return 1

    if not has_ongoing_transition():
        schedule_autopkgtest()
        db_close()
        return 0

    affected_sources = get_affected_sources()

    test_dependencies = get_test_dependencies(package_dir)

    if not test_dependencies:
        db_close()
        return 1

    extra_packages = get_affected_dependencies(
        test_dependencies,
        affected_sources,
    )

    schedule_autopkgtest(extra_packages)

    db_close()
    return 0


def main():
    global args

    default = {
        'db_host': 'udd-mirror.debian.net',
        'db_port': '5432',
        'db_name': 'udd',
        'db_user': 'public-udd-mirror',
        'db_pass': 'public-udd-mirror',
        'dget': '/usr/bin/dget',
        'autodep8': '/usr/bin/autodep8',
        'mirror': 'https://deb.debian.org/debian/',
        'transitions_path': os.path.join(
            os.path.expanduser('~'),
            'ben',
            'transitions',
        ),
        'config_path': os.path.join(
            os.path.expanduser('~'),
            'ben',
            'config',
        ),
        'source_path': os.path.join(
            os.path.expanduser('~'),
            'tmp',
            'debian',
        ),
    }

    parser = argparse.ArgumentParser(
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )

    parser.add_argument(
        '--db-host',
        metavar='<ADDRESS>',
        action='store',
        help='UDD database host (default: %s)' % (
            default['db_host'],
        ),
        default=default['db_host'],
    )
    parser.add_argument(
        '--db-port',
        metavar='<PORT>',
        action='store',
        help='UDD database port (default: %s)' % (
            default['db_port'],
        ),
        default=default['db_port'],
    )
    parser.add_argument(
        '--db-name',
        metavar='<NAME>',
        action='store',
        help='UDD database name (default: %s)' % (
            default['db_name'],
        ),
        default=default['db_name'],
    )
    parser.add_argument(
        '--db-user',
        metavar='<USERNAME>',
        action='store',
        help='UDD database username (default: %s)' % (
            default['db_user'],
        ),
        default=default['db_user'],
    )
    parser.add_argument(
        '--db-pass',
        metavar='<PASSWORD>',
        action='store',
        help='UDD database password (default: %s)' % (
            '*' * len(default['db_pass']),
        ),
        default=default['db_pass'],
    )
    parser.add_argument(
        '--dget',
        metavar='<PATH>',
        action='store',
        help='Path to dget executable (default: %s)' % (
            default['dget'],
        ),
        default=default['dget'],
    )
    parser.add_argument(
        '--autodep8',
        metavar='<PATH>',
        action='store',
        help='Path to autodep8 executable (default: %s)' % (
            default['autodep8'],
        ),
        default=default['autodep8'],
    )
    parser.add_argument(
        '-m', '--mirror',
        metavar='<URL>',
        action='store',
        help='URL to Debian package mirror (default: %s)' % (
            default['mirror'],
        ),
        default=default['mirror'],
    )
    parser.add_argument(
        '-t', '--transitions-path',
        metavar='<PATH>',
        action='store',
        help='Path to ben transitions directory (default: %s)' % (
            default['transitions_path'],
        ),
        default=default['transitions_path'],
    )
    parser.add_argument(
        '-c', '--config-path',
        metavar='<PATH>',
        action='store',
        help='Path to ben config directory (default: %s)' % (
            default['config_path'],
        ),
        default=default['config_path'],
    )
    parser.add_argument(
        '-s', '--source-path',
        metavar='<PATH>',
        action='store',
        help='Path to source package directory (default: %s)' % (
            default['source_path'],
        ),
        default=default['source_path'],
    )
    parser.add_argument(
        '-p', '--package',
        metavar='<PACKAGE>',
        action='store',
        help='Package to schedule autopkgtest for',
        required=True,
    )
    parser.add_argument(
        '-T', '--trigger',
        metavar='<PACKAGE>',
        action='store',
        help='Package that triggers the autopkgtest',
        required=True,
    )
    parser.add_argument(
        '-n', '--dry-run',
        action='store_true',
        help='Don\'t schedule CI jobs',
    )
    parser.add_argument(
        '-d', '--debug',
        action='store_true',
        help='Enable debug output',
    )
    parser.add_argument(
        '-v', '--verbose',
        action='store_true',
        help='Enable verbose output',
    )

    args = parser.parse_args()

    if not args.dget:
        print("Error: No dget path specified!")
        return 1
    elif not os.access(args.dget, os.X_OK):
        print("Error: Cannot execute dget: %s" % args.dget)
        return 1

    if not args.autodep8:
        print("Error: No autodep8 path specified!")
        return 1
    elif not os.access(args.autodep8, os.X_OK):
        print("Error: Cannot execute autodep8: %s" % args.autodep8)
        return 1

    if not args.transitions_path:
        print("Error: No transitions path specified!")
        return 1
    elif not os.access(args.transitions_path, os.R_OK):
        print(
            "Error: Cannot read transitions path: %s" % (
                args.transitions_path,
            )
        )
        return 1

    return transition_autopkgtest()


if __name__ == '__main__':
    sys.exit(main())

Reply to: