[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#961142: snapshot.debian.org: python3 migration



Hi,

After a couple of month, the migration from python 2 and pylons to
python3 and flask is finally done. I've created a merge request
available at:

https://salsa.debian.org/snapshot-team/snapshot/-/merge_requests/1

The following describe what changed in snapshot:

Snapshot script
===============

Ruby 'dbi' driver has been replaced by 'pg' ('dbi' was removed from
unstable in 2016). Main differences between both driver are:

- 'pg' require use of positional arguments
- renamed function/attributes (do/exec, ntuples and cmd_tuples)

I've tested the script both manually and using the frontend test suite.

Snapshot configuration
======================

Since ruby 'pg' module understand connection string, I've removed
the duplicated yaml block information.

DB python upgrade scripts
=========================

In addition to being converted to python 3, I've updated the scripts so
they can be run either as a standalone script or imported from another
python application. This allow running upgrade scripts from the frontend
test suite, when provisionning the test database.

Other python scripts (frontend excluded)
========================================

I've updated all other python scripts that appeared releavent to me,
including scripts in the following directories

- fsck/
- master/
- mirror/
- fuse/

In fsck, I've also updated the C module to python 3.

Without doing extensive and exhaustive testing, I did run the script to
validate they primary use cases.

Dependencies
============

I've updated the dependency list in README.md to match bullseye package
versions (For fuse, ruby, scripts, frontend and fsck).

Frontend
========

Framework and layout
--------------------

I've used the Flask framework to implement the frontend. That's a
minimalist framework, more simple and without any defined pattern, as
opposed to bigger framework such as Django.

It's a good fit in snapshot case because the backend is not in the
traditionnal form of a "Model", using a proper ORM, but instead is
already optimized, specific and implemented using PostgreSQL (psycopg2).

Flask provides all necessary tools to build around snapshot existing
model implementation, adding the frontend logic and the templates pages.

The following directories exposes the layout used for the frontend:

- `docs/`, deployment documentation and file header
- `public/`, root web folder
- `snapshot/__init__.py` `wgsi.py`, entry point
- `snapshot/views/`, frontend logic, contains all routes
- `snapshot/controllers/`, intermediary code, between views and models
- `snapshot/models/`, snapshot psycopg2 model
- `snapshot/settings/`, application configuration
- `snapshot/templates/`, jinja2 templates
- `snapshot/static/`, static content
- `snapshot/lib/`, generic code
- `tests/conftest.py`, test entry point
- `tests/controllers/`, test controllers
- `tests/data/`, test static data (package template, keyring, dataset)
- `tests/unit/`, unit and api tests
- `tests/functional/`, user/tools oriented tests (apt, debsnap)

Additionally, the frontend root folder contains licensing,
documentation, packaging and testing files.

Settings
--------

Snapshot frontend settings are loaded from
`snapshot/settings/snapshot.py`. This file does not exist in the
repository and is intended to be a link to whatever environment
configuration file you choose (`develop.py` or `prod.py`).

All environment configuration files load `common.py` where all defaults
are defined, and overrides them when necessary.

Behavior changes
----------------

My goal for this migration is to provide a seamless transition for
snapshot users. Meaning I've re-implemented as close as I can the
behavior of the old frontend.

Generated html/json, minus the space indentation, should be identical to the
previous version.

Exception from a couple of minors bugs fixed, I'm not aware of any
behavior changes from the previous version.

Testing and QA
--------------

I've added a fairly big test suite to ensure that the rewrite does not
comes with any regression. The test suite is written using pytest and
provide almost a full coverage of the code (grep `pragma` to see
excluded code and why).

The test suite is working as described:

- Load testing settings from `snapshot/settings/test.py`
- Create a temporary PostgreSQL database (ASCII encoded)
- Import snapshot schema (using db/ files)
- Build testing packages and create pooled archived using data from
  `tests/data/dataset.py` and `tests/data/templates` (aptly is used to
  create the archives)
- Import and index the archives using the snapshot script

Once all those step are done, a snapshot of the database is made and
restored before each test.

Unit and functional tests are run. Unit tests covers all routes while
functional tests aims to cover user use cases (using snapshot with apt
or debsnap).

Additionally, a flake8 checker is run and a branch coverage report is
generated in web/app/htmlcov/.

The test suite is either run by:

- invoking the `tox` command. It will take care of settings up an
  isolated environment to test with (postgresql is still a requirement)
- Pushing commits to the repository, that will trigger the CI

Deployment
==========

Although, it requires some adjustments, I haven't touched the
deployments files (speaking of `etc/apache.conf` and `web/deploy`).

From what I can gather, the current frontend setup is split between
several directories:

- /srv/snapshot.debian.org/web/public/static/, for static files
- /srv/snapshot.debian.org/web/public/, for web root
- /srv/snapshot.debian.org/bin/, for the WSGI script
- /srv/snapshot.debian.org/web-app/, for the actual frontend app

It looks like public/ and the WSGI script are actually links to the
web-app subtree.

All of that is split from the local copy of the repository (living in
`/srv/snapshot.debian.org/code/`).

Do you want to use the same setup for the new version? I usually expose
directly the git repository, aliasing /robots.txt, /static and / (to the
wsgi script) and it works. However, that technique might not fit
security requirements for snapshot.d.o.

In any case, I've removed the egg-info directory from the repository
(being generated files), so they would have to be generated by the
deploy script using a `setup.py sdist`.

What's next
===========

Let me know how to proceed from here. I suppose you will want to review
the code; I'm looking forward to your feedback on that.

Then, do you want to host that version, parallel to the previous one for
a couple of weeks to allow testing by other people, before switching?

Let me know how can I be of any help.

I do intend to address bugs listed for snapshot.debian.org in the BTS
and build on this new version of snapshot.

-- 
Baptiste Beauplat - lyknode

Attachment: signature.asc
Description: PGP signature


Reply to: