[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Amend Debian Python Proposal to Include More Python Metadata?



> On Jan 22, 2016, at 7:18 PM, Scott Kitterman <debian@kitterman.com> wrote:
> 
> The Zen of Python says, among other things, "There should be one-- and preferably only one --obvious way to do it".  Build systems seem to me like a great place to apply that.


We have a sliding scale of complexity in what a project needs from it’s build system that generally breaks down into a few major classes of projects:

1) Projects that are pure python, single source, whose only real “compilation” is shuffling files to the correct location.
2) Projects that are pure python, but require some sort of generation step as part of the build process (2to3, etc).
3) Projects that have some basic C code that needs to be compiled, but which doesn’t link against anything special besides Python itself.
4) Projects that have some basic C code that needs to be compiled, but which links against other libraries like OpenSSL, libpq, etc.
5) Projects that have some basic C code, that needs to be compiled, that links against other libraries, and needs to be able to conditionally link against different libraries based on the capabilities and what is available in the system.
6) Extremely complex projects that need to link against many different libraries, possibly hard to build, possibly not C (e.g., Numpy with it’s blas libraries, Fortran, etc).

The problem with a one size fits all solution is that it’s very difficult to actually cover all of these cases in a way that is not horrible to actually use for each particular case. For an example, there is currently a build system called flit which doesn’t support anything but building straight to wheels (because we don’t have any sort of sdist 2.0 or anything yet). It doesn’t attempt to solve anything but the first class of users up there, and because of that it is able to create a very simple and easy to use packaging experience for authors, you just add a __version__ = “the version” to the top level of your package, and then drop in a flit.ini that looks something like:

    [metadata]
    module=foobar
    author=Sir Robin
    author-email=robin@camelot.uk
    home-page=http://github.com/sirrobin/foobar

    # If you want command line scripts, this is how to declare them.
    # If not, you can leave this section out completely.
    [scripts]
    # foobar:main means the script will do: from foobar import main; main()
    foobar=foobar:main

And that’s it. You’re done. From there flit can do the rest of the work for you because it didn’t need to concern itself with trying to work on anything complex.

In addition to the above types of problems, you also have other things like what the “source of truth” is for your metadata. A common thing that people want is to be able to not have to duplicate their version in multiple locations (sometimes that even extends to the tags in the version control), however it’s not currently possible to do that in a particularly easy way. You have systems like pbr, setuptools_scm, versioner, etc that all do it but which all rely on some level of terribleness to deal with the sort of weird inverted control flow we have.

Once you get to all of the possible options for things people reasonably want to do, you quickly end up in a place where the only reasonable solution is the full complexity of a programming language like what we have in setup.py. However, that has it’s own problems which we’ve discovered over two decades of doing that :) It tends to end up with people doing badly tested adhoc code that they cargo cult from project to project and when there’s a problem there ends up being very little understanding why some code did that since it had been copy/pasted around so much. Pip has to go to a lot of effort to try and work around common mistakes people make in their setup.py files (like, depending on it being invoked with the project root as CURDIR) which most people will just never notice their slightly broken setup.py.

So basically, by allowing multiple build systems we will enable a world where your bog standard pure python project can get an extremely simple tool/ux and projects with crazy hard requirements like Numpy can get something more complex, without the two groups of users fighting each other over simplicity + limited vs complexity + powerful.

In the end, I think it’s likely we’ll end up with 2-3 really popular build tools, one that is for the complex projects, one that is for the simpler projects with some basic C needs, and *maybe* one that is for pure python (but that may be able to be handled by the basic C needs one) though there will be a long tail I’m sure.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail


Reply to: