[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: arch-dependent files in -dev packages



Nicolas Boulenguez wrote:
> With the gnat-4.8 transition, we may consider changing the policy for
> -dev packages to avoid architecture-dependent contents in /usr/share/.
> 
> You may want to (re-)read Björn's idea about projects [π], the
> discussion about arch-dependent sources ([√2] and same thread next
> month [i]), or the wiki about multiarch in Debian [-1].
> 
> [π]  https://lists.debian.org/debian-ada/2011/12/msg00000.html
> [√2] https://lists.debian.org/debian-ada/2012/04/msg00040.html
> [i]  https://lists.debian.org/debian-ada/2012/05/msg00001.html
> [-1] https://wiki.debian.org/Multiarch/Implementation

Another relevant development is that I recently published Comfignat, a
common foundation for build systems built around the GNAT tools. It
supports directories projects and specifies a set of variables that
just happen to be compatible with Fedora:

https://www.rombobjörn.se/Comfignat/#directories_projects

> The gnat package would install a configuration project in
> "/usr/share/ada/adainclude/", inheriting the target architecture at
> each build, and implementing the local directory hierarchy policy.
> 
> I have no idea of the best way to inherit TARGET_ARCH, though. Ideas?

The best I could come up with was a pair of shellscript fragments
in /etc/profile.d that define an environment variable:

/etc/profile.d/gnat-project.sh:
HARDWARE_PLATFORM=`uname --hardware-platform`
export HARDWARE_PLATFORM

/etc/profile.d/gnat-project.csh:
setenv HARDWARE_PLATFORM `uname --hardware-platform`

The directories project then picks up the environment variable as an
"external value". That works for builds started manually from an
interactive shell, although users have to use setarch instead of passing
an option such as -m32 to the compiler. I don't think it works in cron
jobs and the like, so it's not perfect, but in those cases there's
always a script or something that can set HARDWARE_PLATFORM.

When we build RPM packages the architecture is available in an RPM
macro, which we use to define HARDWARE_PLATFORM in the standard
parameters that are passed to every Gnatmake or GPRbuild invocation.
Perhaps Debian has something similar that is available when packages are
built?

I would have preferred to have Gnatmake and GPRbuild provide the target
architecture in a variable for project files to use, but as far as I
could find out they don't provide anything like that.

In Debian I suppose you also need the kernel name to be able to figure
out the whole triplet. "uname --kernel-name" seems useful for that
purpose.

> In order to share library projects with at least Fedora, we should
> agree on some details.
> 
> The project name must avoid collision with any user-defined project
> name. The "system" and "variable" words do not carry much information
> in that context. Why not "installed_library_directories.gpr"?

As I wrote last year, the "library" part isn't entirely accurate. If you
think "system" is too vague, then how do you feel about
"operating_system_directories", or "os_directories" for short? Or is
"distribution_directories" better? The name should convey that this
directories project specifies directories that the distribution controls
(specifically its package manager, for distributions that have one).
Other directories projects with other names might specify directories
controlled by the user (under /home), by third-party add-ons
(under /opt), or by the system administrator (under /usr/local for
example).

I usually try to avoid the term "operating system" because people can't
agree on what it means, but I suppose it's at least somewhat more
specific than just "system".

There are a few different use cases to consider:

Case 1: Upstream has no project file, or we need to patch their project
file. Debian and Fedora want to share a project file. This is currently
the common case.

Case 2: Upstream provides project files but has no way of configuring
them. The project files unconditionally require a directories project.
We can safely assume that there are no such upstreams yet, but they
might appear if the idea of directories projects catches on.

Case 3: Upstream has a configurable build system based on Comfignat,
Autoconf or something similar, and the build system supports directories
projects. I hope this case will become more common over time.

In Comfignat I defined a Make variable that lets the installing user
provide the name of the directories project. I had to make using a
directories project optional anyway, as I wanted Comfignat to work in
environments that don't have one, so making the name configurable wasn't
a big addition. That way users can have their own directories projects
in their home directories, and it would be possible to set up multiple
build environments in separate directories.

If an Autoconf-using upstream adds support for directories projects, it
won't be difficult to make the project name configurable there too.

To support cases 1 and 2 we need to agree on the project name. If we
decide to support only case 3, then it doesn't really matter.
Configurable build systems will work fine even if Debian and Fedora use
different names for the directories projects. I think we should try to
support all three cases though.

Either way, we need to agree on which variables a directories project
must provide. Otherwise we'll just make a big mess.

> If I understand well, this project is only useful for compilation. If
> so, bindir and libexecdir, intended as installation destinations,
> should not belong to it. Installation belongs to dpkg/rpm, not to gnat
> tools.

Bindir and Libexecdir are for use in Exec_Dir in build projects – and
that's why "installed_library_directories.gpr" isn't entirely accurate.
A project file that builds some programs that are only meant to be run
by other programs might contain a line like this:

   for Exec_Dir use external("DESTDIR", "") & System_Directories.Libexecdir & "/subdir";

In Fedora Libexecdir is "/usr/libexec" and in Debian it would be
"/usr/lib", and the programs would be installed in the right place in
both distributions.

In case 1 above we need Libexecdir to be able to share build projects.
We can do without it if we only want to share usage projects. In case 2
Libexecdir is the only thing that can tell the upstream build project
where to install this kind of programs. Bindir might prove useful for
installing programs in /bin instead of /usr/bin, and who knows, maybe a
need for /usr/bin64 arises some day in some distribution.

If we decide to support only case 3, then we don't need Bindir and
Libexecdir. Comfignat does currently use them but I can change that.

> Defining an "Archincludedir" variable would encourage libraries to use
> the same layout, at least inside a distribution.
> 
> If Debian aims at minimal changes, "/usr/lib/ARCH/ada/adainclude/NAME"
> seems a natural choice.

I fail to imagine why anyone would write architecture-dependent package
specifications in Ada. It shouldn't be hard to encapsulate the
architecture-dependent bits in a package body. But if they do exist then
I suppose we need a place to put them, so I'm willing to standardize
Archincludedir.

> For convenience of programmers and code browsing tools, a symbolic
> link could be provided from Archincludedir/SRC to Includedir/SRC for
> every arch-indep source. Gnat would not be confused because when it
> finds identical file names in two directories, it only considers the
> first one. I mention the idea for the record, but I dislike link
> forests.

I don't much like that. I don't want to encourage people to make a
habit of looking for everything in Archincludedir. Architecture-specific
APIs *should* be inconvenient. It encourages programmers to encapsulate
the architecture-dependent bits properly and keep the API clean.

I suppose we can refrain from banning such links, but let's not make
them mandatory at least.

> Concrete suggestion:
> ----------------------------------------------------------------------
> --  Debian version of /usr/share/ada/adainclude/installed_lib_dirs.gpr
> abstract project Installed_Lib_Dirs is
>    for Source_Files use ();
>    Deb_Host_Multiarch := external ("TARGET_ARCH");
>    Library_Dir           := "/usr/lib/" & Deb_Host_Multiarch;
>    Library_ALI_Dir       := "/usr/lib/" & Deb_Host_Multiarch & "/ada/adalib";
>    Arch_Dep_Source_Dir   := "/usr/lib/" & Deb_Host_Multiarch & "/ada/adainclude";
>    Arch_Indep_Source_Dir := "/usr/share/ada/adainclude";
> end Installed_Lib_Dirs;

Ah, I see that we also need a variable for the ALI directory. I've just
been putting ALI files in subdirectories of Libdir, but that's OK, let's
define a separate variable for this.

I think it could be confusing to have variables with the same names as
the attributes. The difference between Installed_Lib_Dirs.Library_Dir
and Installed_Lib_Dirs'Library_Dir is very subtle. I chose to copy the
variable names from the GNU Coding Standards. Many programmers,
packagers and system administrators are familiar with them due to the
popularity of Autoconf. I still think this is a good choice, and that
new variables we define should be named in the same style. I also don't
want to change the variable names that are already in use in Fedora and
Comfignat, unless there is a very good reason.

> ----------------------------------------------------------------------
> --  Common contents for /usr/share/ada/adainclude/NAME.gpr
> --  This project file is designed to help build applications that use NAME.
> --  Here is an example of how to use this project file:
> --  with "NAME";
> --  project Example is
> --     for Main use ("example.adb");
> --  end Example;
> with "installed_lib_dirs";
> with "dep_providing_an_importable_project";
> library project NAME is
>    for Library_Name use project'Name;
>    for Library_Kind use "dynamic";
>    for Externally_Built use "true";
>    for Source_Dirs use
>      (Installed_Lib_Dirs.Arch_Indep_Source_Dir & "/" & project'Name,
>       Installed_Lib_Dirs.Arch_Dep_Source_Dir & "/" & project'Name);
>    for Library_ALI_Dir use Installed_Lib_Dirs.Library_ALI_Dir & "/" & project'Name;
>    for Library_Dir use Installed_Lib_Dirs.Library_Dir;
>    package Linker is
>       for Linker_Options use ("-ldep_providing_no_importable_project");
>    end Linker;
> end NAME;
> ----------------------------------------------------------------------

This example shows a shared library that depends on a second library but
isn't linked to that library. Instead the first library requires that
programs and libraries that use it must link to the second library. I
suppose this is normal for static libraries, but a shared library should
normally be linked to the libraries it depends on.

I actually have a library that needs to do this. The Ada Milter API has
a thread wrapper to work around certain problems in certain versions of
Libgnat. To work reliably the thread wrapper must be statically linked
into the program so that it's present before shared libraries are
loaded, so the usage project Milter_API uses Linker_Options to make the
program link to the thread wrapper. It's a very special case, definitely
not something that a shared library should usually do.

> A connected problem: in worst cases, the list of dependencies
> providing no importable projects may depend on the target
> architecture. As long as #717014 exists, this may happen even for Ada
> sources.
> 
> An arch-indep work-around for this exact bug is to append
> "-Wl,--as-needed -lbar -Wl,--no-as-needed" to linker options (or -lbar
> if --as-needed is already the default). This embeds -lbar when and
> only when needed on the target architecture, but not efficiently.
> 
> I mention the problem here because any better idea is welcome, and
> because the work-around should be documented in policy in the
> paragraph enforcing arch-indep gpr files.

In the case of a bug where a shared library uses another shared library
but isn't linked to it, that would typically be worked around in build
projects rather than usage projects. I think --as-needed is good enough
for this as long as the needed library exists even on architectures
where it isn't needed. Presumably the linker will complain if it can't
find the library.

Another solution could be to select on Hardware_Platform:

   case OS_Directories.Hardware_Platform is
      when "armel" | "armhf" | "ia64" | "m68k" | "mips" | "mipsel" |
           "powerpc" | "s390" | "s390x" | "sparc" =>
         for Library_Options use ("-lm");
   end case;

I specified in the Comfignat documentation that directories projects
shall define Hardware_Platform, but my intent then was that it should be
used in filenames, for example subdirectories in a build directory. For
use in case statements it's more important to coordinate the possible
values of Hardware_Platform, and I'm not sure how standardized those
architecture labels are.

To avoid the need to define all the possible architecture labels in a
type declaration, the decision could be moved from the build project to
whatever script, makefile or spec file launches the build. Define an
option in the project file:

   type Boolean is ("false", "true");
   Need_Libm : Boolean := external ("need_libm", "false");
   case Need_Libm is
      when "true" =>
         for Library_Options use ("-lm");
   end case;

and then pass -Xneed_libm=true to the builder on those architectures
where it's needed.


Anyway, here's my draft proposal for a specification:

A directories project is a GNAT project file that defines directory
variables for use by other project files. An operating system
distribution may include a directories project named [to be decided].gpr
which specifies the directories where programs and libraries packaged in
that distribution are installed. Users and system administrators may
write other directories projects with other names to encode local
policy.

A directories project shall define the following variables:

Hardware_Platform
   A short string, suitable for use in filenames, that identifies the
   hardware platform (sometimes called the hardware implementation) that
   is currently being compiled for. The directories project may depend
   on external values to know this.

Bindir
   The directory for programs that can be run from a command prompt.

Libexecdir
   The top-level directory for programs that are intended to be run by
   other programs rather than by users.

Libdir
   The directory for binary libraries to be used by other software, and
   the top-level directory for other architecture-specific files.

Alidir
   The parent of libraries' separate library-specific directories for
   Ada library information files.

Includedir
   The top-level directory for architecture-independent source files to
   be used in the compilation of software using libraries.

Archincludedir
   The top-level directory for architecture-specific source files to be
   used in the compilation of software using libraries.

-- 
Björn Persson

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: