[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: own cloud task in tasksel?



On 10/03/2016 6:22 PM, Charles Plessy wrote:
Le Thu, Mar 10, 2016 at 10:18:30AM +0100, Bastian Blank a écrit :
On Wed, Mar 09, 2016 at 11:51:58PM +0900, Charles Plessy wrote:

Maybe this problem can be solved by the use of metapackages ?  With the
exclusion of cloud-init, specialised kernels etc., can we converge on a
metapackage that would represent the most frequent expectations of users of
non-minimal cloud images in Debian and elsewhere ?
Thats what a task is, a meta-package.
Since debian-boot@l.d.o was CCed, I thought that "task" was employed to mean "a
metapackage built from the tasksel source package", not just any metapackage in
general.

So let me rephrase:

Is the proposal to go through tasksel ?  If yes, what are the expected
advantages over the use of an ad-hoc metapackage ?

Cheers,



All of what people generally want installed in their EC2 instances can be achieved with a suitable boot time UserData section that Cloud-init installs. So long as the base image has enough base packages to fetch & install additional packages at boot; which means it has enough base packages to be configured to make a request to a Debian repo by some means (socks, or explicitly defined proxy, or private repository, or direct) with simple configuration, then that's easy. So, for example, in an EC2 metadata environment, setting the UserData to:

  #!/bin/sh
  apt-get update && apt-get install -y less unattended-updates
....


...will do what many users will want quite quickly.

I don't think that requires us to create additional base images that cater for various combinations. Sometimes those base images want to download packages from Storage (s3 or other external to the instance) that the client wants to get using a script. So long as that instance has the necessary tools (curl, wget, awscli in the case of EC2, plus the lower layer SSL libraries that curl and wget use for https) then that's provided ample means to deliver secured, authenticated payloads to the instance. If the user wants a base image for their own purpose then they can either take a base image, customise and "create image" of their running server, or use the bootstrap-vz script to generate their own images with their selected base packages.

One item of note that we have included in the AWS EC2 base images is the apt-transport-https package, purely that, in environments where the (customer organisational) security policy forbids the use of outgoing HTTP from an instance, but permits (limited?) HTTPS, then permitting a simple re-configuration of the base image sources.list file makes this possible without being a chicken-and-egg problem.

Should users pre-bake their images? For large organisations/corporates/governments/etc who may do hundreds of launches per day, possibly in an auto-scale group - yes, so that the image, and all launch time dependencies are pre-installed and not subject to external services outside of the cloud provider that may not be available during a launch. I've seen people bootstrap live pulls from external revision control provider platforms- and then seen those platforms be down when they've tried to launch. If you want to rely on the stability and availability of an image, you should master your own image and maintain that as an artifact for your environment.

If you're contemplating an ad-hoc meta-package instead - then perhaps that is something that's more widely cloud and non cloud applicable - such as what tasksel already has as tasks? But if a possible "cloud" task was to contain utilities for various cloud vendors - then who decides which vendors and utilities are included, and which ones are not - and when does a small cloud provider get a task for their cloud environment and APIs? Or do we make a "cloud-aws"and "cloud-azure" tasks. If we do that, then why not include those utilities in the base image(s) of those providers to start with. That starts to become bloatware if I have a package where 90% of the content is for cloud environments that I don't use, and I'm paying for the storage for these utilities, times many instances. Even a task of "cloud-aws" may include libraries and APIs for stuff I'll never use: the Go/Python/R/perl/java/node/php SDKs when all I want is less?  So updating the above simple UserData script to use tasksel is also trivial:

  #!/bin/sh
  apt-get update && tasksel install <task>


My preference is to keep the number of images per release to a minimum with small deviations that ensure that the "base"image is universally useful in each providers environment. We were, for a while, generating 4 images in EC2 per release (multiplied by 12 regions). Now we generate one x86 64 bit image, and replicate it (PVM virtualisation and i386 is sun-setting on AWS: amd64 HVM is everywhere). If you twist my arm then perhaps an additional base image(s) for a new CPU architecture(s) may one day be needed.


  James

--
Mobile: +61 422 166 708, Email: james_AT_rcpt.to

Reply to: