sakaki wrote: ↑Tue Feb 26, 2019 2:22 pm
What follows is a partial, initial cut of some project goals, scope etc:
not a specification!
This is very much a work in progress!
Comments / suggestions are actively invited.
Project working name
Obviously, the most important thing to discuss is the name ^-^ So let's address that first.
Since, following ShiftPlusOne's suggestion, we're aiming to
generalize the
existing prototype to handle
multiple OSes, including 32-bit ones, "raspbian-nspawn-64" isn't going to cut it.
So, I'm going to use
pinc (=
pi nspawn
containers) as a working title for now (pronounced "pink"). If you have an idea for a better name, please chip in!
Overall goals
The overall goals of this project are to:
- allow straightforward integration of non-Raspbian 32 and 64 bit OSes within the (32-bit) Raspbian desktop environment; and,
- facilitate the sharing of such OS images by creators.
By
straightforward integration, I mean that pinc users should be able to review the availability of, then install, start, stop and uninstall (instances of) guest OS images using only a simple GUI; further, that once an instance of an image is running, the user should (graphically) be able to open a shell within it; further, that applications installed within such a image instance should 'auto-magically' appear in the user's Raspbian desktop menu, from where they can be launched just as per 'standard' Raspbian apps; further, that when launched such applications should appear on the Raspbian desktop, have the ability to play sound and video, be able to access the network, and be able to operate on files in the user's home directory, just as if running as running as a normal process owned by that user in the host Raspbian OS.
By
facilitate [...] sharing I mean that it should be possible for third-party OS image
creators (those who regularly publish in this board's "Operating system distributions -> Other" forum, for example), to easily package, and publicize the availability of, OS images, so that end-users of pinc can "see", download, and use them in the manner just described.
For those coming fresh to this project, please note that a "proof of concept" of at least the menu and app integration components has been created - this is not "vapourware"; please review earlier posts in this thread for details, or see the bootable image available here.
It is intended that (at least for the 1.0 release) guest OSes will be hosted using a
systemd-nspawn container.
Subject to QA approval, it
is intended that pinc be made available via the official Raspbian repository (see notes on packaging below).
What pinc is not
This project is
not:
- NOOBS or PINN: while the OSes to be used will be booted, they will not be natively booted on the hardware, but rather started up inside a systemd-nspawn container. Raspbian-32 will always be the host OS (and it is assumed the user has already successfully configured it for network access etc.)
- KVM: a single kernel will be shared by all OSes (host and guest(s)).
- QEMU: emulation will not be used, guest OSes will run natively (this point might get relaxed; tbd).
- chroot: guest OSes will run fully booted, with their own systemd and dbus, within a separate process namespace.
- Docker: guest OSes will be natively multi-process and do not require any support middleware, other than systemd itself. At least initially, no concept of 'multi-layered' OS images will be supported, for simplicity. This is an OS, not service, deployment approach.
- firejail: while the use of containers adds some security, the primary purpose here is not to facilitate a more unbreakable jail for OSes, but rather to more easily integrate them.
- a CLI toolkit: the focus being rather on a simple system that 'just works', usable by only relatively inexperienced users.
Who will use pinc?
The target audience for pinc includes those who, while wishing to stay within the familiarity of the 32-bit Raspbian desktop, need (or would like) to:
- run apps, or versions of apps, only available on 64-bit (e.g. mongodb-3.2);
- run apps, or versions of apps, only available on other distros (e.g. Arch) or other releases of an OS (e.g. Debian Buster);
- easily test software compatibility across multiple OSes.
- try other OSes without having to restart their Pi.
- try out other OSes that don't have (or have only limited) Pi driver support.
It is hoped that relatively inexperienced users will be able to use pinc successfully.
OS image
creators will also use the pinc
infrastructure to publicize and (possibly, see below) periodically build and distribute their images. (Such operations, being performed by a relatively small number of relatively knowledgeable individuals, will require the usage of CLI tools.)
Target hardware and software
To natively run 64-bit OSes a 64-bit kernel is required (under ARMv8a). That limits availability to the RPi3B, 3B+, 3A+, CM3 CM3L and 2Bv1.2 (per
this post,
TODO check).
However, if only 32-bit OSes are required, then any RPi should be usable.
TODO check kernel requirements for systemd-nspawn against raspberrypi-kernel config.
The target host OS is 32-bit Raspbian with desktop (LXDE).
The pinc gui will (probably) be written in Python (PyQt?) so as to be straightforwardly portable. Support services (such as the 'reflectors' for .desktop files and /etc/{passwd,shadow,group,gshadow}) will be written in bash.
Some comments on scope
For 1.0, the following are
out of scope:
- guest OSes that don't run systemd
- (gui support for) guest OSes that don't use X11 (but use e.g. wayland natively)
- (sound support for) guest OSes that do not support pulseaudio
- guest OSes shipped in a binary format incompatible with native execution on the Pi (e.g. x86_64)
- guest OSes that require network namespacing to operate (may be relaxed when Raspbian migrates to Buster, please see notes on namespacing later)
- guest OSes that require user namespacing to operate
- guest applications that require MMAL / OpenMAX IL when booted under a 64-bit kernel
TODO: expand
Some comments on packaging
A 64-bit kernel is required to run 64-bit (unemulated) userspace, and since no official 64-bit kernel is currently provided in the RPF repos, one will be provided as part of this project, using the tarballs from my
bcmrpi3-kernel-bis kernel autobuild project. A fair draft prototype of this has already been created by ShiftPlusOne; please see
here ff.
The production variant of this will automatically track autobuild releases, and (subject to QA)
be made available (as a versioned series of debs) in the official Raspbian repo.
The pinc gui application, support scripts and systemd service files will be published as a second package, with two variants:
- One that can support both 64-bit and 32-bit guest OSes, which will depend on the above 64-bit binary kernel package; and
- One that only supports 32-bit guest OSes, which will not.
Again, subject to QA, the intention is for these to be
made available as debs in the official Raspbian repo.
OS images will likely be
distributed as compressed, signed root filesystem tarballs, and
published by placing an entry in a metadata file visible at a pre-arranged URL (known to the pinc client app). For example, we might follow the Gentoo overlay model here, and have a GitHub site, controlled by RPF, holding an "available_images.xml" (or similar) catalogue file, against which pull requests (PRs) may be made to add entries. (The precise mechanism TBD, so
versions can efficiently be handled without necessarily requiring a sign-off each time.)
Users will be able to append their own known images via a local file too (requiring no special permissions).
On the trustworthiness of images
The format of the image catalogue file is TBD, but will contain a trust level for each image version. At least 3 levels will initially be supported:
- Official: these are images distributed by the RPF themselves, hosted on their servers, and signed with an RPF release key. At least a 64-bit Debian (probably Buster) image will be provided by the time of the 1.0 release in this manner.
- Community: these are images hosted and built on the pinc image server (see below), signed by a "community" key.
- Personal: images hosted on servers other than the above, signed by the contributor themselves.
Some comments on infrastructure
I am in discussion currently with a potential sponsor who has expressed interest in funding a three-year VPS contract, to act as a community image server (CiS).
The idea is that this host would be managed by a group of individuals from the Pi user community and (should any wish to do so) some RPF engineers. Users could submit images for hosting on the CiS by providing a bash script (a sort of dockerfile-lite ^-^), which would, when invoked:
- Check if a new image (of the particular target OS) needs to be built, and if so:
- Build it. This would usually involve downloading a baseline image from a well-known official site (or using debootstrap, pacstrap etc to create same), adding some customization to it (additional packages etc.) using only official upstream repos, setting some required baseline configuration to make the image 'pinc-compatible' (see further comments below) and then tarring up the result.
Image tarballs created in this way would then be signed by a "community" key and hosted on the CiS itself. Autobuilds could be created on, say, a monthly cadence (ensuring end-users did not have to spend too much time on the initial package update step, since all provided images would be reasonably 'fresh').
Those managing the service would only need to review the submitted creation script
once, at the outset, to check it didn't do anything obviously mendacious, that it only downloaded from official repos etc., and then approve it. Images thereby created would then have an accelerated (or automatic?) path to publication in the catalogue file.
NB: it may be that licensing concerns etc. prohibit such a service getting off the ground, but it'd be a nice 'halfway house' to provide users some level of assurance they aren't downloading anything truly toxic, without RPF having to sign off on everything.
Alternatively: could just run the scripts (a la ebuilds ^-^) on the end-user's system, constructing the desired spin 'on the fly'. In this case, only the scripts would need to be community hosted (after checking), and these could just live in a GitHub repo or similar. This idea breaks down somewhat where compilation etc. needs to be done for any packages, as then compiler and other build-support packages would need to be loaded by each end-user, rather than once at the CiS. TBD.
Namespacing
By default, guest OSes will run in their own process namespace, courtesy of systemd-nspawn.
However, due to
issues with bind mounts in /tmp under the Stretch version of systemd (v232), network namespacing will initially
not be supported (as the Unix abstract domain sockets for the host X11 server need to remain visible, see e.g. my notes
here). This could be relaxed once the move to Buster has been completed. However, once network namespacing is used, a host bridge also needs to be put in place so the veth tunnel traffic from the container can be routed (and inter-container routing is also made more complex). Also, those users who
want to do interface specific stuff in an image (for example, kali) will not want network namespacing on. So it should remain an optional feature, even when permitted by the OS.
Due to the relative complexity of specifying 'punchouts' under systemd-nspawn, user namespacing will also not initially be supported. It may be an option for future releases however.
Preparation of images
Modifying an existing OS root filesystem image for use with pinc should be straightforward. It will look something like:
- Downloading the 'baseline' guest image, then entering it via a chroot (or non-boot systemd-nspawn).
- Updating / downloading package metadata on the guest.
- Installing systemd container support, sudo, pulseaudio and zenity libraries into the guest (plus their deps of course).
- Setting locale and timezone (TBD)
- Adding required scripts etc. (e.g. to ensure pulseaudio is used by default)
- Disabling any services which might try to 'take control' of the RPi3 peripherals on boot
- Adding any 'image maintainer payload' applications and setup (i.e., those specific to a particular 'spin')
- Cleaning up the image (removing root's bash history, package archives, possibly package metadata), exiting the chroot.
- Tarring up the resulting filesystem.
- Signing the tarball.
Note that no additional users, groups etc. need be created as part of the image prep, since pinc's "reflector" services will take care of this auto-magically (this concept is already functional in the prototype, btw).
Images and instances
While we deliberately won't go down the Docker stacked-image rabbit hole for v1.0, it will probably be useful to borrow their terminology of 'images' and 'instances'.
So, when an OS
image (probably a tarball, actually, rather than a true filesystem image) is downloaded, it will be stored in read-only form (in /var/lib/pinc or somewhere). Then when an
instance of this is created, an
OverlayFS will be built, with the 'upper' (copy-on-write) layer in /var/lib/machines, and the 'lower' (read-only) layer being the image itself. This makes it trivial (and very fast) to create multiple instances of an image, once it has been downloaded.
Future versions of pinc could possibly extend this (checkpointing etc.), but we'll keep things simple for now.
It may be beneficial to use
SquashFS or similar for the (read-only) images themselves, should the the CPU overhead of decompression be outweighed by the benefit of fewer backing-store accesses onto relatively slow media. TBD.
Sketched use cases
An end-user story
Alice is an RPi3B+ user who would like to run a more up-to-date version of the Chromium browser than the one available in the standard Raspbian repo. Having done some research, she finds an appropriate version is available on 64-bit Debian Stretch. She opens the pinc application and sees this is available as an official image, and clicks on it to install. Once downloaded, she is prompted for an instance name, and chooses "debian-stretch-64" (!). The OS is booted in a container (this only takes a few seconds, and consumes minimal system memory) at which point her new instance appears in the pinc GUI. She clicks on it and selects "Open container shell...". A terminal opens inside the container OS instance, and she does a standard "sudo apt get update && sudo apt-get install -y chromium" to install the browser. Once this completes, a new menu item for her browser appears (auto-magically) inside the "Internet" section of her Raspbian desktop main menu, underneath her existing (32-bit) Chromium menu item. She clicks on the new item and 64-bit Chromium opens on her desktop. She browses to youtube.com - audio and video playback both work! She opens another tab, goes to a google doc and saves it in her home directory - this works too; the document is saved with her user credentials - and she isn't even logged on as "pi".
Alice installs a few more apps inside her container, and then elects, via the pinc GUI, to have the instance auto-started on boot. She powers down her RPi3. Later, she starts it back up again, and when the Raspbian desktop appears, clicks on the menu item to run (64-bit) Chromium. It opens (since the underlying container has been autostarted) with her browsing history intact.
NB, conveniently ^-^, this is a flow that has at least partially been prototyped; you can try it out yourself, using the 'proof of concept' raspbian-nspawn-64 image. Please see my post here.
Todo add to this with image creator's story etc.
On future directions
While out of scope for 1.0, there are some directions that could be subsequently explored with pinc:
- The use of emulation: for example, supporting x86_64 linux/systemd OS images via user-mode QEMU and binfmt_misc would be relatively straightforward.
- The integration of Windows applications using wine.
- The integration of non-systemd guest OSes (which are still linux).
- The integration of non-linux guest OSes and apps therein.
- etc.
The first two could probably be integrated fairly straightforwardly within a systemd-nspawn approach. The third is more problematic, and the fourth, challenging.
Nevertheless, targeting only arm/arm64 systemd-based linux OS images for 1.0 still covers a lot of bases. Most of the systems covered in the "Operating system distributions -> Other" board could probably be straightforwardly adapted (Debian, openSUSE, Arch, Ubuntu etc.)
Comments welcome!
To reiterate, at this stage this is very much a rough sketch / work in progress and
comments are actively invited. Per the
previously published schedule, I would like to complete the review period on this phase by
Fri Mar 8. Ideally, please post comments in this thread; alternatively, I can be reached at
sakaki@deciban.com.
Best,
sakaki