DIY Docker using Skopeo+OStree+Runc
Docker is awesome, but what is even more awesome about UNIX philosophy is that you can use combine small tools to create a something that work like docker.
Actually Dockerlite used BTRFS and LXC to make a toy version of docker.
In this post we are going to discuss show how one can pull a Docker image and run the containers without a docker daemon, of course we do this for fun.
We are going the achieve the following:
- ability to pull docker images
- space efficient storage of images and containers
- even better that docker (not just reuse layers, but even files)
- run the container
We are going to use
- OSTree: content addressable storage, git-like for OS Images, space efficient and uses hardlinks
- Skopeo: a way to pull all kinds of images and convert them to all kinds of storage
- Runc (or any OCI runtime like bubble wraps oci)
In this post we are going to run everything as non-root regular user (to make it even more challenging)
let's create a bare OSTree storage (similar to bare git), which stores the content-addressable objects not the usual files you see, but instead of --mode=bare we use --mode=bare-user because we are not using root user
$ mkdir ostree
$ cd ostree
$ ostree init --mode=bare-user --repo=$PWD
let's pull 3 small docker images for docker hub
$ skopeo copy docker://redis:alpine ostree:redis@$PWD
$ skopeo copy docker://nginx:alpine ostree:nginx@$PWD
$ skopeo copy docker://busybox:alpine ostree:busybox@$PWD
$ ostree refs
the refs command would print something like
ociimage/redis_3Alatest
ociimage/nginx_3Alatest
ociimage/12345.....
ociimage/12345.....
now we have the layers and the metadata, let's look around
$ ostree ls ociimage/redis_3Alatest
d00755 1000 1000 0 /
-00644 1000 1000 1568 /manifest.json
$ ostree cat ociimage/redis_3Alatest /manifest.json
here is something like "docker inspect image"
$ config_hash=`ostree cat ociimage/redis_3Alatest /manifest.json | jq -r .config.digest | cut -d ':' -f 2`
$ ostree cat ociimage/$config_hash /content | jq
$ ostree cat ociimage/$config_hash /content | jq .config.Entrypoint
$ ostree cat ociimage/$config_hash /content | jq .config.Cmd
$ ostree cat ociimage/$config_hash /content | jq .config.ExposedPorts
let's create a directory for our container and apply layers one by one inside that directory
$ mkdir -p cont1/rootfs
$ ostree checkout --union ociimage/redis_3Alatest cont1
$ cat cont1/manifest.json | jq -r '.layers[]|.digest' | cut -d ':' -f 2 | while read a; do ostree checkout --union ociimage/$a cont1/rootfs; done
we can reverse the order and use "--union-add" instead of "--union".
Now we have redis root filesystem in "cont1/rootfs", and that does not take space because they are merely hardlinks to those in the ostree. Before we run it, let's generate OCI "config.json"
$ cd cont1
$ runc spec --rootless
You can edit the file "config.json" for example you can
- adjust "args": [ ] to be the command to be executed, for example "args": [ "redis-server" ]
- adjust "mounts": [ ... ] to add tmpfs on "/tmp" and "/var/run" or even "/var"
- adjust "namespaces": [ ... ] to add {"type": "network"} to make separated timestamp
- you can adjust mapping between users "linux": { "uidMappings": [ ... ] } typically containers root is the current user
To run the container type "runc run" followed by any name like redis
$ runc run redis
And in another tab you can enter that container and execute commands using "runc exec"
as you can see there is no network card, you can make it host network stack by removing network namespace from "config.json" or create and configure a network namespace (requires root). in this article I'm limited to non-privileged user which can't create and manage network, if you have root user, you can create virtual ethernet pair "veth" and associate one side in this container and another one in a bridge (using brctl) in away similar to docker0 bridge
limitations?
- rootfs directory is read-only, you can use union filesystem like overlayfs (root) or userspace alikes like FunionFS or unionfs-fuse
I would like to say thanks to you because you have shared wonderful information with us, check it once Devops Online Training Bangalore
ReplyDeleteGreat information! Thanks for sharing!
ReplyDeleteBut using the OCI layers as OSTree refs link of feels like strange in OSTree. Why not stacking the layers as subsequent commits to get the final image (e.g. 'ociimage/redis_3Alatest').
It doesn't take advantage of the 'ostree log' to show how the final image is being created and hides this layering behind a couple of commands.
It additionally makes the references list unusable.
Good to understand how OCI images are built, but not practical IMHO.
Perhaps this is more of a Skopeo issue than of your post...