Alpine Linux on ARM (Turris Router)
My home router is a Turris Omnia, which provides the option for running LXC containers; I use this for SSH jumphosts and other such things as belong “on the router itself”.
Last night I decided that it was time to install an Alpine Linux container, to complement the Debian container which has been predominantly used to date. This presented a few issues, but all was done. In this post: networking from no-network, CIDR (classless) routes accepted over DHCP, and other quirks.
Apologies in advance for use of cat
straight to a terminal; busybox’s cat(1)
does not support -v
and I’m choosing to trust the state of the OS at the
point in time where I’m running these commands, such that I consider the risk
acceptable.
The version presented here is cleaned up, with debugging repetitions removed, showing only what I believe to be the steps needed.
Getting Started
I live in Pittsburgh, PA. The OS is “Alpine”, the nearest semi-famous
mountain is Mount Washington (popular with tourists) so we have a container
name: washington
.
Create the container, I happened to use the Web UI here, as it presented a convenient drop-down of options. The interface is through LuCI, the current web front-end to the UCI configuration system. Thus for me, the URL was https://turris.lan/cgi-bin/luci/admin/services/lxc – I mention this so as to note in passing that one of the nice things about the Turris was how easy it was to install a key/cert signed by my personal Certificate Authority.
I chose Alpine 3.7
. I did not start it. This was the end of the use of the
web interface: CLI all the way from here. With logbooks, so that if I ever
need to automate this then I have a playbook to use for building the
configuration management system rules. From logbooks of previous installs, I
see that I could have avoided the Web UI here too with:
lxc-create -t download -n x -- -l
lxc-create -t download -n washington -- -d Alpine -r 3.7 -a armv7l
Logged into the turris system as root, the next step is some basic configuration:
grep '^lxc\.network\.hwaddr' /srv/lxc/washington/config
cat /srv/lxc/washington/rootfs/etc/hostname
echo washington > /srv/lxc/washington/rootfs/etc/hostname
I used the MAC address from the first line to populate the Single Source of Truth used for generating my home DHCP and DNS server configs; this is easier if you use whatever’s on the Turris, I forget the details, but I disabled much of that stuff because I already have ISC dhcpd and Unbound running at home.
Next, the basic configuration to get running:
lxc-start -n washington
uci show lxc-auto
uci add lxc-auto container
uci set lxc-auto.@container[-1].name=washington
uci set lxc-auto.@container[-1].timeout=30
uci show lxc-auto
uci commit lxc-auto
lxc-attach -n washington
At this point, assuming lxc-auto
is enabled for system boot, the container
should start automatically on next boot; I seem to recall that this took much
poking and prodding when I first set up a container on this host, because the
usual LXC auto-start stuff all exists but is unused, with UCI instead being
the only working mechanism.
Networking
Our next problem is that we have no networking, no /etc/network/interfaces
,
very little in the way of networking tools, and very little installed:
# apk info | sort | xargs
alpine-baselayout alpine-keys apk-tools busybox libc-utils
libressl2.6-libcrypto libressl2.6-libssl musl musl-utils
scanelf zlib
[ manual linebreaks added ]
While https://wiki.alpinelinux.org/wiki/Configure_Networking provides some
guidance, it still assumes more of a system than we have. Why are things so
bare? https://doc.turris.cz/doc/en/public/lxc_alpine explains that the
original package source Turris used dropped all their armhf
packages without
notice, so Turris switched to the minimal OS images directly from Alpine,
which unfortunately are intended for use in other scenarios. So, manual IP
configuration and route-table manipulation it is. Adding in the steps we need
from both sources and adapting for being inside a container, we have this;
I’ve changed some IP details, so we’re assuming that our IP (per DHCP
reservation) is 192.168.1.201
in a /24
network with DNS servers on the
.10
and .11
IPs and default gateway (the Turris itself) on the .1
IP.
cat /etc/hosts
printf '::1\t\tlocalhost ipv6-localhost ipv6-loopback\nfe00::0\t\tipv6-localnet\n' >> /etc/hosts
for x in 0:mcastprefix 1:allnodes 2:allrouters 3:allhosts; do printf 'ff02::%s\t\tipv6-%s\n' ${x%:*} ${x#*:} ; done >> /etc/hosts
cat /etc/hosts
(printf 'auto lo eth0\n';
printf 'iface lo inet%s loopback\n' '' 6;
printf 'iface eth0 inet%s %s\n' 6 manual '' dhcp;
printf '\tudhcpc_opts -O staticroutes\n' ) \
> /etc/network/interfaces
mount -t proc proc /proc
ifconfig eth0 192.168.1.201 netmask 255.255.255.0 up
route add default gw 192.168.1.1
( printf 'domain lan\n';
printf 'nameserver 192.168.1.%s\n' 10 11;
printf 'options timeout 1 attempts 1 rotate\n') \
> /etc/resolv.conf
apk update
apk upgrade
apk add busybox-initscripts
rc-update add networking && rc-update add bootmisc boot && rc-update add syslog boot
apk add zsh vim curl git acl attr iputils rsync
reboot
As is normal for containers, reboot
merely restarts the container, not the
outside OS. Above I enable IPv6 and IPv4. Note that Alpine does not appear
to support NDP for IP configuration. Note too that for IPv4, I’ve used
dhcpc_opts -O staticroutes
as an option line in /etc/network/interfaces
.
I’ve confirmed that this adds to the options on the command-line, instead of
replacing them. This is how we tell the Alpine DHCP client that we want to
request an extra option, DHCP option 121 per RFC 3442. At this point, the
option will be requested and details made available in environment variables
to the configuration scripts, but nothing else done.
We can then look at less /var/log/messages
and see a bunch of spam filling
the logs quickly because /etc/inittab
is configured to request getty
processes listening on TTYs which don’t exist. vi /etc/inittab
and remove
the tty5
and tty6
entries. I also commented out 2, 3 and 4: we don’t need
these things. One getty
should be enough: in fact, it’s probably more than
enough, since lxc-attach -n CONTAINER
does not use a serial line to connect
in. We can comment out all of the getty
lines quite safely and reclaim a
little RAM.
To debug what was needed for udhcpc
, I read
/usr/share/udhcpc/default.script
to see how things fit together. Then,
taking advantage of “not yet permitting remote login” to do unsafe
FS-race-attack-prone things in /tmp, I just ran:
mkdir /etc/udhcpc
echo 'env > /tmp/troll.$$' > /etc/udhcpc/udhcpc.conf
service networking restart
rm /etc/udhcpc/udhcpc.conf
This yields a file in /tmp
for each invocation of the udhcpc script, letting
us see exactly which variables are exposed each time, and at what stages we’re
called. At this point, there was much invocation of strings
upon binaries
and other debugging techniques to get to where I figured out the option name
required: it has been renamed at least once, many search results online are
outdated. What I did get from search-fu was that dhcpc_opts
is the relevant
option in /etc/network/interfaces
.
Time to create some shell scripts to manage bringing up the CIDR routes. Here’s what I created and the scripts I wrote; please note, that renew and deconfig have not been heavily tested. I know that bringing the interface up works well enough.
Of note: DHCP option 121, aka cidr-routes
, aka
rfc3442-classless-static-routes
, is required to include all routes,
including the default route, as clients should ignore the normal route option
in the presence of DHCP option 121.
(This is in contrast to ms-cidr-routes
aka ms-classless-static-routes
aka
private option space usage of code 249 for Microsoft platforms: although it’s
documented in MSDN
as differing only in option code, various resources online claim that the
default route should be omitted from that option. Fun.)
In my code, I do the opposite: I leave the route from DHCP option 3 (RFC 2132) in place, and skip any default route found in the DHCP option 121. This removes a window of turning down a freshly turned up interface, or a bunch of other complexity to handle this. In practice, if your default route differs between option 3 and option 121, then you’re crazy and on your own.
cd /etc/udhcpc
for D in post-bound post-renew pre-deconfig; do
touch $D/cidr-routes
chmod 755 $D/cidr-routes
vi $D/cidr-routes
done
/etc/udhcpc/post-bound/cidr-routes
#!/bin/sh
statedir=/var/lib/udhcpc
togglefile="$statedir/cidr-routes"
cachefile="$statedir/latest.staticroutes"
[ -n "${RETURN_AFTER_SETTINGS}" ] && return 0
[ -n "${staticroutes:-}" ] || exit 0
[ -d "$statedir" ] || mkdir "$statedir"
printf > "$cachefile" '%s\n' "$staticroutes"
# the sed relies upon a GNU extension, which is available (addr,+N)
case $staticroutes in
*\ 0.0.0.0/0\ *) use_routes="$(echo $staticroutes | xargs -n 1 | sed '/^0\.0\.0\.0\/0$/,+1d' | xargs)" ;;
*) use_routes="$staticroutes" ;;
esac
{
echo '#!/bin/sh'
echo 'enable() {'
printf ' ip route add %s via %s\n' $use_routes
echo '}'
echo 'disable() {'
printf ' ip route del %s via %s\n' $use_routes
echo '}'
cat <<'EOTAIL'
case "${1:-missing}" in
missing) ;;
enable) enable ;;
disable) disable ;;
*) echo >&2 'bad arg'; exit 1 ;;
esac
EOTAIL
} > "$togglefile"
chmod 0755 "$togglefile"
exec "$togglefile" enable
/etc/udhcpc/post-renew/cidr-routes
#!/bin/sh
[ -n "${staticroutes:-}" ] || exit 0
RETURN_AFTER_SETTINGS=t
. /etc/udhcpc/post-bound/cidr-routes
unset RETURN_AFTER_SETTINGS
[ -f "$cachefile" ] || exec /etc/udhcpc/post-bound/cidr-routes
previous="$(cat "$cachefile")"
if [ "$staticroutes" = "$previous" ]; then
# don't force back on; if disabled, leave disabled
exit 0
fi
"$togglefile" disable
exec /etc/udhcpc/post-bound/cidr-routes
/etc/udhcpc/pre-deconfig/cidr-routes
#!/bin/sh
RETURN_AFTER_SETTINGS=t
. /etc/udhcpc/post-bound/cidr-routes
unset RETURN_AFTER_SETTINGS
[ -n "$togglefile" ] || exit 0
exec "$togglefile" disable
Security Setup
Time for CA certificate configuration, SSH, and so forth. I don’t enable DSA, or RSA, and don’t want to have those sitting around on disk; I also don’t like components auto-creating SSH hostkeys if they think they’re missing: if it gets used, it encourages sloppiness around hostkey verification. If the files disappear, it’s time to use console and restore from backups, or create new ones and distribute the new public keys via secure mechanisms.
PKIX, SSL/TLS
The directory /usr/local/share/ca-certificates
exists, is empty, and any
.crt
files will be picked up by update-ca-certificates
and trusted
immediately. Note though that the symlink created in /etc/ssl/certs/
has
ca-cert-
prepended to the filename (of the symlink itself), which was an
unexpected difference and caused a few moments of head-scratching as I
wondered why my certs hadn’t been picked up.
Before making changes, I used curl
to verify that certs issued by public CAs
were working, and that those issued by my CA were resulting in failure. I
pasted in the certificate of my household Certificate Authority, then ran
update-ca-certificates
. I then re-ran the curl
tests and All Was Good.
OpenSSH
I tend to stick to OpenSSH at this time. Note though that OpenSSH 7.5, as
included in Alpine 3.7, has a bug where use of -o VerifyHostKeyDNS=ask
(or
anything other than =no
) will cause a segfault for any host missing SSHFP
records in DNS. As far as I can tell, this is entirely an upstream bug and
the fix is to upgrade to OpenSSH 7.6. I just left it with this known bug in
place.
I also tend to place all authorized keys files into
/etc/ssh/userauth/USERNAME
instead of ~/.ssh/authorized_keys
, as this is
more amenable both to central distribution and to using systems such as
etckeeper
to detect and record changes in how authentication can happen.
apk add openssh
printf '\n\nSSHD_DISABLE_KEYGEN=yes\n' >> /etc/conf.d/sshd
for want in ed25519 ecdsa; do
ssh-keygen -t $want -f /etc/ssh/ssh_host_${want}_key -N '' -C washington.lan ;
done
tail -n +0 /etc/ssh/ssh_host_*_key.pub
KEYDIR=/etc/ssh/userauth; mkdir $KEYDIR; vi $KEYDIR/root $KEYDIR/troll
vi /etc/ssh/sshd_config
Changes to sshd_config
:
# uncomment *only* ECDSA and Ed25519 keys
# Double-check that "PermitRootLogin prohibit-password" is the default, else
# set it.
LogLevel VERBOSE
AuthorizedKeysFile /etc/ssh/userauth/%u
PasswordAuthentication no
ChallengeResponseAuthentication no
AcceptEnv WINDOW TMUX_PANE TZ ITERM_PROFILE COLORFGBG
UseDNS no
AllowUsers *@127.0.0.1 *@::1
AllowUsers root@192.0.2.7 root@198.51.100.49
AllowUsers troll@192.0.2.7 troll@198.51.100.49
Then basic user setup. I couldn’t log in as the troll
user at first; after
running /usr/sbin/sshd -ddd -p 24
and connecting to that, I saw the
permission denied errors trying to access the file; a little more
investigation led to this gem:
# ls -ld /
drwx------ 1 root root 108 Mar 18 09:46 /
I have no idea what led to that … “most peculiar setting”.
rc-update add sshd
rc-status
service sshd start
addgroup -g 1000 human
adduser -h /home/troll -g 'Grumpy Troll' -s /bin/zsh -u 3000 -G human troll
# hit enter a couple of times for password
vi /etc/shadow
# on the line for `troll`, replace the second field with just a single '*'
chmod 0755 /
Fin
- To get an
openssl
binary, useapk add libressl
- The
tput
command is inncurses
, but I reworked the personal config files git repo (not covered above) to check for existence of thetput
command and fallback to an assumption of ANSI escape sequences for color, which seemed eminently sane. If a TTY doesn’t support the ANSI sequences, then the system should have a curses system and workingtput
installed. If every TTY does, let’s default to sane strings instead of re-deriving them on startup. - https://turris.lan/cgi-bin/luci/admin/network/firewall/forwards to enable
a port-forward from the WAN on a non-standard port, to the IP of the
container on port 22
- Edit
/etc/ssh/sshd_config
inside the container to changeAllowUsers
accordingly.
- Edit
That’s about it. The complexity was all in the debugging to find the key pieces of information needed.
-The Grumpy Troll