wiki:linux/persistent_device_naming

Linux Persistent Device Naming

The term 'persistent device naming' has to do with your Linux based Operating System naming its devices (not just network devices but also block storage devices, video devices, etc) consistently from boot to boot. This is necessary in order to configure and use them in a consistent fashion from boot to boot.

The Linux kernel over time changes the names it uses for devices as well as the order they get enumerated. The device enumeration order plays an important part in the device name as typically device drivers register a device with a template (ie eth%d) and the number representing %d gets replaced with the 'next' unused device matching that template. So in other words which network device is 'eth0' vs 'eth1' depends on which one got registered with the kernel first. To further complicate things this registration order can change more often than you think due to device drivers being static or modules and in some cases due to the kernel becoming more and more parallel during its init, device registration ordering can change boot to boot even with the exact same kernel and configuration. The Linux kernel does not specify a device naming policy (and it never will) so it is not the kernel's job to name devices consistently. The only thing consistent is the device path that defines the device in terms of the hardware's device-tree.

For many years now Gateworks has striven to make its network ethernet devices numbered from left to right when viewing the front panel such that the leftmost RJ45 would be 'eth0', the next 'eth1' etc. Note that boards with network switches have switch port names that are defined by the device-tree (ie 'lan1', 'lan2', etc is typically used) but the embedded network controllers on a board are typically named 'eth%d' and are not allowed to be named via device-tree (and multiple attempts over the years to add this to Linux have failed citing the kernel not dictating naming policy).

This goal has not always been simple largely because it is not always feasible to control the device registration order:

  • PCI based network controllers of the same type are registered in the order of their PCI ID which may not match the placement order of the RJ45 jacks (the GW5520 single board computer had 2 PCI based network interfaces and the leftmode one had a larger bus/device than the rightmost one making the rightmost one register first and become 'eth0')
  • Drivers that support multiple device types may have a hard-coded enumeration order (the CN803x SoC on Newport has a single driver that supports both the RGM and SGMII interfaces but despite the Newport boards having the RGM RJ45 jack first, the SGMII devices are always registered first based on the driver design making the RGM leftmost interface getting the last 'eth' device.
  • Drivers which are both static in the kernel may still race for device registration order which can change over time due to software changes from kernel to kernel (the GW74xx IMX8MP has two on-board network interfaces and the drivers while different drivers both compete for registration order and can flip between boots on modern kernels)

Persistent device naming is not achievable using the kernel alone without help from userspace. Because the kernel does not dictate device naming policy this task falls on userspace and it has been that way all along. The kernel issues hotplug events when devices are added that userspace can act on and as such userspace can rename a network device from the name the kernel gives it to a name it desires. For some time now this has been handled via the popular 'systemd' software suite which provides an array of system components for Linux based operating systems including device naming and permissions.

Standard Linux distros have used systemd to name network devices according to their type, bus, and port to try to solve the 'persistent device naming' issue. This sounds like a perfect solution but has some serious drawbacks:

  • it does not cover embedded network devices such as those provided by an SoC
  • the device names are not always intuitive

This systemd persistent network device naming feature can be disabled via a kernel argument (bootargs) of 'net.ifnames=0' which Gateworks has put in its default kernel bootargs for quite some time as we wanted to simplify network device names making them all 'eth'. If systemd's persistent network naming were to name the network devices for example on a GW73xx-0x board which has 1 on-board GbE (FEC) NIC and 1 PCIe GbE NIC you would see them named eth0 (FEC) and enp192s0 (en=ethernet, p=pci, 192=bus (0xc0), s0 is slot 0). Note also that the PCI bus (192 here) will change if you add a PCIe switch via an add-in card on the board.

The systemd suite also has something called udev which gives you the very powerful ability to match hotplug events based on kernel device names, device paths, subsystem name and event types and execute things like permission changes, device name changes or anything you can do in a script via a configuration file in /etc/udev/rules.d.

You can use systemd 'udev rules' to rename network interfaces but there are limitations such as namespace collisions such that you can't 'flip' eth0 and eth1 because the device-names you intend to change to already exist. An example of how you can use udev rules to resolve the inconsistent naming of eth0 and eth1 on a GW74xx due to the drivers racing for device registration would be name them base on device path:

# cat /etc/udev/rules.d/70-persistent-net.rules
SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc@0/30800000.bus/30bf0000.ethernet/net/eth*", NAME="en0"
SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/devices/platform/soc@0/30800000.bus/30be0000.ethernet/net/eth*", NAME="en1"

The downside here is that you need to choose name prefixes that are not used by existing kernel drivers such as 'eth' as otherwise you have name collisions. Attempts have been made to use udev rules to rename devices temporarily to unique names and then back to 'eth0' 'eth1' etc but race conditions in device registration can cause an issue there resulting in inconsistent behavior.

A different approach to handling device naming persistent via systemd is to do it manually after all kernel modules have been loaded (thus after all devices are registered) but before the devices are used. For example we can rename network devices between module loading and network device configuration.

consider the following script (which is in our ubuntu venice rootfs):

root@jammy-venice:~# cat /usr/local/sbin/netdevname 
#!/bin/bash

MODEL=$(cat /proc/device-tree/board | tr '\0' '\n')
case "$MODEL" in
        GW740*)
                echo "$0: Adjusting network names for $MODEL"
                eth0=platform/soc@0/30800000.bus/30bf0000.ethernet
                eth1=platform/soc@0/30800000.bus/30be0000.ethernet
                devs="eth0 eth1"
                ;;
esac

[ "$devs" ] || exit 0

# renumber eth devs to above max eth dev
max_eth=$(grep -o '^ *eth[0-9]*:' /proc/net/dev | tr -dc '[0-9]\n' | sort -n | tail -1)
for i in $(seq 0 $max_eth); do
        ip link set "eth$i" down
        ip link set "eth$i" name "eth$((++max_eth))"
done

# renumber eth devs based on the path we defined above
for i in $devs; do
        eval path='$'$i
        devname="$(ls /sys/devices/$path/net | head -1)"
        echo "$0: $i:$path"
        ip link set "$devname" down
        ip link set "$devname" name $i
done

This script determines the board via device-tree and for boards that need a little help with device naming (such as the GW74xx with its racy driver registration) it defines the list of network devices to name and the device path for each. It then determines the number of 'eth*' network devices from /proc/net/dev and renames them to numbers above that as a temporary measure to avoid namespace collisions. Lastly it iterates over the devices and names them to the desired name.

In the case of our venice ubuntu image this script is run by creating a systemd service that must run before networking (Before=network-pre.target):

root@jammy-venice:~# cat /lib/systemd/system/netdev-naming.service 
[Unit]
Description=Network Device Naming

Before=network-pre.target
Wants=network-pre.target

DefaultDependencies=no
Requires=local-fs.target
After=local-fs.target

[Service]
Type=oneshot

ExecStart=/usr/local/sbin/netdevname

RemainAfterExit=yes

[Install]
WantedBy=network.target

You can use various tools to analyze systemd services to see the order they run in:

# create an 'extensive' dump
systemd-analyze dump > systemd.dump
# create a graphical representation
systemd-analyze plot > systemd.svg

This creates a service (netdev-naming.service):

root@jammy-venice:~# systemctl status netdev-naming.service
● netdev-naming.service - Network Device Naming
     Loaded: loaded (/lib/systemd/system/netdev-naming.service; enabled; vendor preset: enabled)
     Active: active (exited) since Thu 2024-04-04 20:28:48 UTC; 5min ago
   Main PID: 188 (code=exited, status=0/SUCCESS)
        CPU: 145ms

Apr 04 20:28:48 jammy-venice systemd[1]: Starting Network Device Naming...
Apr 04 20:28:48 jammy-venice netdevname[188]: /usr/local/sbin/netdevname: Adjusting network names for GW7401-B
Apr 04 20:28:48 jammy-venice netdevname[188]: /usr/local/sbin/netdevname: eth0:platform/soc@0/30800000.bus/30bf0000.ethernet
Apr 04 20:28:48 jammy-venice netdevname[188]: /usr/local/sbin/netdevname: eth1:platform/soc@0/30800000.bus/30be0000.ethernet
Apr 04 20:28:48 jammy-venice systemd[1]: Finished Network Device Naming.
Last modified 10 months ago Last modified on 04/04/2024 08:37:25 PM
Note: See TracWiki for help on using the wiki.