Update 3: It has nothing to do with Upstart, I'm sure about it now, after spending 4 hours of debugging.

Oh hell, I wonder why I'm always running into strange situations regaring Ubuntu Server, Network and Upstart (I hope it's upstart ;))

Ok, here we go with the setup:

Imagine you have a server with several ethernet interfaces.

eth0, eth1, eth2, eth3

Now imagine further, that eth0 and eth1 will be bonded as portchannel with LACP (bond-mode 4). Forget the xmithashpolicy right now (this will be layer3+4, but this is not important right now).

Having Lucid and Upstart in place, the config looks like this:


auto bond0
iface bond0 inet static
   address 192.168.1.10
   netmask 255.255.255.0
   bond-slaves none
   bond-mode 4
   bond-miimon 100

auto bond1
iface bond1 inet static
   address 192.168.1.11
   netmask 255.255.255.0
   bond-slaves none
   bond-mode 4
   bond-miimon 100

auto eth0
iface eth0 inet manual
    bond-master bond0
    bond-primary eth0 eth1

auto eth1
iface eth1 inet manual
    bond-master bond0
    bond-primary eth0 eth1

auto eth2
iface eth2 inet manual
    bond-master bond1
    bond-primary eth2 eth3

auto eth3
iface eth3 inet manual
   bond-master bond1
   bond-primary eth2 eth3

The machine comes up, and I can ping the default interfaces just fine.
So, this setup is correct, the access vlans on the Cisco switch are set correctly, and the etherchannel config on the Cisco switch is also correct. There we go.

Now I want to have over the two portchannel bonds an active-passive bond. So, I'm going to change the config like this:


auto bond0
iface bond0 inet static
   address 0.0.0.1
   netmask 255.255.255.255
   bond-slaves none
   bond-mode 4
   bond-miimon 100

auto bond1
iface bond1 inet static
   address 0.0.0.2
   netmask 255.255.255.255
   bond-slaves none
   bond-mode 4
   bond-miimon 100

auto bond2
iface bond2 inet static
address 192.168.1.10
netmask 255.255.255.0
bond-slaves bond0 bond1
bond-mode 1
bond-miimon 100

auto eth0
iface eth0 inet manual
    bond-master bond0
    bond-primary eth0 eth1

auto eth1
iface eth1 inet manual
    bond-master bond0
    bond-primary eth0 eth1

auto eth2
iface eth2 inet manual
    bond-master bond1
    bond-primary eth2 eth3

auto eth3
iface eth3 inet manual
   bond-master bond1
   bond-primary eth2 eth3


On Ubuntu Jaunty, this setup worked out of the box (minus, that the manual eth* interfaces were not necessary, I had the bond-slaves directly configured on bond0 and bond1, but for Lucid it needs to be this way).

Ok, reboot the machine, comes up and no ping possible, but all interfaces are up and running.
Even bond2 is correctly enslaved with bond0 and bond1.

So, now I'm stucked. I think it has something to do with the setup of the NICs and bonds.

The way it should be:

  1. Upstart will start /etc/init/networking.conf on local-filesystems and stopped udevtrigger.
    This will bring up the bond interfaces bond0, bond1 and bond2
  2. Upstart will then bring up the hardware interfaces eth0, eth1, eth2 and eth3 and put them correctly as slaves to bond0 and bond1.
But what about interface bond2?
bond2 will come up with bond0 and bond1 as slaves, but bond0 and bond1 don't have their bond-slaves ready yet. So bond2 don't know anything about the needed hardware interfaces.

How can I tell upstart to wait for the hardware interfaces, before the virtual interfaces are started?
In other words, I need to defer /etc/init/networking.conf to be executed after the hardware interfaces are up and running.

If this would work somehow, I could even get rid of the unneeded eth0/eth1/eth2/eth3 manual configurations for the hardware NICs, and I'm able to go back to a more sane /etc/network/interfaces configuration.

Help is appreciated.

UPDATE: I uploaded an image of the setup which worked out of the box on Ubuntu Jaunty. So you can imagine what I'm trying to achieve.


UPDATE 2: Found another guy on the Novell forum which had the same problem. (http://forums.novell.com/novell-product-support-forums/suse-linux-enterprise-server-sles/sles-networking/398736-bond-bonds-bonding-2-aggregate-bonds-active-backup.html) but in 2009 that worked for me (Ubuntu Jaunty)