TAGS:

Linux Bonding, LLDP, and MAC Flapping

John W Kerns

Sometimes a painfully troublesome networking problem can have a complicated and brain-twisting root cause, one which you dread having to explain to peers and managers. However, sometimes the root cause is dead simple and makes you feel silly for how long it took you to find it.

Today, I had one of the latter and thought I would share it since my Google Fu failed to render a clue and I ended up taking the scenic route to the answer.

In order to understand the problem, we need to review a few networking basics.

Approx Reading Time: 10 Minutes


Linux Bonding 411

NIC Bonding in Linux works in a very similar way as it does in classic infrastructure networking. Cisco uses “Etherchannels” or “Port-Channels”, some vendors use the term “LAG”, but in the end they accomplish the same goals.

The Similarities

  • Two or more physical interfaces are “bonded” into a single logical interface
  • The logical interface is presented to the operating system
  • All references in the OS for MAC addresses and IP addresses are pointed towards the logical interface
  • The physical interfaces are hidden from the OS and an abstraction driver handles how to route traffic down the different physical links
  • Sometimes negotiation protocols like LACP are used on each end of the aggregated link to ensure consistency

The Differences

While an operating system like Linux can do all the standard link aggregation functions a typical network switch can do, it can also do a few other things:

  • Active/Backup Bonding: Only one of the links in the bond is used at a time. All others are in standby and will transmit no traffic unless the active link fails
  • Adaptive Load Balancing (ALB and TLB): Traffic is balanced across the links in the bond, but the MAC addresses of transmitted frames are kept unique to each physical link
    • Adaptive Transmit Load Balancing (TLB): ARP responses for the IP of the bond interface reference the MAC used on the active link only, so inbound traffic always hits only the active link
    • Adaptive Load Balancing (ALB): ARP requests to the bond interface are intercepted by the bonding driver and will answer differently each time: referencing the MAC addresses of the physical interfaces. This allows different hosts in the broadcast domain to use different NICs in the bond depending on what was in the ARP response they processed
  • Broadcast: All transmitted frames are sent out of all physical interfaces in the bond. This can be useful for strange network topologies where you are connected to multiple hosts on the different physical interfaces in the bond, and want to treat them as if they are in the same broadcast domain, but without bridging frames between the physical interfaces.

It’s important to point out that the Active/Backup and Adaptive Load Balancing (TLB/ALB) bonding modes are specifically designed to allow the upstream switch to be completely unaware of the load-balancing activity. The switch can treat its side of the bonded links as if they are individual and unrelated, simplifying your network switch configs. Or, at least that is how it is supposed to work until you break things like I did…

 

LLDP on Linux

LLDP (and CDP if you’re Cisco-friendly) has been built into networking hardware for many years now. Being able to see the details of neighbors connected to your switch is invaluable and helps immensely when trying to discover your network topology. In recent years it has become common for end-host operating systems, especially hypervisors, to start supporting LLDP as well.

Many Linux distros have LLDP daemon packages in their default package repositories, and I have started installing LLDP as a standard whenever I install Linux on hardware. If using Ubuntu, it’s as easy as apt install lldpd . Once the service is installed and started, it’s default configuration broadcasts LLDP frames out of all connected links every 30 seconds. You can see your LLDP neighbors using the lldpctl command.

 

The MAC Flapping Problem

Ethernet switches are pretty ingenious devices. They figure out how to forward frames to different destination MAC addresses by examining the source MAC addresses of the frames received. The assumption is that an interface on the switch will be connected to a NIC which puts frames on the wire sourced from its unique MAC address and will therefore accept frames destined to that unique MAC address sent back over that same link. It’s a simple and brilliant way to automatically discover and build a forwarding table for an Ethernet segment.

However, in the modern world of networking where a NIC with a specific source MAC address can move from place to place on the network, whether it be wireless client roaming between wireless access points, or perhaps a misconfigured Link Aggregation Group, switches can have a hard time figuring out how to forward a frame destined for a moving MAC address. This issue is typically called MAC flapping, and it can stem from a variety of different scenarios. Sometimes they are unavoidable (like wireless clients roaming between access points), and sometimes they are negligent (like misconfigured LAGs).

 

The Scenic Route

The Symptoms

I was troubleshooting a Linux server which had a couple different bonds configured. One (used for replication) was using the LACP mode and normal LAG-type load balancing, it had a corresponding aggregation configuration on the upstream switch. The other bond interface was using Active/Backup mode and was for server OS management; the upstream switch had no special configuration for the bonding nor should it (remember, this mode is meant to keep any special config off of the upstream switch).

The symptom I was experiencing was intermittent connectivity to the IP of the management bond interface. It would work fine for a few minutes and then act like it dropped off the network for a few minutes, and then all of a sudden it was back online. I checked the logs on the Linux host and the upstream switch and saw no indications of network issues or link failures.

During some troubleshooting I noticed if I shut the secondary link down on the upstream switch, the server would stop losing connectivity intermittently and would stabilize. I also noticed when the server went offline (while both links were up), the MAC address on the upstream switch moved from the primary port to the backup port; indicative of a MAC flap problem.

The Investigation

This should not happen with a Linux bond interface. In an Active/Backup bond, the backup ports should never send Ethernet frames sourced from the MAC address of the bond interface as this would cause the upstream switch to start forwarding frames destined for that bond MAC address down the backup link. However; here I was looking at hard evidence of this happening. At this point it was time for a packet capture.

I started a constant ping to the Linux server and then ran a tcpdump  on it of all traffic moving over the backup link. I ran the dump in byobu  so it would keep running even after I was disconnected from my SSH session. After the server went through a few cycles of going offline and coming back on I stopped the dump, downloaded the capture file, and opened it in Wireshark.

At first I saw quite a bit of normal Ethernet broadcasts from the network (ie: some BPDUs, some broadcasts, etc). I didn’t see any traffic sourced from the Linux host or destined for it. Scrolling down through the buffer, it suddenly started showing all my ping requests coming in over that link, and right before those ping requests: a LLDP advertisement from the server. And the source MAC address of that LLDP advertisement: the MAC of my bond interface.

The Root Cause and Fix

It made perfect sense. The LLDP service on the host was mistakenly using the MAC address of the bond interface to source LLDP advertisements. These single frames, sent once every 30 seconds, would cause a MAC flap on the switch until some other traffic was initiated by the host over its active link which would cause a flap back to the correct port.

In reading the documentation for the lldpcli  tool, I found the below configuration option

configure system bond-slave-src-mac-type value

     Set the type of src mac in lldp frames sent on bond slaves

     Valid types are:
       real  Slave real mac
       zero  All zero mac
       fixed
             An arbitrary fixed value (00:60:08:69:97:ef)
       local
             Real mac with locally administered bit set. If the real mac already has the
             locally administered bit set, fallback to the fixed value.

     Default value for bond-slave-src-mac-type is local.  Some switches may complain when
     using one of the two other possible values (either because 00:00:00:00:00:00 is not
     a valid MAC or because the MAC address is flapping from one port to another). Using
     local might lead to a duplicate MAC address on the network (but this is quite
     unlikely).

So I simply dropped in to the CLI for the LLDP service with lldpcli , ran the command configure system bond-slave-src-mac-type real , exited out and restarted the LLDP service with systemctl restart lldpd .

I ran a follow-up packet capture and found LLDP advertisements were now being sent out using the MAC of the physical NIC and not the logical bond interface. No more MAC flapping problem.

Make it Permanent

It might seem like we are done here, but the configuration change we just made is ephemeral and will be lost with a reboot of the machine. We need to permanently change this setting by adding a config file to the /etc/lldpd.d/ directory.

The easiest way to do this is to run echo “configure system bond-slave-src-mac-type real” > /etc/lldpd.d/bond_real.conf to create a configuration file in the config directory, and then run systemctl restart lldpd.service to restart the service. Now we are done.

 

Conclusion

I love these types of mysteries. Maybe because in the end they bring me back to the very foundational concepts of networking. If my upstream switch were in fact a hub, then I would never have had this problem, perhaps I would have had an entirely different one.

I’m not sure why the default setting in the LLDP service is to use the MAC of the bond interface. Maybe so the neighbor has more accurate information about the host. It seems from now on I need to keep this setting in mind when installing LLDP. However, I’m just happy I learned something new today and got to use some often forgotten, but never irrelevant knowledge to solve a completely unexpected problem.

About John W Kerns: John Kerns is a network and automation engineer for a VAR based in Southern California and has been in the industry for over 12 years. He maintains a few open-source projects on GitHub and writes for Packet Pushers.