Kevin Dorrell, CCIE #20765

09 Apr 2008

VTP Pruning : Is this a bug?

Filed under: IOS Bugs, LAN Switching, VTP — dorreke @ 14:29

I recently introduced VTP pruning on a LAN, and now I have some connectivity problems on certain VLANs.  The more I look at the problems, the more I wonder whether there is some strange behavior in VTP pruning.  The questions I need to answer are:

  1. Is pruning based on whether the switch has any downstream clients; that is, whether there are any active access ports or unpruned downstream trunks on the VLAN?  Or is it based on whether there are any downstream CAM entries for the VLAN?
  2. Is it possible for a switch to prune a VLAN off a trunk that is the root port for that VLAN?

Here is my apparently anomalous situation.  I have four VLANs, 21-24, that serve as point-to-point links between remote sites to carry server heartbeats.  On each of these VLANs, there are only two hosts: one on each site.  Here is the spanning-tree for VLAN 21:

CC80#show spanning-tree vlan 21
 VLAN0021   Spanning tree enabled protocol rstp
   Root ID    Priority    24576
              Address     0007.4f62.a014
              Cost        15
              Port        72 (Port-channel1)
              Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

   Bridge ID  Priority    32789  (priority 32768 sys-id-ext 21)
              Address     001b.2ae8.b280
              Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
              Aging Time 300
 Interface        Role Sts Cost      Prio.Nbr Type
 ---------------- ---- --- --------- -------- --------------------------------
 Gi0/16           Desg FWD 4         128.16   Edge P2p
 Po1              Root FWD 3         128.72   P2p
 Po2              Desg FWD 3         128.80   P2p

Notice one or two things about this.  Firstly, Po1 is the root port.  Secondly, that G0/16 is an “up” access port on this VLAN.  Thirdly, I should mention that Po2 is a trunk to another access layer switch at the same level.  But Po2 spends its time in blocking state on the other switch except in an emergency.  That is why this switch prunes most VLANs off Po2, as we will see.

Now let us look at the trunks:

CC80#show int trunk
 Port        Mode         Encapsulation  Status        Native vlan
 Gi0/3       on           802.1q         trunking      2
 Po1         on           802.1q         trunking      12
 Po2         on           802.1q         trunking      12

 Port        Vlans allowed on trunk
 Gi0/3       1,169
 Po1         1-2,5,12,21-26,169
 Po2         1-2,5,12,21-26,169

 Port        Vlans allowed and active in management domain
 Gi0/3       1,169
 Po1         1-2,5,12,21-26,169
 Po2         1-2,5,12,21-26,169

 Port        Vlansin spanning tree forwarding state and not pruned
 Gi0/3       1,169
 Po1         1-2,5,12,22,24-26,169
 Po2         1

As expected, most of the VLANs are pruned off Po2 as te other end of Po2 is on STPblocking state.  Ignore G0/3; this is a server trunk.  The interesting thing is that the switch has pruned VLAN 21 from the root port trunk, Po1.  Why?  This has effectively cut this switch off from VLAN 21.

VLAN21 is pruned from both Po1 and Po2, and yet it has an access port on it.  Now, that access port, G0/16, is apparently not receiving any MAC traffic from its connected host.  There is nothing in the CAM table except the upstream switch.  But it is still isolated, so it cannot see any traffic from the remote part of the VLAN, so it does not respond:

CC80#show mac-address-table dyn vlan 21
           Mac Address Table
 -------------------------------------------
 Vlan    Mac Address       Type        Ports
 ----    -----------       --------    -----
   21    0016.c73d.a22b    DYNAMIC     Po1

 Total Mac Addresses for this criterion: 1

One last possible clue.  We have four VLANs, each with two host connections.  Two of them work, two of them don’t.  The difference is that they take different paths.  Two of them are rooted on a 4506 running IOS 12.2(25)EWA2, and they work; they are not pruned anywhere between the two sites.  The two that do not work are rooted on a 4003 running CatOS 8.4(5)GLX.

Update 19/04/2008:

I think I have sorted it out, and I think it is a bug in the root switch for VLAN 21. Unlike our other VLANs, VLAN 21 is rooted in a CatOS switch. Due to various circumstances in our network, it tripped over a bug.

The bug is related to one I found a couple of years ago. I found that in a CatOS switch, if you manually disallow the native VLAN (in my case VLAN 12) from a trunk, then it stops the trunk passing BPDUs for VLAN1 as well. At the time, that resulted in a 5-minute meltdown of my network.

Here are the notes I have made for this new bug. Sorry about the generalisations … I do not know the VTP protocol very well yet.

Normally, VTP signalling is carried on the native VLAN of each trunk. By default, the native VLAN is VLAN 1, but you are allowed other values. We use VLAN 12 as native, a VLAN that is unused anywhere on the network. Now, an IOS switch will never prune VLAN 1 from a trunk. Nor will it prune the native VLAN. However, CatOS has a bug: if the (non-1) native VLAN is unused, it will prune it from the trunk regardless of the fact that it is the native. Once the native VLAN is pruned, of course, the VTP signal cannot be propagated to other switches.

It happens that we have one CatOS switch in our core loop. That switch is the root for VLANs 21 and 23. (And fortunately only for VLANs 21 and 23.) Because that root switch had pruned the native VLAN from its trunks, it was no longer able to send VTP unprune signals for VLANs 21 and 23 to its neighbors. Its neighbors therefore pruned VLANs 21 and 23 from the trunks to the root. The result was that there was no connectivity in VLANs 21 and 23, and every switch pruned all ports on those VLANs.

I resolved the problem by rolling back the VTP pruning.

Advertisements

4 Comments »

  1. […] and the other Filed under: Uncategorized — dorreke @ 13:27 Added the resolution to my posting a couple of weeks ago about a VTP pruning problem.  I came to the conclusion it was a bug in CatOS.  CatOS will quite happily prune the native VLAN […]

    Pingback by This, that, and the other « Kevin Dorrell’s CCIE Study Weblog — 19 Apr 2008 @ 13:27

  2. I’ve been looking into this, and before I press enter on the command, was wondering what version of CatOS you are running? I don’t really need the headache of pruning issues if I do this, but need to clear up the amount traffic caused by numerous VLANs.

    Comment by Malcolm White — 08 Jul 2008 @ 06:56

  3. Malcolm,

    The version was 8.4(5)GLX. There are more recent version, but as far as I am aware Cisco had no intention to fix this bug before EOL on the CatOS.

    I think that as long as your native VLAN corresponds to some real ports, you should be OK. The problems start to occur if VTP prunes the native VLAN. In retrospect, I would have done better to use the management VLAN as the native, because that guarantees that each switch has a real presence on the native VLAN, and so it will not get pruned.

    Comment by dorreke — 08 Jul 2008 @ 09:35

  4. P.S. Sorry if the formatting of this posting has changed. WordPress editor has a nasty of removing the CR/LF from router listings, so I had to put them back in by hand.

    Comment by dorreke — 08 Jul 2008 @ 09:38


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: