|
|
| ||
|
|
|
|
|

The Spanning Tree Protocol (STP, defined in IEEE 802.1d) is run by layer two devices to avoid loops. At layer three, IP includes a TTL (time to live) field that decrements with every hop – eventually the packet will reach a destination or be destroyed when the TTL = 0. STP is necessitated because a frame does not include a similar field that could be used to identify and remove looping frames.
The solution is to not allow a loop to form. This is accomplished by recognizing potential looping paths and turning off links until no loops exist. For purposes of this discussion, I will use the term “bridge” to generically describe a layer two device but the same concepts apply for switches.
When a bridged network is first powered on, each of the bridges starts to produce Bridge Protocol Data Units every two seconds. The initial BPDUs advertise the transmitting bridge as the root of the network with a path cost of zero (if I’m the root, it cost me nothing to get to the root). The transmitting bridge gives itself an eight byte Bridge ID, which is a concatenation of its two byte priority and six byte MAC address. When BPDUs are exchanged each bridge evaluates received BPDUs for superiority. A BPDU is superior if it has a lower Bridge ID. If no other bridge offers a superior BPDU then this bridge is the root.
If another bridge sends a superior BPDU, then the bridge will cease to advertise itself as the root and start to advertise its superior neighbor with the cost advertised by the neighbor plus the cost to the neighbor. If two BPDUs are received with the same root then a loop has formed. The bridge will evaluate the two paths that make up the loop and place one in a blocking state based on the lowest cost path to the root, or if cost is equal then the lowest transmitting bridge ID, or if that is equal the lowest transmitting port number.
When this process has finished a loop-free topology will be in place and data transmission can begin. Notice that spanning tree has created a loop free topology by preventing data transmission on selected lines. Spanning tree is not concerned with maximizing bandwidth or minimizing path length! Spanning Tree is only concerned with eliminating loops.
When a link is first activated, it transitions through a listening stage for Forward Delay seconds (Forward Delay defaults to 15) where it sends and receives BPDUs but does not forward data-carrying frames. The spanning tree negotiation takes place during this stage. After the listening stage is through, the bridge transitions into a learning stage for Forward Delay seconds in which it builds its CAM table (MAC to port mappings). When the listening stage is complete, the port is either blocking or forwarding. Links that are left available to the topology are said to be in a forwarding state, where links that are removed from the topology are said to be blocking.
What happens if a bridge receives two equivalent BPDUs? A loop exists, and so one port must be placed in a blocking state.
Which path will be placed in blocking state? The path with the higher cost. If the paths have equivalent cost, then the path with the highest transmitting bridge (neighbor) ID. If the transmitting bridge ID is equal, then the highest port priority and number.
What are the spanning tree timers? Max age is the amount of time a bridge waits, since receiving the last BPDU, before declaring a link dead. Max age is 20 seconds by default. Hello Time is the time between BPDUs. Hello time is 2 seconds by default. Forward delay is the time that a bridge spends in the transitory states – listening or learning. Forward delay is 15 seconds by default. Once a root is elected, the root dictates these values to the other bridges by specifying them in BPDUs.
So we’ve discussed how the bridged network comes up and identifies a single path to each end station. How does the spanning tree react to topology changes? For instance, what happens when a link that was working goes down? Consider the following topology – forwarding lines are shown in black and blocking lines are shown in red.
This network has settled on a stable
topology, but what happens when the BE link goes down? The bridges continue to produce BPDUs
every hello time seconds, even on blocking ports. If Bridge E fails to receive a BPDU
within Max Age from Bridge B then Bridge E will re-run STP using the
BPDUs it is receiving (on the DE and CE links).
How quickly will Bridge E adjust to the lost circuit? First it has to leave its port in blocking for Max Age seconds to make sure that the link is really gone – by default, 20 seconds. Next the port will transition through listening for Forward delay (15 seconds) and learning for Forward delay (15 seconds). This means that the link will be down for up to 50 seconds. This is long enough for TCP connections to be lost and users to receive error messages. There are proprietary ways of reducing this time to almost zero, but there is not a multi-vendor specification except to change the timers, which may not be desirable.
One criticism of Spanning Tree has to do with its use in networks that also use DHCP. When a user first plugs a PC into a port, or when a PC is turned on, the port has to transition through listening and learning before settling into forwarding. If the PC is using DHCP, then the DHCP process can time-out before this is finished. There are three possible solutions to this problem: Change the timers, turn off spanning tree, or use a proprietary solution.
Changing the spanning tree timer is problematic, in that there are interdependencies between the timer values and the size of the network. It can be done, and guaranteeing that all bridges use the same values is easy since the root dictates these values in its BPDUs. However, this is not the preferred method because it is possible to specify unworkable values and there is a limit to how quickly spanning tree can be asked to re-converge.
Turning off spanning tree is a popular solution, but is akin to playing Russian roulette. The idea is, “Hey, this port will never be used for anything other than a PC, why run spanning tree and take the startup hit? We’ll just turn off STP!” Interesting thought, but what happens when an eager Level 1 tech tries to troubleshot a problem by plugging in all the loose cables in the wiring closet? One wrong plug and oops! No network.
That leaves proprietary solutions. The problem with these solutions is that you must thoroughly understand how they work if you have a vendor environment. You many need to commit to a single vendor for all switches. If there are no issues with a single vendor environment then this is the safest and easiest choice.