Cisco GIR

Cisco NX-OS Graceful Insertion and Removal (GIR)

If you operate a data-center network with Cisco Nexus, you’ve probably already faced the problem of how to perform a maintenance on one of the two switches of a vPC pair, with minimum impact and risks for the production network. Cisco NX-OS contains a feature called “Graceful Insertion and Removal” or GIR to help you for that. Here is how it works.

Scenario

Let’s take the example below:

Cisco nx-os vPC
(click on the image to see a larger version)

We have two Nexus (in nx-os mode) in vPC. Doing layer-2 aggregation and  layer-3 routing. They use OSPF as iGP and BGP towards the upstream provider(s). Here, I put only one external router for simplicity. To simulate the Internet, this router announces the prefix 8.8.8.8/32 to our vPC AS.
On these vPC “core” routers, we have the layer-3 SVIs for the data-center networks, represented here with a single dual-homed Nexus access/leaf switch, on the network 10.10.10.0/24.

Despite everything we can read about ACI or VXLAN, this is still a very common architecture in the real world today, at least for small and mid-size data-centers.

Now, let’s say we have to perform a hardware maintenance on the switch core-sw-02. And we would like to do this with a minimum impact on the production network. Even no impact at all, if possible. At the end this is one of the main reason why we have two physically separated switch, right?

 

The manual solution

Let’s see how we can do this task with manual configuration:

  • First, we have to be sure the Internet traffic go through core-sw-01 and not anymore through core-sw-02. The best solution for this is to play with BGP metrics to give a better preference to the core-sw-01 path. One solution could be do add an as-path-prepend on the announces of core-sw-02 to the external router, this to influence the incoming traffic. Or we could also play with MED values if we have a unique upstream provider. Or, we can also add special communities to our prefixes, if the upstream provider supports it. Then, for outgoing traffic, we have to assign a lower local-pref value on the routes we receive on core-sw-02. Or, as we use Cisco devices here, we can also use the weight argument.
    If we choose to simply shutdown the BGP session on core-sw-02, there is a risk the router on the other side do not interpret correctly the BGP message with the “cease” error code (normally sent with we manually shutdown a session), or a similar problem. So, there is a risk of losing packets, at least during the BGP hold-time (180 seconds on Cisco by default).
    I would avoid doing a shutdown of the session or the physical interface, because there are too many unknown parameters on the provider side.
  • Then, we have to be sure the internal traffic is bypassing the core-sw-02. This involves knowing how vPC and the port-channel of the access/leafs switch(s) are forwarding traffic. To minimize the impact, we have to make a vPC shutdown on core-sw-02. We could also maybe change the load-sharing method on the access/leafs switches, but the side effects will certainly be greater than we would like to avoid. So we will only do a vPC shutdown.
  • And 3rd problem, it depends also on the iGP. If, for example, we have another layer-3 device dual-homed to the core switches. This is not the case on my example above but it could be possible. For this, we have to be sure all the layer-3 devices on the network have a better metric through core-sw-01 then through core-sw-02. This can be done in different ways, depending on the iGP protocol used.
  • Finally, we can make the maintenance.
  • Last but not least, we have to do the same operations in reverse order to put the switch back into service.

 

Graceful Insertion and Removal (GIR) overview

As an effort to automate this process, Cisco nx-os introduces the Graceful Insertion and Removal (GIR) function. Available since software release 7.0(3)I2(1), on the Nexus 9K, 7K and 3K platforms. This allows to make the entire process above with a single command. And also to customize it with our own commands, if necessary.

By using the command: system mode maintenance in configuration mode, nx-os will put the switch into maintenance mode by configuring what we call the maintenance configuration profile. Then, by doing no system mode maintenance nx-os will put the switch back in service by configuring the normal configuration profile.

 

GIR profiles

There are two types of maintenance configuration profiles:

  • The maintenance-mode profile: containing all the commands that will be executed during the GIR activation or graceful removal, when the switch enters maintenance mode.
  • The normal-mode profile: containing all the commands that will be executed during the GIR deactivation or graceful insertion, when the switch returns to normal mode.

We can use the default profiles, or modify them.

The default profile is generated by the switch by parsing the configuration when we type the command “system mode maintenance” for the first time, the profile will be different depending what routing protocols are in use. The system generate a default maintenance-mode profile and a default normal-mode profile.

We can see the maintenance profiles with the command: show maintenance profile
Below, I executed this command on core-sw-01 but the command was never used, so we can see the profiles are empty:

NX01# show maintenance profile 
[Normal Mode]

[Maintenance Mode]

NX01#

 

With the default maintenance profile, the active forwarding protocols of the switch are placed in “isolate” state.  Here is an example when I type this command on core-sw-02 and abort it:

NX02(config)# system mode maintenance

Following configuration will be applied:

router bgp 1
isolate
router ospf 1
isolate
sleep instance 2 20
vpc domain 99
shutdown

NOTE: If you have vPC orphan interfaces, please ensure 'vpc orphan-port suspend' is configured under them, before proceeding further
Do you want to continue (yes/no)? [no] no

As we can see, the system first put BGP in isolate mode, then OSPF, then wait 20 seconds and finally make a vPC shutdown. This correspond to the manual configuration I suggested above.

Routing protocols isolate mode

The isolate mode is used to switch from the active forwarding path. Each protocol use a different mechanism to influence the forwarding decision of the remaining devices to not choose this switch as part of the active path(s):

  • RIP: poison route(s) with highest metric.
  • OSPF: send OSPF LSAs with max metric.
  • EIGRP: poison route(s) with highest metric.
  • IS-IS: refresh LSPs with Overload bit on.
  • BGP: withdraw BGP route(s) advertisements.
  • PIM (in vPC): vPC forwarding role transfer.
  • vPC: shutdown the vPC to bring down the vPC domain on the local switch.

 

GIR in action

Now, we can execute this command on core-sw-02 to see the results:

NX02(config)# system mode maintenance

Following configuration will be applied:

router bgp 1
isolate
router ospf 1
isolate
sleep instance 2 20
vpc domain 99
shutdown

NOTE: If you have vPC orphan interfaces, please ensure 'vpc orphan-port suspend' is configured under them, before proceeding further
Do you want to continue (yes/no)? [no] yes

Generating before_maintenance snapshot before going into maintenance mode

Starting to apply commands...

Applying : router bgp 1
Applying : isolate
Applying : router ospf 1
Applying : isolate
Applying : sleep instance 2 20
Applying : vpc domain 99
Applying : shutdown

Maintenance mode operation successful.

Waiting 120 seconds to allow network re-routing to occur before releasing CLI
........................done
NX02(maint-mode)(config)#

At that moment, we are in maintenance mode. We can power-off the switch or hot-swap line cards without any impact on the productive network. All the BGP sessions and OSPF adjacency are still up, but no route is sent to the external router via BGP, for example:

router# show bgp all sum
BGP summary information for VRF default, address family IPv4 Unicast
BGP router identifier 8.8.8.8, local AS number 100
BGP table version is 5, IPv4 Unicast config peers 2, capable peers 2
2 network entries and 2 paths using 472 bytes of memory
BGP attribute entries [2/320], BGP AS path entries [1/6]
BGP community entries [0/0], BGP clusterlist entries [0/0]

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
172.16.1.1      4     1       8       7        5    0    0 00:01:55 1         
172.16.2.1      4     1       7       7        5    0    0 00:01:54 0   

 

Now, let’s see when we execute the “graceful insertion”:

NX02(maint-mode)(config)# no system mode maintenance 

Following configuration will be applied:

vpc domain 99
  no shutdown
sleep instance 2 20
router ospf 1
  no isolate
router bgp 1
  no isolate

Do you want to continue (yes/no)? [no] yes

Starting to apply commands...

Applying : vpc domain 99
Applying :   no shutdown

Applying : sleep instance 2 20
Applying : router ospf 1
Applying :   no isolate
Applying : router bgp 1
Applying :   no isolate

Maintenance mode operation successful.

Waiting 120 seconds to allow network convergence before generating after_maintenance snapshot
.........................
Generating after_maintenance snapshot
Please use 'show snapshots compare before_maintenance after_maintenance' to check the health of the system
NX02(config)#

Again, we can see the system enter the commands in the reverse order, as suggested on the manual configuration above.

 

GIR custom profiles

Now, let’s see who to change the maintenance-mode and normal-mode profiles.

As we saw above, we can see the current profiles, generated by the system, with the command:

NX02# show maintenance profile 
[Normal Mode]
vpc domain 99
no shutdown
sleep instance 2 20
router ospf 1
no isolate
router bgp 1
no isolate

[Maintenance Mode]
router bgp 1
isolate
router ospf 1
isolate
sleep instance 2 20
vpc domain 99
shutdown

If we need to change or add something, here is the process:

  • Go in configuration mode and type: system mode maintenance always-use-custom-profile
    Like this, the system will not generate a new default profile, it will use the one you defined.
  • Update the profiles as you need, with the commands: config maintenance profile maintenance-mode | normal-mode
    Let’s see an example below:
NX02# config maintenance profile maintenance-mode 
Please configure 'system mode maintenance always-use-custom-profile' if you want to use custom profile always for maintenance mode.
Enter configuration commands, one per line. End with CNTL/Z.
NX02(config-mm-profile)# 
NX02(config-mm-profile)# interface e1/1
NX02(config-mm-profile-if-verify)# shut
NX02(config-mm-profile-if-verify)# exit

!--- Here we can see the changes with show maint profile:

NX02(config-mm-profile)# show maint profile 
[Normal Mode]
vpc domain 99
  no shutdown
sleep instance 2 20
router ospf 1
  no isolate
router bgp 1
  no isolate

[Maintenance Mode]
router bgp 1
  isolate
router ospf 1
  isolate
sleep instance 2 20
vpc domain 99
  shutdown
interface Ethernet1/1
  shutdown

NX02(config-mm-profile)#
NX02(config-mm-profile)# end
Exit maintenance profile mode.
NX02#

Any command is accepted on the maintenance profiles. We can also insert a delay before the next change (sleep instance x sec), or execute a Python script (python instance instance-number uri [python-arguments]). But, remember to add the “no” commands into the normal-mode profile to restore your change.

To delete the profiles, you can use the command:

NX02# no configure maintenance profile maintenance-mode 
Maintenance mode profile maintenance-mode successfully deleted
Enter configuration commands, one per line. End with CNTL/Z.
Exit maintenance profile mode.
NX02#
NX02# no configure maintenance profile normal-mode 
Maintenance mode profile normal-mode successfully deleted
Enter configuration commands, one per line. End with CNTL/Z.
Exit maintenance profile mode.
NX02#
NX02# show maintenance profile 
[Normal Mode]

[Maintenance Mode]

NX02# 
NX02# config t
Enter configuration commands, one per line. End with CNTL/Z.
NX02(config)# no system mode maintenance always-use-custom-profile 
NX02(config)# 

 

Snapshots

GIR automatically create a snapshot before and after the maintenance. The snapshot is capturing the running state of selected features and store them on the persistent storage media. This is useful to compare the state of the switch before graceful removal and after graceful insertion.

By entering show snapshots, we see the list of snapshots:

NX02# show snapshots 
Snapshot Name           Time                           Description
------------------------------------------------------------------------------
before_maintenance      Fri Jan 10 18:33:37 2020       system-internal-snapshot
after_maintenance       Fri Jan 10 18:58:42 2020       system-internal-snapshot

 

Compare snapshots

We can quickly compare the snapshots with the compare summary command:

NX02# show snapshots compare before_maintenance after_maintenance summary 

================================================================================
Feature                         before_maintenanceafter_maintenance changed 
================================================================================
basic summary
  # of interfaces                        133             133       
  # of vlans                               2               2       
  # of ipv4 routes vrf default            20              19         *
  # of ipv4 paths  vrf default            22              21         *
  # of ipv4 routes vrf keepalive           8               8       
  # of ipv4 paths  vrf keepalive           8               8       
  # of ipv6 routes vrf default             3               3       
  # of ipv6 paths  vrf default             3               3       
  # of ipv6 routes vrf keepalive           3               3       
  # of ipv6 paths  vrf keepalive           3               3       

interfaces
  # of eth interfaces                    128             128       
  # of eth interfaces up                   7               7       
  # of eth interfaces down               121             121       
  # of eth interfaces other                0               0       

  # of vlan interfaces                     2               2       
  # of vlan interfaces up                  1               0         *
  # of vlan interfaces down                1               2         *
  # of vlan interfaces other               0               0       
NX02#

Custom sections

You can add any section to the snapshots. Any command starting with show can be added to it.

For this, use the command: snapshot section add section “show-command” row-id element-key1 [element-key2]
Where, row-id is the tag of each row entry of the show command’s XML output. And the element_key1 and 2 are the row entries. In most cases only one element needs to be specified.

 

Example, if we want to add a custom section to see the IPv4 prefixes information:

  • First, execute the command you want to analyze with the XML output (here is a extract of the result):
NX02# show ip route detail vrf all | xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<nf:rpc-reply xmlns="http://www.cisco.com/nxos:1.0:urib" xmlns:nf="urn:ietf:para
ms:xml:ns:netconf:base:1.0">
 <nf:data>
  <show>
   <ip>
    <route>
     <__XML__OPT_Cmd_urib_show_ip_route_command_ip>
      <__XML__OPT_Cmd_urib_show_ip_route_command_unicast>
       <__XML__OPT_Cmd_urib_show_ip_route_command_topology>
        <__XML__OPT_Cmd_urib_show_ip_route_command_l3vm-info>
         <__XML__OPT_Cmd_urib_show_ip_route_command_rpf>
          <__XML__OPT_Cmd_urib_show_ip_route_command_ip-addr>
           <__XML__OPT_Cmd_urib_show_ip_route_command_protocol>
            <__XML__OPT_Cmd_urib_show_ip_route_command_summary>
             <__XML__OPT_Cmd_urib_show_ip_route_command_vrf>
              <__XML__OPT_Cmd_urib_show_ip_route_command___readonly__>
               <__readonly__>
                <TABLE_vrf>
                 <ROW_vrf>
                  <vrf-name-out>default</vrf-name-out>
                  <TABLE_addrf>
                   <ROW_addrf>
                    <addrf>ipv4</addrf>
                    <TABLE_prefix>
                     <ROW_prefix>
                      <ipprefix>0.0.0.0/32</ipprefix>
                      <ucast-nhops>1</ucast-nhops>
                      <mcast-nhops>0</mcast-nhops>
                      <attached>false</attached>
                      <TABLE_path>
                       <ROW_path>
                        <ifname>Null0</ifname>
                        <uptime>PT1H24M38S</uptime>
                        <pref>220</pref>
                        <metric>0</metric>
                        <clientname>broadcast</clientname>
                        <type>discard</type>
                        <ubest>true</ubest>
                       </ROW_path>
                      </TABLE_path>
                     </ROW_prefix>
                     <ROW_prefix>
                      <ipprefix>127.0.0.0/8</ipprefix>
                      <ucast-nhops>1</ucast-nhops>
                      <mcast-nhops>0</mcast-nhops>
                      <attached>false</attached>
                      <TABLE_path>
                       <ROW_path>
                        <ifname>Null0</ifname>
                        <uptime>PT1H24M38S</uptime>
                        <pref>220</pref>
                        <metric>0</metric>
                        <clientname>broadcast</clientname>
                        <type>discard</type>
                        <ubest>true</ubest>
                       </ROW_path>
                      </TABLE_path>
                     </ROW_prefix>
                     <ROW_prefix>
                      <ipprefix>255.255.255.255/32</ipprefix>
                      <ucast-nhops>1</ucast-nhops>
                      <mcast-nhops>0</mcast-nhops>
                      <attached>false</attached>
  • We can see the KEY element for each prefix is: <ROW_prefix>
  • And the route itself is: <ipprefix>

So, to add a “route” section into the snapshot, we have to enter the command:

NX02# snapshot section add route "show ip route detail vrf all" ROW_prefix ipprefix
added section "route"

 

To see the custom sections of the snapshot with the command:

NX02# show snapshots sections 
user-specified snapshot sections
--------------------------------
[route]
show command: show ip route detail vrf all
row id: ROW_prefix
key1: ipprefix
key2: -

NX02#

 

Custom section demo

Now, let’s create two snapshots including this custom section and compare them:

NX02# snapshot create TEST1 my test 
Executing 'show interface'... Done
Executing 'show ip route summary vrf all'... Done
Executing 'show ipv6 route summary vrf all'... Done
Executing 'show bgp sessions vrf all'... Done
Feature 'eigrp' not enabled, skipping...
Feature 'eigrp' not enabled, skipping...
Executing 'show vpc'... Done
Executing 'show ip ospf vrf all'... Done
Feature 'ospfv3' not enabled, skipping...
Feature 'isis' not enabled, skipping...
Feature 'rip' not enabled, skipping...
Executing user-specified 'show ip route detail vrf all'... Done
Snapshot 'TEST1' created
NX02# 

--- Now I add four static routes to see a difference ---

NX02# config t
Enter configuration commands, one per line. End with CNTL/Z.
NX02(config)# ip route 9.9.9.9 255.255.255.255 172.16.2.2
NX02(config)# ip route 9.9.9.10 255.255.255.255 172.16.2.2
NX02(config)# ip route 9.9.9.11 255.255.255.255 172.16.2.2
NX02(config)# ip route 9.9.9.12 255.255.255.255 172.16.2.2
NX02(config)# end

--- And now I make the 2nd snapshot ---

NX02# snapshot create TEST2 my test 
Executing 'show interface'... Done
Executing 'show ip route summary vrf all'... Done
Executing 'show ipv6 route summary vrf all'... Done
Executing 'show bgp sessions vrf all'... Done
Feature 'eigrp' not enabled, skipping...
Feature 'eigrp' not enabled, skipping...
Executing 'show vpc'... Done
Executing 'show ip ospf vrf all'... Done
Feature 'ospfv3' not enabled, skipping...
Feature 'isis' not enabled, skipping...
Feature 'rip' not enabled, skipping...
Executing user-specified 'show ip route detail vrf all'... Done
Snapshot 'TEST2' created
NX02#

 

Now we can compare them:

NX02# show snapshots compare TEST1 TEST2 ipv4routes

================================================================================
metric TEST1 TEST2 changed 
================================================================================
# of ipv4 routes 27 31 *

Prefix
--------------------
9.9.9.9/32 prefix not in TEST1 
9.9.9.10/32 prefix not in TEST1 
9.9.9.11/32 prefix not in TEST1 
9.9.9.12/32 prefix not in TEST1 
NX02#

 

Delete snapshots

To delete all snapshots:

NX02# snapshot delete ALL
All snapshots are successfully deleted
NX02#

 

More resources

Cisco Nexus 7K Series NX-OS Configuration Guide, Release 8.x – GIR Chapter

Cisco Nexus 9K Series NX-OS Configuration Guide, Release 7.x – GIR Chapter

Cisco Nexus 9000 Series GIR white paper (the cases studies are great)

Cisco-Live Data center Operations and Maintenance Best Practices (BRKDCT-2458)

Cisco NX-OS Tips and Tricks

 


Did you like this article? Please share it…

10 Comments

  1. riyan

    Hi Jerome,

    Interesting Content! I Have simulated in simulator and real device, while we into Maintenance Mode in local device (vpc domain – shutdown), why the peer switch impacted and the experience downtime on the service?

    • Hi Riyan,

      Thank you for your comment.

      Well, it could be due to several causes, I don’t know the context.
      On my side, I’ve used the maintenance mode several times in production and it was really transparent for directly attached devices in LACP/PortChannel to the vPC.

      Best,
      Jerome

  2. Hi Jerome, so I did a bit more digging and found the following command:
    “system mode maintenance on-reload reset-reason MAINTENANCE”

    This ensures that if it is in maint mode before being reset, it will come up in maint mode, there are various other options fore ‘reset-reason’ but clearly this is the one we wanted!

    This seemed to do the trick and all went well with the move of 2 x 7k chassis.

    • Hi James,

      Thanks a lot for your comment!
      I just checked on my lab device, yes this is the option you are looking for:

      (config)# system mode maintenance ?

      always-use-custom-profile Always use custom profile when entering maintenance mode
      dont-generate-profile Do not generate the maintenance/normal-mode profile
      maint-delay Delay to allow protocol reroute before releasing CLI
      non-interactive Do operation non interactively in background
      on-reload On reload maintenance mode configuration
      shutdown Issue shutdown instead of isolate (default)
      snapshot-delay Delay after which after_maintenance snapshot will be taken
      timeout Restart maintenance mode timer with a new value

      Thank you for sharing this.
      Best Regards,
      Jerome

  3. James

    Hi, firstly thanks for the explanation. Do you know if the maintenance mode survives a reboot?
    So if we enable it on a switch, power it off, move it and the power back on will it still be in maintenance mode so we can gracefully re-insert.
    The above mentions you can power off without impact, but when you power back on what mode should you expect?

    • Hi James,

      Thank you for your comment and question. That’s a very good question. I never tried to bring up a switch with system mode maintenance on after a reboot. If I had to power off the switch, I made a system mode maintenance, and once every change were done, I did the power off. Without saving the config.

      But I think you can save the configuration with the system mode maintenance on state, yes.
      So, do the system mode maintenance command, wait for all changes are done, do a write mem, and then you can power off the switch.
      When you power on the switch again, my guess is the config will be saved with the vpc shutdown, the routing protocols in isolate state, and so on.
      And then, after all links and routing protocol neighbors are up, you can do a “no system mode maintenance”.

      Please, add a comment here with your results if you tried.

      Thank you,
      Jerome

  4. Richard

    Great content – much better than the Cisco doc on the same topic.
    Does isolate not affect redistributed routes? Not sure if I’m hitting a bug on a N9K – 9.3.5.
    When I isolate the BGP process, the routes being redistributed from EIGRP stay in the table and show as advertised prefixes on the neighbor.

    • Hi Richard,

      Thank you for your comment.
      I never tried with BGP routes redistributed into EIGRP, so I cannot answer. But my guess is all BGP received routes should be removed, so not redistributed too.

      Best,
      Jerome

Leave a Reply

Your email address will not be published. Required fields are marked *