Recovering a Cisco 3850 that’s stuck in Boot Loop/ROMMON

This wasn’t even after an upgrade, just a reload, but the below messages were displayed before an unmounts and reboot of the switch:

Start type: SRV_OPTION_RESTART_STATELESS (23)
Death reason: SYSMGR_DEATH_REASON_FAILURE_SIGNAL (2)
Last heartbeat 0.00 secs ago

PID: 9225
Exit code: signal 11 (no core)

Quick steps:

1) Set up a local switch with the correct image on is as a tftp server to expedite fix as tftp is painful.

tftp-server flash:[image-name.bin]

2) Configure an interface with an IP to allow for direct connection to the management port of the broken switch.

interface GigabitEthernet1/0/47
 description -= temp for tftp to other switch =-
 no switchport
 ip address 192.168.0.1 255.255.255.0

3) Connect (or ask someone on site to connect) the interface to the management port of the broken switch (duh!)

4) On broken switch, hold MODE for 10 seconds to interrupt boot loop, or just wait for 5 or so failures for it to drop to ROMMON.

5) Set up IP info. GW only necessary if you’re tftping from a server outside the local subnet.

switch: set IP_ADDR 192.168.0.2/255.255.255.0

switch: set DEFAULT_ROUTER 192.168.0.1

6) Check for emergency files (you’re looking for cat3k_caa-recovery.bin or similar.

switch: dir sda9:

7) Ping the tftp server

switch: ping 192.168.0.1
ping 192.168.0.1 with 32 bytes of data ...
Up 1000 Mbps Full duplex (port  0) (SGMII)
Host 192.168.0.1 is alive.

8) Start the tftp emergency install. On a local connection this will take 10-20 mins.

switch: emergency-install tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin
The bootflash will be erased during install operation, continue (y/n)?y
Starting emergency recovery (tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin)...
Reading full image into memory......................done
Nova Bundle Image
--------------------------------------
Kernel Address    : 0x6042e5d8
Kernel Size       : 0x31794f/3242319
Initramfs Address : 0x60745f28
Initramfs Size    : 0xdbec9d/14412957
Compression Format: .mzip

Bootable image at @ ram:0x6042e5d8
Bootable image segment 0 address range [0x81100000, 0x81b80000] is in range [0x80180000, 0x90000000].
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
File "sda9:cat3k_caa-recovery.bin" uncompressed and installed, entry point: 0x811060f0
Loading Linux kernel with entry point 0x811060f0 ...
Bootloader: Done loading app on core_mask: 0xf

### Launching Linux Kernel (flags = 0x5)



Initiating Emergency Installation of bundle tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin


Downloading bundle tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin...

Validating bundle tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin...
Installing bundle tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin...
Verifying bundle tftp://192.168.0.1/cat3k_caa-universalk9.SPA.03.03.03.SE.150-1.EZ3.bin...
Package cat3k_caa-base.SPA.03.03.03SE.pkg is Digitally Signed
Package cat3k_caa-drivers.SPA.03.03.03SE.pkg is Digitally Signed
Package cat3k_caa-infra.SPA.03.03.03SE.pkg is Digitally Signed
Package cat3k_caa-iosd-universalk9.SPA.150-1.EZ3.pkg is Digitally Signed
Package cat3k_caa-platform.SPA.03.03.03SE.pkg is Digitally Signed
Package cat3k_caa-wcm.SPA.10.1.130.0.pkg is Digitally Signed
Preparing flash...
Syncing device...
Emergency Install successful... Rebooting
Restarting system.

Once this is done it’ll try and boot again. You need to disable manual boot.

The system is not configured to boot automatically.  The
following command will finish loading the operating system
software:

    boot

switch: set MANUAL_BOOT no
switch: boot

Also remember to change the confreg value if it’s not 0x102 on the 3850. In this case it wasn’t needed. (show version last line)


Configuration register is 0x102

Don’t forget to remove the tftp-server config and temporary stuff. :)

Cisco 4900M ROMMON Recovery

Cisco 4900Ms are interesting platforms to work with. They are fantastic as layer 2 devices, but with MLS images, I’ve had some “challenges” in the past that resulted in the need for a ROMMON recovery. One of these issues was the switch deciding to format its flash after a reboot (wonderful!).

The ROMMON recovery procedure is a bit different to the norm (in that you use the management port) so it’s posted below. The TFTP server is 10.10.10.48 as an example. All configuration is done via the console port on a remote terminal server.

rommon 1 >set interface fa1 10.0.0.1 255.255.255.0
rommon 2 >set ip route default 10.0.0.254
rommon 3 >boot tftp://10.10.10.48/cat4500e-entservicesk9-mz.122-54.SG1.bin

The switch then boots up via tftp – this may take quite a while depending on bandwidth and latency. As a reference, booting off that image over a 70ms latency Gig link via tftp took 45 minutes so you will have to be patient unless you have someone local with a laptop that can run a server to load the image from. You’ll see the following (love the Star Wars references on this platform!):

Tftp Session details are ....

 Filename     : /cat4500e-entservicesk9-mz.122-54.SG1.bin
 IP Address   : 10.0.0.1
 Loading from TftpServer: 10.10.10.48

 Received data packet #  50659

 Loaded 25936915 bytes successfully.

Rommon reg: 0x00004380
Reset2Reg: 0x00001FFF
#
Tatooine controller 0x0B46AD69..0x0B4D21F5 original size:0x0012635C##
Forerunner controller 0x0B4D21F6..0x0B593DC6 original size:0x001CE7CF
##################
diagsk5 version 4.1.6

You now have to get an image on to the switch seeing as the IOS image is only in memory, not on flash. You will also need to make a basic config to save. Use FTP at the very least to copy the image down or you will be waiting forever. As per a previous post, the config register must be changed on this platform or you will have headaches later!

Switch>en
Switch#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
Switch(config)#ip vrf mgmtVrf
Switch(config-vrf)#vtp domain null
Switch(config)#vtp mode transparent
Setting device to VTP Transparent mode for VLANS.
Switch(config)#interface FastEthernet1
Switch(config-if)# ip vrf forwarding mgmtVrf
Switch(config-if)# ip address 10.0.0.1 255.255.255.0
Switch(config-if)# speed auto
Switch(config-if)# duplex auto
Switch(config-if)#ip route vrf mgmtVrf 10.10.10.0 255.255.255.0 10.0.0.254
Switch(config)#ip ftp source Fa1
Switch#copy ftp://user:password@10.10.10.48//tftpboot/cat4500e-entservicesk9-mz.122-54.SG1.bin bootflash
Destination filename [cat4500e-entservicesk9-mz.122-54.SG1.bin]?
!!!!(truncated)
[OK - 25936915/4096 bytes]

25936915 bytes copied in 528.374 secs (49088 bytes/sec)
Switch#dir
Directory of bootflash:/

   19  -rw-    25936915  May 13 2013 09:42:20 +00:00  cat4500e-entservicesk9-mz.122-54.SG1.bin

122007552 bytes total (88629248 bytes free)

Switch(config)#boot system flash bootflash:cat4500e-entservicesk9-mz.122-54.SG1.bin 
Switch(config)#config-register 0x2102
Switch(config)#wr mem
Building configuration...
Compressed configuration from 2735 bytes to 1154 bytes[OK]
Switch#sho bootvar
BOOT variable = bootflash:cat4500e-entservicesk9-mz.122-54.SG1.bin,1;
CONFIG_FILE variable does not exist
BOOTLDR variable does not exist
Configuration register is 0x102 (will be 0x2102 at next reload)
Switch#reload