Do NOT install HP OMI agent v12 on NNMi9 Servers!

As per title, do NOT install/upgrade OMI agent on NNMi boxes to v12, as this will overwrite files in the /opt/OV/nonOV/perl directory. NNMi9 relies on Per 5.8.8 however 5.16.0 files will be installed when the OMI agent is installed.

Likely the first issues you will see will be NNMi self monitoring alerts, and then you’ll notice that you can’t run any of the .ovpl commands any more, getting a message like this:

Can't locate OVNNMvars.pm in @INC (@INC contains: /opt/OV/nonOV/perl/a/lib/site_perl/5.16.0/x86_64-linux-thread-multi /opt/OV/nonOV/perl/a/lib/site_perl/5.16.0 /opt/OV/nonOV/perl/a/lib/5.16.0/x86_64-linux-thread-multi /opt/OV/nonOV/perl/a/lib/5.16.0 /opt/OV/nonOV/perl/a/lib/site_perl/5.16.0/x86_64-linux-thread-multi /opt/OV/nonOV/perl/a/lib/site_perl/5.16.0 /opt/OV/nonOV/perl/a/lib/site_perl .) at /opt/OV/bin/nnmversion.ovpl line 19.
 BEGIN failed--compilation aborted at /opt/OV/bin/nnmversion.ovpl line 19.

It wasn’t fun getting back to a known state but we managed in the end by looking at what files the agent had installed/modified and restoring backups. Avoid the pain by not getting into the situation in the first place – also make sure you’re not just backing up data files but binaries as well.

The latest version of NNM10 uses Perl 5.16.0 I believe so this shouldn’t be an issue. I’d personally still avoid installing any product that uses the same file structure.

NNMi 9.24 Custom Poller Bus Adapter Errors

I was getting the following alerts regarding the custom poller on NNMi which was annoying users with the constant on/off status alerts.

“The Performance SPI Custom Poller Bus Adapter has status Critical because the average input queue duration…..”
“The CustomPoller Export Bus Adapter has status Minor because file space limit (2,000 MB) has been reached and older export data files are being removed to make room for new files.”

After some digging, it seems that the older files weren’t being deleted so this issue had actually crept up over time.

Apparently this was fixed by patch 5 but a workaround until then is to delete older files manually as follows. I strongly suggest running the find command without the “| xargs rm” part first to verify that you are indeed only finding regular files within the correct directory.

# cd /var/opt/OV/shared/nnm/databases/custompoller/export/final
# find . -mtime +365 -type f | xargs rm

Check the usage is under the size limit:

du -hs .
649M    .

Spoofing SNMP Traps for testing

Somehow I missed the fact that JunOS allows you to spoof SNMP traps. I discovered this recently and must say it’s very handy, especially when testing new NNMi or other NMS incident configurations. It helpfully populates the varbinds for you with preset values (although you can specify them if desired).

This is done as follows from the JunOS command line:

user@SRX> request snmp spoof-trap ospfNbrStateChange variable-bindings "ospfNbrState = 8"

You can look up varbind names in the appropriate MIB.

A bit easier than the traditional equivalent in shell:

/usr/bin/snmptrap -v 2c -c mycommunity nms.mydomain.com:162 '' .1.3.6.1.2.1.14.16.2.0.2 \
.1.3.6.1.2.1.14.1.1 a 0.0.0.0 \
.1.3.6.1.2.1.14.10.1.1 a "1.2.3.4" \
.1.3.6.1.2.1.14.10.1.2 i "0" \
.1.3.6.1.2.1.14.10.1.3 a "2.3.4.5" \
.1.3.6.1.2.1.14.10.1.6 i "8"

HP NNMi9 False Redundancy Group alerts/Cisco Nexus Duplicate Discoveries

It seems that HP are getting around to fixing the issue of duplicate Cisco Nexus nodes being discovered (it’s not in patch 5) but until then, it’s possible to work around this. Duplicate discoveries play havoc with RRG alerting which isn’t funny when someone’s woken up in the middle of the night for it.

To stop NNMi discovering duplicate nodes once you have the devices in your topology, do the following:

1) Create an Auto-Discovery rule with the lowest ordering in the list (eg: No-AutoDiscover-Rule)
2) Uncheck all options on the left hand pane and uncheck Enable Ping Sweep
3) Add the IP addresses of all the mgmt0 interfaces in the management VRF (Type: Include in rule)

HP NNMi LDAP user account Can’t log in

I had a strange one today where a user set as a Directory service account couldn’t log in. No matter which computer or browser the user was coming from, he couldn’t log in and got the odd message below. The only thing that had happened is that he’d changed his password recently:

ldaperror

I’ve had issues in the past where the AD password has been changed to one that’s been used previously – NNMi must do some sort of caching – but this was different. I was assured that the password hadn’t been used before.

Solution:

1) Delete the User account
2) Recreate the user account and mappings but create it as a local user (uncheck Directory service account and put in a temporary password)
3) Log on with the user name/password from a PC that the user hasn’t used before (ie: your PC)
4) Change account back to a Directory Service Account

The problem should be rectified. Clearing temporary internet files, cookies etc didn’t seem to help at all so again, I think there’s some caching going on within NNMi that breaks things.

Very annoying and I’ll be raising a bug report.