ASA stuck on “Booting system, please wait…” after power cycle.

A Cisco ASA5550 was stuck on “Booting system, please wait…” after a power cycle (physically turning off and on, not just a reload). It was impossible to break into ROMMON from here.

After taking the cover off and doing some experimentation, the issue was found to be a faulty DIMM slot. We removed all DIMMs and replaced them one by one. Upon placing a DIMM in slot 1, the firewall failed to boot again. We swapped DIMMs 1 and 2, still no joy.

Removing the DIMM from slot 1 again meant that the firewall came back up.

Solution: If an RMA is going to take time, bring the firewall back up with less memory. Otherwise, swap the thing out straight away as you don’t know what else the power cycle has fried!

You can make your life easier doing the replacement by moving the old ASA’s compact flash card to the external (disk1) slot of the new one. You can then get the right OS on the replacement quickly at least.

NEWASA# copy disk1:/[IOS-image-name.bin] disk0:/[IOS-image-name.bin]
NEWASA# boot system disk0:/[IOS-image-name.bin]
NEWASA# wr mem

Note: you can copy the running-config on the old ASA to a visible file on CF (eg: copy running-config disk0:/myconfig.cfg), but copying that from disk1 to running-config on the replacement tends to not work very well if you have TACACS config in place. Good old copy and paste from a backup is the way forward.

Splunk Cisco ASA App – Getting it working!

There are some apps on splunkbase for Cisco Firewalls (in particular a Cisco Security Suite and Cisco ASA App) – these work well but there are a few gotchas that stop this app from working.

Prerequisites: Install the latest Sideview Utils from http://sideviewapps.com/apps/sideview-utils and install the Google Maps app from splunkbase.

1) Ensure that you have a “firewall” index created and searchable by the appropriate roles. Be careful if the firewall index is owned by another app; if you remove that app then the index will disappear and you’ll wonder why this one is no longer working!

2) Ensure that the source is being tagged for the “firewall” index (if using a forwarder, you need to set index = firewall in the monitor statement)

3) Copy the etc/apps/Splunk_CiscoFirewalls/default/transforms.conf and props.conf files into the etc/apps/Splunk_CiscoFirewalls/local directory, and edit the local version of transforms.conf so that the the asa sourcetype is correctly set. This must depend on software version but one is commented out here. You may need to swap these around: certainly on 8.2 the log format is ASA- and not ASA–

[force_sourcetype_for_cisco_asa]
DEST_KEY = MetaData:Sourcetype
REGEX = %ASA-\d+-\d+
#REGEX = %ASA--\d+-\d+
FORMAT = sourcetype::cisco_asa

If you really need to cater for both eventualities, then you could use:

REGEX = %ASA--?\d+-\d+

4) I also came across an issue where the sourcetypes were being correctly set, but the host field was incorrectly being detected as the machine running my light forwarder. I got around this by editing the etc/apps/Splunk_CiscoFirewalls/local/props.conf file and changing the first TRANSFORMS line, adding syslog-host as the final entry:

#[source::...cisco]
TRANSFORMS-force-sourcetype_for_cisco_devices = force_sourcetype_for_cisco_pix, force_sourcetype_for_cisco_asa, force_sourcetype_for_cisco_fwsm, force_sourcetype_for_cisco_acs, force_sourcetype_for_cisco_ios, force_sourcetype_for_cisco_catchall, syslog-host

5) This app also has a cisco “catch-all” sourcetype formatter which may cause problems with other apps (eg: they might expect sourcetype=syslog or cisco_syslog). You may want to comment this out because it’s not exhaustive and will result in some of your cisco logs being split sourcetype:

#[force_sourcetype_for_cisco_catchall]
#DEST_KEY = MetaData:Sourcetype
#REGEX = :\s\%((SNMP|CDP|FAN|LINE|LINEPROTO|RTD|SYS|C\d+_[^-]+)-\d+-\S+)
#FORMAT = sourcetype::cisco

Splunk Field Extractions for Juniper SRX

The first two were found elsewhere on the web but I noticed there was no deny event extraction so made my own.

To make your SRX send syslogs, the following example can be modified. You might find it easier to use local facilities to split out your logs by type using syslog-ng. Be sure to monitor performance as you enable logging – lots of logging on a extremely busy firewall may generate a fair bit of extra CPU overhead.

SRX Config

    syslog {
         host 192.168.1.100 {
            any any;
            match RT_FLOW_SESSION;
            facility-override local5;
            source-address 10.0.1.254;
        }
    }

Set your desired policies to log… eg:

edit security policies from-zone trust to-zone untrust
set policy web-traffic-outbound then log session-init session-close
set policy default-drop-trust-untrust then log session-init session-close

Splunk Extractions

Create events

RT_FLOW_SESSION_CREATE:\ssession\screated\s(?P<srx_src_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_src_port>\d+)\D+(?P<srx_dst_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_dst_port>\d+)\s(?P<srx_svc_name>\S+)\s(?P<srx_nat_src_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_nat_src_port>\d+)\D+(?P<srx_nat_dst_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_nat_dst_port>\d+)\s(?P<srx_src_nat_rule_name>\S+)\s(?P<srx_dst_nat_rule_name>\S+)\s(?P<srx_protocol_id>\d+)\s(?P<srx_policy_name>\S+)\s(?P<srx_src_zone>\S+)\s(?P<srx_dst_zone>\S+)\s(?P<srx_sess_id>\d+) 

Close events

RT_FLOW_SESSION_CLOSE:\ssession\sclosed\s(?P<srx_closed_reason>[^:]+)\D+(?P<srx_src_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_src_port>\d+)\D+(?P<srx_dst_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_dst_port>\d+)\s(?P<srx_svc_name>\S+)\s(?P<srx_nat_src_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_nat_src_port>\d+)\D+(?P<srx_nat_dst_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_nat_dst_port>\d+)\s(?P<srx_src_nat_rule_name>\S+)\s(?P<srx_dst_nat_rule_name>\S+)\s(?P<srx_protocol_id>\d+)\s(?P<srx_policy_name>\S+)\s(?P<srx_src_zone>\S+)\s(?P<srx_dst_zone>\S+)\s(?P<srx_sess_id>\d+)\s(?P<srx_pkts_from_client>\d+)\((?P<srx_bytes_from_client>\d+)\)\s(?P<srx_pkts_from_server>\d+)\((?P<srx_bytes_from_server>\d+)\)\s(?P<srx_sess_elapsed_time>\d+) 

Deny Events

RT_FLOW_SESSION_DENY:\ssession\sdenied\s(?P<srx_src_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_src_port>\d+)\D+(?P<srx_dst_ip>\d+\.\d+\.\d+\.\d+)\/(?P<srx_dst_port>\d+)\s(?P<srx_svc_name>\S+)\s(?P<srx_protocol_id>\d+)\((?P<srx_icmp_type>\d+)\)\s(?P<srx_policy_name>\S+)\s(?P<srx_src_zone>\S+)\s(?P<srx_dst_zone>\S+) 

Policy Action Field (common)

RT_FLOW:\s\S+:\ssession\s(?P<srx_policy_action>\S+) 

Next personal project: Write a decent Splunk app for SRX!

Checkpoint Firewall high interrupt CPU%

When this issue occurred, top was showing that the large majority of CPU was of interrupt category despite low traffic levels. Failing over to the secondary member of the cluster did not fix the problem; the fault moved. This issue can be reproduced on Nokia IP Appliances running IPSO and newer Checkpoint platforms running Gaia.

last pid: 59653;  load averages:  0.05,  0.07,  0.02   up 571+16:11:35 12:19:30
45 processes:  1 running, 44 sleeping
CPU states:  1.8% user,  0.0% nice,  1.8% system, 86.1% interrupt, 10.4% idle
Mem: 248M Active, 1321M Inact, 218M Wired, 72M Cache, 99M Buf, 143M Free
Swap: 4096M Total, 4096M Free

ps -aux was showing high CPU time consumed by [swi1: net_taskq0].

cpfw[admin]# ps -aux
USER   PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
root    14 98.2  0.0     0    16  ??  RL   10Feb12 65517:46.72 [swi1: net_taskq0]

Running netstat -ni showed errors incrementing on a few interfaces. At first this seemed like a hardware issue so failover to secondary was initiated. The problem moved to the other firewall.

After more digging, the culprit was found to be some new traffic streams of low bandwidth, but extremely high packet rate (in this case, some UDP syslog forwarding to a host beyond the firewall). A layer 3 switch at the source end was also having some issues so some of the traffic patterns may have been anomalous, compounding the issue.

This traffic was not permissioned on the firewall so was being matched by the drop rule. It seems that having a large rule base makes this issue even worse as traffic at a rate of thousands of packets per second is consuming a lot of CPU cycles. It was noted that adding a rule to permission the traffic near the top of the rule base dropped CPU usage significantly.

It makes sense to assume that as these streams are hitting the drop rule very frequently, rapid evaluations of the entire rulebase are taking place. The handling of “flows” for UDP traffic is probably more limited than is implied in IPSO/Gaia documentation.

It is worth enabling monitoring and finding this sort of traffic to allow you to create or move appropriate rules near the top of the rulebase to avoid unnecessary extra processing, especially if your rulebase is in the order of hundreds of rules.

I suppose you could conclude that you could quite easily DoS a policy-heavy checkpoint firewall by throwing a rapid stream of UDP packets to a far-side destination that doesn’t match anywhere in the rulebase. Note that this issue was encountered on an internal firewall where IPS was NOT enabled. IPS may mitigate this problem.

Resolve MAC addresses to Port, IP and DNS Name

Resolving MAC address to port, IP and DNS or name service name (or more simply for some, resolve mac to name) is a challenge that every network engineer has come across at some point in their career. It’s easily solved with a bit of thought and logic. Unfortunately the past few products I’ve dealt in the past with for this purpose have either been abandoned or aren’t as multi-vendor as I’d like, so it seems that the only solution is to write your own… bash and expect is sufficient.

If you’re thinking about doing this (and it’s a great learning exercise), you need to get around the following:

– Determining which interfaces are trunks on the switches so you can strip those MAC entries out (CDP works quite well)
– Converting ARP and MAC info into a “clean” format (eg: CatOS and IOS output is a different format)
– Detecting the fields across various pieces of hardware as display output isn’t always consistent for the same commands
– Inconsistent logins/passwords
– Correlating the IP/MAC/Interface information together. This can be done with the UNIX join command and some awk/sed
– What you do with MACs that don’t resolve to an IP address (I include a flag to print these if required)
– Whether the machine you run DNS queries on will be able to resolve the IPs to PTR records
– If using expect, stripping out stray characters (eg \r) that will mess up your greps and other string searches
– Add plenty of debugging so you can quickly tell why something isn’t working properly

I used expect to go and grab the ARP, CDP and MAC information seeing as you can’t get all the required information from SNMP on many devices these days. In my case, this results in the following type of output:

Switch       Interface       VLAN  MAC             IP               DNSName
nycsw12      Fa3/10          100   0060.b0aa.0000  192.168.10.30    NO_DNS
nycsw12      Fa2/16          99    1060.4b61.0001  192.168.9.72     nyc-pc573.company.corp.
nycsw12      Fa2/37          101   1060.4b64.0002  192.168.11.78    nyc-pc555.company.corp.
nycsw12      Fa2/42          101   1060.4b68.0003  192.168.11.115   nyc-pc572.company.corp.
nycsw12      Fa2/45          98    1060.4b6a.0004  192.168.8.99     nyc-pc588.company.corp.
nycsw12      Fa2/32          98    1060.4b6a.0005  192.168.8.121    nyc-pc601.company.corp.
nycsw12      Fa3/3           100   2c41.389e.d19f  192.168.10.99    nyc-pc480.company.corp.
nycsw13      Fa2/4           100   5c26.0a01.0ac4  192.168.10.67    nyc-pc246.company.corp.
nycsw13      Fa2/6           100   6c3b.e531.2ddf  192.168.10.85    nyc-pc745.company.corp.

Of course, you can always just use Excel to do a VLOOKUP of your mac-address table output against a sorted table containing all your arp entries, but that’s a bit less automatic.