/ Nagios ^

NPS Loadbalancing and Monitoring

Last updated 21 Oct 2022

Guide to loadbalancing Microsoft's RADIUS implementation, Network Policy Server (NPS), with Loadbalancer.org appliances and monitoring it with Nagios Core with FreeRADIUS.

Microsoft NPS

This assumes you have already installed the NPS role on the real servers and have an SSL certificate with SANs including the server names and the service name (eg nps1.domain.com, nps2.domain.com and nps.domain.com), even though the certificates will not be used by Nagios or the loadbalancers. You need to create an Active Directory (AD) group (eg NPS-Monitor) and an AD user (eg npsmonitor01) added to the group.

From Administrative Tools, open Network Policy Server, under Templates Management, create a new Shared Secrets template (eg NPS Monitor - SS) and a shared secret. Then under RADIUS Clients and Servers, create your clients and select the NPS Monitor - SS you just created. If your loadbalancers have multiple NICs and/or are in clustered configuration you will need to add a client for each NIC on each appliance. Under Policies, create a new Connection Request Policy, (eg. NPS Monitor - CP), with the condition that User Name has the value of your AD user (eg npsmonitor01).

Create a new Network Policy (eg NPS Monitor - NP) with the following settings:

Ignore User Dial-In Properties  False
Access Permission               Grant Access
Authentication Method           Unencrypted authentication (PAP, SPAP)
Framed-Protocol                 PPP
Service-Tpye                    Framed
BAP Percentage of Capacity      Reduce Multlink if server reaches 50% for 2 minutes
Encryption Policy               Enabled
Encryption                      Basic encryption (MPPEE 40-bit), Strong encryption (MPEE 56-bit), Strongest encryption (MPPE 128-bit)

Assign an IP address for the loadbalanced service and create loopback adaptors on the NPS servers with this address by running hdwwiz, manually select the hardware from a list > Network adapters > Microsoft > Microsoft KM-TEST Loopback Adapter. Once it's created, edit its properties, untick all but Internet Protocol Version 4 (TCP /IPv4), edit its properties, set the IP address to your loadbalanced service address with subnet mask 255.255.255.255 and under Advanced > DNS, untick the option to register this address in DNS. In the NPS management tool, confirm the service is not restricted to a single IP of the server, restart the IAS service (described as Network Policy Server) and use netstat to confirm the server is listening on the NPS ports with both addresses. By default, it is not limited to any specific interface but you can check under NPS management tool > NPS (Local) > Properties > Ports and confirm only the port numbers are listed without IP addresses (as described by Microsoft here).

If you have accounting set up to log to SQL, you need to give the computer account (eg DOMAIN.COM\SERVERNAME$) permissions on the database and should set up a job to clear the entries that are using the loadbalancing/monitoring connection profile, otherwise your logs will be filled with them. You can configure a username to be ignored from the accounting logs by adding a 'Ping user-name' key to the registry of the NPS servers (as described by Microsoft here) but that can't be used here because it will also reject the NPS authentication request while the FreeRADIUS checks from Nagios and Loadbalancer.org require the authentication request to be successful.

Loadbalancer.org

On the loadbalancers, create a new Layer 4 service on UDP ports as defined in the NPS management tool under NPS (Local) > Properties > Ports (default is 1812 and 1813), with forwarding method Direct Routing (and the official Loadbalancer.org guidance for loadbalancing Always-On VPN recommends a session persistence of 28800 seconds but I think you could get away with a much lower value). Use a Negotiate health check on port 1812 with RADIUS (IPv4 only) protocol and enter the credentials for the AD account npsmonitor01 along with the shared secret defined in the shared secret template. Ensure the AD account is either configured to never be locked out or limited so it can only log in to the NPS servers – if the account supplies the wrong password or shared secret during the health check it can very quickly get locked out which will cause all future checks to fail and the loadbalanced service will go down. On the NPS servers, you can check the logs to see if the health checks are failing eg due to locked account or not matching a RADIUS client or network policy – go to Event Viewer > Custom Views > ServerRoles > Network Policy and Access Services. If the loadbalancers show erratic behaviour with the real servers being constantly added and removed from the service, the appliances may require an extra CPU.

Nagios

If you also want to monitor it in Nagios, make sure you create a new RADIUS client on the NPS servers and create a new shared secret to use.

Ensure radiusclient-ng is installed then configure a config file eg /etc/radiusclient-ng/radiusclient.conf where the settings authserver and acctserver will define your NPS servers along with a servers file defined which contains the shared secrets for the servers. Config file settings should look like this:

# RADIUS server to use for authentication requests. this config
# item can appear more then one time. if multiple servers are
# defined they are tried in a round robin fashion if one
# server is not answering.
# optionally you can specify a the port number on which is remote
# RADIUS listens separated by a colon from the hostname. if
# no port is specified /etc/services is consulted of the radius
# service. if this fails also a compiled in default is used.
authserver      nps01.domain.com
authserver      nps02.domain.com
authserver      nps.domain.com


# RADIUS server to use for accouting requests. All that I
# said for authserver applies, too.
acctserver      nps01.domain.com
acctserver      nps02.domain.com
acctserver      nps.domain.com

# file holding shared secrets used for the communication
# between the RADIUS client and server
servers         /etc/radiusclient-ng/servers

Then your servers file should look something like this:

#Server Name or Client/Server pair              Key
#----------------                               ---------------
nps01.domain.com                                LSLvsacklSKL92nl-csk
nps02.domain.com                                LSLvsacklSKL92nl-csk
nps.domain.com                                  LSLvsacklSKL92nl-csk

Make sure you have the check_radius plugin for Nagios then define a command using it:

define command {
  command_name    check_nps
  command_line    $USER1$/check_radius -F /etc/radiusclient-ng/radiusclient.conf -H $ARG1$.domain.com -P 1812 -u $_HOSTNPSUSER$ -p $_HOSTNPSPASS$
}

I've used host macros in this example so that I can use a single service check with variables in the host definitions (I've also manually entered the expiry date of the NPS SSL certificate so it can be monitored as well).

define hostgroup {
  hostgroup_name  nps-servers
  alias           Network Policy Servers
}

define host {
  use             windows-server

  hostgroups      windows-servers,nps-servers,aovpn-group
  host_name       nps01
  alias           NPS01
  address         10.0.0.10
  notes           Network Policy Server
  _npsuser        npsmon01
  _npspass        xLiASCPaeocje098
  _npsvip         nps
  _sslcrt         2021-10-28
}

Now we can build a service check against the hosts and also one against the VIP:

define service {
  service_description     NPS Availability
  hostgroup_name          nps-servers
  use                     generic-service
  check_command           check_nps!$HOSTNAME$
  notes                   Checks NPS availibility required for AOVPN clients
  check_interval          15
  contacts                +admin
}

define service{
  service_description     NPS VIP Availability
  hostgroup_name          nps-servers
  use                     generic-service
  check_command           check_nps!$_HOSTNPSVIP$
  notes                   Checks NPS VIP $_HOSTNPSVIP$.domain.com availibility required for AOVPN clients
  check_interval          15
  contacts                +admin
}

If you have something like NSClient++ installed on the NPS servers, you can use a Powershell script to check the certificate instead of entering the date manually:

# common name of certificate
$cn = 'nps.domain.com'

# thresholds in days
$warn = 30
$crit = 14

# find the cert
$certdate = (get-childitem -path cert:\localmachine -recurse | where-object {$_.subject -eq "CN=$cn"} | select -expandproperty NotAfter)

# calculate the difference
$datediff = ($certdate - (get-date))

# format the expiry date
$datestring = ($certdate).tostring("dd MMM yyyy")

# set the standard output format
$body = "Certificate for $cn expires on $datestring"

if ($datediff.days -lt 0)
    {echo "CRITICAL: Certificate for $cn expired on $datestring"; exit 2}

elseif ($datediff.days -lt $crit)
    { echo "CRITICAL: $body"; exit 2 }

elseif ($datediff.days -lt $warn)
    { echo "WARNING: $body"; exit 1 }

elseif ($datediff.days -gt $warn)
    { echo "OK: $body"; exit 0 }

else
    { echo "UNKOWN: Unable to determine when certificate for $cn expires"; exit 3}


Return to roddie.digital / top