Navigation
This version of the documentation is archived and no longer supported. To learn how to upgrade your version of MongoDB Ops Manager, refer to the upgrade documentation.
You were redirected from a different version of the documentation. Click here to go back.

Diagnostic and Troubleshooting Guide

This document provide troubleshooting advice for common issues encountered installing the On Prem MMS Monitoring agent. Begin by working through the checklist below to ensure issues are not easily resolved. Questions and answers also are listed below for issues not caused by easily fixed installation or connectivity problems.

For answers to other questions, see the monitoring FAQ.

Getting Started Checklist

Most problems with MMS are the result of issues with installation, connectivity, and other problems easily resolved. To begin troubleshooting, complete these tasks:

  1. Authentication Errors
  2. Check Agent Output or Log
  3. Confirm Only One Agent is Active and Running
  4. Ensure Connectivity Between Agent and Monitored Hosts
  5. Ensure Connectivity Between Agent and MMS Server
  6. Allow Agent to Discover Hosts and Collect Initial Data

Installation

Authentication Errors

If your MongoDB instances run with authentication enabled, ensure MMS has these credentials. For new hosts, click the Add Host button on the Hosts page then specify credentials for every host with authentication enabled. For hosts already listed in MMS, click the gear icon to the right of a host name on the Host page then select Edit Host to provide credentials.

Please consult the Authentication Requirements documentation for details about how to use authentication.

Setup Exits with command 'gcc' failed with exit status 1 Error

This error usually indicates Python C extensions cannot be built due to missing dependencies. Type this command to determine your system’s architecture:

uname -a

Debian and Ubuntu users should issue these commands to install any missing Python dependencies:

sudo apt-get install python-setuptools
sudo apt-get install build-essential python-dev

Red Hat, CentOS, and Fedora Users should issue these commands to install any missing Python dependencies:

sudo yum install python-setuptools sudo yum install gcc
python-devel

If you install MMS monitoring agents on Windows, see Install the Monitoring Agent on Windows.

Agent

Check Agent Output or Log

If you continue to encounter problems, check the agent’s output or logs for errors. Here are a errors you might find and their solutions:

AttributeError: ‘builtin_function_or_method’ object has no attribute ‘new’

This error often happens after an MMS agent software uprade. Usually the agent agent runs under Python 2.4 and the hmac and hashlib packages are missing. To fix, either install these packages or upgrade to Python 2.5 or greater. For more details, see Install Monitoring Agent.

TypeError: _init_() got an unexpected keyword argument ‘ssl’

This error indicates PyMongo is out of date. Upgrade to at least version 2.6.3. The agent cannot connect to hosts without the latest version of PyMongo.

Confirm Only One Agent is Active and Running

If your monitoring agent can connect to all hosts in your deployment, a single monitoring agent is sufficient. A second monitoring agent can act as a hot standby. Otherwise, multiple agents can cause unexpected problems.

To tell which agent is the Primary Agent, note the Last Ping value in the Monitoring Agents tab on the Hosts page. If there is no Last Ping value for a listed agent, the agent is a standby agent.

When you upgrade a monitoring agent, do not forget to kill the old agent.

If you run a primary agent and a host standby agent, confirm both agents are the same version.

See Frequently Asked Questions About On Prem MMS Monitoring and Monitor Hosts with On Prem MMS Monitoring for more information.

Ensure Connectivity Between Agent and Monitored Hosts

Ensure the system running the agent can resolve and connect to the MongoDB instances. To confirm, log into the system where the agent is running and issue a command in the following form:

mongo [hostname]:[port]

Replace [hostname] with the hostname and [port] with the port that the database is listening on.

Ensure Connectivity Between Agent and MMS Server

Verify that the Monitoring agent can connect on TCP port 443 (outbound) to the MMS server (i.e. “mms.mongodb.com”.)

Allow Agent to Discover Hosts and Collect Initial Data

Allow the agent to run for 5-10 minutes to allow host discovery and initial data collection.

Hosts

Hosts are not Visible

Problems with the monitoring agent detecting hosts can be caused by a few factors.

Host not added to MMS: In MMS, click the Hosts tab then click the Add Host button. In the New Host window, specify the host type, internal hostname, and port. If appropriate, add the database username and password and whether or not MMS should use SSL to connect with your monitoring agent. Note it is not necessary to restart your monitoring agent when adding (or removing) a host.

Accidental duplicate mongods If you add the host after a crash and restart the monitoring agent, you might not see the hostname in the MMS Mongos page. MMS detects the host as a duplicate and suppresses its data. To reset, select Settings then Group Settings. Click the Reset Duplicates button.

Too many monitoring agents installed: Only one monitoring agent is needed to monitor all hosts within a single network. You can use a single monitoring agent if your hosts exist across multiple data centers and can be discovered by a single agent. Check you have only one monitoring agent and remove old agents after upgrading the monitoring agent.

A second monitoring agent can be set up for redundancy. However, the MMS monitoring agent is robust. MMS sends an Agent Down alert only when there are no available monitoring agents available. See Monitoring FAQ and Monitoring Architecture for more information.

Cannot Delete a Host

In MMS, click the Hosts tab and click the gear icon to the right of a hostname and select Remove Host.

In rare cases, the mongod is brought down and the replica set is reconfigured. The down host cannot be deleted and returns an error message, “This host cannot be deleted because it is enabled for backup.” Contact MMS Support for help in deleting these hosts.

Monitoring Server

Why doesn’t the monitoring server startup and run successfully?

If you use authentication, whether or not you enable backup, confirm these properties are in the <install_dir>/conf/conf-mms.properties file:

mongo.mongoUri=<SetToValidUri>
mongo.replicaSet=<ValidRSIfUsed>

Otherwise, MMS will fail while trying to connect to the default 127.0.0.1:27017 URL.

If you use the MMS <install_dir>/bin/credentialstool to encrypt the password used in the mongo.mongoUri value, also add the mongo.encryptedCredentials key to the <install_dir>/conf/conf-mms.properties file and set the value for this property to true:

mongo.encryptedCredentials=true

For more details, see Authentication Configuration.

Munin

Install and configure the munin-node daemon on the monitored MongoDB server(s) before starting MMS monitoring. The MMS agent README file provides guidelines to install munin-node. However, new versions of Linux, specifically Red Hat Linux (RHEL) 6, can generate error messages. See Configure MMS Monitoring for details about monitoring hardware with munin-node.

Restart munin-node after creating links for changes to take effect.

“No package munin-node is available” Error

To correct this error, install the most current version of the Linux repos. Type these commands:

sudo yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

Then type this command to install munin-node and all dependencies:

sudo yum install munin-node

Non-localhost IP Addresses are Blocked

By default, munin blocks incoming connections from non-localhost IP addresses such as MMS. The /var/log/munin-node/munin-node.log file will display a “Denying connection” error for your non-localhost IP address.

To fix this error, open the munin-node.conf configuration file and comment out these two lines:

allow ^127\.0\.0\.1$
allow ^::1$

Then add this line to the munin-node.conf configuration file with a pattern that matches your subnet:

cidr_allow 0.0.0.0/0

Verifying iostat and Other Plugins/Services Returns “# Unknown service” Error

The first step is to confirm there is a problem. Open a telnet session and connect to iostat, iostat_ios, and cpu:

telnet HOSTNAME 4949 <default/required munin port>
fetch iostat
fetch iostat_ios
fetch cpu

If any of these telnet fetch commands returns an “# Unknown Service” error, create a link to the plugin or service in /etc/munin/plugins/ by typing these commands:

cd /etc/munin/plugins/
sudo ln -s /usr/share/munin/plugins/<service> <service>

Replace <service> with the name of the service that generates the error.