Navigation

Monitoring

Installation

The monitoring server does not start up successfully

Confirm the URI or IP address for the Ops Manager service is stored correctly in the mongo.mongoUri property in the <install_dir>/conf/conf-mms.properties file:

mongo.mongoUri=<SetToValidUri>

If you don’t set this property, Ops Manager will fail while trying to connect to the default 127.0.0.1:27017 URL.

If the URI or IP address of your service changes, you must update the property with the new address. For example, update the address if you deploy on a system without a static IP address, or if you deploy on EC2 without a fixed IP and then restart the EC2 instance.

If the URI or IP address changes, then each user who access the service must also update the address in the URL used to connect and in the client-side monitoring-agent.config files.

If you use the Ops Manager <install_dir>/bin/credentialstool to encrypt the password used in the mongo.mongoUri value, also add the mongo.encryptedCredentials key to the <install_dir>/conf/conf-mms.properties file and set the value for this property to true:

mongo.encryptedCredentials=true

Alerts

For resolutions to alert conditions, see also Alert Resolutions.

For information on creating and managing alerts, see Manage Alert Configurations and Manage Alerts.

Cannot Turn Off Email Notifications

There are at least two ways to turn off alert notifications:

Receive Duplicate Alerts

If the notification email list contains multiple email-groups, one or more people may receive multiple notifications of the same alert.

Receive “Host has low open file limits” or “Too many open files” error messages

These error messages appear on the Deployment page, under a host’s name. They appear if the number of available connections does not meet the Ops Manager-defined minimum value. These errors are not generated by the mongos instance and, therefore, not appear in mongos log files.

On a host by host basis, the Monitoring Agent compares the number of open file descriptors and connections to the maximum connections limit. The max open file descriptors ulimit parameter directly affects the number of available server connections. The agent calculates whether or not enough connections exist to meet the Ops Manager-defined minimum value.

In ping documents, for each node and its serverStatus.connections values, if the sum of the current value plus the available value is less than the maxConns configuration value set for a monitored host, the Monitoring Agent will send a Host has low open file limits or Too many open files message to Ops Manager.

Ping documents are data sent by Monitoring Agents to Ops Manager. To view ping documents:

Note

To access this feature, you must either:

  1. Click the Deployment page.
  2. Click the host’s name.
  3. Click Last Ping.

To prevent this error, we recommend you set ulimit open files to 64000. We also recommend setting the maxConns command in the mongo shell to at least the recommended settings.

To learn more, see the MongoDB ulimit reference page and the the MongoDB maxConns reference page.

Deployments

Monitoring Agent Fails to Collect Data

Possible causes for this state:

Deployments are not Visible

Problems with the Monitoring Agent detecting deployments can be caused by a few factors:

Deployment not added

To fix this issue:

  1. Click Deployment.
  2. Click the Processes tab
  3. Click Add Deployment.
  4. In the New Deployment window, specify the:
    • deployment type
    • internal hostname
    • internal port.
  5. If appropriate, add:
    • Add the database username and password.
    • Enable TLS/SSL SSL to connect with your Monitoring Agent.

Note

It is not necessary to restart your Monitoring Agent when adding (or removing) a deployment.

Accidental duplicate mongods

If you add the deployment after a crash and restart the Monitoring Agent, you might not see the hostname on the Deployment page. Ops Manager detects the deployment as a duplicate and suppresses its data.

To reset:

  1. Click Settings.
  2. Click Project Settings.
  3. Click Reset Duplicates.
Monitoring Agents cannot detect deployments

If your deployments exist across multiple data centers, make sure that all of your deployments can be discovered by all of your Monitoring Agents.

Cannot Delete a Deployment

In rare cases, the mongod is brought down and the replica set is reconfigured. The down deployment cannot be deleted and returns an error message:

Warning

This deployment cannot be deleted because it is enabled for backup.

Contact MongoDB Support for help in deleting these deployments.

Projects

Additional Information on Projects

Create a project to monitor additional segregated systems or environments for servers, agents, users, and other resources.

Example

Firewalls may separate your deployment among two or more environments. In this case, you would need two or more separate Ops Manager projects.

API keys are unique to each project. Each project requires its own agent with the appropriate API keys. Within each project, the agent needs to be able to connect to all hosts it monitors in the project.

To learn more about creating and managing projects, see Projects.

Munin

Important

As of Automation Agent 2.7.0, hardware monitoring using munin- node is deprecated.

munin-node is a third-party package. For problems related to installing munin-node, see the Munin Wiki.

Install and configure the munin-node service on the MongoDB server(s) to be monitored before starting Ops Manager monitoring. The Ops Manager agent’s README file provides guidelines to install munin-node.

See also

See Configure Hardware Monitoring with munin-node for details about monitoring hardware with munin-node.

Red Hat Enterprise Linux (RHEL 6, 7) can generate the following error messages.

No package munin-node is available Error

To correct this error:

  1. Follow the instructions on the Extra Packages for Enterprise Linux repository wiki page to install the epel-release rpm for your version of your enterprise Linux.

  2. After the package is installed, type this command to install munin-node and all of its dependencies:

    sudo yum install munin-node
    
  3. After the munin-node is installed, check to see if the munin-node service is running. If it is not, type these commands to start the munin-node service.

    service munin-node status
    service munin-node start
    

Non-localhost IP Addresses are Blocked

By default, munin blocks incoming connections from non-localhost IP addresses. The /var/log/munin/munin-node.log file will display a “Denying connection” error for your non-localhost IP address.

To fix this error, open the munin-node.conf configuration file and comment out these two lines:

allow ^127\.0\.0\.1$
allow ^::1$

Then add this line to the munin-node.conf configuration file with a pattern that matches your subnet:

cidr_allow 0.0.0.0/0

Restart munin-node after editing the configuration file for changes to take effect.

“# Unknown service” Error Returned

Verifying iostat and other plugins/services can return this error.

The first step is to confirm there is a problem. Open a telnet session and connect to iostat, iostat_ios, and cpu:

telnet HOSTNAME 4949 <default/required munin port>
fetch iostat
fetch iostat_ios
fetch cpu

The iostat_ios plugin creates the iotime chart. The cpu plugin creates the cputime chart.

If any of these telnet fetch commands returns an # Unknown Service error, create a link to the plugin or service in /etc/munin/plugins/ by typing these commands:

cd /etc/munin/plugins/
sudo ln -s /usr/share/munin/plugins/<service> <service>

Replace <service> with the name of the service that generates the error.

Disk names are not listed by Munin

In some cases, Munin will omit disk names with a dash between the name and a numerical prefix like dm-0 or dm-1. There is a documented fix for Munin’s iostat plugin.