Navigation

Global Alerts

The globalAlerts resource allows you to retrieve and acknowledge alerts that have been triggered by a global alert configuration. You must have the Global Monitoring Admin to use this resource.

Alert Status

When Ops Manager detects an alert condition, it opens an alert. If the global alert configuration contains no notification delay, the alert status goes immediately to OPEN. If the configuration contains a delay, Ops Manager sets the alert to TRACKING until the delay period ends, after which Ops Manager sets the alert to OPEN if the condition persists.

If the global alert configuration has multiple notifications, each with its own notification delay, Ops Manager uses the smallest delay value to determine when to move an alert from TRACKING to OPEN.

Endpoints

Get All Global Alerts

GET /api/public/v1.0/globalAlerts

To specify alert status, use the status query parameter with one of the following values:

  • TRACKING
  • OPEN
  • CLOSED
GET /api/public/v1.0/globalAlerts?status=STATUS

The status parameter cannot retrieve CANCELLED global alerts.

Get a Specific Global Alert

GET /api/public/v1.0/globalAlerts/ALERT-ID

Acknowledge an Alert

To acknowledge a global alert, use PATCH to update the alert’s acknowledgedUntil field. You can optionally update the acknowledgementComment field with a comment. Do not include any other field in the PATCH request.

PATCH /api/public/v1.0/globalAlerts/GLOBAL-ALERT-ID

To acknowledge an alert “forever”, set the field value to 100 years in the future. To unacknowledge a previously acknowledged alert, set the field value to the past.

If you add a comment in the acknowledgementComment field, Ops Manager displays the comment next to the message that the alert has been acknowledged.

Sample Entity

The following is an example return document. The fields in a return document depend on the alert type.

{
  "id" : "56e81a12e4b82547f6cd3a7",
  "groupId" : "56cfafabe4b0391691d7c2a5",
  "alertConfigId" : "56a104e0e4b00664f588ac8",
  "created" : "2016-10-15T14:20:02Z",
  "eventTypeName" : "INCONSISTENT_BACKUP_CONFIGURATION",
  "lastNotified" : "2016-10-17T15:21:25Z",
  "replicaSetName" : "testSet",
  "resolved" : "2016-10-17T15:21:20Z",
  "sourceTypeName" : "REPLICA_SET",
  "status" : "CLOSED",
  "tags" : [ ],
  "typeName" : "BACKUP",
  "updated" : "2016-10-17T15:21:20Z"
  "links": [ ... ]
}

Entity Fields

Name Type Description
id string Unique identifier.
groupId string ID of the group that this alert was opened for.
alertConfigId string ID of the global alert configuration that triggered this alert.
created date When the alert was opened.
eventTypeName string

The name of the event that triggered the alert.

  • Host alerts support these values:
    • HOST_DOWN
    • HOST_RECOVERING
    • VERSION_BEHIND
    • HOST_EXPOSED
    • OUTSIDE_METRIC_THRESHOLD
  • Agent alerts support these values:
    • MONITORING_AGENT_DOWN
    • MONITORING_AGENT_VERSION_BEHIND
    • BACKUP_AGENT_DOWN
    • BACKUP_AGENT_VERSION_BEHIND
    • BACKUP_AGENT_CONF_CALL_FAILURE
  • Backup alerts support these values:
    • OPLOG_BEHIND
    • CLUSTER_MONGOS_IS_MISSING
    • RESYNC_REQUIRED
    • BAD_CLUSTERSHOTS
    • RS_BIND_ERROR (x alerts only)
    • BACKUP_TOO_MANY_RETRIES (global alerts only)
    • BACKUP_IN_UNEXPECTED_STATE (global alerts only)
    • LATE_SNAPSHOT (global alerts only)
    • SYNC_SLICE_HAS_NOT_PROGRESSED (global alerts only)
    • BACKUP_JOB_TOO_BUSY (global alerts only)
    • GROUP_TAGS_CHANGED (global alerts only)
  • Group alerts support these values:
    • USERS_AWAITING_APPROVAL
    • USERS_WITHOUT_MULTI_FACTOR_AUTH
  • Replica set alerts support these values:
    • CONFIGURATION_CHANGED
    • PRIMARY_ELECTED
    • TOO_FEW_HEALTHY_MEMBERS
    • TOO_MANY_UNHEALTHY_MEMBERS
    • NO_PRIMARY
  • Sharded cluster alerts support this value:
    • CLUSTER_MONGOS_IS_MISSING
  • User alerts support these values:
    • JOINED_GROUP
    • REMOVED_FROM_GROUP
lastNotified date When the last notification was sent for this alert. Only present if notifications have been sent.
replicaSetName string Name of the replica set. Only present for global alerts of type HOST, HOST_METRIC, BACKUP, and REPLICA_SET.
resolved date When the alert was closed. Only present if the status is CLOSED.
sourceTypeName string

For global alerts of the type BACKUP, the type of server being backed up. Possible values are:

  • REPLICA_SET
  • SHARDED_CLUSTER
  • CONFIG_SERVER
status string

The current state of the alert. Possible values are:

  • TRACKING
  • OPEN
  • CLOSED
  • CANCELLED
tags array of strings The tags associated with the alert.
typeName string This field is deprecated and will be ignored.
updated date When the alert was last updated.
acknowledgedUntil date The date through which the alert has been acknowledged. Only present if the alert has been acknowledged.
acknowledgementComment string The comment left by the user who acknowledged the alert. Only present if the alert has been acknowledged.
acknowledgingUsername string The username of the user who acknowledged the alert. Only present if the alert has been acknowledged.
hostnameAndPort string The hostname and port of each host to which the alert applies. Only present for alerts of type HOST, HOST_METRIC, and REPLICA_SET.
hostId string ID of the host to which the metric pertains. Only present for alerts of type HOST, HOST_METRIC, and REPLICA_SET.
metricName string

The name of the measurement whose value went outside the threshold. Only present if eventTypeName is set to OUTSIDE_METRIC_THRESHOLD.

For possible values, see Measurement Types for Global Alerts on this page.

currentValue object The current value of the metric that triggered the alert. Only present for alerts of type HOST_METRIC.
currentValue.number number The value of the metric.
currentValue.units string

The units for the value. Depends on the type of metric. For example, a metric that measures memory consumption would have a byte measurement, while a metric that measures time would have a time unit. Possible values are:

  • RAW
  • BITS
  • BYTES
  • KILOBITS
  • KILOBYTES
  • MEGABITS
  • MEGABYTES
  • GIGABITS
  • GIGABYTES
  • TERABYTES
  • PETABYTES
  • MILLISECONDS
  • SECONDS
  • MINUTES
  • HOURS
  • DAYS
clusterId string The ID of the cluster to which this alert applies. Only present for alerts of type BACKUP, REPLICA_SET, and CLUSTER.
clusterName string The name the cluster to which this alert applies. Only present for alerts of type BACKUP, REPLICA_SET, and CLUSTER.

Measurement Types for Global Alerts

The globalAlerts resource returns measurement types in the metricName field. The field is present only if eventTypeName is set to OUTSIDE_METRIC_THRESHOLD.

Host Measurements

  • ASSERT_REGULAR
  • ASSERT_WARNING
  • ASSERT_MSG
  • ASSERT_USER
Measure the rate of asserts for a MongoDB process, as collected from the MongoDB serverStatus command’s asserts document.
  • BACKGROUND_FLUSH_AVG
Measurement found on the host’s background flush avg chart. To view the chart, see View Metrics.
  • CACHE_BYTES_READ_INTO
  • CACHE_BYTES_WRITTEN_FROM
  • CACHE_USAGE_DIRTY
  • CACHE_USAGE_USED
  • TICKETS_AVAILABLE_READS
  • TICKETS_AVAILABLE_WRITES
Apply to a MongoDB process’s WiredTiger storage engine, as collected from the MongoDB serverStatus command’s wiredTiger.cache and wiredTiger.concurrentTransactions documents.
  • CONNECTIONS
Measures connections to a MongoDB process, as collected from the MongoDB serverStatus command’s connections document.
  • CURSORS_TOTAL_OPEN
  • CURSORS_TOTAL_TIMED_OUT
Measure the number of cursors for a MongoDB process, as collected from the MongoDB serverStatus command’s metrics.cursor document.
  • EXTRA_INFO_PAGE_FAULTS
  • GLOBAL_ACCESSES_NOT_IN_MEMORY
  • GLOBAL_PAGE_FAULT_EXCEPTIONS_THROWN
Measurements found on the host’s Record Stats and Page Faults charts. To view the charts, see View Metrics.
  • GLOBAL_LOCK_CURRENT_QUEUE_TOTAL
  • GLOBAL_LOCK_CURRENT_QUEUE_READERS
  • GLOBAL_LOCK_CURRENT_QUEUE_WRITERS
Measure operations waiting on locks, as collected from the MongoDB serverStatus command. Ops Manager computes these values based on the type of storage engine.
  • GLOBAL_LOCK_PERCENTAGE
Applicable only to hosts running MongoDB 2.0 and earlier. Measures operations waiting on the global lock, as collected from the MongoDB serverStatus command.
  • INDEX_COUNTERS_BTREE_ACCESSES
  • INDEX_COUNTERS_BTREE_HITS
  • INDEX_COUNTERS_BTREE_MISSES
  • INDEX_COUNTERS_BTREE_MISS_RATIO
Measurements found on the host’s btree chart. To view the chart, see View Metrics.
  • JOURNALING_COMMITS_IN_WRITE_LOCK
  • JOURNALING_MB
  • JOURNALING_WRITE_DATA_FILES_MB
Measurements found on the host’s journal - commits in write lock chart and journal stats chart. To view the charts, see View Metrics.
  • MEMORY_RESIDENT
  • MEMORY_VIRTUAL
  • MEMORY_MAPPED
  • COMPUTED_MEMORY
Measure memory for a MongoDB process, as collected from the MongoDB serverStatus command’s mem document.
  • NETWORK_BYTES_IN
  • NETWORK_BYTES_OUT
  • NETWORK_NUM_REQUESTS
Measure throughput for MongoDB process, as collected from the MongoDB serverStatus command’s network document.
  • OPLOG_SLAVE_LAG_MASTER_TIME
  • OPLOG_MASTER_TIME
  • OPLOG_MASTER_LAG_TIME_DIFF
  • OPLOG_RATE_GB_PER_HOUR
Measurements that apply to the MongoDB process’s oplog.
  • DB_STORAGE_TOTAL
  • DB_DATA_SIZE_TOTAL
Measurements displayed on the host’s db storage chart. To view the chart, see View Metrics.
  • OPCOUNTER_CMD
  • OPCOUNTER_QUERY
  • OPCOUNTER_UPDATE
  • OPCOUNTER_DELETE
  • OPCOUNTER_GETMORE
  • OPCOUNTER_INSERT
Measure the rate of database operations on a MongoDB process since the process last started, as collected from the MongoDB serverStatus command’s opcounters document.
  • OPCOUNTER_REPL_CMD
  • OPCOUNTER_REPL_UPDATE
  • OPCOUNTER_REPL_DELETE
  • OPCOUNTER_REPL_INSERT
Measure the rate of database operations on MongoDB secondaries, as collected from the MongoDB serverStatus command’s opcountersRepl document.
  • DOCUMENT_RETURNED
  • DOCUMENT_INSERTED
  • DOCUMENT_UPDATED
  • DOCUMENT_DELETED
The average rate per second of documents returned, inserted, updated, or deleted for a selected time period. These measurements are found on the host’s Document Metrics chart. To view the chart, see View Metrics.
  • OPERATIONS_SCAN_AND_ORDER
For a selected time period, the average rate per second for operations that perform a sort but cannot perform the sort using an index. This measurement is found on the host’s Scan and Order chart. To view the chart, see View Metrics.
  • AVG_READ_EXECUTION_TIME
  • AVG_WRITE_EXECUTION_TIME
  • AVG_COMMAND_EXECUTION_TIME
Available to hosts running MongoDB v3.4+. The average execution time in milliseconds per read, write, or command operation over the selected time period. These measurements are found on the host’s Operation Execution Times chart. To view the chart, see View Metrics.
  • QUERY_EXECUTOR_SCANNED
The average rate per second to scan index items during queries and query-plan evaluations. This rate is driven by the same value as totalKeysExamined in the output of explain. This measurement is found on the host’s Query Executor chart, accessed when viewing metrics.
  • QUERY_EXECUTOR_SCANNED_OBJECTS
The average rate per second to scan documents during queries and query-plan evaluations. Ops Manager derives the rate using the explain output’s totalDocsExamined value. This measurement is found on the host’s Query Executor chart, accessed when viewing metrics.
  • QUERY_TARGETING_SCANNED_PER_RETURNED
The ratio of the number of index items scanned to the number of documents returned. This measurement is found on the host’s Query Targeting chart, accessed when viewing metrics.
  • QUERY_TARGETING_SCANNED_OBJECTS_PER_RETURNED
The ratio of the number of documents scanned to the number of documents returned. This measurement is found on the host’s Query Targeting chart, accessed when viewing metrics.

Examples

Get All Global Alerts

Request

curl -i -u "username:apiKey" --digest "https://<ops-manager-host>/api/public/v1.0/globalAlerts"

Response

HTTP/1.1 200 OK

{
  "links" : [ ... ],
  "results" : [ {
    "alertConfigId" : "573b7d12e4b0979a262467c1",
    "created" : "2016-10-18T08:08:08Z",
    "currentValue" : {
      "number" : 143.4739833843463,
      "units" : "RAW"
    },
    "eventTypeName" : "OUTSIDE_METRIC_THRESHOLD",
    "groupId" : "5729fde7e4b0132d250fed5f",
    "hostId" : "63f42376fb735471fe40ec54a7",
    "hostnameAndPort" : "replicaset-shard-00-02:27017",
    "id" : "573b7d2de4b02fd2c93423a6",
    "links" : [ ... ],
    "metricName" : "OPCOUNTER_CMD",
    "lastNotified" : "2016-10-18T19:29:54Z",
    "replicaSetName" : "replicaSet-shard-0",
    "resolved" : "2016-10-18T21:30:04Z",
    "status" : "CLOSED",
    "tags" : [ ],
    "typeName" : "HOST_METRIC",
    "updated" : "2016-10-18T21:30:04Z"
  }, ... ],
  "totalCount" : 8
}

Get All Open Global Alerts

Request

curl -i -u "username:apiKey" --digest "https://<ops-manager-host>/api/public/v1.0/globalAlerts?status=OPEN"

Response

HTTP/1.1 200 OK

{
  "links" : [ ... ],
  "results" : [ {
    "alertConfigId" : "5730f5e1e4b030a9634a3f69",
    "clusterId" : "572a00f2e4b051814b144e90",
    "clusterName" : "shardedCluster",
    "created" : "2016-10-09T06:16:36Z",
    "eventTypeName" : "OPLOG_BEHIND",
    "groupId" : "5729fde7e4b0132d250fed5f",
    "id" : "3b7d2de0a4b02fd2c98146de",
    "links" : [ ... ],
    "lastNotified" : "2016-10-10T20:42:32Z",
    "replicaSetName" : "shardedCluster-shard-0",
    "sourceTypeName" : "REPLICA_SET",
    "status" : "OPEN",
    "tags" : [ ],
    "typeName" : "BACKUP",
    "updated" : "2016-10-10T20:42:32Z"
  }, ... ],
  "totalCount" : 3
}

Get a Specific Global Alert

Request

curl -i -u "username:apiKey" --digest "https://<ops-manager-host>/api/public/v1.0/globalAlerts/3b7d2de0a4b02fd2c98146de"

Response

HTTP/1.1 200 OK

{
  "alertConfigId" : "5730f5e1e4b030a9634a3f69",
  "clusterId" : "572a00f2e4b051814b144e90",
  "clusterName" : "shardedCluster",
  "created" : "2016-10-09T06:16:36Z",
  "eventTypeName" : "OPLOG_BEHIND",
  "groupId" : "5729fde7e4b0132d250fed5f",
  "id" : "3b7d2de0a4b02fd2c98146de",
  "links" : [ ... ],
  "lastNotified" : "2016-10-10T20:42:32Z",
  "replicaSetName" : "shardedCluster-shard-0",
  "sourceTypeName" : "REPLICA_SET",
  "status" : "OPEN",
  "tags" : [ ],
  "typeName" : "BACKUP",
  "updated" : "2016-10-10T20:42:32Z"
}

Acknowledge an Alert

Request

curl -i -u "username:apiKey" --digest -H "Content-Type: application/json" -X PATCH "https://<ops-manager-host>/api/public/v1.0/globalAlerts/3b7d2de0a4b02fd2c98146de" --data '
{
  "acknowledgedUntil": "2016-11-01T00:00:00-0400"
}'

Response

HTTP/1.1 200 OK

{
  "alertConfigId" : "5730f5e1e4b030a9634a3f69",
  "clusterId" : "572a00f2e4b051814b144e90",
  "clusterName" : "shardedCluster",
  "created" : "2016-10-09T06:16:36Z",
  "eventTypeName" : "OPLOG_BEHIND",
  "groupId" : "5729fde7e4b0132d250fed5f",
  "id" : "3b7d2de0a4b02fd2c98146de",
  "links" : [ ... ],
  "lastNotified" : "2016-10-10T20:42:32Z",
  "replicaSetName" : "shardedCluster-shard-0",
  "sourceTypeName" : "REPLICA_SET",
  "status" : "OPEN",
  "acknowledgedUntil" : "2016-11-01T00:00:00Z",
  "acknowledgingUsername" : "admin@example.com",
  "tags" : [ ],
  "typeName" : "BACKUP",
  "updated" : "2016-10-10T22:03:11Z"
}