Home > developer > content development > Default Monitor Thresholds

Default Monitor Thresholds

The tables below list all of the default monitor thresholds implicitly added in all environments. As an app owner, you should review and update these thresholds to what is best suited for your app.

Monitor Type	Resource Name	Description	Action
CPU Load Heartbeat	compute	If collection for any of the load metrics (load1, load5 or load15) is missed, raises a missing heartbeat pulse event which makes the compute instance unhealthy.	Unhealthy notification is raised. Repair action is executed on the affected instance.
CPU Load	compute	`'HighLoad' => threshold('1m','avg','load5',trigger('>=',30,3,1),reset('<',15,1,1))` Compute is heavily loaded if the load5 average value goes above 30. Then set the trigger.	Notify only. No action.
CPU Usage	compute	`'HighCpuUsage' =>threshold('5m','avg','CpuIdle',trigger('<=',10,15,2),reset('>',15,15,1))` Compute utilization is very high if cpuidle goes below 10% which means that more than 90% is utilized.	Notify only. No action.
Socket Connection	compute	No default threshold is defined. Monitor can be set up with different State: `TIME_OUT`, `ESTABLISHED`, `CLOSE_WAIT`, etc.
Network	compute	No default threshold is defined.
Filesystem root	volume /	`'LowDiskSpace' => threshold('1m', 'avg', 'space_used', trigger('>=', 90, 5, 2), reset('<', 85, 5, 1))` Compute has low disk space when space_used is more than 90% at root disk. `/'LowDiskInode' => threshold('1m', 'avg', 'inode_used', trigger('>=', 90, 5, 2), reset('<', 85, 5, 1))` Compute has low inode when inode_used is more than 90% at root disk /	Notify only. No action.
System messages	file /var/log/messages
Memory	Compute	`'HighMemUse' => threshold('1m', 'avg', 'free', trigger('<', 50000, 5, 4), reset('>', 80000, 5, 4))` Compute is using too much memory when available (free) memory goes lower than 50MB.	Notify only. No action.
Process cron	crond process	`'CrondProcessLow' => threshold('1m', 'avg', 'count', trigger('<', 1, 1, 1), reset('>=', 1, 1, 1))` crond process should be running. If not, the process count goes below 1 and raises the alert. `'CrondProcessHigh' => threshold('1m', 'avg', 'count', trigger('>=', 200, 1, 1), reset('<', 200, 1, 1))` crond process count should not be above 200. If found, raises the alert.	Notify only. No action.
Process sendmail	postfix process	`'PostfixProcessLow' => threshold('1m', 'avg', 'count', trigger('<', 1, 1, 1), reset('>=', 1, 1, 1))` postfix process should be running. If not, the process count goes below 1 and raises the alert. `'PostfixProcessHigh' => threshold('1m', 'avg', 'count', trigger('>=', 200, 1, 1), reset('<', 200, 1, 1))` postfix process count should not be above 200. If found, raised the alert.	Notify only. No action.
Process SSH Daemon	sshd process	`'SshdProcessLow' => threshold('1m', 'avg', 'count', trigger('<', 1, 1, 1), reset('>=', 1, 1, 1))` sshd process should be running. If not, the process count goes below 1 and raises the alert. `'SshdProcessHigh' => threshold('1m', 'avg', 'count', trigger('>=', 200, 1, 1), reset('<', 200, 1, 1))` sshd process count should not be above 200. If found raises the alert.	Notify only. No action.

Volume /app Thresholds

Monitor Type	Resource Name	Threshold Definition	Description	Action
Filesystem /app	volume		`'LowDiskSpaceCritical' => threshold('1m', 'avg', 'space_used', trigger('>=', 90, 5, 2), reset('<', 85, 5, 1))` Volume has low disk space when space_used is more than 90% at root disk /app `'LowDiskInodeCritical' => threshold('1m', 'avg', 'inode_used',trigger('>=', 90, 5, 2), reset('<', 85, 5, 1)),` Volume has low inode space when inode_used is more than 90% at root disk /app	Notify only. No action.

Tomcat Thresholds

Monitor Type	Resource Name	Description	Action
Tomcat process	tomcat-daemon	`'TomcatDaemonProcessDown' => threshold('1m', 'avg', 'up', trigger('<=', 98, 1, 1), reset('>', 95, 1, 1))` tomcat daemon process is considered down if its process availability goes below 90%. Even though the threshold says below 90%, in reality the process no longer exists. Do not change the average values to 100%.	Notify only. No action.
JvmInfo	tomcat	`'HighMemUse' => threshold('1m','avg', 'percentUsed',trigger('>=',90,5,1),reset('<',85,5,1))` Note: Values are calculated from http://localhost:#{port}/manager/status?XML=true
ThreadInfo	tomcat	`'HighThreadUse' => threshold('5m','avg','percentBusy',trigger('>=',90,5,1),reset('<',85,5,1))` Note: Values are calculated from http://localhost:#{port}/manager/status?XML=true
RequestInfo	tomcat	No Threshold defined. Note: Values are calculated from http://localhost:#{port}/manager/status?XML=true
Log	tomcat	`'CriticalLogException' => threshold('15m', 'avg', 'logtomcat_criticals', trigger('>=', 1, 15, 1), reset('<', 1, 15, 1))`
AppVersion	tomcat

Artifact App – App-Specific Thresholds

Monitor Type	Resource Name	Threshold Definition	Description	Action
Exception Monitoring	artifact Level	* Log Path: * /log/logmon/logmon.log * Pattern to look for: Exception * thresholds: 1 (Alert on every occurrence ) * Severity: Major * If more than 2 Critical	`'CriticalLogException' => threshold('15m', 'avg', 'logtomcat_criticals', trigger('>=', 1, 15, 1), reset('<', 1, 15, 1)), 'logfile' => '/log/apache-tomcat/catalina.out', 'warningpattern' => 'WARNING', 'criticalpattern' => 'CRITICAL'` The three parameters above define the file to be monitored for warning and critical patterns.	Notify only. No action.

Apache Server Thresholds

Monitor Type	Resource Name	Threshold Definition	Description	Action
ServerStatus	Apache		`'TooBusy' => threshold('5m','avg','idle_workers',trigger('<',5,5,5),reset('>',5,5,5)), 'HighUserCpu' => threshold('5m','avg','cpu_user',trigger('>',60,5,1),reset('<',60,5,1)), 'HighSysCpu' => threshold('5m','avg','cpu_sys',trigger('>',30,5,1),reset('<',30,5,1))` Note: All the metrics are calculated using http://localhost:#{port}/server-status	Notify only. No action.

ActiveMQ Thresholds

Monitor Type	Resource Name	Threshold Definition	Description	Action
BrokerStatus	activemq		Note: Metrics values are calculated using queues: `<protocol>://<host>:<port>/admin/xml/queues.jsp topics: <protocol>://<host>:<port>/admin/xml/topics.jsp`
Log	activemq		`'CriticalLogException' => threshold('15m', 'avg', 'logtomcat_criticals', trigger('>=', 1, 15, 1), reset('<', 1, 15, 1)), 'logfile' => '/opt/apache-activemq-5.5.1/data/wrapper.log', 'warningpattern' => 'OutOfMemory', 'criticalpattern' => 'OutOfMemory'` The three parameters above define the file to be monitored for warning and critical patterns. Log Path: `/log/logmon/logmon.log` Pattern to look for: Exception.	Notify only. No action.
Memory	activemq	No threshold defined	`'protocol' => 'http', 'port' => '8161', 'path' => '/admin/index.jsp?printable=true'` Note: Metrics values are calculated using `<protocol>://<host>:<port>/admin/index.jsp?printable=true`	Notify only. No action.
Process	Daemon		`'ActiveMQDaemonProcessDown' => threshold('1m', 'avg', 'up', trigger('<=', 98, 1, 1), reset('>', 95, 1, 1))`	Notify only. No action.

Overview

Key Concepts

Core Development