How to View Incidents

Review all downtime events, alert triggers, and outage durations across your servers and monitors.

What is an Incident?

An incident is a recorded period during which a server or monitor was in a failure state. Pinguzo opens an incident when:

A monitor check fails — the target URL is down, a port is unreachable, a keyword is missing, etc.
A server goes offline — the Pinguzo Agent stops reporting for several consecutive minutes
A metric threshold is exceeded — CPU, memory, disk, or load passes the threshold set in an alert policy

An incident is automatically resolved when the condition clears — the check passes again, the server comes back online, or the metric drops below the threshold.

Cross-edge verification Before opening a monitor incident, the detecting edge server sends a spot-check request to one or more peer edge servers, which independently run the same check. An incident is only opened if the peer(s) also confirm the failure. This prevents false positives from edge-specific network issues.

Opening the Incidents Page

Click Incidents in the left sidebar. The page aggregates incident data from all of your edge servers and displays them in a unified timeline.

Filtering Incidents

Use the filter controls at the top of the page to narrow down the list:

Status Filter

Option	Shows
All	Every incident, open and resolved
Open	Incidents that are still ongoing (the condition has not cleared)
Resolved	Incidents that have ended

Type Filter

Option	Shows
All Types	Server incidents and monitor incidents
Servers	Only incidents related to server metrics (offline, high CPU, etc.)
Monitors	Only incidents related to uptime checks (HTTP errors, ping failures, etc.)

Incident Table Columns

Column	Description
Resource	The server or monitor that triggered the incident. Click the name to open the metrics detail page for that resource.
Trigger	The specific type of failure (see Trigger Types below).
Message	A human-readable description of what went wrong (e.g., "HTTP 503 Service Unavailable", "CPU usage 94.2% for 12 minutes").
Went Down	Timestamp when the incident started (shown in your local timezone).
Recovered	Timestamp when the incident resolved. Shows "—" for open incidents.
Duration	Total downtime duration. For open incidents, this updates live while you view the page.
Status	Open or Resolved

Trigger Types

Monitor Triggers

Trigger	Meaning
`http_error`	The HTTP request returned a non-2xx status code (e.g., 404, 500, 503)
`https_error`	The HTTPS request failed or returned a non-2xx status code
`ssl_error`	SSL certificate is invalid, expired, or the handshake failed
`ping_failed`	ICMP echo request received no reply — host is unreachable
`port_unreachable`	TCP connection to the target port was refused or timed out
`keyword_mismatch`	The page loaded (HTTP 200) but the expected keyword was not found in the response body
`dns_failed`	DNS resolution of the hostname failed

Server Triggers

Trigger	Meaning
`offline`	The Pinguzo Agent stopped reporting; the server may be down or unreachable
`high_cpu`	CPU usage exceeded the configured threshold for the specified duration
`high_memory`	Memory usage exceeded the configured threshold for the specified duration
`high_disk`	Disk usage exceeded the configured threshold for the specified duration
`high_load`	Load-per-core exceeded the configured threshold for the specified duration
`no_data`	No data received from the agent for longer than the configured no-data timeout

Pagination

The incidents table shows 20 incidents per page. Use the Previous / Next buttons at the bottom to navigate. The total number of incidents matching your current filters is displayed above the table.

How Incidents Relate to Alerts

An incident is a recorded event. An alert is a policy that decides when to create an incident and who to notify. You can have incidents without alert policies (they still appear in the incidents list), but you will only receive notifications if you have a matching alert policy with contacts configured.

See Configure Alerts to set up notification policies and Configure Contacts to add notification channels.

Incident Lifecycle

Detection: An edge server detects a failure during a check cycle
Verification: Peer edge(s) independently confirm the failure (monitor incidents only)
Incident created: A new incident record is opened in the database
Notifications sent: Matching alert policies trigger contact notifications (email, Slack, Discord, Telegram, webhook)
Condition monitored: Subsequent checks continue; the incident remains open while the failure persists
Recovery detected: The check passes (or metric drops below threshold, or agent reports again)
Incident resolved: The incident is closed with a recovered_at timestamp
Recovery notification sent: Contacts configured with recovery notifications are alerted

Next Steps

Configure Alerts — define when to open incidents and who to notify
Configure Contacts — add email, Slack, or webhook notification channels
View Monitor Metrics — investigate the response time leading up to an incident
View Server Metrics — investigate CPU or memory spikes that triggered a server incident