How to View Incidents
Review all downtime events, alert triggers, and outage durations across your servers and monitors.
What is an Incident?
An incident is a recorded period during which a server or monitor was in a failure state. Pinguzo opens an incident when:
- A monitor check fails — the target URL is down, a port is unreachable, a keyword is missing, etc.
- A server goes offline — the Pinguzo Agent stops reporting for several consecutive minutes
- A metric threshold is exceeded — CPU, memory, disk, or load passes the threshold set in an alert policy
An incident is automatically resolved when the condition clears — the check passes again, the server comes back online, or the metric drops below the threshold.
Opening the Incidents Page
Click Incidents in the left sidebar. The page aggregates incident data from all of your edge servers and displays them in a unified timeline.
Filtering Incidents
Use the filter controls at the top of the page to narrow down the list:
Status Filter
| Option | Shows |
|---|---|
| All | Every incident, open and resolved |
| Open | Incidents that are still ongoing (the condition has not cleared) |
| Resolved | Incidents that have ended |
Type Filter
| Option | Shows |
|---|---|
| All Types | Server incidents and monitor incidents |
| Servers | Only incidents related to server metrics (offline, high CPU, etc.) |
| Monitors | Only incidents related to uptime checks (HTTP errors, ping failures, etc.) |
Incident Table Columns
| Column | Description |
|---|---|
| Resource | The server or monitor that triggered the incident. Click the name to open the metrics detail page for that resource. |
| Trigger | The specific type of failure (see Trigger Types below). |
| Message | A human-readable description of what went wrong (e.g., "HTTP 503 Service Unavailable", "CPU usage 94.2% for 12 minutes"). |
| Went Down | Timestamp when the incident started (shown in your local timezone). |
| Recovered | Timestamp when the incident resolved. Shows "—" for open incidents. |
| Duration | Total downtime duration. For open incidents, this updates live while you view the page. |
| Status | Open or Resolved |
Trigger Types
Monitor Triggers
| Trigger | Meaning |
|---|---|
http_error | The HTTP request returned a non-2xx status code (e.g., 404, 500, 503) |
https_error | The HTTPS request failed or returned a non-2xx status code |
ssl_error | SSL certificate is invalid, expired, or the handshake failed |
ping_failed | ICMP echo request received no reply — host is unreachable |
port_unreachable | TCP connection to the target port was refused or timed out |
keyword_mismatch | The page loaded (HTTP 200) but the expected keyword was not found in the response body |
dns_failed | DNS resolution of the hostname failed |
Server Triggers
| Trigger | Meaning |
|---|---|
offline | The Pinguzo Agent stopped reporting; the server may be down or unreachable |
high_cpu | CPU usage exceeded the configured threshold for the specified duration |
high_memory | Memory usage exceeded the configured threshold for the specified duration |
high_disk | Disk usage exceeded the configured threshold for the specified duration |
high_load | Load-per-core exceeded the configured threshold for the specified duration |
no_data | No data received from the agent for longer than the configured no-data timeout |
Pagination
The incidents table shows 20 incidents per page. Use the Previous / Next buttons at the bottom to navigate. The total number of incidents matching your current filters is displayed above the table.
How Incidents Relate to Alerts
An incident is a recorded event. An alert is a policy that decides when to create an incident and who to notify. You can have incidents without alert policies (they still appear in the incidents list), but you will only receive notifications if you have a matching alert policy with contacts configured.
See Configure Alerts to set up notification policies and Configure Contacts to add notification channels.
Incident Lifecycle
- Detection: An edge server detects a failure during a check cycle
- Verification: Peer edge(s) independently confirm the failure (monitor incidents only)
- Incident created: A new incident record is opened in the database
- Notifications sent: Matching alert policies trigger contact notifications (email, Slack, Discord, Telegram, webhook)
- Condition monitored: Subsequent checks continue; the incident remains open while the failure persists
- Recovery detected: The check passes (or metric drops below threshold, or agent reports again)
- Incident resolved: The incident is closed with a
recovered_attimestamp - Recovery notification sent: Contacts configured with recovery notifications are alerted
Next Steps
- Configure Alerts — define when to open incidents and who to notify
- Configure Contacts — add email, Slack, or webhook notification channels
- View Monitor Metrics — investigate the response time leading up to an incident
- View Server Metrics — investigate CPU or memory spikes that triggered a server incident