IT On-Call Policy Template [Free] —...

On-call burnout is the #1 reason IT operations staff quit. A poorly designed on-call program leads to exhausted engineers, missed incidents, and attrition that costs $50,000-$150,000 per replacement hire. A well-structured IT on-call policy fixes this by setting clear expectations for rotation schedules, response times, escalation paths, and compensation. This guide provides a complete, ready-to-use on-call policy template. For additional IT management resources, visit our IT Policy Templates guide and IT Management Policies.

Quick Start: Download our free IT On-Call Policy Template — covers rotation design, escalation procedures, compensation guidelines, and response time SLAs. Customize for your team in under an hour.

What Is an IT On-Call Policy?

An IT on-call policy is a formal document that defines the rules, expectations, and procedures for staff who are designated to respond to IT incidents outside of normal business hours. It covers who is on call, when, how they're contacted, what they're expected to do, and how they're compensated.

Why You Need a Formal On-Call Policy

Without a Policy	With a Policy
Same people always get called	Fair rotation ensures equitable distribution
No clear response time expectations	Defined SLAs by severity level
Engineers don't know when they're "off"	Clear on/off boundaries reduce burnout
Compensation is inconsistent or nonexistent	Fair pay for after-hours work
Escalation is chaotic ("just call whoever")	Structured escalation paths with backup
Burnout and attrition	Sustainable, predictable program

IT On-Call Policy Template

1. Policy Overview

IT ON-CALL POLICY
Version: 1.0
Effective Date: [Date]
Policy Owner: [IT Director / VP of Engineering]
Approved By: [CTO / VP of IT]

PURPOSE:
This policy establishes the framework for IT on-call coverage,
ensuring critical systems are monitored and incidents are resolved
outside of normal business hours. It defines rotation schedules,
response expectations, escalation procedures, and compensation.

SCOPE:
This policy applies to all IT staff who participate in on-call
rotations, including:
- Infrastructure and operations engineers
- Site reliability engineers (SRE)
- Database administrators
- Network engineers
- Security operations staff
- Application support engineers

DEFINITIONS:
- On-call: Designated period where an engineer must be reachable
  and able to respond to incidents within defined SLA
- Primary on-call: First responder for all incoming alerts
- Secondary on-call: Backup who is engaged if primary doesn't respond
- Escalation: Process of involving additional personnel when an
  incident exceeds the responder's ability or authority

2. On-Call Rotation Schedules

Rotation Models

Model	How It Works	Best For	Drawbacks
Weekly rotation	One person covers 7 days, then rotates	Small teams (3-5 people)	Full week can be exhausting
Follow-the-sun	Shifts align with time zones, no overnight	Distributed teams across 3+ time zones	Requires global headcount
Split week	Weekdays (Mon-Fri) and weekend (Sat-Sun) are separate	Teams where weekends are significantly different	More handoffs
Day/night split	Day shift (8am-8pm) and night shift (8pm-8am)	High-volume environments	Requires larger team
Bi-weekly rotation	Two-week rotation with primary and secondary	Medium teams (5-10 people)	Long rotation period

Sample Weekly Rotation Schedule

Week	Primary On-Call	Secondary On-Call	Shift
Feb 10-16	Engineer A	Engineer B	Mon 9am - Mon 9am
Feb 17-23	Engineer B	Engineer C	Mon 9am - Mon 9am
Feb 24-Mar 2	Engineer C	Engineer D	Mon 9am - Mon 9am
Mar 3-9	Engineer D	Engineer E	Mon 9am - Mon 9am
Mar 10-16	Engineer E	Engineer A	Mon 9am - Mon 9am

Rotation Rules

Minimum team size: 4 people for weekly rotation (ensures no more than 1 week in 4 on call)
Maximum on-call frequency: No more than 1 week in 3 (33% of time)
Holiday coverage: Rotated separately and equitably tracked over the year
Swap policy: Engineers may swap shifts with manager approval and 48-hour notice
New hire exemption: New team members shadow on-call for 2 rotations before going primary
Consecutive limit: No engineer should be primary on-call for more than 7 consecutive days

3. Response Time SLAs

Severity	Definition	Acknowledgment	Response	Resolution Target
P1 — Critical	Complete service outage, data breach, revenue-impacting	5 minutes	15 minutes to begin working	1 hour (mitigate), 4 hours (resolve)
P2 — High	Major feature degraded, significant user impact	15 minutes	30 minutes to begin working	4 hours
P3 — Medium	Minor feature impacted, workaround available	30 minutes	1 hour to begin working	Next business day
P4 — Low	Informational alert, no user impact	1 hour	Next business day	Scheduled maintenance window

Response expectations during on-call:

Must be reachable by phone and paging system at all times during shift
Must be able to access a laptop and VPN within 15 minutes of page
Must not be more than 30 minutes from a reliable internet connection
Alcohol consumption must not impair ability to respond and troubleshoot
If temporarily unreachable (medical appointment, flight), secondary must be notified

4. Escalation Procedures

Escalation Matrix

Trigger	Action	Who to Escalate To	Timeline
Primary doesn't acknowledge within SLA	Auto-page secondary on-call	Secondary engineer	Automatic after SLA breach
Secondary doesn't acknowledge	Page on-call manager	Engineering manager	10 minutes after secondary SLA
P1 incident lasting over 30 minutes	Notify management	Engineering manager + IT Director	30 minutes into incident
P1 incident lasting over 1 hour	Executive notification	VP of Engineering / CTO	1 hour into incident
Incident requires cross-team support	Page relevant team's on-call	Other team's primary on-call	As needed
Incident has customer impact	Notify customer success	CS on-call + Communications	Immediately upon confirmed impact

Escalation Procedure

ESCALATION FLOW

1. Alert fires → Primary on-call paged
   ├── Acknowledged within SLA → Primary investigates
   │   ├── Resolved → Document and close
   │   └── Cannot resolve alone → Escalate to secondary/specialist
   └── NOT acknowledged within SLA → Secondary auto-paged
       ├── Acknowledged → Secondary investigates
       └── NOT acknowledged → Manager paged (phone call)

2. For P1 incidents:
   - 15 min: Primary begins investigation
   - 30 min: If unresolved, notify engineering manager
   - 60 min: If unresolved, notify IT Director/CTO
   - 90 min: If unresolved, incident commander assigned, war room opened

5. On-Call Compensation

On-call compensation varies by company and region. Here are the most common models:

Compensation Models

Model	How It Works	Typical Range	Best For
Flat on-call stipend	Fixed amount per on-call shift regardless of pages	$200-$500/week	Predictable cost, low-volume paging
Hourly rate for active response	Base stipend + hourly pay when actively working incidents	$50-$150/hour (active) + $100-$200/week (standby)	High-volume environments
Comp time (time off in lieu)	No extra pay, but earn time off for on-call work	1:1 or 1.5:1 ratio	Startups, budget-constrained
Percentage of salary	On-call pay as % of base salary	5-15% of base salary while on call	Simplicity, salary-proportional
Hybrid	Flat stipend + additional pay per page/incident	$200/week + $50/incident	Balanced incentive

Sample Compensation Schedule

On-Call Period	Standby Pay	Active Incident Pay	Holiday Multiplier
Weeknight (6pm-8am)	$50/night	$75/hour	N/A
Weekend day (8am-6pm)	$75/day	$75/hour	N/A
Weekend night (6pm-8am)	$75/night	$100/hour	N/A
Holiday (full day)	$150/day	$100/hour	1.5x standby

Legal considerations:

Non-exempt (hourly) employees must be paid for all time spent responding to incidents per FLSA
Exempt employees may not be legally required to receive additional pay, but best practice is to compensate them
Some states have specific on-call pay requirements — check your state labor laws
On-call time where the employee cannot use the time freely may count as hours worked under FLSA

6. On-Call Tooling Requirements

Tool Category	Purpose	Examples
Paging/alerting	Route alerts to on-call engineer	PagerDuty, Opsgenie, VictorOps
Monitoring	Detect incidents automatically	Datadog, New Relic, Prometheus/Grafana
Communication	Coordinate during incidents	Slack (incident channel), Microsoft Teams, phone bridge
Runbooks	Step-by-step resolution guides	Confluence, Notion, internal wiki
Incident management	Track and document incidents	Jira, ServiceNow, Rootly, Firehydrant
VPN/Remote access	Access production systems remotely	Company VPN, SSH bastion host

Tooling requirements for on-call engineers:

Company-provided phone or phone stipend ($50-$100/month)
Company laptop with VPN configured and tested
Home internet backup plan (mobile hotspot as fallback)
Access to all production monitoring dashboards
Runbooks accessible from mobile device

7. On-Call Health and Sustainability

Burnout Prevention

Metric	Healthy Target	Action if Exceeded
Pages per on-call shift	Fewer than 10/week	Fix noisy alerts, improve automation
After-hours incidents requiring response	Fewer than 3/week	Improve system reliability
Average incident resolution time	Under 30 minutes	Better runbooks, automation
On-call frequency per engineer	No more than 1 week in 4	Hire additional staff
Engineer satisfaction (survey)	Above 3.5/5	Review rotation, compensation, workload

After On-Call Recovery

Engineers receive a recovery half-day off after a week of on-call (or full day if more than 5 incidents)
If called during sleeping hours (midnight-6am), the engineer may start the next workday late
Managers should check in with engineers after particularly heavy on-call weeks
Track cumulative on-call burden per engineer per quarter — adjust if inequitable

8. Runbook Requirements

Every service covered by on-call must have a runbook. No engineer should be paged for a service they don't have documentation for.

Minimum runbook contents:

SERVICE RUNBOOK: [Service Name]
Last Updated: [Date]
Owner: [Team/Engineer]

OVERVIEW:
- What this service does
- Who it serves (internal/external)
- Business impact if down

ARCHITECTURE:
- Infrastructure diagram (or link)
- Dependencies (upstream and downstream)
- Data stores

COMMON ALERTS AND RESOLUTION:
Alert: [Alert name]
  Meaning: [What this alert indicates]
  Impact: [User/business impact]
  Steps:
    1. [First thing to check]
    2. [Second thing to check]
    3. [How to resolve]
  Escalate if: [When to escalate and to whom]

CONTACT INFORMATION:
- Service owner: [Name, phone, Slack]
- Database team: [Contact]
- Vendor support: [Contact, account #, SLA]

ROLLBACK PROCEDURES:
- How to roll back the last deployment
- How to failover to backup

IT Policy Templates: Complete Guide — All IT policy templates in one place
IT Disaster Recovery Plan Template — Business continuity and disaster recovery procedures
IT Security Policy Template — Comprehensive security policy with incident response
IT Service Level Management — SLA templates and service level management
IT Documentation Templates — Runbooks, SOPs, and knowledge base templates

Frequently Asked Questions

How many people do I need for an on-call rotation?

Minimum 4 for a weekly rotation (1 in 4 frequency). Ideally 5-6 so engineers aren't on call more than once a month. If you only have 2-3 people, consider a manager-as-backup model or hiring specifically for operations coverage.

Should on-call be mandatory?

For operations and SRE roles, on-call is typically a job requirement and should be stated in the job description. For other engineering roles, participation may be optional or incentivized with additional compensation. Never spring on-call requirements on existing employees without their agreement.

How do I reduce on-call burnout?

Three things matter most: (1) reduce page volume through better monitoring and automation, (2) compensate fairly, and (3) limit frequency to no more than 1 week in 4. Engineers burn out from noisy alerts and feeling uncompensated far more than from occasional real incidents.

What if an on-call engineer doesn't respond?

The escalation policy should automatically page the secondary after the SLA expires. If both primary and secondary fail to respond, the manager is paged. Repeated failures to respond should be addressed through performance management, not by removing the engineer from rotation (which overloads others).

Do I need to pay exempt employees for on-call time?

Under FLSA, exempt employees are not legally required to receive additional on-call pay. However, best practice — and the standard in the tech industry — is to provide on-call compensation to all participants regardless of exempt status. It's a retention and fairness issue, not just a legal one.

What Is an IT On-Call Policy?

Why You Need a Formal On-Call Policy

IT On-Call Policy Template

1. Policy Overview

2. On-Call Rotation Schedules

Rotation Models

Sample Weekly Rotation Schedule

Rotation Rules

3. Response Time SLAs

4. Escalation Procedures

Escalation Matrix

Escalation Procedure

5. On-Call Compensation

Compensation Models

Sample Compensation Schedule

6. On-Call Tooling Requirements

7. On-Call Health and Sustainability

Burnout Prevention

After On-Call Recovery

8. Runbook Requirements

Implementing Your On-Call Program

Phase 1: Design (Week 1)

Phase 2: Tooling and Documentation (Week 2)

Phase 3: Launch (Week 3)

Phase 4: Optimize (Ongoing)

Related Resources

Frequently Asked Questions

How many people do I need for an on-call rotation?

Should on-call be mandatory?

How do I reduce on-call burnout?

What if an on-call engineer doesn't respond?

Do I need to pay exempt employees for on-call time?

Explore More IT Policies Resources

Related Articles

Incident Response Plan: Step-by-Step Guide

IT Documentation Templates Every Manager Needs [Free Downloads]

From Firefighting to Framework: How IT Managers Build Sustainable Operations

Related Templates

Acceptable Use Policy Template

Internet Usage Policy

Facebook Usage Policy

IT Roles and Responsibilities Policy

Explore Related Resource Hubs

Need a Template for This?