IT On-Call Policy Template [Free] — Rotation Schedules, Escalation & Compensation
On-call burnout is the #1 reason IT operations staff quit. A poorly designed on-call program leads to exhausted engineers, missed incidents, and attrition that costs $50,000-$150,000 per replacement hire. A well-structured IT on-call policy fixes this by setting clear expectations for rotation schedules, response times, escalation paths, and compensation. This guide provides a complete, ready-to-use on-call policy template. For additional IT management resources, visit our IT Policy Templates guide and IT Management Policies.
Quick Start: Download our free IT On-Call Policy Template — covers rotation design, escalation procedures, compensation guidelines, and response time SLAs. Customize for your team in under an hour.
What Is an IT On-Call Policy?
An IT on-call policy is a formal document that defines the rules, expectations, and procedures for staff who are designated to respond to IT incidents outside of normal business hours. It covers who is on call, when, how they're contacted, what they're expected to do, and how they're compensated.
Why You Need a Formal On-Call Policy
| Without a Policy | With a Policy |
|---|---|
| Same people always get called | Fair rotation ensures equitable distribution |
| No clear response time expectations | Defined SLAs by severity level |
| Engineers don't know when they're "off" | Clear on/off boundaries reduce burnout |
| Compensation is inconsistent or nonexistent | Fair pay for after-hours work |
| Escalation is chaotic ("just call whoever") | Structured escalation paths with backup |
| Burnout and attrition | Sustainable, predictable program |
IT On-Call Policy Template
1. Policy Overview
IT ON-CALL POLICY
Version: 1.0
Effective Date: [Date]
Policy Owner: [IT Director / VP of Engineering]
Approved By: [CTO / VP of IT]
PURPOSE:
This policy establishes the framework for IT on-call coverage,
ensuring critical systems are monitored and incidents are resolved
outside of normal business hours. It defines rotation schedules,
response expectations, escalation procedures, and compensation.
SCOPE:
This policy applies to all IT staff who participate in on-call
rotations, including:
- Infrastructure and operations engineers
- Site reliability engineers (SRE)
- Database administrators
- Network engineers
- Security operations staff
- Application support engineers
DEFINITIONS:
- On-call: Designated period where an engineer must be reachable
and able to respond to incidents within defined SLA
- Primary on-call: First responder for all incoming alerts
- Secondary on-call: Backup who is engaged if primary doesn't respond
- Escalation: Process of involving additional personnel when an
incident exceeds the responder's ability or authority
2. On-Call Rotation Schedules
Rotation Models
| Model | How It Works | Best For | Drawbacks |
|---|---|---|---|
| Weekly rotation | One person covers 7 days, then rotates | Small teams (3-5 people) | Full week can be exhausting |
| Follow-the-sun | Shifts align with time zones, no overnight | Distributed teams across 3+ time zones | Requires global headcount |
| Split week | Weekdays (Mon-Fri) and weekend (Sat-Sun) are separate | Teams where weekends are significantly different | More handoffs |
| Day/night split | Day shift (8am-8pm) and night shift (8pm-8am) | High-volume environments | Requires larger team |
| Bi-weekly rotation | Two-week rotation with primary and secondary | Medium teams (5-10 people) | Long rotation period |
Sample Weekly Rotation Schedule
| Week | Primary On-Call | Secondary On-Call | Shift |
|---|---|---|---|
| Feb 10-16 | Engineer A | Engineer B | Mon 9am - Mon 9am |
| Feb 17-23 | Engineer B | Engineer C | Mon 9am - Mon 9am |
| Feb 24-Mar 2 | Engineer C | Engineer D | Mon 9am - Mon 9am |
| Mar 3-9 | Engineer D | Engineer E | Mon 9am - Mon 9am |
| Mar 10-16 | Engineer E | Engineer A | Mon 9am - Mon 9am |
Rotation Rules
- Minimum team size: 4 people for weekly rotation (ensures no more than 1 week in 4 on call)
- Maximum on-call frequency: No more than 1 week in 3 (33% of time)
- Holiday coverage: Rotated separately and equitably tracked over the year
- Swap policy: Engineers may swap shifts with manager approval and 48-hour notice
- New hire exemption: New team members shadow on-call for 2 rotations before going primary
- Consecutive limit: No engineer should be primary on-call for more than 7 consecutive days
3. Response Time SLAs
| Severity | Definition | Acknowledgment | Response | Resolution Target |
|---|---|---|---|---|
| P1 — Critical | Complete service outage, data breach, revenue-impacting | 5 minutes | 15 minutes to begin working | 1 hour (mitigate), 4 hours (resolve) |
| P2 — High | Major feature degraded, significant user impact | 15 minutes | 30 minutes to begin working | 4 hours |
| P3 — Medium | Minor feature impacted, workaround available | 30 minutes | 1 hour to begin working | Next business day |
| P4 — Low | Informational alert, no user impact | 1 hour | Next business day | Scheduled maintenance window |
Response expectations during on-call:
- Must be reachable by phone and paging system at all times during shift
- Must be able to access a laptop and VPN within 15 minutes of page
- Must not be more than 30 minutes from a reliable internet connection
- Alcohol consumption must not impair ability to respond and troubleshoot
- If temporarily unreachable (medical appointment, flight), secondary must be notified
4. Escalation Procedures
Escalation Matrix
| Trigger | Action | Who to Escalate To | Timeline |
|---|---|---|---|
| Primary doesn't acknowledge within SLA | Auto-page secondary on-call | Secondary engineer | Automatic after SLA breach |
| Secondary doesn't acknowledge | Page on-call manager | Engineering manager | 10 minutes after secondary SLA |
| P1 incident lasting over 30 minutes | Notify management | Engineering manager + IT Director | 30 minutes into incident |
| P1 incident lasting over 1 hour | Executive notification | VP of Engineering / CTO | 1 hour into incident |
| Incident requires cross-team support | Page relevant team's on-call | Other team's primary on-call | As needed |
| Incident has customer impact | Notify customer success | CS on-call + Communications | Immediately upon confirmed impact |
Escalation Procedure
ESCALATION FLOW
1. Alert fires → Primary on-call paged
├── Acknowledged within SLA → Primary investigates
│ ├── Resolved → Document and close
│ └── Cannot resolve alone → Escalate to secondary/specialist
└── NOT acknowledged within SLA → Secondary auto-paged
├── Acknowledged → Secondary investigates
└── NOT acknowledged → Manager paged (phone call)
2. For P1 incidents:
- 15 min: Primary begins investigation
- 30 min: If unresolved, notify engineering manager
- 60 min: If unresolved, notify IT Director/CTO
- 90 min: If unresolved, incident commander assigned, war room opened
5. On-Call Compensation
On-call compensation varies by company and region. Here are the most common models:
Compensation Models
| Model | How It Works | Typical Range | Best For |
|---|---|---|---|
| Flat on-call stipend | Fixed amount per on-call shift regardless of pages | $200-$500/week | Predictable cost, low-volume paging |
| Hourly rate for active response | Base stipend + hourly pay when actively working incidents | $50-$150/hour (active) + $100-$200/week (standby) | High-volume environments |
| Comp time (time off in lieu) | No extra pay, but earn time off for on-call work | 1:1 or 1.5:1 ratio | Startups, budget-constrained |
| Percentage of salary | On-call pay as % of base salary | 5-15% of base salary while on call | Simplicity, salary-proportional |
| Hybrid | Flat stipend + additional pay per page/incident | $200/week + $50/incident | Balanced incentive |
Sample Compensation Schedule
| On-Call Period | Standby Pay | Active Incident Pay | Holiday Multiplier |
|---|---|---|---|
| Weeknight (6pm-8am) | $50/night | $75/hour | N/A |
| Weekend day (8am-6pm) | $75/day | $75/hour | N/A |
| Weekend night (6pm-8am) | $75/night | $100/hour | N/A |
| Holiday (full day) | $150/day | $100/hour | 1.5x standby |
Legal considerations:
- Non-exempt (hourly) employees must be paid for all time spent responding to incidents per FLSA
- Exempt employees may not be legally required to receive additional pay, but best practice is to compensate them
- Some states have specific on-call pay requirements — check your state labor laws
- On-call time where the employee cannot use the time freely may count as hours worked under FLSA
6. On-Call Tooling Requirements
| Tool Category | Purpose | Examples |
|---|---|---|
| Paging/alerting | Route alerts to on-call engineer | PagerDuty, Opsgenie, VictorOps |
| Monitoring | Detect incidents automatically | Datadog, New Relic, Prometheus/Grafana |
| Communication | Coordinate during incidents | Slack (incident channel), Microsoft Teams, phone bridge |
| Runbooks | Step-by-step resolution guides | Confluence, Notion, internal wiki |
| Incident management | Track and document incidents | Jira, ServiceNow, Rootly, Firehydrant |
| VPN/Remote access | Access production systems remotely | Company VPN, SSH bastion host |
Tooling requirements for on-call engineers:
- Company-provided phone or phone stipend ($50-$100/month)
- Company laptop with VPN configured and tested
- Home internet backup plan (mobile hotspot as fallback)
- Access to all production monitoring dashboards
- Runbooks accessible from mobile device
7. On-Call Health and Sustainability
Burnout Prevention
| Metric | Healthy Target | Action if Exceeded |
|---|---|---|
| Pages per on-call shift | Fewer than 10/week | Fix noisy alerts, improve automation |
| After-hours incidents requiring response | Fewer than 3/week | Improve system reliability |
| Average incident resolution time | Under 30 minutes | Better runbooks, automation |
| On-call frequency per engineer | No more than 1 week in 4 | Hire additional staff |
| Engineer satisfaction (survey) | Above 3.5/5 | Review rotation, compensation, workload |
After On-Call Recovery
- Engineers receive a recovery half-day off after a week of on-call (or full day if more than 5 incidents)
- If called during sleeping hours (midnight-6am), the engineer may start the next workday late
- Managers should check in with engineers after particularly heavy on-call weeks
- Track cumulative on-call burden per engineer per quarter — adjust if inequitable
8. Runbook Requirements
Every service covered by on-call must have a runbook. No engineer should be paged for a service they don't have documentation for.
Minimum runbook contents:
SERVICE RUNBOOK: [Service Name]
Last Updated: [Date]
Owner: [Team/Engineer]
OVERVIEW:
- What this service does
- Who it serves (internal/external)
- Business impact if down
ARCHITECTURE:
- Infrastructure diagram (or link)
- Dependencies (upstream and downstream)
- Data stores
COMMON ALERTS AND RESOLUTION:
Alert: [Alert name]
Meaning: [What this alert indicates]
Impact: [User/business impact]
Steps:
1. [First thing to check]
2. [Second thing to check]
3. [How to resolve]
Escalate if: [When to escalate and to whom]
CONTACT INFORMATION:
- Service owner: [Name, phone, Slack]
- Database team: [Contact]
- Vendor support: [Contact, account #, SLA]
ROLLBACK PROCEDURES:
- How to roll back the last deployment
- How to failover to backup
Implementing Your On-Call Program
Phase 1: Design (Week 1)
- Determine which services require on-call coverage
- Identify eligible team members (minimum 4 per rotation)
- Choose rotation model based on team size and distribution
- Define severity levels and response time SLAs
- Set compensation structure (get HR and finance approval)
Phase 2: Tooling and Documentation (Week 2)
- Configure paging/alerting tool with on-call schedules
- Create or update runbooks for all covered services
- Set up escalation policies in alerting tool
- Configure monitoring thresholds and alert routing
- Test paging flow end-to-end (including escalation)
Phase 3: Launch (Week 3)
- Distribute on-call policy to all participating engineers
- Conduct on-call training session (runbook walkthrough, tool training)
- Shadow rotation: new on-call engineers paired with experienced ones
- Collect signed acknowledgments
- Begin first rotation
Phase 4: Optimize (Ongoing)
- Review on-call metrics monthly (page volume, MTTA, MTTR)
- Survey on-call engineers quarterly on satisfaction and workload
- Reduce alert noise (eliminate false positives and duplicate alerts)
- Update runbooks after every incident that revealed a documentation gap
- Adjust rotation and compensation annually based on feedback
Related Resources
- IT Policy Templates: Complete Guide — All IT policy templates in one place
- IT Disaster Recovery Plan Template — Business continuity and disaster recovery procedures
- IT Security Policy Template — Comprehensive security policy with incident response
- IT Service Level Management — SLA templates and service level management
- IT Documentation Templates — Runbooks, SOPs, and knowledge base templates
Frequently Asked Questions
How many people do I need for an on-call rotation?
Minimum 4 for a weekly rotation (1 in 4 frequency). Ideally 5-6 so engineers aren't on call more than once a month. If you only have 2-3 people, consider a manager-as-backup model or hiring specifically for operations coverage.
Should on-call be mandatory?
For operations and SRE roles, on-call is typically a job requirement and should be stated in the job description. For other engineering roles, participation may be optional or incentivized with additional compensation. Never spring on-call requirements on existing employees without their agreement.
How do I reduce on-call burnout?
Three things matter most: (1) reduce page volume through better monitoring and automation, (2) compensate fairly, and (3) limit frequency to no more than 1 week in 4. Engineers burn out from noisy alerts and feeling uncompensated far more than from occasional real incidents.
What if an on-call engineer doesn't respond?
The escalation policy should automatically page the secondary after the SLA expires. If both primary and secondary fail to respond, the manager is paged. Repeated failures to respond should be addressed through performance management, not by removing the engineer from rotation (which overloads others).
Do I need to pay exempt employees for on-call time?
Under FLSA, exempt employees are not legally required to receive additional on-call pay. However, best practice — and the standard in the tech industry — is to provide on-call compensation to all participants regardless of exempt status. It's a retention and fairness issue, not just a legal one.