DevOps

How to Run On-Call Rotations with a 5-Person Engineering Team

With 5 engineers, traditional on-call rotations do not work. A weekly rotation means each person is on call once every 5 weeks, which sounds fine until you factor in vacations, sick days, and the reality that being on call for a full week is exhausting. Burnout sets in fast.

Here is how to build an on-call system that actually works for a small team.

The sustainable rotation model

Instead of weekly rotations, use 2-day rotations with a clear primary and secondary. With 5 engineers, each person is primary for 2 days every 10 days. That is manageable.

The schedule looks like this:

  • Monday-Tuesday: Engineer A (primary), Engineer B (secondary)
  • Wednesday-Thursday: Engineer B (primary), Engineer C (secondary)
  • Friday-Saturday: Engineer C (primary), Engineer D (secondary)
  • Sunday-Monday: Engineer D (primary), Engineer E (secondary)
  • Tuesday-Wednesday: Engineer E (primary), Engineer A (secondary)

The secondary only gets paged if the primary does not acknowledge within 10 minutes. This provides coverage without doubling the burden.

Reducing the on-call burden

The goal is to make on-call boring. Every page should either be actionable (something is actually broken) or eliminated (the alert is noise). Here is how:

Eliminate noisy alerts. If an alert fires more than twice a week and the response is always "ignore it" or "it resolved itself," delete the alert or fix the underlying issue. Alert fatigue is the fastest way to burn out an on-call engineer. Aim for fewer than 2 pages per on-call shift.

Write runbooks for every alert. Every alert that pages someone should have a corresponding runbook: what the alert means, how to investigate, and how to resolve it. If the on-call engineer can follow the runbook without needing to wake up anyone else, the page is manageable.

Invest in self-healing. Auto-restart crashed services. Auto-scale when traffic spikes. Auto-rotate credentials before they expire. Every self-healing mechanism you build is a page that never happens.

Set business-hours-only alerts for non-critical issues. Not everything needs to wake someone up at 3 AM. If your staging environment is down, it can wait until morning. If a non-critical batch job fails, it can wait. Reserve after-hours paging for customer-impacting production issues only.

Compensating on-call engineers

On-call is work. Compensate it. Common approaches for startups:

  • Per-shift stipend: $200-$500 per on-call shift (2 days). This is the simplest model and the one we recommend.
  • Comp time: A day off for every on-call shift. Works if your team values time more than money.
  • Per-incident bonus: $100-$200 for every incident that requires after-hours work. Incentivizes response but can feel transactional.

Whatever model you choose, make it explicit and consistent. Nothing kills on-call morale faster than the perception that it is uncompensated extra work.

Protecting quality of life

Set clear expectations about response time. For most startups, a 15-minute acknowledgment SLA for critical alerts is reasonable. The on-call engineer does not need to be at their desk, but they need to have their phone within reach and be in a state where they can respond.

Never schedule on-call during someone is vacation. This sounds obvious but we have seen it happen. Use your alerting tool is (PagerDuty, OpsGenie) override feature to swap shifts when people are away.

Run a weekly on-call review. The outgoing on-call engineer spends 15 minutes reviewing what happened during their shift: what pages fired, how they were resolved, and what can be improved. This creates a feedback loop that continuously reduces the on-call burden.

Need help setting up on-call rotations?

traztech helps startups build sustainable on-call systems. We configure alerting tools, write runbooks, and set up rotations that keep your team healthy and your systems reliable.

Book a free strategy call

Not ready for a call? Same.

Get the playbook, not a sales pitch

If this was useful, Jacob sends a few short, practical notes on cutting cloud spend and scaling infra the right way. No fluff, unsubscribe in one click. Just reply if you want to talk; it reaches him directly.

From Jacob Masse, founder of traztech. No spam, unsubscribe in one click.

Need help with any of this?

We help startups build secure, scalable infrastructure. Book a free strategy call and let\'s talk about your stack.

Book a free consultation