Exit code 137: killed by SIGKILL (usually out of memory)
Exit code 137 means the process was terminated by SIGKILL (signal 9), because 137 = 128 + 9 under the shell convention that a process killed by signal N reports 128 + N. The most common cause is the Linux out-of-memory (OOM) killer, or a container memory limit (cgroup) killing the process when it exceeds its allowed memory. A plain retry will not help because it is a resource ceiling, not a transient error — you have to reduce the job's memory use or raise the limit.
Why does exit code 137 happen?
Exit code 137 is a signal-termination code. Under the shell convention used by Bash and most job runners, a process killed by signal N exits with status 128 + N. 137 - 128 = 9, and signal 9 is SIGKILL — a signal that cannot be caught, blocked, or ignored, so the process dies immediately with no chance to clean up.
Something sent that SIGKILL. In a scheduled job the usual senders are:
- The Linux OOM killer: when the machine runs out of memory, the kernel picks a process and kills it with SIGKILL to reclaim memory. You'll see an "Out of memory: Killed process" line in dmesg or the kernel log.
- A container memory limit: in Docker or Kubernetes, exceeding the container's memory limit (its cgroup limit) triggers an OOM kill of the offending process, surfacing as exit 137 on the container.
- A manual or orchestrated kill -9: something (a deploy script, a watchdog, an operator) sent SIGKILL directly.
How do I confirm it was the OOM killer?
Check the kernel log for the OOM killer's own message. It names the killed PID and the memory it was using:
dmesg -T | grep -i -E "killed process|out of memory"
journalctl -k --since "1 hour ago" | grep -i "out of memory"
# In Kubernetes, the pod's last state shows the reason:
kubectl describe pod <pod> | grep -A3 "Last State" # look for Reason: OOMKilledIf you see "Out of memory: Killed process" the kernel OOM killer did it. If a Kubernetes pod shows Reason: OOMKilled, the container hit its memory limit. If neither appears, something sent SIGKILL explicitly.
How do I fix exit code 137?
- Reduce peak memory: stream or batch instead of loading everything into memory at once, process records in chunks, and free large buffers as soon as you're done with them.
- Raise the ceiling: increase the container/cgroup memory limit or move the job to a larger machine if the working set is genuinely that large.
- Cap concurrency: if the job forks workers, each worker holds its own memory — fewer parallel workers means a lower peak.
- Don't just retry: because 137 is a resource ceiling, an automatic retry usually gets killed the same way. Fix the memory pressure first.
How do I get alerted when a job exits 137?
An OOM-killed job often dies mid-run, so it never reports success — which a heartbeat monitor reads as a miss and alerts on. Have the job ping a monitor only on a clean exit, so a SIGKILL withholds the ping:
# Ping only on a zero exit (&&, not ;). A SIGKILL-ed job sends no ping,
# and the monitor alerts you that the run missed.
/usr/local/bin/nightly.sh && curl -fsS -m 10 --retry 3 "https://ping.cronshield.com/<your-check-id>"Catch this failure automatically
The free tier gives you a heartbeat endpoint and an email alert when an expected ping doesn't arrive. Paid tiers add the log-aware diagnosis — the last log line and a likely cause in the alert. The heartbeat receiver ships in an upcoming release; see the plans to learn what each tier adds.
Frequently asked questions
- Is exit code 137 always out of memory?
- No, but it usually is. 137 specifically means SIGKILL (signal 9). The OOM killer and container memory limits are the most common senders of SIGKILL to a batch job, but a manual kill -9 or a watchdog can also produce it. Check the kernel log for an "Out of memory" line to confirm.
- What's the difference between exit 137 and exit 143?
- Both are signal terminations. 137 = 128 + 9 = SIGKILL (an immediate, uncatchable kill, often OOM). 143 = 128 + 15 = SIGTERM (a polite termination the process can handle for a graceful shutdown). A 137 job was killed hard; a 143 job was asked to stop.
Primary sources
- GNU Bash Manual — Exit Status (128 + N for signal N) — verified 2026-07-04
- Linux man-pages: signal(7) — SIGKILL is signal 9, cannot be caught — verified 2026-07-04
- Kubernetes docs — container states and OOMKilled reason — verified 2026-07-04