exit-codes

Exit-code reference for linuxguard-agent — universal codes (0/1/2), signal-induced 128+N codes (130 SIGINT, 143 SIGTERM), and per-command divergences.

This page documents the exit codes that linuxguard-agent returns to the calling shell, init system, or container runtime. Codes are deterministic and follow standard POSIX conventions; the only command whose exit code does NOT reflect operational outcome is probe, which is non-fatal by design.

Universal codes

These codes apply to every subcommand unless a per-command override is documented below.

Code
Source
Meaning

0

Normal return

Successful operation. The command completed and produced its expected output (or probe ran and emitted JSON regardless of capability check outcome).

1

log.Fatal(err) after the agent run returns non-nil

General error. Includes: invalid argument value, unknown config key, persist failure, network failure, enrollment failure, PID-file collision in typical service mode, missing required flag, and any unwrapped error returned by a subcommand's Action.

2

urfave/cli/v2 framework

Argument parsing error. Returned when the CLI parser rejects the command line before the subcommand Action runs (e.g., unknown flag, missing required positional argument). The framework prints a synopsis to stderr in addition to the exit code.

130

os.Exit(128 + int(syscall.SIGINT))

Process terminated by SIGINT — Ctrl-C in an interactive shell. The agent caught the signal, stored it in caughtSignal, cancelled the agent context, returned from main, and explicitly called os.Exit(130).

143

os.Exit(128 + int(syscall.SIGTERM))

Process terminated by SIGTERM — orchestrator-driven shutdown (systemctl stop linuxguard-agent, docker stop, kubectl delete pod). The agent caught the signal, stored it in caughtSignal, cancelled the agent context, returned from main, and explicitly called os.Exit(143).

The 128 + signum convention

linuxguard-agent re-raises caught signals as exit codes following the standard shell convention:

Signal

signum (Linux)

Exit code

SIGINT

2

128 + 2 = 130

SIGTERM

15

128 + 15 = 143

The re-raise happens via os.Exit(128+signum) directly from main (NOT from the signal-handling goroutine). The rationale and the history of why an earlier signal.Reset + syscall.Kill(getpid, sig) approach was replaced are documented in signals § Re-raise convention. The short version: under Go 1.25 inside a distroless containerized PID-1 deployment, the runtime's dieFromSignal path fell through to its exit(2) fallback before the synchronous re-raise terminated the process; docker wait reported 2 for every SIGTERM. Direct os.Exit(128+signum) returns the expected 143 deterministically.

Per-command divergences

Command
Code
Difference from universal table

0

Always 0. Probe is non-fatal by design. Capability check failures are reported as false JSON fields, NOT as a non-zero exit. Even a JSON marshal failure (which should not happen for a struct of basic types) emits {} to stdout and returns nil. CI and Ansible callers MUST parse the JSON output and decide their own pipeline disposition.

0

Never observed in normal long-running service mode. 0 only on --help, --version, or other early-exit paths that bypass the agent worker. A start that returned 0 after running is itself a bug — the agent is expected to either run indefinitely or terminate via signal (130 / 143) or error (1).

0

Always 0status is informational. The "is the agent running?" answer is conveyed by the stdout text (linuxguard-agent is running with PID: <N> vs linuxguard-agent is not running), not by the exit code. Shell scripts that need a programmatic check should grep the stdout or use pgrep linuxguard-agent.

0 or 1

0 when the PID file is absent (agent not running — non-error per the agent's logic) or when the agent acknowledged SIGTERM and exited within 10 seconds. 1 when the agent did not exit within the 10-second wait window or the PID file was unreadable.

0

0 even when no agent process is found at the PID file — set persists the new level and reports (persisted; agent not running — applies on next start) to stdout. The persist succeeded; the SIGHUP was skipped because the target did not exist. Distinguishing "persisted + SIGHUP delivered" from "persisted only" is the stdout message, not the exit code.

0 or 1

1 when the --out path is occupied (the O_EXCL open rejects pre-existing files), when the output directory cannot be created with mode 0750, or when archive assembly fails. Otherwise 0 with the bundle path on stdout and sha256=... size=... on stderr.

0 or 1

1 when: the bundle file is missing or unreadable, its size disagrees with the manifest (file changed after collect), the presign call (POST /upload-url) fails, the PUT to S3 fails, or the register call (POST /register) fails. Otherwise 0 with bundle_id, object_key, and uploaded_at on stdout. The local bundle file is NEVER deleted regardless of outcome.

Examples

Read the exit code in a shell

Use SIGTERM for orchestrated shutdown

systemctl stop delivers SIGTERM via the unit file's KillSignal=SIGTERM (the systemd default). The agent's signal handler catches it and re-raises as 143. ExecMainStatus=143 is the recorded exit code.

Detect signal-induced shutdown in CI / orchestrator logs

docker wait <container> and kubectl get pod <pod> -o jsonpath='{.status.containerStatuses[0].state.terminated.exitCode}' both report the integer exit code. Treat 130 and 143 as graceful, signal-induced shutdowns (not failures) when the orchestrator initiated the stop. Treat 1 and 2 as actual errors that warrant log inspection.

Distinguish argument parsing errors (2) from runtime errors (1)

Code 2 is a CLI-framework rejection; the subcommand never executed. Code 1 is a runtime error from inside the subcommand's Action. Both surface a non-zero exit, but the distinction is useful when triaging in CI logs.


Related: signals | start | probe | support-bundle | CLI Reference

Last updated

Was this helpful?