`pg0 info`/`running` reports a postgres instance as healthy with no real connectivity check, masking zombie processes after shared-memory loss

**Submitted with the assistance of Sally Sonnet**

## Summary

`pg0 info` / `pg0 list` (and the `running` property on the Python `Pg0` class) report a postgres instance as "running" based on process/port-alive state only — there's no real connectivity check (e.g. `SELECT 1`). This means a postgres backend that's technically alive in `ps` and bound to its port, but unable to actually serve any query, is reported as healthy indefinitely.

## How this happens in practice

On Linux hosts where `systemd-logind`'s default `RemoveIPC=yes` reaps a user's shared-memory segments when their last login session ends (and that user has no `loginctl` linger enabled), an embedded `pg0` postgres instance can lose its shared memory while the OS process itself keeps running. Any subsequent real connection attempt fails:

```
FATAL: could not open shared memory segment "/PostgreSQL.NNNNNNNNNN": No such file or directory
```

But `pg0 info --name <instance>` and `pg0 list` continue to report the instance as `running` with a valid-looking connection URI, because the check never actually opens a connection — it appears to only check that the process exists and the port is listening.

## Why this matters

For any application that calls `pg0.info().running` (or the CLI equivalent) to decide whether to skip startup/use an existing instance — as `hindsight-api`'s embedded-Postgres manager does — a zombie instance like this is invisible. The application happily reuses (or tries to reuse) a connection to an instance that can never actually serve a query, and the resulting failure surfaces much later, in application-level code, with no indication that `pg0` itself already "knew" the instance was unhealthy.

## Verification performed

- Reproduced directly: stopped a healthy instance's shared memory out from under it (via the systemd RemoveIPC interaction above), confirmed the process was still alive (`ps`), confirmed `pg0 list` still reported `(running)`, and confirmed a direct `psql` connection failed with the shared-memory error shown above.
- After restarting the instance with `pg0 stop` + `pg0 start` (getting a fresh shared-memory segment), the same instance correctly served real queries again.

## Suggested fix

Have `pg0 info`/`pg0 list`/the `running` property perform a lightweight real connectivity check (e.g. attempt a trivial query via the bundled `psql`, or open a raw libpq connection) rather than relying solely on process-alive + port-listening state.

## Environment

- pg0-embedded 0.14.2 (Python SDK), pg0 CLI 0.14.2
- PostgreSQL 18.1.0 (bundled)
- Host: Debian 12 bookworm, x86_64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`pg0 info`/`running` reports a postgres instance as healthy with no real connectivity check, masking zombie processes after shared-memory loss #24

Summary

How this happens in practice

Why this matters

Verification performed

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

pg0 info/running reports a postgres instance as healthy with no real connectivity check, masking zombie processes after shared-memory loss #24

Description

Summary

How this happens in practice

Why this matters

Verification performed

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`pg0 info`/`running` reports a postgres instance as healthy with no real connectivity check, masking zombie processes after shared-memory loss #24