docs: document LMCache Prometheus metrics harmless error for vLLM 0.12.0 (#5026)

keivenchang · web-flow · commit 4d0b1a119e7f · 2025-12-19T08:57:40.000-08:00
Signed-off-by: Keiven Chang &lt;keivenchang@users.noreply.github.com&gt;
Co-authored-by: Keiven Chang &lt;keivenchang@users.noreply.github.com&gt;
diff --git a/docs/backends/vllm/LMCache_Integration.md b/docs/backends/vllm/LMCache_Integration.md
@@ -160,6 +160,45 @@ When LMCache is enabled with `--connector lmcache` and `DYN_SYSTEM_PORT` is set,
 
 For detailed information on LMCache metrics, including the complete list of available metrics and how to access them, see the **[LMCache Metrics section](prometheus.md#lmcache-metrics)** in the vLLM Prometheus Metrics Guide.
 
+### Troubleshooting
+
+#### LMCache log: `PrometheusLogger instance already created with different metadata`
+
+You may see an error like:
+
+```text
+LMCache ERROR: PrometheusLogger instance already created with different metadata. This should not happen except in test
+```
+
+**Version note**: We reproduced this behavior with **vLLM v0.12.0**. We have not reproduced it with **vLLM v0.11.0**, so it may be specific to (or introduced in) v0.12.0.
+
+This is emitted by LMCache when the LMCache connector is initialized more than once in the same process (for example, once for a `WORKER` role and later for a `SCHEDULER` role). LMCache uses a process-global singleton for its Prometheus logger, so the second initialization can log this warning if its metadata differs.
+
+- **Impact**: This is a log-only error; in our testing it does not prevent vLLM/Dynamo from serving requests. If you care about LMCache metric labels, be aware the logger singleton uses the first-seen metadata.
+- **Repro without Dynamo** (vLLM v0.12.0):
+
+```bash
+vllm serve Qwen/Qwen3-0.6B \
+  --host 127.0.0.1 --port 18000 \
+  --gpu-memory-utilization 0.24 \
+  --enforce-eager \
+  --no-enable-prefix-caching \
+  --max-num-seqs 2 \
+  --kv-offloading-backend lmcache \
+  --kv-offloading-size 1 \
+  --disable-hybrid-kv-cache-manager
+```
+
+- **Mitigation (silence)**: set `LMCACHE_LOG_LEVEL=CRITICAL`.
+- **Upstream issue**: [vLLM issue #30996](https://github.com/vllm-project/vllm/issues/30996).
+
+#### vLLM log: `Found PROMETHEUS_MULTIPROC_DIR was set by user`
+
+vLLM v1 uses `prometheus_client.multiprocess` and stores intermediate metric values in `PROMETHEUS_MULTIPROC_DIR`.
+
+- If you **set `PROMETHEUS_MULTIPROC_DIR` yourself**, vLLM warns that the directory must be wiped between runs to avoid stale/incorrect metrics.
+- When running via Dynamo, the vLLM wrapper may set `PROMETHEUS_MULTIPROC_DIR` internally to a temporary directory to avoid vLLM cleanup issues. If you still see the warning, confirm you are not exporting `PROMETHEUS_MULTIPROC_DIR` in your shell or container environment.
+
 ## References and Additional Resources
 
 - [LMCache Documentation](https://docs.lmcache.ai/index.html) - Comprehensive guide and API reference
diff --git a/docs/backends/vllm/prometheus.md b/docs/backends/vllm/prometheus.md
@@ -129,6 +129,12 @@ python -m dynamo.vllm --model Qwen/Qwen3-0.6B --connector lmcache
 curl -s localhost:8081/metrics | grep "^lmcache:"
 ```
 
+### Troubleshooting
+
+Troubleshooting LMCache-related metrics and logs (including `PrometheusLogger instance already created with different metadata` and `PROMETHEUS_MULTIPROC_DIR` warnings) is documented in:
+
+- [LMCache Integration Guide](LMCache_Integration.md#troubleshooting)
+
 **For complete LMCache configuration and metric details**, see:
 - [LMCache Integration Guide](LMCache_Integration.md) - Setup and configuration
 - [LMCache Observability Documentation](https://docs.lmcache.ai/production/observability/vllm_endpoint.html) - Complete metrics reference