feat: SOAP faults, SCPD capabilities, IGD v2, security hardening#10
Open
MorningLightMountain713 wants to merge 37 commits into
Open
Conversation
This could be better implemented using the GetListOfPortMappings api. However, it does require a second parsing of the returned CDATA, and the returned object is of a different format. For miniupnpd, the current implementation will return an error with the text `ArrayIndexInvalid` - so on the first iteration, we can use this to determine if there are no records available, instead of failing on the next test and returning an error - when there isn't. If there is another upnp implementation being used, it will fall back to the prior behavior.
This is handy for a user to be able to cache the router url. Sometimes, multicast will work for the first 300 seconds until the join times out, in this case - the user can cache the URL, and if it fails when using SSDP, it can set the url and try again without multicast.
…y hardening Library rewrite based on survey of 400+ routers across the Flux network (13 router models with full XML fixtures captured from real devices). SOAP fault handling: - Detect faults in both HTTP 500 and HTTP 200 responses - Structured UpnpError class with code, description, action - Single extractFaultInfo/throwSoapFault utility (no duplication) Device info and capabilities: - Parse rootDesc.xml into GatewayDevice (manufacturer, model, daemon info) - Parse SCPD actionList to discover supported actions - ServiceCapabilities with convenience booleans (supportsAddAnyPortMapping, etc.) - Lazy loading with promise caching — no redundant fetches IGD v2 actions (capability-gated): - AddAnyPortMapping (router assigns port if taken) - DeletePortMappingRange (bulk delete) - GetListOfPortMappings (single-call range query) - Actions check SCPD before executing; throw if unsupported or unknown New v1 actions: - GetSpecificPortMappingEntry — O(1) mapping lookup by port - GetStatusInfo — WAN connection status and uptime for reboot detection Security: - XXE protection (processEntities: false on XMLParser) - XML escaping on SOAP argument values (prevents injection) - Response size limits (2MB cap via axios maxContentLength) - getMappings iteration capped at 10,000 (prevents infinite loop from malicious router) Networking: - HTTP Agent with keepAlive: false — prevents Node.js 19+ socket hang up errors against miniupnpd (which always responds with Connection: close) - Single SSDP socket bound to 0.0.0.0 (was N sockets for N interfaces) - Local address resolved via UDP connect (zero-packet kernel route query) — the standard technique used by miniupnpc, Python, Go, Docker, Kubernetes Bug fixes: - normalizeOptions defaults internal port to match external (fixes Sagemcom 402) - parseMapping handles missing NewProtocol gracefully - parseDescription filters non-object entries from service list - Service preference order: WANIPConnection:2 > :1 > WANPPPConnection:1 Tests: - 252 unit tests across 13 router fixtures (170 XML files) - Fixtures from: OPNsense, pfSense 2.7/2.8, ASUS RT-AX55, NEC SH621A1, Sagemcom Livebox, Sagemcom F5685, Nokia IGD v2, MikroTik, Freebox, Linux IGD, Sercomm GPON, Technicolor - Integration test suite updated for new async UpnpInfo API Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add GitHub Actions workflow: build + test on push (Node 18/20/22), publish to npm on GitHub release with provenance - Remove build/ from git tracking (built by CI and npm prepare) - Add build/ to .gitignore - Check in package-lock.json for reproducible installs - Update all dependencies to latest: - axios ^1.6.5 (latest 1.x) - fast-xml-parser ^5.5.9 (major bump from 4.x, fixes CVEs) - typescript ^6.0.2 (major bump from 5.x) - @types/node ^25.5.0 - Update tsconfig for TypeScript 6 compatibility (target es2020, types: ["node"]) - Add prepare script so npm install from git triggers build - Add types field for TypeScript consumers - Package name set to @megachips/nat-upnp for testing before PR to @RunOnFlux - Zero npm audit vulnerabilities Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- actions/checkout@v6, actions/setup-node@v6 (fixes Node.js 20 deprecation warnings) - Test matrix: Node 20, 22, 24 (drop 18, add 24) - Publish on Node 24 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
npm version errors when version unchanged. Add --allow-same-version flag. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Complete README rewrite documenting all new APIs - References @runonflux/nat-upnp as upstream - BlueOak Model License 1.0.0 (modern permissive, MIT-compatible) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Security: - Add HTTP timeout (10s) and redirect limit (2) to all axios requests - Validate SSDP Location header (reject non-http URLs) - Validate Device constructor URL - Guard against empty/IPv6 hostname in resolveLocalAddress - Type-safe axios error handling (isAxiosErrorWithData guard) Race conditions: - SSDP socket only stored after bind succeeds (not before) - Concurrent getGateway() calls share one pending promise - All lazy-cached promises clear on error to allow retry Types: - RawResponse simplified to Record<string, unknown> (was incorrectly typed) - Mapping, StatusInfo, GatewayDevice, ServiceCapabilities use readonly - catch(err: unknown) instead of catch(err: any) - extractFaultInfo typed with Record<string, unknown> Code quality: - Remove dead getService() method from IDevice interface - Error messages include URLs for production debugging - Remove listener before setting ended flag (SSDP cleanup ordering) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Runtime hardening from Node.js expert review: - resolveLocalAddress: add 5s timeout with proper settled guard to prevent socket leak on unreachable IPs - Client.close(): set closed flag, getGateway() throws if called after close - Prevents callers from hanging forever when close() is called during gateway discovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getMappings() previously swallowed ALL errors silently, returning partial results with no indication of incompleteness. Now: - End-of-list errors (713, 714) break cleanly as before - First-iteration errors (empty list) break cleanly as before - Mid-iteration errors (network failure, timeout) THROW so the caller knows they have incomplete data, not an empty router This prevents a dangerous scenario where a network hiccup at iteration 5 of 25 causes the caller to act on 5 mappings thinking that's all of them. Also: - SSDP socket calls unref() after bind so forgetting close() doesn't prevent the Node process from exiting. The socket stays functional but won't keep the event loop alive on its own. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SSDP socket management:
- Create fresh Ssdp instance per gateway discovery instead of holding
one for the Client lifetime. Socket is closed automatically when
discovery completes (success or failure). No socket leak even if
the caller never calls close().
- Removed unref() hack — proper cleanup is better than masking leaks.
- Client.close() now just sets the closed flag (no SSDP to clean up).
Convenience method:
- UpnpInfo.getAll() resolves device, capabilities, and localAddress
in parallel and returns a ResolvedGatewayInfo object. For callers
who don't need lazy loading:
const info = await client.getGateway();
const { device, capabilities, localAddress } = await info.getAll();
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Document getAll() convenience method - Explain SSDP socket auto-cleanup (no leak on forgotten close) - Add Resource Management section - Document getMappings() mid-iteration error behavior - Update Security section with timeout, redirect, and validation details Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add missing error codes 713, 728, 729 to table - Clarify getMapping() can throw UpnpError, not just return null - Document numberOfPorts parameter on getMappingRange - Document default TTL (1800s / 30 minutes) for port mappings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… logic Bug 1: discoverGateway device event handler had no try-catch. If Device constructor threw (e.g., malformed SSDP Location header), the exception was unhandled, crashing the Node process. The promise never settled and the SSDP socket leaked. Now wrapped in try-catch with proper reject(). Bug 2: getMappings first-iteration error handling was wrong. If the router returned error 501 (Action Failed) on the first iteration, the code silently returned [] instead of throwing. This conflated "unsupported action" with "empty mapping list." Now only 713/714 (end-of-list) cause a break; all other errors throw regardless of iteration index. Both found during final pre-production review round. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SSDP: require Location header to be present AND start with "http" (was allowing undefined/empty locations through, which would crash Device constructor) - parseDescription: cap recursion at 10 levels to prevent stack overflow from malicious/broken router XML with deeply nested devices Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The original nat-upnp was MIT-licensed and its terms require the notice be preserved in derivative works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Major rewrite of nat-upnp:
getMapping()for O(1) lookup,UpnpInfo.getAll()convenience method, device info/capabilities via lazy gettersRouter compatibility
Every SOAP action is tested against real XML responses captured from each router. Fixtures cover root description parsing, SCPD capability detection, all core SOAP actions, fault handling, and edge cases.
Additionally, integration tests (
npm run flux-test) have been run against 400+ live Flux nodes with diverse router hardware.Pre-publish setup required
Before the first
npm publishfrom CI after merging, npm trusted publishing must be configured for the@runonflux/nat-upnppackage. This is a one-time, ~2 minute setup:@runonflux/nat-upnp→ "Publishing access"RunOnFlux/nat-upnpci.yml--provenanceand OIDCid-token: writeWithout this, the publish step will fail with a 403 because there's no
NPM_TOKENsecret — the workflow uses OIDC instead.Test plan
npm run test— 400+ unit tests pass against all 13 router fixturesnpm run flux-test— integration tests pass against live routers on the Flux network🤖 Generated with Claude Code