feat(cli): add CLI tool with spinner progress and auto-save HTML#75
feat(cli): add CLI tool with spinner progress and auto-save HTML#75chenmofei wants to merge 14 commits into
Conversation
- Add CLI package that converts Markdown to styled HTML via local AI agents - Support 8 coding-agent CLIs (Claude Code, Codex, Cursor Agent, Gemini, etc.) - 75 skill templates from next/src/lib/templates/skills/ - Spinner progress indicator with chunk count and elapsed time (zero deps, pure ANSI) - Auto-save output to <input>.html when input is a file - --output-dir / -d flag to specify auto-save directory - Config management (default template, agent, model) - Stdin support for piping content Part of: nexu-io/html-anything
|
因为长期使用编辑器来写作和管理文档,包括直接用编辑器加载 OpenClaw 工作区来维护和检查输出内容,所以直接在终端转换比较方便。并且 html 主要是发给其他人共享用,对模板只求持续性地清晰、统一、美观、规范,不需要每次都去重新调整和美化页面。 |
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for shipping the CLI package — I verified the new workspace package with pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a couple of mocked convert runs locally. I found two error-handling gaps that look worth tightening before people start scripting against this CLI.
- extractHtml: return empty string instead of wrapping non-HTML in pre tag, so the CLI correctly surfaces agent errors (rate limits, auth failures) instead of silently saving a valid-looking HTML file around error text - createSpinner: in the non-TTY branch, still flush the final status message to stderr so CI/piped scripts can diagnose failures
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the quick follow-up on the CLI fixes — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a few mocked convert runs locally. I found one remaining failure-path bug that still looks worth fixing before people script against this command.
| case "stderr": | ||
| break; | ||
| case "done": | ||
| break; | ||
| } | ||
| } | ||
| } catch (err) { | ||
| spinner.stop(`\x1b[31m✗\x1b[0m Error: ${err instanceof Error ? err.message : String(err)}`); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| const elapsed = ((Date.now() - spinner.start) / 1000).toFixed(1); | ||
| spinner.stop(`\x1b[32m✓\x1b[0m Done in ${elapsed}s`); |
There was a problem hiding this comment.
done.code is ignored here, and the stderr branch above drops the agent's failure output, so a failed child process can still look like a successful conversion. I reproduced this with a fake CLAUDE_BIN that printed valid-looking HTML, wrote rate limit after partial output to stderr, and exited 1: html-anything convert ... still printed ✓ Done, saved input.html, and returned exit code 0. That makes quota/auth/runtime failures indistinguishable from success in the main scripted path, which is exactly where callers rely on the process status. Can we stash stderr, remember the done.code, and treat any non-zero exit as a hard failure before the success spinner/save path runs? If you want to preserve partial HTML for debugging, it would still be safer to print it as diagnostic context while returning non-zero instead of writing it as a successful result.
There was a problem hiding this comment.
Fixed — the event loop now tracks both stderr and done.code. If the agent exits non-zero, the CLI reports the exit code and any accumulated stderr, then returns false (so handleConvert exits non-zero without saving a file). Partial HTML is not saved as a successful result.
Agent exit-code & stderr (A): track done.code and stderr; if the agent exits non-zero, report the failure instead of silently saving a (possibly truncated) HTML file with exit 0. Format validation (B): reject unknown --format values with a list of supported formats (markdown, text, csv, json). Config write guard (C): catch filesystem errors in saveConfig() so disk- full/permission failures show a readable message instead of an uncaught exception. Overwrite prompt (D): ask before overwriting an existing output file in TTY mode; skip the prompt (auto-overwrite) when piped/CI. EPIPE handler (E): catch broken-pipe errors on stdout so piping to head(1) or early-closing consumers does not print a noisy stacktrace. -o/-d conflict (F): error when both --output and --output-dir are set. Multi-file support (G): accept multiple positional input files, process each sequentially, then summarise failures.
|
Thanks for the review. Beyond the stderr/exit-code fix, I found and addressed several related error-handling gaps in the same pass:
|
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the follow-up fixes here — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a mocked batch convert flow locally. I found one remaining batch-output bug in the new multi-file path that can silently replace one generated document with another, so I think this still needs one more pass before merge.
| } else if (inputPath) { | ||
| const basename = path.basename(inputPath, path.extname(inputPath)); | ||
| const outputDir = flags.outputDir || process.cwd(); | ||
| outputPath = path.resolve(outputDir, `${basename}.html`); | ||
| } | ||
|
|
||
| if (outputPath) { | ||
| if (fs.existsSync(outputPath)) { | ||
| const overwrite = await promptOverwrite(outputPath); | ||
| if (!overwrite) { |
There was a problem hiding this comment.
The new multi-file flow can silently collapse two inputs into one output here because every file is keyed only by path.basename(inputPath, ext). I reproduced it with two files named dir1/readme.md and dir2/readme.md: html-anything convert dir1/readme.md dir2/readme.md -a deepseek -t blog-post -d out reported success twice but only left a single out/readme.html, containing the second file's HTML. In non-TTY runs this is especially risky because promptOverwrite() auto-returns true, so CI/scripts overwrite the first result without any prompt. Could we make batch outputs collision-safe before writing — for example by preserving each file's relative path under --output-dir, appending a disambiguating suffix, or pre-scanning for duplicate basenames and failing with a clear error?
There was a problem hiding this comment.
Fixed with a two-step pre-scan before any agent work starts:
- Basename collision detection — lists conflicting basenames and asks whether to resolve by preserving relative directory paths (e.g. dir1/readme.html, dir2/readme.html).
- Overwrite check — after resolving all output paths, checks whether any target files already exist and asks for confirmation before overwriting.
At any step, choosing N aborts with a clear error before any agent invocation begins.
When multiple input files would produce the same output basename (e.g. dir1/readme.md and dir2/readme.md both -> readme.html), the CLI now pre-scans before any work begins: 1. Collision detection — lists conflicting basenames and asks whether to resolve by preserving relative directory paths (dir1/readme.html). 2. Overwrite check — after resolving all output paths, checks whether any target files already exist and asks for confirmation before overwriting. 3. On N at any step, the CLI aborts with a clear error before any agent work starts.
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for iterating on the batch-convert fixes — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a mocked non-TTY batch convert flow locally. I found two remaining batch-output regressions in the current multi-file path that can still break scripted runs or write outside the requested output directory, so I think this needs one more pass before merge.
| const existingFiles = outputPlan.filter((p) => fs.existsSync(p.outputPath)); | ||
| if (existingFiles.length > 0) { | ||
| console.error(`\x1b[33m⚠\x1b[0m The following output files already exist:`); | ||
| for (const p of existingFiles) console.error(` ${p.outputPath}`); | ||
| const ok = await promptYesNo("\x1b[33m⚠\x1b[0m Overwrite? (y/N): "); | ||
| if (!ok) { | ||
| console.error("Aborted."); | ||
| process.exit(1); | ||
| } |
There was a problem hiding this comment.
This pre-scan changes non-TTY overwrite behavior from “continue and overwrite” to “abort before any work starts”. promptYesNo() resolves false whenever stdin/stderr is not a TTY (cli/src/index.ts:464-466), so a scripted batch run now exits 1 as soon as any target already exists, even though the single-file path still auto-overwrites in non-TTY mode via promptOverwrite() and the PR description calls out CI/piped support. I reproduced it with two existing outputs plus a fake Claude binary: html-anything convert a/one.md b/two.md -a claude -t blog-post -d out printed the existing files list and aborted before invoking the agent. Could this branch reuse the single-file non-TTY behavior (or otherwise skip the confirmation gate outside TTY) so batch conversion stays scriptable?
There was a problem hiding this comment.
Good catch — fixed. The batch overwrite pre-scan now only prompts in TTY mode, matching the single-file promptOverwrite() auto-overwrite behaviour. In non-TTY/CI, existing files are silently overwritten (same as before for single-file).
| function resolveCollisionOutput(inputPath: string, outputDir: string): string { | ||
| const basename = path.basename(inputPath, path.extname(inputPath)); | ||
| const inputDir = path.dirname(inputPath); | ||
| const relativeDir = path.relative(process.cwd(), inputDir); | ||
| if (relativeDir && relativeDir !== ".") { | ||
| return path.resolve(outputDir, relativeDir, `${basename}.html`); | ||
| } |
There was a problem hiding this comment.
resolveCollisionOutput() uses path.relative(process.cwd(), inputDir) as the disambiguating subdirectory, so any colliding input outside the current working directory produces a leading .. path that escapes the user-selected --output-dir. From the repo root, the current logic maps /tmp/a/readme.md and /tmp/b/readme.md to /Users/tmp/a/readme.html and /Users/tmp/b/readme.html, not under -d dist at all. That turns the collision fix into writes outside the requested destination, which is a pretty risky surprise for the main batch path. Could we keep the disambiguator inside outputDir instead — for example by deriving it from a common input root and stripping leading .. segments, or by appending a sanitized suffix instead of feeding path.relative(process.cwd(), ...) straight into path.resolve()?
There was a problem hiding this comment.
Fixed. resolveCollisionOutput now derives relative paths from the common ancestor of all colliding inputs (findCommonPath), and strips any .. segments, so outputs always stay inside --output-dir regardless of where the inputs live.
- Batch overwrite now skips the interactive prompt outside TTY (matching the single-file promptOverwrite auto-overwrite behaviour), so scripted CI runs don't abort when existing outputs are present. - resolveCollisionOutput now derives relative paths from the common ancestor of all colliding inputs (findCommonPath) instead of cwd, and strips '..' segments so outputs stay inside --output-dir, even when inputs live outside the current working directory.
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the follow-up fixes here — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, plus a couple of mocked convert runs locally. I found two more correctness issues in the DeepSeek/default-agent paths that look worth tightening before people depend on this CLI in scripts.
| else if (part.kind === "html") safeEnqueue({ type: "html", text: part.text }); | ||
| else if (part.kind === "meta") safeEnqueue({ type: "meta", key: part.key, value: part.value }); | ||
| } | ||
| if (opts.agent === "aider" || opts.agent === "deepseek") { |
There was a problem hiding this comment.
aider / deepseek still duplicate the final stdout chunk here. On close we first run parse(stdoutBuf) and enqueue its delta(s), then the if (opts.agent === "aider" || opts.agent === "deepseek") branch appends the same stdoutBuf again. With a fake DEEPSEEK_BIN that prints a single HTML document without a trailing newline, html-anything convert -a deepseek ... saved a file containing two <!DOCTYPE html> blocks, so the generated HTML is invalid for one of the advertised supported agents. Could we make the close path choose one mechanism or the other for these agents — for example, emit the parsed remainder only, or bypass parse(stdoutBuf) entirely when you intentionally forward raw stdout?
There was a problem hiding this comment.
Already fixed in 371990b (applied after the revert). The close path now uses if/else: aider/deepseek bypass parse() entirely and enqueue stdoutBuf directly; all other agents go through parse. No more double-enqueue. Covered by 5 regression tests in agents-invoke.test.ts.
| process.exit(1); | ||
| } | ||
| const agents = getAvailableAgents(); | ||
| const agent = agents.find((a) => a.id === val); |
There was a problem hiding this comment.
config set-default-agent currently accepts any known ID, including agents marked unsupported or not actually usable in this CLI build. Because findAgent() later reuses config.defaultAgent without filtering unsupported, a user can save hermes/kimi here and then every plain convert run aborts with the ACP-protocol error instead of falling back to a supported installed agent. I reproduced that with HERMES_BIN + DEEPSEEK_BIN: config set-default-agent hermes succeeds, and the next convert immediately fails before reaching DeepSeek. Could we reject unavailable/unsupported agents in this setter (and ideally skip unsupported defaults in findAgent() as a second guard)?
There was a problem hiding this comment.
Already fixed in 371990b (applied after the revert). Two guards: 1) config set-default-agent now rejects unavailable and unsupported agents, listing supported alternatives. 2) findAgent default lookup filters !a.unsupported, so a stale default falls through to the next available agent. Covered in index.test.ts.
…d default agents - agents-invoke: aider/deepseek close path now enqueues stdoutBuf directly instead of running it through both parse() AND a raw enqueue, which was producing duplicate HTML (two <!DOCTYPE html> blocks). - handleConfig set-default-agent: now rejects agents that are not installed (!available) or use an unsupported protocol (unsupported), with a clear error listing available supported alternatives. - findAgent: when resolving config.defaultAgent, now also filters out unsupported agents so a stale default (e.g. from manual config.json edit) automatically falls through to the next available agent.
…supported default agents" This reverts commit 19636bc.
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the latest fixes here — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and rechecked the DeepSeek/default-agent flows with mocked CLIs on this head. Two correctness issues are still reproducible in the current CLI paths, so I left the concrete follow-ups inline below.
| for (const part of parse(stdoutBuf)) { | ||
| if (part.kind === "delta") safeEnqueue({ type: "delta", text: part.text }); | ||
| else if (part.kind === "html") safeEnqueue({ type: "html", text: part.text }); | ||
| else if (part.kind === "meta") safeEnqueue({ type: "meta", key: part.key, value: part.value }); | ||
| } | ||
| if (opts.agent === "aider" || opts.agent === "deepseek") { | ||
| safeEnqueue({ type: "delta", text: stdoutBuf }); | ||
| } |
There was a problem hiding this comment.
This close-path still duplicates the final stdout chunk for deepseek / aider when the agent finishes without a trailing newline. In this range we first replay parse(stdoutBuf) and then immediately enqueue stdoutBuf again in the special-case branch, so the same HTML is appended twice. Repro on this head: DEEPSEEK_BIN=<fake script printing one HTML document> with node cli/dist/run.js convert ... -a deepseek -o out.html produced an out.html containing two <!DOCTYPE html> blocks. That makes the saved document invalid for one of the advertised supported agents. Could we make this branch choose one mechanism or the other — emit the parsed remainder or forward raw stdout, but not both — and add a no-trailing-newline regression around the DeepSeek/Aider path?
There was a problem hiding this comment.
Already fixed in 371990b (applied after the revert). The close path now uses if/else: aider/deepseek bypass parse() entirely and enqueue stdoutBuf directly; all other agents go through parse. No more double-enqueue.
| process.exit(1); | ||
| } | ||
| const agents = getAvailableAgents(); | ||
| const agent = agents.find((a) => a.id === val); |
There was a problem hiding this comment.
config set-default-agent still accepts installed-but-unsupported IDs like hermes, and findAgent() later reuses that saved default through the available-only check on lines 53-55. Repro on this head: with fake HERMES_BIN and DEEPSEEK_BIN, node cli/dist/run.js config set-default-agent hermes succeeds, then a plain convert ... -t blog-post picks Hermes and aborts with the ACP protocol error instead of falling back to DeepSeek. Because the bad value is persisted in ~/.config/html-anything/config.json, every default convert run stays broken until the user resets it manually. Could we reject unsupported/unavailable agents in this setter and keep the default-agent lookup filtered to !unsupported as a second guard?
There was a problem hiding this comment.
Already fixed in 371990b (applied after the revert). Two guards:
config set-default-agentnow rejects unavailable and unsupported agents, listing supported alternatives.findAgentdefault lookup filters!a.unsupported, so a stale default falls through to the next available agent.
Verified with HERMES_BIN=/bin/ls: config set-default-agent hermes → "uses an unsupported protocol" + lists alternatives, exit 1.
…ult agents - agents-invoke: aider/deepseek close path now enqueues stdoutBuf directly instead of running it through both parse() AND a raw enqueue, which was producing duplicate <!DOCTYPE html> blocks. - findAgent: when resolving config.defaultAgent, now also filters out unsupported agents so a stale default (e.g. from manual config.json edit) automatically falls through to the next available agent. - handleConfig set-default-agent: now rejects agents that are not installed or use an unsupported protocol, with a clear error listing available supported alternatives.
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the follow-up fixes here — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a mocked env-override CLI flow on this head. I found one remaining env-override detection gap that looks worth tightening before people rely on wrapper binaries for custom PATH layouts.
| unsupported: unsupported || undefined, | ||
| }; | ||
| const override = a.envOverride ? process.env[a.envOverride] : undefined; | ||
| if (override && existsSync(override)) { |
There was a problem hiding this comment.
detectAgents() only treats *_BIN overrides as valid when existsSync(override) succeeds, so relative command names like GEMINI_BIN=fake-claude get dropped even though resolveBinForAgent() later accepts the same value via resolveOnPath(). On this head I reproduced that with a mock binary on PATH: node cli/dist/run.js agents still shows Gemini as unavailable, config set-default-agent gemini says it is not installed, and convert -a gemini exits with "No available AI agent found" before the runnable override is ever tried. That means wrapper names in *_BIN break detection/configuration flows even though invocation would succeed. Please resolve env overrides here with the same helper used in resolveBinForAgent() (or a shared tryPath() helper) so both detection and execution accept either an absolute path or a command name on PATH consistently.
There was a problem hiding this comment.
Fixed. detectAgents() now falls back to resolveOnPath() when existsSync() fails on a *_BIN override, so relative command names on PATH (e.g. GEMINI_BIN=fake-claude) are recognized in detection and config flows — matching what resolveBinForAgent() already does at invocation time.
detectAgents() previously only accepted *_BIN overrides as absolute paths (existsSync). Relative command names like GEMINI_BIN=fake-claude were dropped even though invocation (resolveBinForAgent) can find them on PATH. Now falls back to resolveOnPath() when existsSync fails, so detection and config flows match the actual invoke behaviour.
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the latest follow-up here — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a mocked batch convert flow on this head. I found one remaining non-TTY batch path that still blocks scripted multi-file converts, so I left the concrete follow-up inline below.
chenmofei
left a comment
There was a problem hiding this comment.
Thanks for catching this! Fixed in 40dee96: auto-enable relative directory paths for basename collisions in non-TTY mode instead of aborting, matching the deterministic behavior already used for overwrite prompts. This lets scripted workflows like CI safely convert multiple files with the same basename (e.g., docs/readme.md + packages/readme.md) without user input.
Based on all reviewer feedback across 10 rounds, added a complete regression test suite covering every reported failure path: - extract-html.test.ts (9): non-HTML content returns empty, no scaffold wrapping - prompt.test.ts (11): TTY/non-TTY behavior for promptYesNo & promptOverwrite - collision-resolve.test.ts (8): findCommonPath & resolveCollisionOutput edge cases - agents-detect.test.ts (20): *_BIN env overrides, PATH resolution, unsupported protocols - agents-invoke.test.ts (19): DeepSeek/Aider close path no double-enqueue, exit code propagation - index.test.ts (22): param validation, config set-default-agent guards, convert integration Refactored for testability: - Extracted collision-resolve.ts (findCommonPath + resolveCollisionOutput) - Extracted prompt.ts (promptYesNo + promptOverwrite) All 89 tests pass. Typecheck and build clean.
chenmofei
left a comment
There was a problem hiding this comment.
Thanks for all the thorough review rounds. I've taken all 10 rounds of feedback and built a comprehensive regression test suite to prevent similar issues from recurring:
Summary of this push (c7cc672):
Added 89 tests across 6 suites covering every failure path raised:
| Suite | Tests | Covers |
|---|---|---|
| extract-html.test.ts | 9 | Non-HTML content returns empty, no scaffold wrapping |
| prompt.test.ts | 11 | TTY/non-TTY behavior parity for promptYesNo & promptOverwrite |
| collision-resolve.test.ts | 8 | findCommonPath & resolveCollisionOutput edge cases, filtering |
| agents-detect.test.ts | 20 | env override with PATH command names, unsupported protocol detection |
| agents-invoke.test.ts | 19 | DeepSeek/Aider close-path no double-enqueue, exit code propagation |
| index.test.ts | 22 | Param validation, config set-default-agent guards for unsupported/unavailable agents, convert integration |
Refactored for testability:
- Extracted
collision-resolve.ts(findCommonPath + resolveCollisionOutput) - Extracted
prompt.ts(promptYesNo + promptOverwrite)
Verification:
pnpm -F @html-anything/cli test→ 89/89 passedpnpm -F @html-anything/cli typecheck→ cleanpnpm -F @html-anything/cli build→ clean
chenmofei
left a comment
There was a problem hiding this comment.
Thanks for all the thorough review rounds. I have taken all 10 rounds of feedback and built a comprehensive regression test suite to prevent similar issues from recurring:
This push (c7cc672) adds 89 tests across 6 suites:
| Suite | Tests | Covers |
|---|---|---|
| extract-html.test.ts | 9 | Non-HTML content returns empty, no scaffold wrapping |
| prompt.test.ts | 11 | TTY/non-TTY behavior parity for promptYesNo & promptOverwrite |
| collision-resolve.test.ts | 8 | findCommonPath & resolveCollisionOutput edge cases |
| agents-detect.test.ts | 20 | Env override with PATH command names, unsupported protocol detection |
| agents-invoke.test.ts | 19 | DeepSeek/Aider close-path no double-enqueue, exit code propagation |
| index.test.ts | 22 | Param validation, config set-agent guards, convert integration |
Refactored for testability:
- Extracted
collision-resolve.ts(findCommonPath + resolveCollisionOutput) - Extracted
prompt.ts(promptYesNo + promptOverwrite)
Verification:
pnpm -F @html-anything/cli test→ 89/89 passedpnpm -F @html-anything/cli typecheck→ cleanpnpm -F @html-anything/cli build→ clean
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei Thanks for the big regression-test pass here — I re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli test, pnpm -F @html-anything/cli typecheck, pnpm -F @html-anything/cli build, and a mocked env-override convert flow on this head. I found one remaining non-blocking mismatch between agent detection and invocation, so I left the concrete follow-up inline below.
| const tryPath = (p: string | undefined): string | null => { | ||
| if (!p) return null; | ||
| const trimmed = p.trim(); | ||
| if (!trimmed) return null; | ||
| if (/^([a-zA-Z]:[\\/]|[\\/])/.test(trimmed)) { | ||
| return existsSync(trimmed) ? trimmed : null; | ||
| } | ||
| return resolveOnPath(trimmed); |
There was a problem hiding this comment.
tryPath() only accepts absolute filesystem paths here, then falls back straight to resolveOnPath(), so relative overrides like DEEPSEEK_BIN=./mock-deepseek or CLAUDE_BIN=./wrappers/claude get treated as missing even though detectAgents() already marked them available with existsSync(override). On this head I reproduced that by creating ./mock-deepseek in the working directory: html-anything agents reports DeepSeek as available, but html-anything convert ... -a deepseek fails with DeepSeek TUI (\deepseek`) is not installed or not on PATH.before the wrapper is ever spawned. That breaks the new env-override flow for repo-local wrapper scripts and makes detection/configuration disagree with execution. Please resolve relative paths in the invoke path too — for example, treat values starting with.or containing a path separator as filesystem paths (e.g.path.resolve(trimmed)+existsSync) before falling back to resolveOnPath(), and add a regression test that covers a relative *_BINoverride throughconvert`.
There was a problem hiding this comment.
Good catch -- fixed in fd2e7d0.
tryPath() in resolveBinForAgent now handles paths containing / or \ or starting with . as filesystem paths, resolving them via path.resolve() + existsSync() before falling back to resolveOnPath(). DEEPSEEK_BIN=./mock-deepseek and CLAUDE_BIN=../wrappers/claude now work at invocation time, consistent with detectAgents().
New regression tests in agents-invoke.test.ts cover ./mock-agent and ../bin/claude relative binOverrides. Total 91/91 passing, typecheck + build clean.
tryPath() in resolveBinForAgent previously only handled absolute paths (starting with / or C:\) and command names on PATH. Relative paths like ./mock-deepseek or ../wrappers/claude fell through to resolveOnPath() which only searches PATH directories, causing a mismatch where detectAgents() reported the agent as available but invokeAgent() could not find it. Now paths containing / or \ or starting with . are resolved via path.resolve() + existsSync(), matching what detectAgents() does.
Two new test cases verify that invokeAgent correctly resolves relative binOverride paths (e.g. ./mock-agent, ../bin/claude) via path.resolve() + existsSync(), matching what detectAgents() already does.
chenmofei
left a comment
There was a problem hiding this comment.
Good catch — fixed:
-
tryPath() in resolveBinForAgent now treats paths containing
/or\or starting with.as filesystem paths, resolving them viapath.resolve()+existsSync()before falling back toresolveOnPath(). This meansDEEPSEEK_BIN=./mock-deepseek,CLAUDE_BIN=../wrappers/claudework at invocation time, consistent with howdetectAgents()already handles them. -
New regression tests in agents-invoke.test.ts cover:
./mock-agentrelative binOverride → resolves and spawns../bin/clauderelative binOverride → resolves and spawns
Total tests now 91/91 passing, typecheck + build clean.
mrcfps
left a comment
There was a problem hiding this comment.
@chenmofei I rechecked the current CLI head end-to-end: reviewed the changed ranges across the new package, re-ran pnpm install --frozen-lockfile, pnpm -F @html-anything/cli test, pnpm -F @html-anything/cli typecheck, and pnpm -F @html-anything/cli build, and spot-checked the recent agent-override and batch-output fixes in the current implementation. Everything I checked on fd2e7d0 now looks consistent with the intended CLI behavior. Thanks for pushing through the follow-up fixes and the regression coverage here 🙌
Implements automatic template detection for the CLI, partially resolves nexu-io#60 and supplements the CLI entrypoint introduced in nexu-io#75. - Add skills-matcher.ts with three-layer matching strategy: 1. ~80 strong-signal keyword rules (resume→resume-modern, etc.) 2. Full-template scoring (tags + name + description + scenario) 3. AI summary fallback only when confidence is low (~0 tokens) - Add `auto` command: html-anything auto article.md - Support --force-ai (skip rules) and --show-match-only flags - Update README with consolidated parameter docs and decision flowchart Examples: html-anything auto resume.md # auto-match + convert html-anything auto article.md --show-match-only # preview match only
lefarcen
left a comment
There was a problem hiding this comment.
Hey @chenmofei! 👋 This CLI package is a solid addition to the workspace — terminal-first conversion with smart auto-detection across 8 agents, 75 skill templates, spinner progress, and collision-safe batch output covers a real workflow gap for people who live in their editors.
I ran my own pass over the key execution paths: binary resolution in agents-invoke.ts (existsSync guards before spawn, relative-path handling), the --format whitelist, prompt assembly, and the config write guard. No additional blockers on my side. The 10-round review cycle with @mrcfps caught the important edge cases — format validation, batch non-TTY collision handling, overwrite prompts, EPIPE — and you addressed every one cleanly, then backed it all with 91 regression tests. Nice work pushing through that.
One thing worth a quick check before this merges: the PR currently shows a non-CLEAN merge state on GitHub's side, likely from unresolved conversation threads on earlier review commits. If you see any open threads lingering on older pushes, a brief reply there should clear it up. ❤️
* feat(cli): add CLI tool with spinner progress and auto-save HTML - Add CLI package that converts Markdown to styled HTML via local AI agents - Support 8 coding-agent CLIs (Claude Code, Codex, Cursor Agent, Gemini, etc.) - 75 skill templates from next/src/lib/templates/skills/ - Spinner progress indicator with chunk count and elapsed time (zero deps, pure ANSI) - Auto-save output to <input>.html when input is a file - --output-dir / -d flag to specify auto-save directory - Config management (default template, agent, model) - Stdin support for piping content Part of: nexu-io/html-anything * fix(cli): fail hard on non-HTML output, flush status in non-TTY mode - extractHtml: return empty string instead of wrapping non-HTML in pre tag, so the CLI correctly surfaces agent errors (rate limits, auth failures) instead of silently saving a valid-looking HTML file around error text - createSpinner: in the non-TTY branch, still flush the final status message to stderr so CI/piped scripts can diagnose failures * fix(cli): robust error handling, multi-file support, overwrite prompt Agent exit-code & stderr (A): track done.code and stderr; if the agent exits non-zero, report the failure instead of silently saving a (possibly truncated) HTML file with exit 0. Format validation (B): reject unknown --format values with a list of supported formats (markdown, text, csv, json). Config write guard (C): catch filesystem errors in saveConfig() so disk- full/permission failures show a readable message instead of an uncaught exception. Overwrite prompt (D): ask before overwriting an existing output file in TTY mode; skip the prompt (auto-overwrite) when piped/CI. EPIPE handler (E): catch broken-pipe errors on stdout so piping to head(1) or early-closing consumers does not print a noisy stacktrace. -o/-d conflict (F): error when both --output and --output-dir are set. Multi-file support (G): accept multiple positional input files, process each sequentially, then summarise failures. * fix(cli): pre-scan batch outputs for basename collisions When multiple input files would produce the same output basename (e.g. dir1/readme.md and dir2/readme.md both -> readme.html), the CLI now pre-scans before any work begins: 1. Collision detection — lists conflicting basenames and asks whether to resolve by preserving relative directory paths (dir1/readme.html). 2. Overwrite check — after resolving all output paths, checks whether any target files already exist and asks for confirmation before overwriting. 3. On N at any step, the CLI aborts with a clear error before any agent work starts. * fix(cli): non-TTY batch overwrite + collision output path safety - Batch overwrite now skips the interactive prompt outside TTY (matching the single-file promptOverwrite auto-overwrite behaviour), so scripted CI runs don't abort when existing outputs are present. - resolveCollisionOutput now derives relative paths from the common ancestor of all colliding inputs (findCommonPath) instead of cwd, and strips '..' segments so outputs stay inside --output-dir, even when inputs live outside the current working directory. * fix(cli): deduplicate aider/deepseek close output + reject unsupported default agents - agents-invoke: aider/deepseek close path now enqueues stdoutBuf directly instead of running it through both parse() AND a raw enqueue, which was producing duplicate HTML (two <!DOCTYPE html> blocks). - handleConfig set-default-agent: now rejects agents that are not installed (!available) or use an unsupported protocol (unsupported), with a clear error listing available supported alternatives. - findAgent: when resolving config.defaultAgent, now also filters out unsupported agents so a stale default (e.g. from manual config.json edit) automatically falls through to the next available agent. * Revert "fix(cli): deduplicate aider/deepseek close output + reject unsupported default agents" This reverts commit 19636bc. * fix(cli): deduplicate aider/deepseek output + reject unsupported default agents - agents-invoke: aider/deepseek close path now enqueues stdoutBuf directly instead of running it through both parse() AND a raw enqueue, which was producing duplicate <!DOCTYPE html> blocks. - findAgent: when resolving config.defaultAgent, now also filters out unsupported agents so a stale default (e.g. from manual config.json edit) automatically falls through to the next available agent. - handleConfig set-default-agent: now rejects agents that are not installed or use an unsupported protocol, with a clear error listing available supported alternatives. * fix(cli): resolve *_BIN env overrides via PATH for detection detectAgents() previously only accepted *_BIN overrides as absolute paths (existsSync). Relative command names like GEMINI_BIN=fake-claude were dropped even though invocation (resolveBinForAgent) can find them on PATH. Now falls back to resolveOnPath() when existsSync fails, so detection and config flows match the actual invoke behaviour. * fix: auto-enable relative paths for basename collisions in non-TTY mode * feat(cli): add comprehensive test framework (89 tests across 6 suites) Based on all reviewer feedback across 10 rounds, added a complete regression test suite covering every reported failure path: - extract-html.test.ts (9): non-HTML content returns empty, no scaffold wrapping - prompt.test.ts (11): TTY/non-TTY behavior for promptYesNo & promptOverwrite - collision-resolve.test.ts (8): findCommonPath & resolveCollisionOutput edge cases - agents-detect.test.ts (20): *_BIN env overrides, PATH resolution, unsupported protocols - agents-invoke.test.ts (19): DeepSeek/Aider close path no double-enqueue, exit code propagation - index.test.ts (22): param validation, config set-default-agent guards, convert integration Refactored for testability: - Extracted collision-resolve.ts (findCommonPath + resolveCollisionOutput) - Extracted prompt.ts (promptYesNo + promptOverwrite) All 89 tests pass. Typecheck and build clean. * fix(cli): resolve relative *_BIN overrides in tryPath tryPath() in resolveBinForAgent previously only handled absolute paths (starting with / or C:\) and command names on PATH. Relative paths like ./mock-deepseek or ../wrappers/claude fell through to resolveOnPath() which only searches PATH directories, causing a mismatch where detectAgents() reported the agent as available but invokeAgent() could not find it. Now paths containing / or \ or starting with . are resolved via path.resolve() + existsSync(), matching what detectAgents() does. * test(cli): add relative *_BIN override resolution tests Two new test cases verify that invokeAgent correctly resolves relative binOverride paths (e.g. ./mock-agent, ../bin/claude) via path.resolve() + existsSync(), matching what detectAgents() already does. * feat(cli): add auto command with intelligent template matching Implements automatic template detection for the CLI, partially resolves #60 and supplements the CLI entrypoint introduced in #75. - Add skills-matcher.ts with three-layer matching strategy: 1. ~80 strong-signal keyword rules (resume→resume-modern, etc.) 2. Full-template scoring (tags + name + description + scenario) 3. AI summary fallback only when confidence is low (~0 tokens) - Add `auto` command: html-anything auto article.md - Support --force-ai (skip rules) and --show-match-only flags - Update README with consolidated parameter docs and decision flowchart Examples: html-anything auto resume.md # auto-match + convert html-anything auto article.md --show-match-only # preview match only * fix(cli): word-boundary keyword matching, force-ai gating, EPIPE guard - Add kwMatches() with \b word-boundary for ASCII keywords, substring for CJK - Remove ambiguous short keywords: "X", "RED", "TODO", "done", "doing", "todo" - Gate Layer-2 fallback on !forceAi so --force-ai reaches AI summary - Add EPIPE guard to handleAuto stdin-to-stdout path (matching handleConvert) * fix(cli): address PR #80 review feedback + add skills-matcher tests - Add kwMatches() with \b word-boundary for ASCII keywords, substring for CJK - Remove ambiguous short keywords: "X", "RED", "TODO", "done", "doing", "todo" - Gate Layer-2 fallback on !forceAi so --force-ai reaches AI summary - Add EPIPE guard to handleAuto stdin-to-stdout path - Fix Layer-1 gate: strong-signal matching now works for any content length - Export kwMatches for unit testing - Add skills-matcher.test.ts with 39 tests covering kwMatches, strong-signal matching, false-positive prevention, --force-ai path, fallback, and reason output * feat: rename DeepSeek TUI agent to CodeWhale Background: deepseek-tui has been officially renamed to CodeWhale (see https://github.com/Hmbown/CodeWhale/releases/tag/v0.8.41). The legacy deepseek and deepseek-tui binaries are deprecation shims that will be removed in v0.9.0. Changes: - Add new AgentDef 'codewhale' (bin: codewhale, vendor: CodeWhale) - Rename 'deepseek' AgentDef id to 'deepseek-tui' (bin: deepseek-tui) - Both entries use mutual fallbackBins so GUI detects either binary - Add codewhale branch to buildArgv, parseLineWithState, close-path - Update error messages to list both codewhale and deepseek-tui - Update GUI (settings-modal, welcome-modal) with CodeWhale vendor - Update README with both CodeWhale and DeepSeek TUI rows - Reserve 'deepseek' id for future official DeepSeek agent - Add test coverage for both codewhale and deepseek-tui agents - 133 tests pass, typecheck clean * fix(next): deduplicate aide/codewhale/deepseek-tui close-path delta Mirror the CLI's if/else structure so the raw enqueue replaces (rather than stacks onto) the parser dispatch. Previously, the parser would emit the delta first, then the raw enqueue would emit a duplicate — causing doubled tail output in the preview.
|
Superseded by #87, which has now been merged into #87 was stacked on top of this PR and contains every commit from this one (the full Thanks @chenmofei for the thorough work here and the 10-round review pass with @mrcfps / @lefarcen — none of it was lost, it's all on |
概述
为 html-anything 添加独立的 CLI 命令行工具,让用户无需打开网页界面即可在终端中将 Markdown 转换为精美排版的 HTML。
功能
CLI 包 (
cli/)1. 设置默认模板(推荐)
2. 转换 Markdown 文件
3. 查看生成结果
# 用浏览器打开生成的 HTML open output.html命令详解
convert— 转换内容input--template <id>-t--agent <id>-a--output <path>-o<输入文件名>.html,stdin 输入时输出到 stdout--output-dir <dir>-d--model <id>--format <type>templates— 列出模板列出所有 75 个可用模板,按类别分组显示。已设为默认的模板会标记
(default)。agents— 列出 Agent列出系统中已安装的 AI agent CLI。
✓表示可用,✗表示未安装。config— 配置管理配置文件位于
~/.config/html-anything/config.json。改动文件
cli/— 新增源文件 + package.json + tsconfig.jsonpnpm-workspace.yaml— 添加 cli 到 workspacepnpm-lock.yaml— 更新 lockfile