Skip to content

Add plain Markdown import and clean export functionality#56

Merged
TheZupZup merged 1 commit intomainfrom
claude/add-markdown-export-YLw4K
May 3, 2026
Merged

Add plain Markdown import and clean export functionality#56
TheZupZup merged 1 commit intomainfrom
claude/add-markdown-export-YLw4K

Conversation

@TheZupZup
Copy link
Copy Markdown
Owner

Summary

This PR adds support for importing plain Markdown files (without NexaNote frontmatter) and exporting notes as clean Markdown files suitable for use in Obsidian and other plain-text tools. Plain .md files dropped into the notes/ directory are now automatically discovered and surfaced as notes without being rewritten, while exports produce frontmatter-free output with sanitized filenames and collision handling.

Key Changes

Plain Markdown Import

  • Synthetic ID system: Plain .md files are assigned stable synthetic IDs using the md. prefix followed by URL-safe base64-encoded filenames, allowing the original filename to be recovered without an external index
  • Non-invasive reading: Plain Markdown files are read and synthesized into Note objects on-the-fly without rewriting the files, preserving user edits made outside NexaNote
  • Seamless coexistence: Plain and NexaNote-managed notes (with frontmatter) can coexist in the same directory and are listed together
  • In-place conversion: When a plain Markdown note is edited and saved through NexaNote, it is converted to a managed note with frontmatter in place

Clean Markdown Export

  • New export.py module: Provides export_note() and export_all() functions that write notes as body-only Markdown files without YAML frontmatter or internal metadata
  • Filename sanitization: sanitize_filename() removes filesystem-unsafe characters (control chars, <>:"/\|?*), collapses whitespace, handles Windows reserved names, and provides a fallback for blank names
  • Collision handling: Duplicate titles are disambiguated with (2), (3) suffixes using case-insensitive matching to prevent overwrites on case-insensitive filesystems
  • Existing file protection: Export never overwrites pre-existing files in the target directory
  • Multi-page support: Multiple pages are joined with blank lines; internal NexaNote markers are stripped

API Endpoint

  • POST /export/markdown: New endpoint accepts optional target_dir and include_archived parameters, defaults to <data_dir>/export, and returns a report with the count and paths of exported files

Storage Layer Updates

  • file_store.py: Added helper functions plain_md_id_from_stem(), stem_from_plain_md_id(), and synthesize_plain_md_note() to handle plain Markdown discovery and synthesis
  • _read_note() and list_notes(): Updated to synthesize plain Markdown notes when deserialization fails (no frontmatter found)
  • get_stats(): Fixed to correctly count plain Markdown notes and their pages

Comprehensive Test Coverage

  • 378-line test suite covering:
    • Plain Markdown ID round-tripping with Unicode and special characters
    • Import of plain .md files with stable IDs and non-invasive reads
    • Coexistence of plain and managed notes
    • In-place conversion on save
    • Filename sanitization edge cases (invalid chars, reserved names, truncation, whitespace)
    • Clean export with collision detection and case-insensitive deduplication
    • API endpoint behavior and internal storage integrity

Notable Implementation Details

  • Plain Markdown files are identified by the absence of NexaNote frontmatter during deserialization, not by a separate index
  • Timestamps for plain Markdown notes are derived from filesystem metadata (st_ctime/st_mtime) so external edits are reflected on the next listing
  • The export pipeline never modifies internal NexaNote storage; all output is written to a separate target directory
  • Soft-deleted notes are automatically excluded from exports via the existing list_notes() filtering

https://claude.ai/code/session_017Mf5G58RPUBKqKQL7zET2m

- file_store: detect plain `.md` files (no NexaNote frontmatter), surface
  them as Notes with stable synthetic ids derived from the filename, and
  leave the source file untouched until the user explicitly saves an edit.
- storage/export: new module that writes notes to `<title>.md` files
  containing only the markdown body (no frontmatter). Filenames are
  sanitized (forbidden chars stripped, length capped, Windows reserved
  names handled) and collisions get `(N)` suffixes. Internal storage is
  never touched.
- api/routes: add `POST /export/markdown` to trigger a clean export to a
  caller-supplied dir (or `<data_dir>/export` by default).
- tests: 44 new tests covering plain MD import (legacy + Obsidian style
  coexist, ids stable, files not rewritten on read, save converts
  in-place), filename sanitization (invalid chars, blanks, reserved
  names, length cap), and clean export (body-only output, dedup on
  duplicates, case-insensitive collision check, internal storage
  unchanged, plain MDs round-trip cleanly).
@TheZupZup TheZupZup merged commit 082fdbe into main May 3, 2026
1 check passed
@TheZupZup TheZupZup deleted the claude/add-markdown-export-YLw4K branch May 3, 2026 09:47
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d11d84f00d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

def _note_path(self, note_id: str) -> Path:
stem = stem_from_plain_md_id(note_id)
if stem is not None:
return self.notes_dir / f"{stem}.md"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Sanitize decoded plain-md stems before filesystem access

The new plain-markdown ID path bypasses _safe_id() and directly uses the base64-decoded stem as a path component, so a crafted note_id like md.<base64('../../outside')> resolves outside notes_dir. This affects all operations that call _note_path() (get_note, save_note, delete_note_permanent, etc.), allowing reads/writes/deletes of unintended files whenever an untrusted or malformed ID reaches storage.

Useful? React with 👍 / 👎.

Comment on lines +59 to +60
if cleaned.upper() in _RESERVED_NAMES:
return f"_{cleaned}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject Windows reserved basenames even with extensions

The reserved-name guard only checks exact matches (e.g., CON) and misses reserved device names followed by extensions (e.g., CON.txt, NUL.tar.gz), which Windows still treats as invalid. As a result, exporting notes with those titles can fail with OSError on Windows despite the function claiming cross-platform-safe filenames.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants