fix: file sync delete, reconnect re-sync, and watcher race#4
Conversation
Five bugs fixed: 1. FileSync: _on_sync_end set _sync_complete immediately, but the server streams FileSyncDelete/Update messages *after* FileSyncEnd. pull/sync-once closed the connection before those messages arrived, so remote deletions were never applied locally. Now mirrors NoteSync: tracks expected vs received counts and only marks complete when all messages are processed. 2. FileSync: _on_sync_update returned early on unexpected content type without incrementing _received_modify, causing _wait_file_sync to stall until the 300s timeout. 3. Watcher: on_moved did not check is_ignored, so server-triggered renames (which temporarily ignore both paths) could fire on_local_rename after the ignore window closed, pushing a redundant delete+upload back to the server. 4. Daemon (fns run): after a WebSocket reconnect _initial_sync was never retried, so any changes made on other clients during the outage (including deletions) were permanently missed. Added on_reconnect hook to WSClient; SyncEngine registers a callback that pauses the watcher and re-runs _initial_sync on every reconnect. 5. sync_once: called _initial_sync (incremental NoteSync) then immediately called request_full_sync, sending two NoteSync requests per session. sync_once now calls request_full_sync + file_sync.request_sync directly.
Reviewer's GuideAdjusts file synchronization completion tracking, ensures counters are updated for all update/delete paths, adds reconnect-triggered resync, fixes sync-once double NoteSync, and prevents watcher from reacting to server-driven moves. Sequence diagram for reconnect-triggered resyncsequenceDiagram
actor User
participant SyncEngine
participant WSClient
participant Server
participant VaultWatcher
participant FileSync
User->>SyncEngine: run()
SyncEngine->>WSClient: run()
WSClient->>Server: connect and authenticate
Server-->>WSClient: auth_response(success, connectCount=1)
WSClient->>SyncEngine: set ready_event
SyncEngine->>SyncEngine: _initial_sync() on first connect
SyncEngine->>VaultWatcher: start watching vault
Server--xWSClient: network drop
WSClient->>WSClient: reconnect logic
WSClient->>Server: reconnect and authenticate
Server-->>WSClient: auth_response(success, connectCount>1)
WSClient->>WSClient: detect reconnect
WSClient->>SyncEngine: _on_reconnect()
SyncEngine->>SyncEngine: _watch_enabled = False
SyncEngine->>VaultWatcher: effectively pause handling via _watch_enabled
SyncEngine->>SyncEngine: _initial_sync()
SyncEngine->>FileSync: run incremental file and note sync
FileSync->>Server: request_sync()
Server-->>FileSync: FileSyncEnd, updates, deletes
FileSync->>FileSync: track expected and received counts
FileSync->>SyncEngine: mark file sync complete
SyncEngine->>SyncEngine: _watch_enabled = True
SyncEngine->>VaultWatcher: resume handling via _watch_enabled
User-->>SyncEngine: daemon continues with up to date state
Sequence diagram for sync_once flow without double NoteSyncsequenceDiagram
actor User
participant SyncEngine
participant WSClient
participant Server
participant NoteSync
participant FileSync
User->>SyncEngine: sync_once()
SyncEngine->>WSClient: run()
WSClient->>Server: connect and authenticate
Server-->>WSClient: auth_response(success)
WSClient->>SyncEngine: set ready_event
alt sync_notes enabled
SyncEngine->>NoteSync: request_full_sync()
NoteSync->>Server: NoteSync request
Server-->>NoteSync: note updates
NoteSync->>SyncEngine: mark note sync complete
SyncEngine->>SyncEngine: _wait_note_sync(timeout=120)
end
alt sync_files or sync_config enabled
SyncEngine->>FileSync: request_sync()
FileSync->>FileSync: _reset_counters()
FileSync->>Server: FileSync request
Server-->>FileSync: FileSyncEnd
FileSync->>FileSync: record expected counts, _check_complete()
Server-->>FileSync: FileSyncUpdate and FileSyncDelete messages
FileSync->>FileSync: increment received counts and _check_complete()
FileSync->>SyncEngine: mark file sync complete
SyncEngine->>SyncEngine: _wait_file_sync(timeout=300)
end
SyncEngine->>WSClient: close()
WSClient-->>Server: close connection
SyncEngine-->>User: sync_once completes with single NoteSync round trip
Sequence diagram for FileSyncEnd and completion trackingsequenceDiagram
participant Server
participant WSClient
participant FileSync
Server-->>WSClient: FileSyncEnd(needModifyCount, needDeleteCount)
WSClient->>FileSync: _on_sync_end(msg)
FileSync->>FileSync: set _expected_modify, _expected_delete
FileSync->>FileSync: _got_end = True
FileSync->>FileSync: if total expected == 0 then _sync_complete = True else _check_complete()
loop for each modify
Server-->>WSClient: FileSyncUpdate
WSClient->>FileSync: _on_sync_update(msg)
alt chunked transfer
FileSync->>FileSync: _request_chunk_download(rel_path, data)
FileSync->>FileSync: _received_modify += 1
FileSync->>FileSync: _check_complete()
else direct content
FileSync->>FileSync: write file or log warning
FileSync->>FileSync: unignore_file(rel_path)
FileSync->>FileSync: _received_modify += 1
FileSync->>FileSync: _check_complete()
end
end
loop for each delete
Server-->>WSClient: FileSyncDelete
WSClient->>FileSync: _on_sync_delete(msg)
FileSync->>FileSync: delete local file if exists
FileSync->>FileSync: unignore_file(rel_path)
FileSync->>FileSync: _received_delete += 1
FileSync->>FileSync: _check_complete()
end
FileSync->>FileSync: when received >= expected
FileSync->>FileSync: log completion and set _sync_complete = True
Updated class diagram for sync, client, and watcher componentsclassDiagram
class SyncEngine {
+config
+ws_client
+note_sync
+file_sync
+state
+vault_path
+_watch_enabled
+run()
+sync_once()
+_initial_sync()
+_wait_file_sync(timeout)
+_wait_note_sync(timeout)
}
class FileSync {
+engine
+vault_path
+_sync_complete
+_download_sessions
+_expected_modify
+_expected_delete
+_received_modify
+_received_delete
+_got_end
+register_handlers()
+request_sync()
+_reset_counters()
+_on_sync_update(msg)
+_on_sync_delete(msg)
+_on_sync_rename(msg)
+_on_sync_end(msg)
+_check_complete()
+_request_chunk_download(rel_path, data)
+_collect_local_files()
}
class WSClient {
+config
+_running
+_handlers
+_binary_handler
+_on_reconnect
+_msg_queue
+_ready_event
+_connect_count
+on(action, handler)
+on_binary(handler)
+on_reconnect(handler)
+run()
+close()
+_raw_send(data)
+_flush_queue()
+_on_auth_response(msg)
}
class VaultWatcher {
+engine
+loop
+on_created(event)
+on_deleted(event)
+on_modified(event)
+on_moved(event)
+_rel(path)
+_schedule(key)
}
SyncEngine *-- FileSync
SyncEngine *-- WSClient
SyncEngine *-- VaultWatcher
FileSync --> SyncEngine
VaultWatcher --> SyncEngine
WSClient --> SyncEngine
Architecture flow diagram for sync engine, watcher, and serverflowchart LR
subgraph Local
CLI["CLI commands fns run, fns sync-once"]
SE["SyncEngine"]
WS["WSClient"]
FS["FileSync"]
NS["NoteSync"]
VW["VaultWatcher"]
FSys["Local filesystem"]
end
subgraph Remote
Srv["Sync server"]
end
CLI --> SE
SE --> WS
SE --> FS
SE --> NS
SE --> VW
WS <--> Srv
FS <--> Srv
NS <--> Srv
VW --> FS
VW --> NS
FS <--> FSys
NS <--> FSys
WS --> SE
SE --> VW
subgraph Reconnect_behavior
WS -- on_reconnect --> SE
SE -- _initial_sync on reconnect --> FS
SE -- _initial_sync on reconnect --> NS
SE -- _watch_enabled flag --> VW
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
SyncEngine.run's_on_reconnecthandler, consider wrapping the_watch_enabledflip in atry/finallyso that the watcher is always re-enabled even if_initial_syncraises, to avoid leaving the daemon in a permanently non-watching state. - The reconnect callback registration in
WSClientonly allows a single_on_reconnecthandler; if you anticipate multiple components needing reconnect behavior, it may be cleaner to support a list of handlers or a small event mechanism instead of a single stored callable.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `SyncEngine.run`'s `_on_reconnect` handler, consider wrapping the `_watch_enabled` flip in a `try/finally` so that the watcher is always re-enabled even if `_initial_sync` raises, to avoid leaving the daemon in a permanently non-watching state.
- The reconnect callback registration in `WSClient` only allows a single `_on_reconnect` handler; if you anticipate multiple components needing reconnect behavior, it may be cleaner to support a list of handlers or a small event mechanism instead of a single stored callable.
## Individual Comments
### Comment 1
<location path="fns_cli/file_sync.py" line_range="344-354" />
<code_context>
+ return
+ total_expected = self._expected_modify + self._expected_delete
+ total_received = self._received_modify + self._received_delete
+ if total_received >= total_expected:
+ log.info(
+ "FileSync complete: %d modified, %d deleted",
+ self._received_modify, self._received_delete,
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Using '>=' here can hide protocol mismatches or double-processing of messages.
With `total_received >= total_expected`, duplicate or double-counted messages will still mark the sync as complete, hiding protocol mismatches. If the protocol guarantees exact counts, consider using `==` and logging when `total_received > total_expected` so these issues are visible. Even if you keep the tolerance, logging any overshoot would help surface subtle bugs.
```suggestion
def _check_complete(self) -> None:
if not self._got_end:
return
total_expected = self._expected_modify + self._expected_delete
total_received = self._received_modify + self._received_delete
# Not done yet; still waiting for more messages.
if total_received < total_expected:
return
if total_received == total_expected:
log.info(
"FileSync complete: %d modified, %d deleted",
self._received_modify,
self._received_delete,
)
else:
# We received more messages than expected; this may indicate a protocol
# mismatch or double-processing, so surface it explicitly.
log.warning(
"FileSync received more messages than expected "
"(expected=%d, received=%d; modified=%d, deleted=%d). "
"Treating sync as complete.",
total_expected,
total_received,
self._received_modify,
self._received_delete,
)
self._sync_complete = True
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def _check_complete(self) -> None: | ||
| if not self._got_end: | ||
| return | ||
| total_expected = self._expected_modify + self._expected_delete | ||
| total_received = self._received_modify + self._received_delete | ||
| if total_received >= total_expected: | ||
| log.info( | ||
| "FileSync complete: %d modified, %d deleted", | ||
| self._received_modify, self._received_delete, | ||
| ) | ||
| self._sync_complete = True |
There was a problem hiding this comment.
suggestion (bug_risk): Using '>=' here can hide protocol mismatches or double-processing of messages.
With total_received >= total_expected, duplicate or double-counted messages will still mark the sync as complete, hiding protocol mismatches. If the protocol guarantees exact counts, consider using == and logging when total_received > total_expected so these issues are visible. Even if you keep the tolerance, logging any overshoot would help surface subtle bugs.
| def _check_complete(self) -> None: | |
| if not self._got_end: | |
| return | |
| total_expected = self._expected_modify + self._expected_delete | |
| total_received = self._received_modify + self._received_delete | |
| if total_received >= total_expected: | |
| log.info( | |
| "FileSync complete: %d modified, %d deleted", | |
| self._received_modify, self._received_delete, | |
| ) | |
| self._sync_complete = True | |
| def _check_complete(self) -> None: | |
| if not self._got_end: | |
| return | |
| total_expected = self._expected_modify + self._expected_delete | |
| total_received = self._received_modify + self._received_delete | |
| # Not done yet; still waiting for more messages. | |
| if total_received < total_expected: | |
| return | |
| if total_received == total_expected: | |
| log.info( | |
| "FileSync complete: %d modified, %d deleted", | |
| self._received_modify, | |
| self._received_delete, | |
| ) | |
| else: | |
| # We received more messages than expected; this may indicate a protocol | |
| # mismatch or double-processing, so surface it explicitly. | |
| log.warning( | |
| "FileSync received more messages than expected " | |
| "(expected=%d, received=%d; modified=%d, deleted=%d). " | |
| "Treating sync as complete.", | |
| total_expected, | |
| total_received, | |
| self._received_modify, | |
| self._received_delete, | |
| ) | |
| self._sync_complete = True |
Summary
FileSyncEndbefore the individualFileSyncDeletemessages. The old code set_sync_complete = TrueonFileSyncEnd, causingpull/sync-onceto close the connection before delete messages arrived. Now tracksexpectedvsreceivedcounts (mirroringNoteSync) and marks complete only when all messages are processed.returnin_on_sync_updateskipped_received_modify += 1, causing_wait_file_syncto hang until the 300 s timeout.on_movedmissingis_ignoredcheck: server-triggered renames temporarily ignore both paths, buton_movedonly checkedis_excluded. After the ignore window closed the debounced handler still fired, pushing a redundantdelete + uploadback to the server._initial_syncwas only called once at startup. After a network drop, deletions and updates on other clients during the outage were permanently missed. Addedon_reconnecthook toWSClient;SyncEngineuses it to pause the watcher and re-run_initial_syncon every reconnect.sync_oncedouble NoteSync:_initial_sync(incremental) was called first, thenrequest_full_syncwas called again in the same session.sync_oncenow callsrequest_full_sync+file_sync.request_syncdirectly without going through_initial_sync.Test plan
fns pullremoves the local copyfns runis connected, confirm local delete happens livefns runis active, delete a file on another client, restore network — confirm local delete happens after reconnectfns sync-once, confirm notes sync completes with a single NoteSync round-trip (check logs for no duplicateNoteSyncrequests)fns runis watching; confirm the rename is not pushed back to the server a second timeSummary by Sourcery
Improve file synchronization reliability, especially around deletes, reconnects, and one-off syncs.
Bug Fixes:
Enhancements: