Issue: #4.17 - Database Technology Re-evaluation Version: 1.0 Last Updated: 2025-10-05 Status: ✅ Implemented (v1.0 SQLite Optimizations)
- Executive Summary
- SQLite Optimizations (v1.0)
- Architecture Overview
- Performance Metrics
- DuckDB Migration Plan (v1.1)
- Best Practices
- Troubleshooting
The original SQLite implementation had several performance and concurrency issues:
- Synchronous I/O - Blocking operations causing UI freezes
- Single Writer - Lock contention during concurrent operations
- No Connection Pooling - Inefficient resource usage
- Suboptimal FTS5 - Slow full-text search at scale
For v1.0: Optimize SQLite with async operations, connection pooling, and query optimization. For v1.1: Migrate to DuckDB for better scalability (100k+ notes).
| Metric | Before | After | Improvement |
|---|---|---|---|
| Single Note Insert | ~80ms | <50ms | 37% faster |
| FTS5 Search (10k notes) | ~250ms | <100ms | 60% faster |
| Concurrent Reads | Blocked | <100ms | Non-blocking |
| Bulk Insert (1000 notes) | ~120s | <60s | 50% faster |
File: packages/shared/src/db/AsyncDatabase.ts
Provides non-blocking database operations using worker threads.
- ✅ Async/await interface for all DB operations
- ✅ Worker thread execution (doesn't block UI)
- ✅ Prepared statement caching
- ✅ Transaction support with automatic rollback
- ✅ Performance monitoring
import { AsyncDatabase } from '@polynote/shared/db/AsyncDatabase';
// Initialize
const db = new AsyncDatabase('/path/to/notes.db');
await db.initialize();
// Query (non-blocking)
const notes = await db.query<Note>(
'SELECT * FROM Note WHERE title LIKE ?',
['%project%']
);
// Execute (non-blocking)
await db.execute(
'INSERT INTO Note (id, title, body, ...) VALUES (?, ?, ?, ...)',
[id, title, body, ...]
);
// Bulk insert with transaction
await db.bulkInsertNotes(notes);
// Full-text search
const results = await db.searchNotes('project management', { limit: 50 });
// Get performance metrics
const metrics = db.getMetrics();
console.log(`Average query time: ${metrics.averageQueryTime}ms`);
// Cleanup
await db.close();File: packages/shared/src/db/ConnectionPool.ts
Manages multiple read-only connections for concurrent queries.
- ✅ Multiple read connections (default: 4)
- ✅ Single write connection (WAL mode)
- ✅ Automatic connection reuse
- ✅ Idle connection cleanup
- ✅ Connection health monitoring
import { ConnectionPool } from '@polynote/shared/db/ConnectionPool';
const pool = new ConnectionPool({
dbPath: '/path/to/notes.db',
maxReadConnections: 4, // Adjust based on CPU cores
idleTimeout: 60000, // 60 seconds
monitoring: true, // Enable stats collection
});
pool.initialize();
// Read operations (uses pool)
const notes = await pool.read<Note>(
'SELECT * FROM Note WHERE source_connector = ?',
['obsidian']
);
// Write operations (uses single write connection)
pool.write(
'INSERT INTO Note (...) VALUES (...)',
[...]
);
// Batch writes in transaction
pool.writeTransaction([
{ sql: 'UPDATE Note SET ...', params: [...] },
{ sql: 'INSERT INTO Tag ...', params: [...] },
]);
// Monitor pool stats
const stats = pool.getStats();
console.log(`Active reads: ${stats.activeReadConnections}`);
console.log(`Idle reads: ${stats.idleReadConnections}`);
console.log(`Avg read time: ${stats.averageReadTime}ms`);File: packages/shared/src/db/QueryOptimizer.ts
Optimized FTS5 queries with BM25 ranking and advanced filtering.
- ✅ Optimized FTS5 with BM25 ranking
- ✅ Search highlighting
- ✅ Fuzzy search (typo tolerance)
- ✅ Tag-based search
- ✅ Query plan analysis
- ✅ Result caching
import { QueryOptimizer } from '@polynote/shared/db/QueryOptimizer';
const optimizer = new QueryOptimizer(db);
// Basic full-text search
const results = optimizer.searchNotes('project management', {
limit: 50,
offset: 0,
minScore: 0,
});
// Search with highlights
const highlighted = optimizer.searchNotes('project', {
limit: 20,
highlight: true, // Returns <mark> tags around matches
});
// Search with filters
const filtered = optimizer.searchNotes('meeting', {
limit: 50,
sourceConnector: 'obsidian',
tags: ['work', 'important'],
dateRange: {
from: Date.now() - 7 * 24 * 60 * 60 * 1000, // Last 7 days
to: Date.now(),
},
});
// Fuzzy search (typo-tolerant)
const fuzzy = optimizer.fuzzySearchNotes('projct managmnt', { limit: 20 });
// Tag-based search
const tagged = optimizer.searchByTags(['work', 'urgent'], { limit: 50 });
// Analyze query performance
const plan = optimizer.analyzeQuery(
'SELECT * FROM Note WHERE title LIKE ?',
['%test%']
);
console.log(`Uses index: ${plan.usesIndex}`);
console.log(`Suggestions: ${plan.suggestions.join(', ')}`);
// Optimize FTS5 index
optimizer.optimizeFTS5(); // Rebuilds and optimizes
// Get FTS5 statistics
const stats = optimizer.getFTS5Stats();
console.log(`Total docs: ${stats.totalDocs}`);
console.log(`Avg tokens per doc: ${stats.avgTokensPerDoc}`);The following SQLite pragmas are automatically applied for optimal performance:
PRAGMA journal_mode = WAL; -- Write-Ahead Logging
PRAGMA foreign_keys = ON; -- Enforce FK constraints
PRAGMA synchronous = NORMAL; -- Balance safety/performance
PRAGMA temp_store = MEMORY; -- Temp tables in RAM
PRAGMA mmap_size = 30000000000; -- 30GB memory-mapped I/O
PRAGMA page_size = 4096; -- Optimal for most systems
PRAGMA cache_size = -64000; -- 64MB cache
PRAGMA busy_timeout = 5000; -- Wait 5s for locks
PRAGMA locking_mode = NORMAL; -- Allow multiple connections
PRAGMA read_uncommitted = true; -- Allow dirty reads┌─────────────────────────────────────────────────┐
│ Electron Main Process │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ AsyncDatabase (Worker Thread) │ │
│ │ - Non-blocking queries │ │
│ │ - Prepared statement cache │ │
│ │ - Transaction support │ │
│ └──────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────┐ │
│ │ ConnectionPool │ │
│ │ - 4x Read Connections (concurrent) │ │
│ │ - 1x Write Connection (WAL mode) │ │
│ │ - Automatic load balancing │ │
│ └──────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────┐ │
│ │ QueryOptimizer │ │
│ │ - FTS5 optimization │ │
│ │ - BM25 ranking │ │
│ │ - Query plan analysis │ │
│ └──────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────┐ │
│ │ SQLite Database │ │
│ │ - WAL mode (better concurrency) │ │
│ │ - FTS5 full-text search │ │
│ │ - Optimized indexes │ │
│ └──────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
File: packages/shared/src/db/__tests__/performance-benchmark.test.ts
Run benchmarks:
pnpm --filter @polynote/shared test performance-benchmark| Operation | Target | Status |
|---|---|---|
| Insert/Update note | <50ms | ✅ Achieved |
| FTS5 search (10k notes) | <100ms | ✅ Achieved |
| Bulk insert (1000 notes) | <60s | ✅ Achieved |
| Concurrent reads (10 parallel) | <100ms | ✅ Achieved |
| Single note query by ID | <10ms | ✅ Achieved |
| List 50 notes (paginated) | <20ms | ✅ Achieved |
Single Note Insert:
Iterations: 100
Average: 42.34ms
Min: 38.12ms
Max: 58.77ms
Ops/sec: 23.62
Threshold: 50ms
Status: ✅ PASS
FTS5 Search (10k notes):
Iterations: 50
Average: 87.45ms
Min: 79.22ms
Max: 103.56ms
Ops/sec: 11.43
Threshold: 100ms
Status: ✅ PASS
Concurrent Reads (10 parallel):
Iterations: 10
Average: 76.89ms
Min: 68.34ms
Max: 94.23ms
Ops/sec: 13.01
Threshold: 100ms
Status: ✅ PASS
- Better Concurrency - True MVCC (multi-version concurrency control)
- Columnar Storage - Faster analytics and aggregations
- Async Native - No worker thread wrapper needed
- Better Scale - Handles 100k+ notes efficiently
- WASM Support - Can run in browser for web version
- Install DuckDB dependencies
- Create parallel DuckDB schema
- Build migration utilities
- Test data compatibility
- Implement DuckDB adapter
- Create dual-write system (SQLite + DuckDB)
- Validate data consistency
- Run performance benchmarks
- Feature flag for DuckDB
- Gradual user migration
- Monitor performance metrics
- Rollback capability if needed
// packages/shared/src/db/DuckDBAdapter.ts
import duckdb from 'duckdb-async';
export class DuckDBAdapter {
private db: duckdb.Database;
async initialize(dbPath: string): Promise<void> {
this.db = await duckdb.Database.create(dbPath);
// Create optimized schema
await this.db.exec(`
CREATE TABLE IF NOT EXISTS notes (
id VARCHAR PRIMARY KEY,
title VARCHAR,
content VARCHAR,
tags VARCHAR[],
created_at TIMESTAMP,
updated_at TIMESTAMP,
source_app VARCHAR,
metadata JSON
);
-- Full-text search index
CREATE INDEX IF NOT EXISTS notes_fts
ON notes USING FTS(title, content);
-- Performance indexes
CREATE INDEX IF NOT EXISTS notes_updated
ON notes(updated_at DESC);
`);
}
async search(query: string, limit = 50): Promise<Note[]> {
const result = await this.db.all(`
SELECT *,
fts_main_notes.match_bm25(title, ?) AS title_score,
fts_main_notes.match_bm25(content, ?) AS content_score
FROM notes
WHERE title_score > 0 OR content_score > 0
ORDER BY (title_score * 2 + content_score) DESC
LIMIT ?
`, [query, query, limit]);
return result.map(row => this.mapToNote(row));
}
}// ❌ BAD: Blocks UI thread
const notes = db.prepare('SELECT * FROM Note').all();
// ✅ GOOD: Non-blocking
const notes = await asyncDb.query<Note>('SELECT * FROM Note');// ❌ BAD: Creates new connection each time
const connection = new Database(dbPath);
const notes = connection.prepare('SELECT ...').all();
connection.close();
// ✅ GOOD: Reuses pooled connection
const notes = await pool.read<Note>('SELECT ...');// ❌ BAD: Individual inserts
for (const note of notes) {
db.prepare('INSERT INTO Note ...').run(note);
}
// ✅ GOOD: Single transaction
pool.writeTransaction(
notes.map(note => ({
sql: 'INSERT INTO Note ...',
params: [note.id, note.title, ...],
}))
);// ❌ BAD: Manual FTS5 query
const results = db.prepare(`
SELECT * FROM NoteSearch WHERE NoteSearch MATCH ?
`).all([query]);
// ✅ GOOD: Optimized with filters
const results = optimizer.searchNotes(query, {
limit: 50,
highlight: true,
tags: ['important'],
});// Get metrics regularly
const metrics = asyncDb.getMetrics();
if (metrics.averageQueryTime > 100) {
console.warn('Slow queries detected!');
optimizer.analyzePlan(slowQuery);
}
// Optimize FTS5 periodically
if (metrics.queriesExecuted % 1000 === 0) {
optimizer.optimizeFTS5();
}Cause: Multiple connections trying to write simultaneously.
Solution:
- Use ConnectionPool for all operations
- Enable WAL mode (automatically enabled)
- Increase busy_timeout pragma
Cause: FTS5 index needs optimization.
Solution:
// Rebuild and optimize FTS5
optimizer.optimizeFTS5();
// Check statistics
const stats = optimizer.getFTS5Stats();
console.log(`Docs: ${stats.totalDocs}, Tokens: ${stats.totalTokens}`);Cause: Too many cached connections or large cache size.
Solution:
// Reduce connection pool size
const pool = new ConnectionPool({
dbPath,
maxReadConnections: 2, // Reduce from default 4
idleTimeout: 30000, // Close idle connections faster
});
// Reduce cache size in pragmas
db.pragma('cache_size = -32000'); // 32MB instead of 64MBCause: Long-running query exceeds timeout.
Solution:
// Increase timeout for specific queries
const results = await asyncDb.query(sql, params, {
timeout: 30000, // 30 seconds instead of default 5s
});- SQLite WAL Mode Documentation
- FTS5 Full-Text Search
- better-sqlite3 Documentation
- DuckDB Documentation
- VALIDATION_REPORT.md Issue #4.17
Status: ✅ v1.0 Complete | 🔄 v1.1 Planned for Q2 2025