[Bug]: seekdb Exit Code 139 (SIGSEGV) Bug Report

#GitHub Issue Report - Simplified Version

## Title
pyseekdb embedded mode crashes with SIGSEGV (exit 139) when importing large collections (>1000 documents)

## Environment information
- **OS**: macOS Darwin 25.3.0 (Linux also tested)
- **Python**: 3.11.8 (3.14 is not compatible)
- **pyseekdb**: latest version (pip install)
- **Mode**: Embedded mode (`pyseekdb.Client(path="./data/db")`)

## Problem description

When using pyseekdb's embedded mode to import data, the Python process crashes and returns **exit code 139 (SIGSEGV)** when the collection data size exceeds about 1000 documents.

### Key Features
- ❌ No Python exceptions (direct SEGSEGV)
- ❌ No error messages
- ❌ Inconsistency: the same data sometimes succeeds and sometimes fails
- ✅ Small data sets (<1000 items) can be successful

## Minimal reproduction code

```python
import pyseekdb

#Create client
client = pyseekdb.Client(path="./test.db")
collection = client.get_or_create_collection(name="test_collection")

# Generate test data (2000 items)
documents = [f"Test document {i}" for i in range(2000)]
ids = [f"id_{i}" for i in range(2000)]
metadatas = [{"index": i} for i in range(2000)]

# Batch import
batch_size = 50
for i in range(0, len(documents), batch_size):
    print(f"Batch {i//batch_size + 1}...")
    collection.add(
        ids=ids[i:i+batch_size],
        documents=documents[i:i+batch_size],
        metadatas=metadatas[i:i+batch_size]
    )

print(f"Total: {collection.count()}")
```

**Run**: `python3 test_crash.py`

**Expected**: 2000 items successfully imported
**Actual**: Crash on batch 10-20 (exit code 139)

## Actual test data

| Collection name | Number of documents | Status | Crash point |
|---------|--------|------|--------|
| edd_codes | 3,600 | ✅ Success | - |
| edd_data_items | 600 | ✅ Success | - |
| edd_domains | 600 | ✅ Success | - |
| physical_model_table | 868 | ✅ Success | - |
| logical_model_attribute | 24,669 | ❌ crash | ~500-1000 |
| physical_model_field | 29,322 | ❌ crash | ~250 |

## Tried solutions (all failed)

1. ✅ Python 3.14 → 3.11 (initial crash fixed)
2. ✅ Reduce batch_size: 1000 → 100 → 50 → 10
3. ✅ Use global client (such as official example)
4. ✅ Convert numpy type to Python type
5. ✅ Chunked import (one process for every 1000 items)
6. ✅ Checkpoint recovery mechanism

**All scenarios still crash when importing >1000 items. **

## Error log

```bash
$ python import.py
[seekdb] seekdb has opened
Reading Excel file...
Loaded 24669 rows
Building data...
Importing to collection 'test'...
  Progress: 50/24669
  Progress: 100/24669
  Progress: 150/24669
  Progress: 200/24669
[Process exits with code 139 - no error message]
```

## expected behavior

It should be possible to reliably import 10,000+ documents, as stated in the official documentation.

## Actual behavior

Crash when exceeding ~1000 entries, making embedded mode unusable for medium to large datasets in production environments.

## Temporary solution

Using **server mode** may be more stable (untested):
```bash
seekdb server start --port 2881
export SEEKDB_HOST=localhost
export SEEKDB_PORT=2881
python import.py
```

## Scope of influence

This issue blocks any production deployments that use pyseekdb embedded mode to process real-world datasets.

---

**Request**:
1. Confirm that this is a known issue of embedded mode
2. Provide a stable import solution for large data sets
3. Fix the SEGSEGV problem in embedded mode, or
4. Clearly state in the documentation that embedded mode is not suitable for production environments/large data sets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: seekdb Exit Code 139 (SIGSEGV) Bug Report #175

Title

Environment information

Problem description

Key Features

Minimal reproduction code

Actual test data

Tried solutions (all failed)

Error log

expected behavior

Actual behavior

Temporary solution

Scope of influence

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Collection name	Number of documents	Status	Crash point
edd_codes	3,600	✅ Success	-
edd_data_items	600	✅ Success	-
edd_domains	600	✅ Success	-
physical_model_table	868	✅ Success	-
logical_model_attribute	24,669	❌ crash	~500-1000
physical_model_field	29,322	❌ crash	~250

[Bug]: seekdb Exit Code 139 (SIGSEGV) Bug Report #175

Description

Title

Environment information

Problem description

Key Features

Minimal reproduction code

Actual test data

Tried solutions (all failed)

Error log

expected behavior

Actual behavior

Temporary solution

Scope of influence

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions