Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 36 additions & 2 deletions docs/connectors/mssql/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,23 @@ The OLake Go MSSQL Source connector supports multiple synchronization modes. It
- **CDC Only**
- **Full Refresh + Incremental**

:::info **CHUNKING PERFORMANCE FOR TABLES WITHOUT PRIMARY KEYS**

This is optional and not mandatory. If you want faster chunking for tables without primary keys, you can grant the following permission:

**SQL Server 2016-2019:**

```sql
GRANT VIEW DATABASE STATE TO <olake_user>;
```

**SQL Server 2022 or later:**

```sql
GRANT VIEW DATABASE PERFORMANCE STATE TO <olake_user>;
```
:::

## Prerequisites

### Version Prerequisites
Expand Down Expand Up @@ -159,8 +176,15 @@ To connect the MSSQL source to a read-only secondary replica, set the following
In **OLake UI**, add this under **JDBC URL Parameters** as a key-value pair: `ApplicationIntent` (key) and `ReadOnly` (value).

:::info
- CDC must be enabled on the primary database.
- If you enable **Manage Capture Instance** while connecting to a read-only secondary replica, OLake Go cannot create or drop CDC capture instances automatically because that requires write access on the primary database. In this mode, capture instances must be **created and maintained manually** on the primary database when using a secondary replica for sync.
CDC must be enabled on the primary database.
:::

If you connect to a read-only secondary replica and enable **Manage Capture Instance**, OLake Go prompts for **primary database credentials** so it can create and manage capture instances on the primary. Without primary configuration, capture instances cannot be managed automatically.

If you prefer not to provide primary database details, you can still sync from the secondary replica by managing capture instances manually on the primary and leaving **Manage Capture Instance** disabled.

:::note SSH Tunnel
When SSH tunneling is enabled, both the primary and secondary database connections must be reachable through the same bastion host.
:::

### Connection Prerequisites
Expand Down Expand Up @@ -283,3 +307,13 @@ check \

---

## Troubleshooting {#troubleshooting}

### 1. High Database CPU usage during Full Refresh

Jobs syncing tables without a primary key can consume more CPU because rowid computation is done for the rows.

**Solution:** For non-primary-key table jobs, if CPU usage is high, reduce `max_threads` in the source configuration or set it to the default value.

**If the issue is not listed here, post the query on Slack to get it resolved within a few hours.**

4 changes: 3 additions & 1 deletion docs/release/ingestion/v0.7.0.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ April 21, 2026 – June 13, 2026

5. **MongoDB delete pre-image capture -** <br/> Added support to capture the full document on delete events using `fullDocumentBeforeChange: "whenAvailable"` for MongoDB 6.0+ clusters with pre-images enabled, falling back to `_id`-only `documentKey` when pre-images are unavailable to preserve existing behaviour.

6. **Optimized chunking strategies for MSSQL -** <br/> Adds faster and more efficient chunk planning for MSSQL full-load syncs. Uses page-level metadata to split tables without scanning them (SQL Server 2012+, requires `VIEW DATABASE STATE`; not supported on Azure SQL DB/MI). Falls back to statistical sampling when the primary strategy is unavailable.

### Destinations

1. **Skip equality deletes for CDC inserts post-backfill -** <br/> Equality deletes are now skipped for CDC inserts once the backfill→CDC overlap window is complete, reducing unnecessary write overhead. A new `dedup_inserts` flag on the Iceberg `olake_2pc` table property tracks this — Java sets it to `true` on backfill commit, and Go clears it to `false` after the first successful CDC commit. This applies to both the Arrow and legacy gRPC writers.
Expand Down Expand Up @@ -47,4 +49,4 @@ April 21, 2026 – June 13, 2026

10. **Fixed edge cases in `ReformatValue` and `ReformatBool` -** <br/> Corrected two bugs in value reformatting logic, added unit test coverage for `reformat.go`.

11. **Fixed TOAST column values being nulled on update events -** <br/> Unchanged TOAST columns in PostgreSQL update events were incorrectly emitted as `null` when `pgoutput` omitted the column data for unchanged values. For `REPLICA IDENTITY FULL` tables, the fix now preserves the existing value from the old tuple, preventing data loss on updates.
11. **Fixed TOAST column values being nulled on update events -** <br/> Unchanged TOAST columns in PostgreSQL update events were incorrectly emitted as `null` when `pgoutput` omitted the column data for unchanged values. For `REPLICA IDENTITY FULL` tables, the fix now preserves the existing value from the old tuple, preventing data loss on updates.
Loading