Skip to content

Comments

FXC-5747: Avoid unnecessary deep copies in monitor and unstructured data#3303

Open
momchil-flex wants to merge 2 commits intodevelopfrom
momchil/deep-copy-data
Open

FXC-5747: Avoid unnecessary deep copies in monitor and unstructured data#3303
momchil-flex wants to merge 2 commits intodevelopfrom
momchil/deep-copy-data

Conversation

@momchil-flex
Copy link
Collaborator

@momchil-flex momchil-flex commented Feb 18, 2026

Summary

  • Add deep=False to copy/updated_copy calls across monitor data, simulation data, and unstructured grid data where update values are already new independent objects, avoiding redundant deep copies of potentially large data arrays.
  • Add validate=False where update values are already the correct pydantic type (TCAD symmetry expansion, unstructured grid operations, sim data with model-typed updates).
  • Replace symmetry_expanded_copy with symmetry_expanded in two read-only call sites in FieldData (to_source and _make_adjoint_sources).

Important note on validate=False

validate=False is not safe when update values come from xarray arithmetic/operations (e.g. field / amps, np.conj(field), field * factor, field.isel(...)) because these produce plain xarray.DataArray objects that lose the tidy3d subclass (e.g. ScalarFieldDataArray). Pydantic validation is needed to coerce them back. A comment documenting this has been added near ElectromagneticFieldData.time_reversed_copy.

Files changed

File Changes
data/monitor_data.py deep=False on normalize, time_reversed_copy, grid_corrected_copy, translated_copy, mode reorder/subset/interp, flux/projection normalize; symmetry_expanded for read-only access
data/sim_data.py deep=False, validate=False on renormalize
data/unstructured/base.py deep=False, validate=False on rename, clean, arithmetic, real/imag/abs, sel, isel, non-spatial interp
data/unstructured/triangular.py deep=False, validate=False on reflect (normal-axis, reflection_only)
eme/data/sim_data.py deep=False, validate=False on EME field reconstruction
mode/data/sim_data.py deep=False, validate=False on sort_modes
tcad/data/monitor_data/abstract.py deep=False, validate=False on symmetry_expanded_copy
tcad/data/monitor_data/charge.py deep=False, validate=False on SteadyCapacitanceData.symmetry_expanded_copy

Test plan

  • tests/test_data/test_monitor_data.py — 352 passed
  • tests/test_data/test_sim_data.py — 84 passed
  • tests/test_components/test_heat_charge.py — 57 passed
  • tests/test_components/test_heat.py — 20 passed
  • tests/test_components/test_eme.py — 18 passed
  • Top-level tests/test_eme.py (numerical) — 20 passed

Made with Cursor


Note

Medium Risk
Touches many copy/update pathways for core data containers; incorrect deep=False/validate=False usage could introduce shared-mutation bugs or type-coercion/validator regressions, though changes are largely performance-oriented.

Overview
Reduces memory/time overhead when creating modified copies of large simulation datasets by switching many copy()/updated_copy() call sites to deep=False (and selectively validate=False) across monitor data, simulation data, and unstructured grid data.

Also updates Tidy3dBaseModel.copy docstring to clarify safety tradeoffs for deep and validate, and replaces two read-only uses of symmetry_expanded_copy with symmetry_expanded in FieldData to avoid extra copying when only accessing expanded fields.

Written by Cursor Bugbot for commit 4d88630. This will update automatically on new commits. Configure here.

Copilot AI review requested due to automatic review settings February 18, 2026 09:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes performance by avoiding unnecessary deep copies in data update operations across monitor data, simulation data, and unstructured grid data. The changes add deep=False to copy/updated_copy calls where update values are already independent objects, and validate=False where updates are already the correct pydantic type. Additionally, it replaces symmetry_expanded_copy with symmetry_expanded for read-only access patterns.

Changes:

  • Add deep=False to copy operations where update values are new independent objects
  • Add validate=False where update values are already correctly typed (TCAD symmetry expansion, unstructured grid operations, sim data with model-typed updates)
  • Replace symmetry_expanded_copy with symmetry_expanded for read-only field access
  • Document the safety constraint regarding xarray arithmetic operations losing subclass types

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
data/monitor_data.py Adds deep=False to various copy operations (normalize, time_reversed_copy, translated_copy, mode operations, flux/projection normalize); switches to symmetry_expanded for read-only access
data/sim_data.py Adds deep=False, validate=False to simulation renormalize operations
data/unstructured/base.py Adds deep=False, validate=False to unstructured data operations (rename, clean, arithmetic, real/imag/abs, sel, isel, interpolation)
data/unstructured/triangular.py Adds deep=False, validate=False to triangular grid reflection operations
eme/data/sim_data.py Adds deep=False, validate=False to EME field reconstruction
mode/data/sim_data.py Adds deep=False, validate=False to mode sorting operations
tcad/data/monitor_data/abstract.py Adds deep=False, validate=False to TCAD symmetry expansion
tcad/data/monitor_data/charge.py Adds deep=False, validate=False to capacitance data symmetry expansion
components/medium.py Changes self.model_fields to type(self).model_fields for correct class attribute access
docs/notebooks Updates subproject commit reference

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add deep=False to copy/updated_copy calls where update values are already
new objects, avoiding redundant deep copies of large data arrays. Also add
validate=False where update values are already the correct pydantic type
(e.g. TCAD symmetry expansion, unstructured grid operations). Notably,
validate=False is NOT safe when xarray arithmetic produces plain DataArray
objects that need pydantic coercion back to tidy3d subclasses.

Key changes:
- TCAD HeatChargeMonitorData/SteadyCapacitanceData symmetry expansion
- UnstructuredGridDataset operations (rename, clean, arithmetic, sel, isel)
- TriangularGridDataset.reflect for normal-axis reflection_only
- Various MonitorData methods (normalize, time_reversed_copy, etc.)
- SimulationData.renormalize, EME/Mode sim data operations
- Use symmetry_expanded instead of symmetry_expanded_copy for read-only access

Co-authored-by: Cursor <cursoragent@cursor.com>
@momchil-flex momchil-flex force-pushed the momchil/deep-copy-data branch from 15c8e94 to c279242 Compare February 18, 2026 10:07
@momchil-flex
Copy link
Collaborator Author

momchil-flex commented Feb 18, 2026

@yaugenst could you take a look especially at this comment I'm adding to the base model copy method?

One thing this is pointing to is that it would be nice if we can split type coercion "validation" from our custom validators, so we can at least avoid running the latter in some instances.

- Remove validate=False from unstructured base.py updated_copy calls
  (xarray ops return plain DataArray, losing IndexedDataArray).
- Remove validate=False from EME sim_data (drop_vars loses subclass).
- Remove validate=False from SimulationData.renormalize (restores
  _check_normalize_index validator).
- Add deep/validate safety guidance to Tidy3dBaseModel.copy() docstring.
- Remove redundant comment from monitor_data.py.

Co-authored-by: Cursor <cursoragent@cursor.com>
@momchil-flex momchil-flex force-pushed the momchil/deep-copy-data branch from 99596d7 to 4d88630 Compare February 18, 2026 12:49
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

return self.updated_copy(symmetry=(0, 0, 0), **new_field_components)
return self.updated_copy(
symmetry=(0, 0, 0), **new_field_components, deep=False, validate=False
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe validate=False with xarray-derived SpatialDataArray values

Medium Severity

The symmetry_expanded_copy now uses validate=False, but _symmetry_expanded_copy_base processes SpatialDataArray field components through xarray operations (.sortby() inside SpatialDataArray.reflect and .isel() inside sel_inside) which, per the PR's own documentation in base.py, lose the tidy3d DataArray subclass. Without validation, those plain xr.DataArray objects are stored directly instead of being coerced back to SpatialDataArray. This affects TemperatureData and SteadyPotentialData which accept SpatialDataArray fields.

Fix in Cursor Fix in Web

@github-actions
Copy link
Contributor

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

  • tidy3d/components/data/monitor_data.py (93.3%): Missing lines 658
  • tidy3d/components/data/sim_data.py (100%)
  • tidy3d/components/data/unstructured/base.py (88.9%): Missing lines 436
  • tidy3d/components/data/unstructured/triangular.py (100%)
  • tidy3d/components/eme/data/sim_data.py (100%)
  • tidy3d/components/mode/data/sim_data.py (100%)
  • tidy3d/components/tcad/data/monitor_data/abstract.py (100%)

Summary

  • Total: 30 lines
  • Missing: 2 lines
  • Coverage: 93%

tidy3d/components/data/monitor_data.py

Lines 654-662

  654             if eig_val < 0:
  655                 update[field_name] = field * self.grid_dual_correction
  656             else:
  657                 update[field_name] = field * self.grid_primal_correction
! 658         return field_data.copy(deep=False, update=update)
  659 
  660     @property
  661     def intensity(self) -> ScalarFieldDataArray:
  662         """Return the sum of the squared absolute electric field components."""

tidy3d/components/data/unstructured/base.py

Lines 432-440

  432         result = getattr(ufunc, method)(*inputs, **kwargs)
  433 
  434         if type(result) is tuple:
  435             # multiple return values
! 436             return tuple(self.updated_copy(values=x, deep=False) for x in result)
  437         elif method == "at":
  438             # no return value
  439             return None
  440         else:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants