Shard datasets during training so that RAM usage is independent of world size by HastingsGreer · Pull Request #18 · HastingsGreer/InverseConsistency

HastingsGreer · 2025-10-26T12:02:35Z

No description provided.

* asdf * beginto harmonize training proceedure * final training pipe * Rename preprocess_train_knees.py to preprocess_train_fullres_knees.py * Create preprocess_train_halfres_knees.py * details * brain unstripped * update old eval notebook Co-authored-by: Thomas Greer <tgreer@biag-w05.cs.unc.edu>

* cvpr start * Update TODO.md * Update TODO.md * Update TODO.md * Update TODO.md * Update and rename TODO.md to README.md * OAI eval script done * brain training script- testing if both steps in one script works * ugh * visualize training * training? * Update cvpr_network.py * don't say validation * normal log freq * Add train config for lung dataset. * fix shuffle name * HCP eval * progress * Adding support for different dimensions. * Fix the error " CPU tensor cannot be gathered" when using flips() for 2D and 1D data on 4 GPUS. * 1. Fix bug - GradientICONSparse running error when applied to 2D images. 2. Add framework parameter to train_two_stage so that we can run the training process on ICON as well. * Add ablation study script for comparing the training on different resolutions. * fix preprocess * script for COPDGene_eval * batch size + switching images * HCP ants eval * COPDGene_eval.py * OAI_ants_eval_needs_work * Get code for bending energy or velocity field ablation into the cvpr branch (#50) * Update network_wrappers.py * Update network_wrappers.py * Update network_wrappers.py * Update networks.py * Update network_wrappers.py * Update train.py * Update losses.py * name mistake * crimes * explain bizzare code in comment * Update network_wrappers.py * Update network_wrappers.py * Update network_wrappers.py * Update network_wrappers.py * Add learn2reg abdomenCTCT and NLST dataset helper function to icon data script. * Add train script for the learn2reg AbdomenCTCT registration task and NLST task. * Add copdgene train set to data script. * Add train script for network capacity and network structure ablation study. * evaluate OAI at half resolution to match prior work * Add experiment script for comparing convergence speed between icon and gradICON. * Add option to flips function so that it could print foldings in percentage. * Add evaluation script for learn2reg abdomenCTCT registration. * Fix the bug when normalize the intensity to [0,1]. * Fix bug: footsteps is initialized twice. Because utils initializes footsteps when imported. * Add evaluation script for learn2reg NLST task. * training scripts for abdomen and learn2reg lung * abdomen eval fixes * Add support for specifying output folder via argument list in network structure ablation study scripts. * folds * update comparison regularizers * Update losses.py (#51) * OAI_eval with torch.grid_sample * real grid_sample test * chunkin along * Asdfafdsa * synthmorph * synthmorph * Add HCP evaluation script for synthmorph. * Add folding computation into the script. * Fix Bending Energy. * Experiment of comparing regularizers with varying lambdas. * Plotting convergence speed comparison between ICON and GradICON. * Update requirements.txt * Update setup.cfg * Add model statistics computation. * Clean up the SynthMorph evaluation code. * Add test code for SynthMorph evaluation code. * Clean up the notebooks of the varying lambda experiments. * Add the reason of having a copy of VM UNet to the comments. * Add description of how to run the model statistics computation script. * Change the required itk version to 5.3.0 * Add the pretrained models to package. * Add test script for brain registration. * Fix bug: Should have used pre-trained model used in test_brain_itk. * Lossen the test criteria for brain registration so that the test case can pass when ran on cpu. * Fix the comments so that sphinx can generate documentation. * Unify the output of flips() function. Now the output should be a detached tensor. Co-authored-by: Lin Tian <lintian@cs.unc.edu> Co-authored-by: Raul <sonic1sonic@gmail.com>

* make sure we aren't scaling a signed short image to [0, 1) * enable cast * fix tests * add new module to doc

* Fix bug: input_channels was not truly reflecting the number of channels of x or y when given (x,y) as the input. * Remove input_channels argument in UNet2 to avoid potential error caused by inconsistency between the two arguments input_channels and channels. * Refactor all the similarity loss to inherient from SimilarityBase. SimilarityBase has a member variable called isInterpolated and it indicates whether the similarity loss class requires mask for the interpolated evaluation. * 1. Fix bug: Should check whether inbounds_tag is None or not. 2. Add assertion to check the shape of image_A and image_B. * Set correct shape for the inbounds_tag when images have multiple channels. * Refactor the test script according to the SimilarityBase class. * Refactor the test script according to the SimilarityBase class. * Refactor the test script according to the SimilarityBase class. * To allow using similarity measure defined by user. * Keep ssd and ssd_only_interpolated for backward compatibility. * Add itk interface for multi-modality registration task.

…arity meaure with isInterpolated set to True requires the inbounds_tag to be passed as one extra channel on image_A, otherwise the similarity measure will accept the two images with the same number of channels.

2.Add freesurfer affine evaluation script. 3.Move all the helper functions to helper.py. 4.Add a prepare script so that we process the image for Synthmorph once.

…ommit them to git.

…itialization error.

HCP Fix

I don't think we meant to keep this check after switching to the "getattr" approach

Add GradICON paper and reference.

Use input image type instead of a predefined one. Also, refactoring to be compliant with PEP8 (79 characters max length)

vendor_gaussian_kernel

EHN: Change predefined image types for input images

This is the configuration we actually used in training

Reduce memory footprint

1.1.2

Update losses.py

* Update cpu-test-action.yml * Update setup.cfg * Update setup.cfg * Update setup.cfg * Update setup.cfg * Update setup.cfg * Update setup.cfg * Update setup.cfg * Update setup.cfg * Update cpu-test-action.yml * Update cpu-test-action.yml * Update setup.cfg * Update requirements.txt * Update setup.cfg * Update requirements.txt * Update setup.cfg * Update requirements.txt * Update setup.cfg * Update cpu-test-action.yml

Co-authored-by: Basar Demir <bdemir@biag-gpu6.cs.unc.edu>

Put a pointer to uniGradICON and multiGradICON into the ICON readme.

Correct link to uniGradICON and multiGradICON.

* Add squared lncc and mind-ssc losses * fix cpu error and add indexing parameter for meshgrid in mind-ssc --------- Co-authored-by: Basar Demir <bdemir@biag-gpu6.cs.unc.edu>

* add support for loss function masking * Change masking strategy, update naming conventions, and fix bugs --------- Co-authored-by: Basar Demir <bdemir@biag-gpu6.cs.unc.edu>

* Update cpu-test-action.yml * Update cpu-test-action.yml

* Move ConstrICON code into ICON. WIP So that I stop copy-pasting it everywhere. * Create test_constricon.py * Update test_constricon.py * Update test_constricon.py * Update test_2d_registration_train.py * Update test_constricon.py * Update test_constricon.py * Update test_constricon.py * Update constricon.py * Update constricon.py * Update medical_training.rst * docs work * Update test-action.yml * Update test-action.yml * maybe that helps * Update README.md * Update medical_training.rst * docs * tabs * fix tests * doc fix * Update medical_training.rst * Update data.py * work for presentation * fixes * a * Update medical_training.rst * Update medical_training.rst * Update medical_training.rst * Add files via upload * Add files via upload * updates for longleaf * unicarl attempt * README * fix details * a * Update setup.cfg to ban naughty itk 6.0 * prepare for itk 6 * used hasattr wrong * A * itk regression fixed? * datasets * a * is training * Update register.py * Update register.py

…hnorm stats consistent

HastingsGreer and others added 30 commits September 23, 2022 17:49

Update train.py

f29536f

Update register_fives.rst (#47)

a2e64ee

Update README.md

6a196aa

Update README.md

66a7238

pip release 1.1

123146e

Update test_brain_itk.py

752532f

make sure we aren't scaling a signed short image to [0, 1) (#53)

343762c

* make sure we aren't scaling a signed short image to [0, 1) * enable cast * fix tests * add new module to doc

Refactor cvpr code to accomondate the new similarity interface: Simil…

e64ffe0

…arity meaure with isInterpolated set to True requires the inbounds_tag to be passed as one extra channel on image_A, otherwise the similarity measure will accept the two images with the same number of channels.

1.Update the test set.

0460a3c

2.Add freesurfer affine evaluation script. 3.Move all the helper functions to helper.py. 4.Add a prepare script so that we process the image for Synthmorph once.

The evaluation scripts of ants and gradicon were saved locally. Now c…

1c91258

…ommit them to git.

Move the initialization of footsteps ahead to prevent the multiple in…

57635f2

…itialization error.

Merge pull request #56 from uncbiag/HCP_fix

17c9c4b

HCP Fix

remove unneeded check (#57)

8dd6b49

I don't think we meant to keep this check after switching to the "getattr" approach

Update README.md (#58)

10f2808

Add GradICON paper and reference.

EHN: Change predefined image types for input images

6b27670

Use input image type instead of a predefined one. Also, refactoring to be compliant with PEP8 (79 characters max length)

vendor_gaussian_kernel

fe441b4

Update losses.py

770256d

Merge pull request #61 from uncbiag/vendor_gaussian_kernel

dc0a850

vendor_gaussian_kernel

Add files via upload

ac39bcd

Merge pull request #59 from curiale/master

cd5d56e

EHN: Change predefined image types for input images

fix shallow copy bug

f59eba0

Update losses.py

2ef1710

This is the configuration we actually used in training

Update lung_ct.py

e7e4604

Update losses.py

0b9361a

Update losses.py

f6d212a

Merge pull request #63 from uncbiag/reduce_memory_footprint

66ad084

Reduce memory footprint

release

7ffed36

HastingsGreer and others added 19 commits December 8, 2023 10:57

1.1.2

ab63d8d

Merge pull request #65 from uncbiag/pip-update-1.1.2

50d2cba

1.1.2

Update losses.py

8dee552

Merge pull request #67 from uncbiag/nanfix

f23a698

Update losses.py

pypi version bump

2302339

fix rtd theme error

6c13d48

pypi 1.1.4

d2fd88d

Add squared lncc and mind-ssc losses (#76)

f11da0b

Co-authored-by: Basar Demir <bdemir@biag-gpu6.cs.unc.edu>

pypi 1.1.5

0ffe3f2

Update README.md

e7f177e

Put a pointer to uniGradICON and multiGradICON into the ICON readme.

Update README.md

c926e9a

Correct link to uniGradICON and multiGradICON.

Fix MIND-SSC (#77)

2a6e26e

* Add squared lncc and mind-ssc losses * fix cpu error and add indexing parameter for meshgrid in mind-ssc --------- Co-authored-by: Basar Demir <bdemir@biag-gpu6.cs.unc.edu>

remove python 3.7 test (#82)

8289da1

WIP add support for loss function masking (#83)

bd1d563

* add support for loss function masking * Change masking strategy, update naming conventions, and fix bugs --------- Co-authored-by: Basar Demir <bdemir@biag-gpu6.cs.unc.edu>

pypi 1.1.6

b70537f

Work out why CI is failing for null pull requests (#90)

0a6586a

* Update cpu-test-action.yml * Update cpu-test-action.yml

shard datasets second attempt- split individual datasets to keep batc…

c4fbc65

…hnorm stats consistent

HastingsGreer force-pushed the shard-datasets branch from 8a9df6e to c4fbc65 Compare October 26, 2025 12:06

HastingsGreer added 2 commits October 26, 2025 13:11

tweaks for actual data

12e3a25

switch to public split

5c84f94

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shard datasets during training so that RAM usage is independent of world size#18

Shard datasets during training so that RAM usage is independent of world size#18
HastingsGreer wants to merge 51 commits into
HastingsGreer:masterfrom
uncbiag:shard-datasets

HastingsGreer commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

HastingsGreer commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants