Skip to content

Doc example standard analysis reproducibility#14

Merged
rhingo merged 4 commits into
mainfrom
doc-example-standard-analysis-reproducibility
May 26, 2026
Merged

Doc example standard analysis reproducibility#14
rhingo merged 4 commits into
mainfrom
doc-example-standard-analysis-reproducibility

Conversation

@nehatk17

Copy link
Copy Markdown
Contributor

No description provided.

@nehatk17 nehatk17 self-assigned this May 18, 2026

@NEStock NEStock left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and page works as expected!

@CodyCBakerPhD CodyCBakerPhD left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@@ -0,0 +1,116 @@
# Example Data Standardization

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest making this a bit more descriptive. Maybe "Example Workflow for Data Standardization and Analysis Reproducibility"

@@ -0,0 +1,116 @@
# Example Data Standardization

Here we document an example of how we took processed data stored in a typical file format many researchers work with and converted that to the NWB file format (community standard accepted by EMBER).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to make this a bulleted list to highlight the goals being:

  1. Standardize the data in a community-accepted standard (NWB)
  2. Enable reproducibility of paper figures as a starting point for secondary analyses

That way the goals are viewed in parallel rather than the analysis reproducibility being just a secondary effort


In collaboration with Dr. Suthana and Dr. Seeber (lead author), we explored each of the data variables in the original .mat files and identified analogous containers within the NWB file structure.

Most of the data variables are relevant to multiple subjects at the same time, as is often the case for data representing group averages in paper figures.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a ("e.g., ...") after "...relevant to multiple subjects at the same time". That would give a more concrete example of what that means so that someone understands when ndx-multisubjects should be used (I think the "at the same time" is the key part here?)


As mentioned above, the first step towards enabling robust secondary analyses is to replicate publication figures or analyses produced with the original file format.

To continue towards this effort, the following next steps are outlined:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can motivate some of these next steps with one sentence here? Why do we need/want to do this with the raw data as well?

We also show how we are able to replicate figure results from a paper using the converted data.

In doing this exercise, we complete a pipeline that is key for any datasets that are uploaded to EMBER. It is important that not only is the data standardized for improved storage and metadata retrieval, but that the standardized data can also be used for secondary processing and analyses. Reproducing key figures is the first such verification step towards ensuring that open datasets can be repurposed for new scientific endeavours.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a link to the data conversion code (python script to convert the .mat files to .nwb) and a link to the jupyter notebook that reproduces the analysis. We can also mention that we plan to release the standardized version of the dataset in the future.

@rhingo

rhingo commented May 18, 2026

Copy link
Copy Markdown
Contributor

Thanks for putting this together! I left a few minor comments throughout the file. Feel free to look through and address the ones you think are critical and then get this merged into main.

@nehatk17 nehatk17 requested a review from rhingo May 19, 2026 16:55
@rhingo rhingo merged commit 43717a0 into main May 26, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants