Skip to content

Pavlovia experiment cleanup: scan and delete stale repos, pause linked Prolific studies #49

@khahani

Description

@khahani

PRD: Pavlovia Experiment Cleanup

Problem Statement

Scientists accumulate test and scaffolding Pavlovia repositories over the course of developing
EasyEyes experiments. These repos — created during development iterations, aborted pilots, or
misconfigured builds — clutter the "Select a compiled experiment" list and consume storage on
gitlab.pavlovia.org. When a study was linked to Prolific, the corresponding Prolific study also
remains in a potentially active or unpublished state, creating loose ends across two platforms.
There is currently no way for a scientist to identify and remove these stale repos without manually
inspecting each one through the Pavlovia and Prolific web interfaces.

Solution

Add a secondary "Clean Up" button to the "Select an Experiment" modal. When clicked, the compiler
scans all of the scientist's Pavlovia repos, identifies deletion candidates based on emptiness and
activity criteria, and presents a checklist. The scientist selects any combination of candidates,
confirms once, and the compiler deletes each Pavlovia repo and pauses any linked Prolific study in
a best-effort batch. A summary of results is shown at the end.

User Stories

  1. As a scientist, I want a "Clean Up" button in the "Select an Experiment" dialog, so that I can
    start the cleanup workflow without leaving the compiler.

  2. As a scientist, I want the cleanup button to be visually secondary to the "Close" button, so
    that it is clearly not a primary compiler action and cannot be triggered accidentally.

  3. As a scientist, I want the compiler to scan all of my Pavlovia repos when I start cleanup, so
    that I do not have to check each one manually.

  4. As a scientist, I want to see a progress indicator while the scan is running, so that I know
    the compiler is working and not frozen.

  5. As a scientist, I want an empty or scaffold-only repo (no real experiment code committed) to
    appear in the candidate list, so that repos created by mistake or abandoned before use can be
    removed.

  6. As a scientist, I want a repo whose data/ folder is empty and whose Pavlovia status is not
    "Running" to appear in the candidate list, so that repos that were deployed but never collected
    real participant data can be removed.

  7. As a scientist, I want repos currently in "Running" status on Pavlovia to be excluded from the
    candidate list, even if their data/ folder is empty, so that an active experiment is never
    accidentally deleted.

  8. As a scientist, I want repos whose linked Prolific study has any active, queued, or completed
    submissions to be excluded from the candidate list, so that experiments with real participant
    activity are never deleted.

  9. As a scientist, I want each candidate in the list to show the repo name, the reason it was
    flagged (e.g. "empty repo", "no data collected"), and the Prolific study status if a
    ProlificStudyId.txt file exists, so that I can make an informed decision before selecting
    it for deletion.

  10. As a scientist, I want a "Select All" checkbox at the top of the candidate list, so that I can
    select every candidate in one click.

  11. As a scientist, I want to be able to uncheck individual repos from a "Select All" selection,
    so that I can exclude specific repos I am unsure about.

  12. As a scientist, I want a single confirmation prompt that summarises how many repos will be
    deleted and how many Prolific studies will be paused before anything is removed, so that I can
    abort if the summary looks wrong.

  13. As a scientist, I want the compiler to delete each selected Pavlovia repo via the GitLab API,
    so that the repos are removed from my gitlab.pavlovia.org account.

  14. As a scientist, I want the compiler to pause the linked Prolific study for each deleted repo
    that has a ProlificStudyId.txt, so that no new participants can join a study whose experiment
    repo no longer exists.

  15. As a scientist, I want the cleanup to continue processing remaining repos if one deletion or
    pause fails, so that a single network error does not block the whole batch.

  16. As a scientist, I want a final summary dialog showing which repos were successfully deleted,
    which Prolific studies were paused, and which operations failed, so that I know the final
    state of my accounts.

  17. As a scientist with no deletable repos, I want the scan to tell me that nothing was found, so
    that I know the scan completed and my repos are all considered active.

  18. As a scientist, I want the "Select an Experiment" list to reflect the deletions immediately
    after cleanup completes, so that removed repos no longer appear in the dropdown.

Implementation Decisions

  • scanDeletablePavloviaRepos (new deep module in gitlabUtils): Accepts the authenticated
    User and prolificToken. Iterates all of the scientist's Pavlovia repos (using the existing
    paginated project-list infrastructure). For each repo it performs three checks in order:
    (1) repo emptiness — calls the GitLab repository tree API; if the tree is empty or contains
    only scaffold files committed by EasyEyes at creation time, the repo is flagged as "empty repo";
    (2) data folder check — calls the GitLab tree API for path=data; if the folder is absent or
    empty AND the Pavlovia experiment status is not RUNNING, the repo is flagged as "no data
    collected"; (3) Prolific gate — if ProlificStudyId.txt exists, fetches the Prolific study and
    its submissions; if any submission has status ACTIVE, AWAITING REVIEW, or APPROVED, the
    repo is excluded from candidates regardless of other flags.
    Returns an array of CleanupCandidate objects: { project, reason, prolificStudyId, prolificStatus }.

  • deleteRepo (new function in gitlabUtils): Calls the GitLab API DELETE /projects/:id for
    a single Pavlovia repo. Returns a boolean indicating success. Routes through the existing
    GitLabOAuthClient.apiRequest so token refresh and retry are handled consistently.

  • pauseProlificStudy (new function in prolificIntegration): Calls
    POST /.netlify/functions/prolific/studies/:id/transition/ with body { action: "PAUSE" }.
    Returns a boolean indicating success. Matches the existing fetch pattern in prolificIntegration
    (same headers, same error capture via captureError).

  • runCleanupBatch (new function, co-located with scan logic): Accepts the scientist's
    selected CleanupCandidate[], User, and prolificToken. For each candidate: calls deleteRepo
    then (if prolificStudyId is present) calls pauseProlificStudy. Collects successes and
    failures. Returns { deleted: string[], paused: string[], failed: Array<{ name, error }> }.
    Best-effort: a failure on one repo does not stop processing of the rest.

  • Dropdown component (modified): openModal gains a onCleanup callback prop. The SweetAlert2
    config gains a showDenyButton: true with denyButtonText: "Clean Up" and a muted style to
    signal it is a secondary action. The onDeny handler calls onCleanup. The cleanup flow opens
    a new SweetAlert2 dialog (replacing the experiment selector) that cycles through: (1) scanning
    spinner, (2) candidate checklist with "Select All" + per-row checkboxes and reason/Prolific
    columns, (3) confirmation prompt, (4) progress during batch, (5) results summary. On completion
    or cancellation, control returns to the refreshed experiment selector.

  • Prolific study status display: Reuses the existing fetchProlificStudy +
    fetchProlificStudySubmissions pattern from prolificIntegration to produce a short status
    string per candidate (e.g. "Unpublished. 0/20 in progress"). If no Prolific token is available,
    the Prolific column is hidden and the Prolific gate is skipped (the repo can still be deleted if
    Pavlovia criteria are met).

  • Scaffold file definition: A repo is considered scaffold-only if it contains only files that
    EasyEyes commits at creation time. The exact set of scaffold filenames is defined as a constant
    alongside scanDeletablePavloviaRepos so it can be updated in one place.

  • Post-cleanup list refresh: After runCleanupBatch completes, the Dropdown calls its
    existing onRefresh / fetchFreshList path so the experiment selector re-opens with an
    up-to-date list that no longer includes deleted repos.

Testing Decisions

A good test asserts externally observable return values given specific mock API responses. Do not
assert which internal functions were called or in what order — only assert on the data returned
and the side effects visible through the public interface.

Modules to test:

  • scanDeletablePavloviaRepos: given mock GitLab tree responses (empty repo, empty data/ with
    non-Running status, empty data/ with Running status, non-empty data/) and mock Prolific
    responses (study with zero submissions, study with one ACTIVE submission), assert the correct
    subset of repos appears as candidates with the expected reason values. Running repos and repos
    with Prolific activity must never appear as candidates.

  • deleteRepo: given a 204 response from the GitLab delete endpoint, returns true; given a 403
    or network error, returns false without throwing.

  • pauseProlificStudy: given a 200 response from the transition endpoint, returns true; given a
    4xx response, returns false without throwing.

  • runCleanupBatch: given two candidates where the first delete succeeds and the second fails,
    returns the correct deleted, paused, and failed arrays.

Prior art: source/__tests__/gitlabUtils.test.js for Pavlovia API mocking patterns.
source/components/prolificIntegration.js for existing Prolific fetch patterns to mirror in tests.

Out of Scope

  • Deleting Prolific studies: Only pausing is in scope. Permanent deletion of a Prolific study
    is irreversible and would remove submission records that scientists may need for participant
    payment audits.

  • Age-based deletion criteria: Last-commit date is deliberately excluded as a criterion.
    Scientists have legitimate past studies with no recent commits that must not be flagged.

  • Bulk cleanup across multiple scientist accounts by an admin: Each scientist runs cleanup on
    their own Pavlovia account using their own OAuth session. Cross-account admin tooling is out of
    scope.

  • Automated / scheduled cleanup: Cleanup is always scientist-initiated. No background job or
    cron-based cleanup is in scope.

  • Archiving repos instead of deleting: The action is permanent deletion on Pavlovia. GitLab
    archive (which keeps the repo read-only) is not offered.

  • Recovering deleted repos: No undo or recycle-bin mechanism is provided. The confirmation
    prompt is the only safety gate.

  • Filtering or sorting the candidate list: Candidates are shown in the order the scan returns
    them. Sorting or filtering the candidate list is out of scope.

Further Notes

The Prolific API is accessed via the Netlify function proxy at /.netlify/functions/prolific/
rather than directly at api.prolific.com — consistent with all existing Prolific calls in
prolificIntegration.js. The pauseProlificStudy function must follow this same routing.

The Pavlovia experiment status (Running / Piloting / Inactive) is already fetched by
ServerManager.js during a session open. For the cleanup scan, it is fetched directly via the
GitLab project metadata API (/projects/:id) rather than through the PsychoJS runtime, since the
scientist is not running an experiment session during cleanup.

The EasyEyesResources repo must always be excluded from the candidate list regardless of
emptiness, since it is a shared resource repo and not an experiment.

Metadata

Metadata

Assignees

Labels

needs-triageNewly created, awaiting triage

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions