Skip to content

Add repository caching and enqueue clone on startup#721

Merged
Kobzol merged 8 commits intorust-lang:mainfrom
reddevilmidzy:local
Apr 3, 2026
Merged

Add repository caching and enqueue clone on startup#721
Kobzol merged 8 commits intorust-lang:mainfrom
reddevilmidzy:local

Conversation

@reddevilmidzy
Copy link
Copy Markdown
Member

@reddevilmidzy reddevilmidzy commented Mar 26, 2026

close: #693

Add gitops cache with CloneRepository. and reuse cached bare repos for squash commit transfer.

Copy link
Copy Markdown
Member Author

@reddevilmidzy reddevilmidzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we forgot to discuss testing—would now be a good time to start using a git mock for this?

View changes since this review

@reddevilmidzy reddevilmidzy marked this pull request as ready for review March 26, 2026 06:50
Copy link
Copy Markdown
Member

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! Left some comments. I don't think that testing is needed at the moment, that will be a more complex change.

View changes since this review

let log_repo = repo_name.clone();
match senders.gitops_queue().enqueue_clone_repository(repo_name) {
Ok(true) => {}
Ok(false) => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, in practice this won't be an issue, as the current queue capacity is 3, and we only work with rust-lang/rust in production, but the capacity shouldn't be used for some commands. Let's not deal with that now.

Copy link
Copy Markdown
Member Author

@reddevilmidzy reddevilmidzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review :)

View changes since this review

@Kobzol
Copy link
Copy Markdown
Member

Kobzol commented Apr 3, 2026

I did some experiments on a repository that has the full history of rust-lang/rust:

clone --bare + fetch => 100s clone, 1.6s push
clone + fetch => 100s clone, 1.6s push
clone --bare blob=none + fetch => 30s clone, 30.4s push
clone blob=none + fetch => 30s clone, 2.7s push
clone tree:0 + fetch => 6.5s clone, 2s push
clone tree:0 + fetch tree:0 => 6.5s clone, 2.7s push
clone --bare tree:0 + fetch => 6.5s clone, 4.7s push

It seems that there is some unfortunate git behavior when a bare clone is combined with a blobless clone. It is quite slow on the first push. The second push is fast again, but that's not enough.

It seems to me that the sweet spot is either doing a bare clone with all blobs or trees, or a treeless bare clone, which saves space on disk, has fast init, relatively fast first push, and fast followup pushes. That being said, the treeless clone could cause some performance edge-cases, and I'm kinda worried about their behavior with submodules. We already saw some issues with partial clones on rust-lang/rust. So the most reliable solution would likely be to just do a bare clone and that's it. And yet, I want to see how the treeless clone behaves in production 😆

I took the liberty of adding some logging, so that we can observe in logs how long does the init and squashes take. Let's try to use the treeless clone and see what happens in production.

std::fs::remove_dir_all(repo_path)
.context("Cannot reset repository cache directory")?;
}
if let Some(parent) = repo_path.parent() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this is needed, as git creates the target directory and its parents recursively when cloning.

Copy link
Copy Markdown
Member

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is great! Let's try it on prod.

View changes since this review

@Kobzol Kobzol added this pull request to the merge queue Apr 3, 2026
Merged via the queue into rust-lang:main with commit c05c0dd Apr 3, 2026
3 checks passed
@Kobzol
Copy link
Copy Markdown
Member

Kobzol commented Apr 3, 2026

Aaaand ofc I already ran into an error. Let's just try with a full clone.. (#728).

@reddevilmidzy reddevilmidzy deleted the local branch April 4, 2026 05:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement local git operations for squashing

2 participants