refactor(network): use async for registry network operations#16745
refactor(network): use async for registry network operations#16745arlosi wants to merge 7 commits intorust-lang:masterfrom
Conversation
5072ba5 to
e09814c
Compare
This comment has been minimized.
This comment has been minimized.
4e38864 to
a462b06
Compare
This comment has been minimized.
This comment has been minimized.
|
@rustbot ready I've addressed a couple performance issues since the last update in the async http client. On my machine, performance is now as good or slightly better with this change applied. Pending investigation:
r? @weihanglo |
d31c707 to
b5ac488
Compare
This comment has been minimized.
This comment has been minimized.
src/cargo/util/network/http_async.rs
Outdated
|
|
||
| /// Configure a libcurl http handle with the defaults options for Cargo | ||
| /// Note: keep in sync with `http::configure_http_handle`. | ||
| fn configure_http_handle( |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| return Ok(summaries.clone()); | ||
| } | ||
|
|
||
| // Check if this request has already started. If so, return a oneshot that hands out the same data. |
There was a problem hiding this comment.
I'm leaning towards using adding a dependency on quick_cache here. It should reduce this whole function to something like:
self.summaries_cache.get_or_insert_async(name, self.load_summaries_uncached(name, load)).await
This comment has been minimized.
This comment has been minimized.
|
Benchmarks show improvements or no change in all areas measured. Criterion benchmarks which test the performance of the resolver when there is no blockingHyperfine with local webserver and sample project at 100ms simulated latencyWith simulated 100ms delay on the following assorted dependencies: [dependencies]
anyhow = "1.0.102"
futures = "0.3.32"
futures-timer = "3.0.3"
serde = { version = "1", features = ["derive"] }
reqwest = "0.13"
hyper = "1.8.1"
http = "1.4.0"
cargo = "0.95.0"Hyperfine on `cargo` itself hitting live crates.io |
| [UPDATING] `alternative` index | ||
| [ERROR] failed to get `bar` as a dependency of package `foo v0.0.1 ([ROOT]/foo)` | ||
|
|
||
| Caused by: |
There was a problem hiding this comment.
Do we feel like these new context are good or redundant?
There was a problem hiding this comment.
The context for auth errors now better matches other errors. It does feel somewhat redundant though. Removing one level of the error chain in a follow up PR (see my other comment) would somewhat mitigate it.
There was a problem hiding this comment.
In the current Cargo, the additional context of "failed to load source" / "unable to update registry"
doesn't appear only for sparse registries.
The reason is that the context only gets added to errors that come from block_until_ready,
and sparse registry errors come through query.
So this change makes sparse registries behave like the other source types.
Example showing git vs sparse errors on existing cargo
Updating `sparse` index
error: failed to get `anyhow` as a dependency of package `sample v0.1.0 (/home/arlo/git/sample-baseline)`
Caused by:
failed to query replaced source registry `crates-io`
Caused by:
download of config.json failed
Caused by:
failed to download from `https://localhost:34433/config.json`
Caused by:
[7] Could not connect to server (Failed to connect to localhost port 34433 after 0 ms: Could not connect to server)
Updating `git` index
error: failed to get `anyhow` as a dependency of package `sample v0.1.0 (/home/arlo/git/sample-baseline)`
Caused by:
failed to load source for dependency `anyhow`
Caused by:
unable to update registry `crates-io`
Caused by:
failed to update replaced source registry `crates-io`
Caused by:
failed to fetch `https://localhost:34433/`
Caused by:
network failure seems to have happened
if a proxy or similar is necessary `net.git-fetch-with-cli` may help here
https://doc.rust-lang.org/cargo/reference/config.html#netgit-fetch-with-cli
Caused by:
[7] Could not connect to server (Failed to connect to localhost port 34433 after 0 ms: Could not connect to server); class=Net (12)
| .with_stderr_data(str![[r#" | ||
| [UPDATING] git repository `[ROOTURL]/bar` | ||
| [ERROR] unable to update [ROOTURL]/bar#0.1.2 | ||
| [ERROR] failed to get `bar` as a dependency of package `dep1 v0.5.0 ([ROOT]/foo/dep1)` |
There was a problem hiding this comment.
Not sure whether this is an improvement or regression. Personally feel like this is more an diagnostic improvement because now people know why git were cloning bar, but OTOH it is a bit too verbose.
There was a problem hiding this comment.
Overall I think the additional context is good, but things are getting too verbose. I'd like to make a follow-up change that removes one level of this chain everywhere. Doing that in this PR would cause a significant diff though.
Changing most
Caused by:
failed to load source for dependency `bar`
Caused by:
unable to update [ROOTURL]/bar#0.1.2
to
Caused by:
failed to update [ROOTURL]/bar#0.1.2 for dependency `bar`
This comment has been minimized.
This comment has been minimized.
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
This is only used by the resolver, which is the remaining non-async native component that accesses registries.
We need to be able to have multiple async futures in-flight, which means shared access is required for registry methods.
This is used to enable dynamic dispatch on traits that use async functions.
The method bodies are not changed. This is in preparation to remove the block_until_ready functions.
This error occurs when a registry returns "CacheValid", but we didn't provide a cache key. This is only a refactor, does not impact any functionality. This error path is only possible with http (sparse) registries currently.
Converts Poll into async. Some error messages are impacted due to slightly different code paths being taken. This is a major change to the sparse registry backend in http_remote.rs.
|
I've minimized the tests diffs further, and fixed a couple typos in the last push. This should be ready to go now. |
View all comments
This PR introduces
asyncinto cargo, specifically for handling registry network operations.Why use async?
Poll+block_until_readyto enable parallelism. This model is somewhat awkward to work with, especially when parallelism is desired. Usingasyncsimplifies creating and working with registries. When parallelism is not needed, the call can be wrapped inblock_onreqwestor other rust-native HTTP clients that use async.What executor is used?
This change uses
futures::executorrunning on the current thread. It does not use a thread pool. Local I/O (filesystem), still usesstdin blocking mode, even when in an async context. This choice was made to minimize dependencies and size of this change. In the future, we may decide to move to a larger executor liketokio.What does this PR change?
Convert registry methods taking
&mut selfto&selfIn order to execute multiple futures in parallel (using
FuturesUnorderedor similar), there must be multiple active borrows. Many of the registry querying methods used&mut self, which prevents this.Any mutations are now handled by using
Cell/RefCell. Since the executor is single-threaded, the operations aren't actually happening in parallel, but when passing an.awaitpoint, a different future could begin executing. We must be careful now not to hold mut borrows onRefCells across await points, as it will cause a panic.Convert methods returning
Poll<_>to returnFuture<_>(async).When we want to block (and we're not in an async function), the async call is wrapped with
crate::util::block_on. If this is attempted within an existing async executor, cargo will panic.Remove all
block_until_readymethods from allSource,RegistryDatatraitsFor most of the sources, the contents of the
block_until_readyfunction was moved into a a privateupdatefunction that's called once before making any queries.Add new HTTP async
http_asyncmoduleWraps cURL's
Multiinterface to provide anasyncinterface for making HTTP requests through cURL.Add
async-traitcrateThis is used to enable dynamic dispatch on traits that use
async fn.Error message output changes
Due to refactoring around removing
block_until_ready, some error messages are changed, usually by having additional context information. This change makes the error messages more consistent between various code paths. I attempted to minimize the changes in this PR for ease of review. We may want to simplify the error messages (by removing one level of context) in a follow up.Resolver
The resolver is the only portion still using the
Pollapproach. An adapter layerLocalPollAdapteris added that translatesPolls intoasync.Converting the resolver be
asyncnative is left as future work. In theory this conversion could provide performance improvements, since the resolution would not need to be restarted. It would also let the resolver start working immediately as new dependencies became available..cratedownloadingThe downloading of crate files using
PackageSetis not yet converted to async. This can be done separately.Progress reporting
The
http_remoteno longer has the ability to track how manyblock_until_readycalls have been made, so we can't do the trick where we assume we have a dependency tree depth of 10 to make a fake progress bar. Since the progress bar was fake anyway, this change removes it in favor of only displaying the information we know.The new behavior is:
Performance
This change is not expected to impact performance. My local measurements show neutral to possible 1% improvement. Release binary size increased about 1%.
How to review
Commit by commit. The final commit contains the majority of the change that was not straightforward to separate out.