feat(metro-file-map): Lazily stat files and populate symlinks for Node crawled file trees#1686
Open
kitten wants to merge 5 commits intofacebook:mainfrom
Open
Conversation
This is needed because we won't mark unvisited files as changed anymore. This causes them to not be read or processed. We could exclude symlinks from this, but the lazy symlinking makes sense in conjunction with the lazy lstat change. This is a trade-off, we skip the readlinks on startup that are async, and instead run them synchronously. However, they're skipped if the symlink is never accessed, which in workspaces with isolated dependency installations can be beneficial. In most cases, this is unlikely to impact performance as much as eagerly populating symlinks, as the previous code seems to mostly assume a low number of symlinks anyway (which isn't true for isolated installations)
ad0b094 to
3c2d74c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Note
This is part of the same pick/patch-set from an experiment on the Expo repo,
like #1676, #1677, and #1687, for
metro-file-map(conflicts are expected, since changes were pulled out into separate PRs for clarity)After #1677 the highest impact on initial crawling performance with the Node crawler are the individual
lstatcalls andreadlinkcalls. The former happens for every file that's discovered and can hence get quite large for monorepos (or assets that aren't excluded with blocklists). The latter, readlinks, are expensive if we're in an isolated installation (pnpm & bun), which can contain a much larger amount of symlinks than was likely anticipated by the current code paths.This is the last (and most minimal iteration) of a few attempts to reduce the cost.
The idea is to skip the
lstatcall for new files that aren't in the previous FS snapshot (entry doesn't exist). To make this safe, these files are also excluded fromgetDifference, i.e. when both the previous and new entries have0 | nullmtime values. In both paths0 | nullindicates that the file is skipped forlstat.The
mtimeis later added via anfs.promises.lstatcall ingetOrComputeSha1i.e. when the file is actually accessed. If the file isn't accessed the mtime value remains at null.This cascades to require/allow us to lazily evaluate symlinks for mtime values of null. Their values aren't read until accessed in
#resolveSymlinkTargetToNormalPathinTreeFS. This does introduce an update on read, but lazily populates the symlink value. This trades off the asyncreadlinkcall on all symlinks for a syncreadlinkcall on a subset of symlinks.We've observed this to be an overall win, but this could be influenced further by this commit: expo/expo@af67435 (which could also be upstreamed but changes the
getContentsignature. This change can be avoided by awaiting first, but seems overall like a better change than awaitinggetContentunconditionally before plugins)Changelog: [Internal] Lazily stat Node crawled files and lazily populate symlink targets
Test plan