Skip to content

feat(metro-file-map): Lazily stat files and populate symlinks for Node crawled file trees#1686

Open
kitten wants to merge 5 commits intofacebook:mainfrom
kitten:@kitten/feat/metro-file-map/defer-stat-and-readlink
Open

feat(metro-file-map): Lazily stat files and populate symlinks for Node crawled file trees#1686
kitten wants to merge 5 commits intofacebook:mainfrom
kitten:@kitten/feat/metro-file-map/defer-stat-and-readlink

Conversation

@kitten
Copy link
Copy Markdown
Contributor

@kitten kitten commented Apr 11, 2026

Summary

Note

This is part of the same pick/patch-set from an experiment on the Expo repo,
like #1676, #1677, and #1687, for metro-file-map (conflicts are expected, since changes were pulled out into separate PRs for clarity)

After #1677 the highest impact on initial crawling performance with the Node crawler are the individual lstat calls and readlink calls. The former happens for every file that's discovered and can hence get quite large for monorepos (or assets that aren't excluded with blocklists). The latter, readlinks, are expensive if we're in an isolated installation (pnpm & bun), which can contain a much larger amount of symlinks than was likely anticipated by the current code paths.

This is the last (and most minimal iteration) of a few attempts to reduce the cost.

The idea is to skip the lstat call for new files that aren't in the previous FS snapshot (entry doesn't exist). To make this safe, these files are also excluded from getDifference, i.e. when both the previous and new entries have 0 | null mtime values. In both paths 0 | null indicates that the file is skipped for lstat.

The mtime is later added via an fs.promises.lstat call in getOrComputeSha1 i.e. when the file is actually accessed. If the file isn't accessed the mtime value remains at null.

This cascades to require/allow us to lazily evaluate symlinks for mtime values of null. Their values aren't read until accessed in #resolveSymlinkTargetToNormalPath in TreeFS. This does introduce an update on read, but lazily populates the symlink value. This trades off the async readlink call on all symlinks for a sync readlink call on a subset of symlinks.

We've observed this to be an overall win, but this could be influenced further by this commit: expo/expo@af67435 (which could also be upstreamed but changes the getContent signature. This change can be avoided by awaiting first, but seems overall like a better change than awaiting getContent unconditionally before plugins)

Changelog: [Internal] Lazily stat Node crawled files and lazily populate symlink targets

Test plan

  • Unit tests were added to demonstrate this
  • This can further be tested E2E in the linked Expo PR, although not in isolation (if necessary I can create a new, isolated branch with a pnpm patch for these changes)

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 11, 2026
@kitten kitten changed the title @kitten/feat/metro file map/defer stat and readlink feat(metro-file-map): Lazily stat and populate symlinks for Node crawled file trees Apr 11, 2026
kitten added 4 commits April 11, 2026 01:20
This is needed because we won't mark unvisited files as changed anymore.
This causes them to not be read or processed. We could exclude symlinks
from this, but the lazy symlinking makes sense in conjunction with the
lazy lstat change.

This is a trade-off, we skip the readlinks on startup that are async,
and instead run them synchronously. However, they're skipped if the
symlink is never accessed, which in workspaces with isolated dependency
installations can be beneficial. In most cases, this is unlikely to
impact performance as much as eagerly populating symlinks, as the
previous code seems to mostly assume a low number of symlinks anyway
(which isn't true for isolated installations)
@kitten kitten force-pushed the @kitten/feat/metro-file-map/defer-stat-and-readlink branch from ad0b094 to 3c2d74c Compare April 11, 2026 00:21
@facebook-github-tools facebook-github-tools bot added the Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team. label Apr 11, 2026
@kitten kitten changed the title feat(metro-file-map): Lazily stat and populate symlinks for Node crawled file trees feat(metro-file-map): Lazily stat files and populate symlinks for Node crawled file trees Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant