Skip to content

Conversation

@ramblinghermit0403
Copy link

Hey! I ran into some EPERM error: operation not permitted errors while running MemoryBench on Windows, specifically when the system tries to save the checkpoint file during a run (or when stopping the run, which triggers a final save). It looks like fs.renameSync sometimes conflicts with file locks (likely anti-virus or the OS just holding onto handles briefly).

I fixed this by:

Wrapping the atomic renameSync operation in a retry loop.
Adding exponential backoff to handle transient EBUSY or EPERM locks, which are common on Windows.
This preserves the safety of atomic writes (vital for crash resilience) while making it robust against Windows file locking issues.
On Linux/macOS, the rename succeeds on the first try, so there is no performance impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant