-
Notifications
You must be signed in to change notification settings - Fork 860
feat: marimo edit local static notebooks
#7474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@crivaronicolini this code looks good. the local vs remote was maybe a bad abstraction and makes this a bit difficult. i think really what local vs remote is: already-a-marimo-file-on-your-computer vs needs-to-be-downloaded-or-written-as-a-tmp-marimo-file so given that, I wonder if we can change the |
|
Thanks for taking a look! I also tried to let Let me know if I should try again with the |
|
@crivaronicolini - it looks like the change fails this test https://github.com/marimo-team/marimo/actions/runs/20143407972/job/57817253583?pr=7474, (the other errors are unrelated) would you mind taking another crack at the previous way? i think if the |
tests/_cli/test_file_path.py
Outdated
| temp_dir_obj.cleanup() | ||
|
|
||
| @patch("marimo._utils.requests.get") | ||
| def test_validate_remote_html_not_notebook(self, mock_get): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_validate_remote_html_not_notebook(self, mock_get): | |
| def test_validate_remote_html_is_notebook(self, mock_get): |
Thoughts? Looks good though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enables editing of local HTML notebooks (static notebooks with embedded Python code) by extending the StaticNotebookReader to handle local .html files in addition to remote URLs. This allows users to edit presentation notebooks directly without needing the original Python file.
Key Changes
- Extended
StaticNotebookReaderto read local.htmlfiles with embedded marimo code - Updated regex patterns to handle whitespace between marimo-code tags and Python content
- Modified
LocalFileHandlerto process.htmlfiles usingStaticNotebookReaderbefore general validation
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| marimo/_cli/file_path.py | Added DEFAULT_FILENAME constant, updated regex patterns for whitespace handling, extended StaticNotebookReader to read local HTML files, added HTML file handling in LocalFileHandler, and refactored temp file creation |
| tests/_cli/test_file_path.py | Removed old test_static_notebook_reader, added comprehensive TestStaticNotebooks class with tests for local/remote HTML files and validation scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| CODE_TAG = r"marimo-code" | ||
| CODE_REGEX = re.compile(r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>(.*?)<") | ||
| CODE_REGEX = re.compile( | ||
| r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<" |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern may not correctly match multiline Python code within the marimo-code tags. The pattern uses .*? which doesn't match newlines by default. Add the re.DOTALL flag to ensure the pattern matches code spanning multiple lines. This would be: re.compile(r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<", re.DOTALL)
| r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<" | |
| r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<", | |
| re.DOTALL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine since we generate the code- but always reminds me of this: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
No multiple lines in this cases (all encoded)
| @staticmethod | ||
| def _create_tmp_file_from_content( | ||
| content: str, name: str, temp_dir: TemporaryDirectory[str] | ||
| ) -> str: | ||
| LOGGER.info("Creating temporary file") | ||
| path_to_app = Path(temp_dir.name) / name | ||
| path_to_app.write_text(content, encoding="utf-8") | ||
| LOGGER.info("App saved to %s", path_to_app) | ||
| return str(path_to_app) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _create_tmp_file_from_content method is duplicated between LocalFileHandler and RemoteFileHandler with identical implementations. Consider extracting this to a shared utility function or a common base class method to reduce code duplication and improve maintainability.
| # Read a local Static Notebook | ||
| if url.endswith(".html"): | ||
| file_contents = Path(url).read_text(encoding="utf-8") | ||
| return ( | ||
| StaticNotebookReader.CODE_TAG in file_contents, | ||
| file_contents, | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding path traversal validation for local HTML file paths before reading them. While this is likely handled by the Path object's normal resolution, explicit validation using path normalization and checking for suspicious patterns (e.g., "..") would provide defense in depth against potential security issues.
| content, filename = reader.read(name) | ||
| path_to_app = self._create_tmp_file_from_content( | ||
| content, filename, temp_dir | ||
| ) | ||
| return path_to_app, temp_dir |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extracted filename is passed directly to create a path without validation. If the filename is an absolute path or contains path separators, the Path / operator may not behave as expected (e.g., Path(temp_dir) / "/absolute/path" returns "/absolute/path"). Consider using only the basename of the extracted filename with os.path.basename() or Path(filename).name to ensure the file is created within the temp directory.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
dmadisetti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot seems rather concerned about path injection, there's a bit of hardening we can do around the general function, but the implementation, as is, is fine (and the code is wrong elsewhere too)
| CODE_TAG = r"marimo-code" | ||
| CODE_REGEX = re.compile(r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>(.*?)<") | ||
| CODE_REGEX = re.compile( | ||
| r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine since we generate the code- but always reminds me of this: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
No multiple lines in this cases (all encoded)
📝 Summary
This PR adds a way to edit local html notebooks as proposed in #7327. This allows for easier sharing of notebooks: with this feature the notebooks used for presentation (with cell outputs) can be edited without access to the original Python file.
🔍 Description of Changes
The changes involved mainly repurposing the existing
StaticNotebookReaderas a new file handler that was placed first on the chain of handlers that try to open a file. The new handler now handles every html file regardless of origin, and it raises a Click exception when it's unable to handle the html file. This was required to prevent these files from being read by theLocalFileHandlerand breaking the app. The alternative would have been to edit every handler and local reader to ignore html files.I also modified the regex used to find the actual code inside the
<marimo-code>tags so that it can handle whitespace between the tags and the Python code. The previous version was failing on some test files downloaded from Marimo.Finally, I had also modified the
StaticNotebookReader.readfunction to reuse the file downloaded in thecan_readfunction. This way would save a redundant download. In the end I decided to keep it as it was because it changed thereadfunction signature (namewas not needed). Let me know what you think about this and if I should make the change anyways.📋 Checklist