Skip to content

Conversation

@crivaronicolini
Copy link
Contributor

📝 Summary

This PR adds a way to edit local html notebooks as proposed in #7327. This allows for easier sharing of notebooks: with this feature the notebooks used for presentation (with cell outputs) can be edited without access to the original Python file.

🔍 Description of Changes

The changes involved mainly repurposing the existing StaticNotebookReader as a new file handler that was placed first on the chain of handlers that try to open a file. The new handler now handles every html file regardless of origin, and it raises a Click exception when it's unable to handle the html file. This was required to prevent these files from being read by the LocalFileHandler and breaking the app. The alternative would have been to edit every handler and local reader to ignore html files.

I also modified the regex used to find the actual code inside the <marimo-code> tags so that it can handle whitespace between the tags and the Python code. The previous version was failing on some test files downloaded from Marimo.

Finally, I had also modified the StaticNotebookReader.read function to reuse the file downloaded in the can_read function. This way would save a redundant download. In the end I decided to keep it as it was because it changed the read function signature (name was not needed). Let me know what you think about this and if I should make the change anyways.

📋 Checklist

  • I have read the contributor guidelines.
  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • I have added tests for the changes made.
  • I have run the code and verified that it works as expected.

@vercel
Copy link

vercel bot commented Dec 11, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
marimo-docs Ready Ready Preview, Comment Dec 15, 2025 11:05pm

@mscolnick
Copy link
Contributor

@crivaronicolini this code looks good. the local vs remote was maybe a bad abstraction and makes this a bit difficult.

i think really what local vs remote is: already-a-marimo-file-on-your-computer vs needs-to-be-downloaded-or-written-as-a-tmp-marimo-file

so given that, I wonder if we can change the RemoveFileHandler can_handle to be able to handle local html files. or we let LocalFileHandler also create new tempdirs/files

@crivaronicolini
Copy link
Contributor Author

Thanks for taking a look!
Yeah, that was the first thing I tried, but to handle local html files in RemoteFileHandle I had to bypass html files in LocalFileHandler and LocalFileReader and it seemed messy.

I also tried to let LocalFileHandler handle the html files but changed my mind when I realised it had to use StaticNotebookReader directly and then create temp copies. It was starting to look too similar to RemoteFileHandler.
In the end it seemed neater to handle every html file in a single handler and raise an exception if we can't handle the file.

Let me know if I should try again with the LocalFileHandler anyways.

@mscolnick
Copy link
Contributor

@crivaronicolini - it looks like the change fails this test https://github.com/marimo-team/marimo/actions/runs/20143407972/job/57817253583?pr=7474, (the other errors are unrelated)

would you mind taking another crack at the previous way? i think if the can_handle has more logic than is_url, then maybe it will work? not is_url(file) and not file.endswith("html").

temp_dir_obj.cleanup()

@patch("marimo._utils.requests.get")
def test_validate_remote_html_not_notebook(self, mock_get):
Copy link
Collaborator

@dmadisetti dmadisetti Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_validate_remote_html_not_notebook(self, mock_get):
def test_validate_remote_html_is_notebook(self, mock_get):

Thoughts? Looks good though

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables editing of local HTML notebooks (static notebooks with embedded Python code) by extending the StaticNotebookReader to handle local .html files in addition to remote URLs. This allows users to edit presentation notebooks directly without needing the original Python file.

Key Changes

  • Extended StaticNotebookReader to read local .html files with embedded marimo code
  • Updated regex patterns to handle whitespace between marimo-code tags and Python content
  • Modified LocalFileHandler to process .html files using StaticNotebookReader before general validation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
marimo/_cli/file_path.py Added DEFAULT_FILENAME constant, updated regex patterns for whitespace handling, extended StaticNotebookReader to read local HTML files, added HTML file handling in LocalFileHandler, and refactored temp file creation
tests/_cli/test_file_path.py Removed old test_static_notebook_reader, added comprehensive TestStaticNotebooks class with tests for local/remote HTML files and validation scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

CODE_TAG = r"marimo-code"
CODE_REGEX = re.compile(r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>(.*?)<")
CODE_REGEX = re.compile(
r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<"
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern may not correctly match multiline Python code within the marimo-code tags. The pattern uses .*? which doesn't match newlines by default. Add the re.DOTALL flag to ensure the pattern matches code spanning multiple lines. This would be: re.compile(r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<", re.DOTALL)

Suggested change
r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<"
r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<",
re.DOTALL

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine since we generate the code- but always reminds me of this: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

No multiple lines in this cases (all encoded)

Comment on lines +363 to +371
@staticmethod
def _create_tmp_file_from_content(
content: str, name: str, temp_dir: TemporaryDirectory[str]
) -> str:
LOGGER.info("Creating temporary file")
path_to_app = Path(temp_dir.name) / name
path_to_app.write_text(content, encoding="utf-8")
LOGGER.info("App saved to %s", path_to_app)
return str(path_to_app)
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _create_tmp_file_from_content method is duplicated between LocalFileHandler and RemoteFileHandler with identical implementations. Consider extracting this to a shared utility function or a common base class method to reduce code duplication and improve maintainability.

Copilot uses AI. Check for mistakes.
Comment on lines +177 to +183
# Read a local Static Notebook
if url.endswith(".html"):
file_contents = Path(url).read_text(encoding="utf-8")
return (
StaticNotebookReader.CODE_TAG in file_contents,
file_contents,
)
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding path traversal validation for local HTML file paths before reading them. While this is likely handled by the Path object's normal resolution, explicit validation using path normalization and checking for suspicious patterns (e.g., "..") would provide defense in depth against potential security issues.

Copilot uses AI. Check for mistakes.
Comment on lines +336 to +340
content, filename = reader.read(name)
path_to_app = self._create_tmp_file_from_content(
content, filename, temp_dir
)
return path_to_app, temp_dir
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extracted filename is passed directly to create a path without validation. If the filename is an absolute path or contains path separators, the Path / operator may not behave as expected (e.g., Path(temp_dir) / "/absolute/path" returns "/absolute/path"). Consider using only the basename of the extracted filename with os.path.basename() or Path(filename).name to ensure the file is created within the temp directory.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Collaborator

@dmadisetti dmadisetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot seems rather concerned about path injection, there's a bit of hardening we can do around the general function, but the implementation, as is, is fine (and the code is wrong elsewhere too)

CODE_TAG = r"marimo-code"
CODE_REGEX = re.compile(r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>(.*?)<")
CODE_REGEX = re.compile(
r"<marimo-code\s+hidden(?:=['\"]{2})?\s*>\s*(.*?)\s*<"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine since we generate the code- but always reminds me of this: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

No multiple lines in this cases (all encoded)

@mscolnick mscolnick merged commit 61c9076 into marimo-team:main Dec 16, 2025
44 of 60 checks passed
@crivaronicolini crivaronicolini deleted the edit-html branch December 21, 2025 01:02
@Light2Dark Light2Dark added the enhancement New feature or request label Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants