-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
[Fix] [TRL] load_lora for multi line llm.chat/generate #3696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Summary of ChangesHello @Datta0, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the integration and compatibility between Unsloth and the TRL (Transformer Reinforcement Learning) library. It refines the process of loading LoRA models by improving the regex used for injecting Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request improves the patching mechanism for TRL by updating a regular expression to handle multi-line llm.chat and llm.generate calls. It also introduces new patching capabilities for openenv and adds patches to remove reload_weights calls, which are not needed for Unsloth's LoRA use case.
My review found a critical issue in the construction of the replacement string for the llm.chat/generate patching, which could lead to syntax errors and runtime failures. The rest of the changes look good and are consistent with the project's patching style.
* [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove reload_weights rpc call from grpo trainer * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Use regex instead of static string * patch openenv reload_weights call * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Better handle sleep and wakeup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reset indentation * Handle multi line self.llm.chat better * Use logger * re-indent * Stricter regex to replace wildcard --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com>
This change from trl happened to break our code. The PR aims to fix it