Skip to content

Conversation

@CodingOnStar
Copy link
Contributor

Summary

feat: implement step two of dataset creation with comprehensive UI components and hooks

  • Added new components for general chunking options, parent-child options, preview panel, and step two footer.
  • Introduced hooks for document creation, indexing configuration, indexing estimation, preview state, and segmentation state.
  • Created types for step two props and integrated them into the components.
  • Implemented escape and unescape utility functions for handling special characters.
  • Established a structured approach for managing dataset creation workflow, enhancing user experience and functionality.

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran make lint and make type-check (backend) and cd web && npx lint-staged (frontend) to appease the lint gods

…mponents and hooks

- Added new components for general chunking options, parent-child options, preview panel, and step two footer.
- Introduced hooks for document creation, indexing configuration, indexing estimation, preview state, and segmentation state.
- Created types for step two props and integrated them into the components.
- Implemented escape and unescape utility functions for handling special characters.
- Established a structured approach for managing dataset creation workflow, enhancing user experience and functionality.
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@CodingOnStar CodingOnStar marked this pull request as ready for review January 7, 2026 09:08
Copilot AI review requested due to automatic review settings January 7, 2026 09:08
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. 💪 enhancement New feature or request labels Jan 7, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the step-two component of the dataset creation workflow by extracting logic into custom hooks and smaller components, improving maintainability and testability.

Key Changes:

  • Extracted state management into 5 custom hooks (useSegmentationState, usePreviewState, useIndexingConfig, useIndexingEstimate, useDocumentCreation)
  • Split monolithic component into 5 focused sub-components (GeneralChunkingOptions, ParentChildOptions, IndexingModeSection, PreviewPanel, StepTwoFooter)
  • Added comprehensive test coverage with 2185 lines of tests
  • Created escape/unescape utility functions for handling special characters

Reviewed changes

Copilot reviewed 15 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
index.tsx Refactored main component to use extracted hooks and sub-components
types.ts Defined StepTwoProps interface
hooks/* Implemented 5 custom hooks for state management and business logic
components/* Created 5 focused sub-components for UI sections
escape.ts/unescape.ts Utility functions for character escaping
index.spec.tsx Comprehensive test suite covering hooks and components

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Updated test case to clarify handling of complex strings without backslashes.
- Added new test case to document behavior for strings containing existing backslashes, highlighting the non-symmetrical nature of escape/unescape functions.
- Improved code readability and understanding of string manipulation scenarios.
Copy link
Member

@WTW0313 WTW0313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 8, 2026
@CodingOnStar CodingOnStar merged commit 9848823 into main Jan 9, 2026
14 checks passed
@CodingOnStar CodingOnStar deleted the refactor/step-two branch January 9, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants