Skip to content

TOC page numbers reset, spike unexpectedly, and are non-continuous in some cases #55

@salam59

Description

@salam59

While generating the Table of Contents (TOC), the following issues have been observed:

  • Page numbers are correct up to a certain point, but then unexpectedly reset and start again from page 1.
  • Page numbers can be non-continuous, with gaps or skipped values.
  • In some cases, there is a sudden spike in page numbers (for example, jumping from a low value to a much higher one without a valid transition).

These issues lead to incorrect TOC mappings and unreliable section page ranges.

Proposed Solution

Add validation at each step of the TOC generation pipeline to ensure page number consistency. Specifically:

  • Validate that page numbers are monotonically increasing.
  • Detect and flag unexpected resets to page 1.
  • Identify non-continuous sequences.
  • Detect sudden spikes in page numbers beyond a reasonable threshold.

Based on validation results, apply correction logic, fallback strategies, or error reporting as appropriate.

Expected Outcome

  • TOC page numbers remain continuous and logically consistent.
  • Anomalies in page numbering are detected early and handled gracefully.
  • Improved reliability of TOC-based page range extraction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions