-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Replace ESRI geometry library with JTS in geospatial plugin #27881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dain
wants to merge
15
commits into
master
Choose a base branch
from
user/dain/geo-jts
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+3,145
−3,211
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
b4e8c12 to
aad9700
Compare
fe6e6cd to
b3494aa
Compare
Fix test data that was accepted by ESRI but rejected by JTS which strictly enforces the OGC Simple Features Specification: - Close polygon rings (first point must equal last point) - Fix single-point LINESTRING to have two points (minimum required) - Fix MULTILINESTRING EMPTY syntax (remove extra parentheses) - Replace invalid MULTIPOLYGON with overlapping polygons using ST_Union - Replace degenerate polygons in GEOMETRYCOLLECTION with valid geometries
Adds assertSpatialEquals helper to TestGeoFunctions that uses stEquals for geometry comparison. Converts testSTGeometryType and testSTBuffer to use the new helper. testSTBuffer was updated to use property-based assertions (ST_Envelope and ST_Area with tolerance) instead of exact WKT coordinate matching. This makes the tests stable across CPU architectures (ARM vs x86) where trigonometric functions can produce slightly different floating-point results.
Migrate simple geometry functions to use JTS library. Test updates for behavior differences: - ST_Boundary returns LINESTRING instead of MULTILINESTRING for simple polygons - ST_Buffer with infinity returns POLYGON EMPTY instead of MULTIPOLYGON EMPTY - Minor floating-point precision differences in some calculations
Migrate ST_NumPoints and related accessor functions to JTS. Test updates for behavior differences: - ST_NumPoints now counts closing vertices in polygons per OGC standard - Ring vertex ordering may differ cosmetically (same geometry)
Add JTS-compatible overloads for geometry utility methods to support incremental migration from ESRI to JTS. The ESRI versions remain for existing callers until they are converted.
Rewrite stUnion to use JTS UnaryUnionOp instead of ESRI cursors. Behavior differences: - Point-on-line union does not insert vertices - Empty inputs return empty geometry collection instead of null
- Migrate spatial join operator to JTS for intersection and containment tests - Switch GeoFunctions envelope operations to use JTS Envelope (deserializeEnvelope, ST_XMin/XMax/YMin/YMax, ST_IsEmpty)
Use Extended Well-Known Binary (EWKB) format for geometry serialization. EWKB is the standard used by PostGIS and retains the SRID (Spatial Reference System Identifier) for coordinate system information.
Note: TestEsriTable's expected values file was converted from Trino's old internal binary format to WKT. This change cannot be separated into an earlier commit because the old format's deserializer was deleted in the EWKB commit, and circular Maven dependencies prevent adding geospatial as a test dependency to trino-hive.
With ESRI removed JTS objects no longer need fully qualified names
Change the internal representation of geometry values to use JTS Geometry objects directly, avoiding unnecessary serialization cycles between function calls.
b3494aa to
9bec6d6
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Migrate the geospatial plugin from ESRI geometry-api-java to JTS (Java Topology Suite) as the core geometry library. JTS is more widely used, better maintained, and provides the foundation for upcoming Iceberg geometry type support.
Key changes:
Geometryas the native stack type instead of serialized bytesAdditional context and related issues
JTS is the de facto standard geometry library in the Java ecosystem, used by GeoTools, PostGIS, and Apache Sedona. This change aligns Trino's geospatial implementation with the broader ecosystem and enables future improvements like Iceberg geometry support.
Reviewer Guide
Commit Organization
The migration is structured to minimize risk and make review manageable:
like unclosed rings that JTS correctly rejects) from Refactors (adding
assertSpatialEqualstohandle harmless differences like vertex ordering). Any test changes in subsequent commits indicate
actual behavior differences between ESRI and JTS.
the PostGIS standard that preserves SRID.
were needed while both libraries coexisted. Please do not comment on verbose imports in earlier commits.
Sliceto JTSGeometryas the native stack type,eliminating serialization cycles between function calls.
Behavior Differences
Test changes document intentional behavior differences:
default, whereas ESRI used 24. This results in buffer polygons with fewer vertices (e.g., 32 vs 96 for
a point buffer) while maintaining standard precision.
triangle has 4 points: A-B-C-A).
LINESTRINGinstead ofMULTILINESTRINGfor simple polygons (simpler, morestandard return type).
ST_Buffer(infinity)returnsPOLYGON EMPTY(instead ofMULTIPOLYGON EMPTY).GEOMETRYCOLLECTIONinstead ofnullfor empty inputs, improvingnull-safety in downstream SQL.
ST_CentroidandST_Areadue to differentunderlying math implementations (verified via tolerance checks).
ST_Points.Release notes
(x) Release notes are required, with the following suggested text: