Geo Search & Clustering

Geographical objects and shapes are supported in ES through these field types:
- GEO POINTS → latitude/longitude pairs or geo hashes. Primarily used for:
- finding locations within a bounding box , within a certain distance of a central point, within a polygon , or within a geo_shape query ,
- aggregating locations geographically or by distance from a central point,
- integrating distance into a document's relevance score ,
- and sorting documents by distance.
- GEO SHAPES → select GeoJSON and WKT entities. Used for:
- filtering
geo_shape
documents using the spatial relationsINTERSECTS
,DISJOINT
,WITHIN
, orCONTAINS
- filtering
- SHAPES → arbitrary
x, y
cartesian shapes.
The third dimension (z-value) in any of the above field types is also accepted but only the latitude and longitude values will be indexed — the third dimension is ignored!
People often ask how many coordinate decimal places they should store.
Let's clarify that by borrowing from this GIS Stack Exchange thread and this answer by whuber :
- Accuracy is the tendency of your measurements to agree with the true values.
- Precision is the degree to which your measurements pin down an actual value.
I'm not going to get into the accuracy part and will rather focus on precision — here's a summary of what each digit in a decimal degree signifies:
- The sign tells us whether we are north or south, east or west on the globe (+ N/E, - S/W).
- A nonzero hundreds digit tells us we're using longitude, not latitude! See below — the x-axis represents the longitude, the y-axis the latitude:
North (+90)
|
|
(-180) West ---+--- East (+180)
|
|
South (-90)
- The tens digit gives a position to about 1,000 kilometers. It gives us useful information about what continent or ocean we are on.
- The units digit (one decimal degree) gives a position up to 111 kilometers (~69 miles). It can tell us roughly what large state or country we are in.
- The first decimal place is worth up to 11.1 km: it can distinguish the position of one large city from a neighboring large city.
- The second decimal place is worth up to 1.1 km: it can separate one village from the next.
- The third decimal place is worth up to 110 m: it can identify a large agricultural field or institutional campus.
- The fourth decimal place is worth up to 11 m: it can identify a parcel of land. It is comparable to the typical accuracy of an uncorrected GPS unit with no interference.
- The fifth decimal place is worth up to 1.1 m: it distinguish trees from each other. Accuracy to this level with commercial GPS units can only be achieved with differential correction.
I round my coordinates any chance I get, and rarely use more than 5 decimal places. Unnecessary decimal points add up to bandwidth transfer costs quickly, esp. when you're storing millions of geo entities and running strongly geo-focused products and services.
As to Elasticsearch's own precision: there is some index-time coordinate rounding too but the rounding errors translate to less than 1cm even at the equator which is perfectly fine for most use cases.
Let's visualise the conventional cardinal directions again: