Geohash Explained
Geohash is a public-domain location encoding invented by Gustavo Niemeyer in February 2008. It interleaves latitude and longitude bits and base32-encodes the result, producing strings like 'dr5ru6gh' that identify rectangular cells of decreasing size as length grows. The 32-character alphabet (0-9 b-z excluding a/i/l/o) is chosen to avoid confusion. Key property: lexicographic prefix containment — a shorter geohash contains all longer geohashes that start with it. Used as the geo-index primitive in Elasticsearch, MongoDB, Redis, PostGIS, Solr, and many spatial databases. The article covers the algorithm, precision table, the meridian/antimeridian edge cases, and the database-integration patterns.
By Steve K.. Published . Last updated .
Geohash is the developer-favored location encoding: less human-readable than Plus Codes or what3words, but extraordinarily useful as a database index. Most modern spatial databases include geohash as a built-in feature. This article opens with the algorithm and precision table, moves through the prefix property, covers the database-integration patterns, and notes the edge cases.
Companion to /learn/coordinate-formats-explained, /learn/plus-codes-explained, and /learn/what3words-explained.
What a geohash is
A geohash is a short alphanumeric string identifying a rectangular cell on Earth's surface. The cell size depends on the string length: longer strings = smaller cells = more precise location.
Examples:
dr5 Times Square area (~156 km × 156 km)
dr5r Manhattan (~39 km × 19 km)
dr5ru Midtown Manhattan (~4.9 km × 2.4 km)
dr5ru6 Midtown West (~1.2 km × 0.6 km)
dr5ru6g Around Penn Station (~153 m × 76 m)
dr5ru6gh A specific block (~38 m × 19 m)
dr5ru6gh1 A specific building (~4.8 m × 4.8 m)
The cells are rectangular, not square — but they approach square for longer strings because the latitude/longitude bits alternate.
Invention and release
Geohash was created by Gustavo Niemeyer in February 2008, while he was working on a location-based service. He released the algorithm into the public domain via the website geohash.org, which allowed lookups and encode/decode operations.
The original geohash.org site is no longer actively maintained but the algorithm has spread far beyond it. Niemeyer described the work as inspired by the challenge of having a short, URL-friendly location identifier — the design parallels are clear with URL-shortener algorithms.
The public-domain release is significant: unlike proprietary alternatives, anyone can implement geohash without licensing. Open-source libraries exist in every major language, and the algorithm has been embedded in dozens of databases.
The algorithm (high level)
The encoding procedure:
- Latitude bisection: divide the latitude range
[-90°, +90°] in half. If the target latitude is in
the upper half, output bit
1; lower half, output0. Recursively refine the chosen half. - Longitude bisection: divide the longitude range
[-180°, +180°] in half. If the target longitude is
in the right half, output bit
1; left half, output0. Recursively refine the chosen half. - Interleave: alternate between latitude and longitude bits, starting with longitude.
- Base32 encode: group the bit stream into groups of 5 bits, encode each group using the 32-character alphabet.
The alphabet:
0 1 2 3 4 5 6 7 8 9 b c d e f g h j k m n p q r s t u v w x y z
Notably excluding a, i, l, o to avoid confusion
with digits or each other. The alphabet is identical to
the Base32hex standard (RFC 4648 Section 7) but
with letters in different positions — they're not
interchangeable encodings.
The decoding procedure reverses the encoding: base32 to bits, deinterleave to lat/lon, recover the bisection ranges to determine the cell.
Precision table
| Length | Lat error | Lon error | Cell size | Use case | | ------ | --------- | --------- | --------- | -------- | | 1 | ±23° | ±23° | ~5,000 km × 5,000 km | Continental | | 2 | ±2.8° | ±5.6° | ~625 km × 1,250 km | Country | | 3 | ±0.70° | ±0.70° | ~156 km × 156 km | Region | | 4 | ±0.087° | ±0.18° | ~19.5 km × 39 km | Large city | | 5 | ±0.022° | ±0.022° | ~4.9 km × 4.9 km | City district | | 6 | ±0.0027° | ±0.0055° | ~610 m × 1,220 m | Neighborhood | | 7 | ±0.00068° | ±0.00068° | ~153 m × 153 m | Block | | 8 | ±0.0000086° | ±0.00017° | ~19 m × 38 m | Building | | 9 | ±0.000021° | ±0.000021° | ~4.8 m × 4.8 m | Address | | 10 | ±0.0000027° | ±0.0000054° | ~0.6 m × 1.2 m | Surveying | | 11 | ±0.00000067° | ±0.00000067° | ~15 cm × 15 cm | Centimeter | | 12 | ±0.000000084° | ±0.00000017° | ~1.9 cm × 3.7 cm | Sub-cm |
Each additional character adds 5 bits to the binary representation. Even-numbered character positions (2, 4, 6, ...) preserve aspect ratio; odd-numbered positions add one extra longitude bit (making cells more “tall” rectangular) or latitude bit (more “wide”) depending on the algorithm convention.
The prefix-containment property
The defining feature of geohash: a shorter geohash is a container of all longer geohashes that start with it.
dr5
├── dr50
├── dr51
├── dr52
├── ...
├── dr5r
│ ├── dr5r0
│ ├── dr5r1
│ ├── ...
│ ├── dr5ru
│ │ ├── dr5ru0
│ │ ├── ...
│ │ ├── dr5ru6
│ │ │ ├── dr5ru6g
│ │ │ ├── ...
│ │ │ ├── dr5ru6gh
│ │ │ ...
This nested-prefix property maps spatial containment to string-prefix matching:
- “Find all points in the cell ‘dr5’”
→ SQL:
WHERE geohash LIKE 'dr5%' - “Find all points in ‘dr5ru6gh’”
→
WHERE geohash LIKE 'dr5ru6gh%'
This is why geohash is the natural spatial-index primitive for any database with B-tree or trie-style indexes: prefix queries are fast. Specialized geospatial indexes (R-trees, quadtrees) exist for complex shape queries, but geohash gives you point-in-cell queries with a regular text index.
Worked example
Empire State Building (40.7484° N, 73.9857° W):
Encoding step-by-step (sketch):
- Latitude 40.7484° is in upper half of [-90°, 90°] → bit
1. - Longitude -73.9857° is in left half of [-180°, 180°] → bit
0. - Bit sequence so far:
10. - Continue refining and interleaving.
- After 35 bits (7 characters):
dr5ru6g≈ Penn Station area. - After 40 bits (8 characters):
dr5ru6gh≈ specific block. - After 50 bits (10 characters):
dr5ru6gh1z≈ ~0.6 m precision.
For the Empire State Building specifically, an 8-character
geohash like dr5ru6g7 (the actual value depends on the
precise lat/lon and grid alignment) identifies the
building.
Database integration
The dominant use of geohash is as a database index key. The integration pattern is similar across implementations:
Elasticsearch
PUT /locations/_mapping
{
"properties": {
"location": {
"type": "geo_point"
}
}
}
Internally, Elasticsearch indexes geo_point as a
geohash at multiple precision levels, enabling efficient
range queries. The geohash_grid aggregation buckets
documents by their geohash prefix:
GET /locations/_search
{
"aggs": {
"by_hash": {
"geohash_grid": {
"field": "location",
"precision": 5
}
}
}
}
MongoDB
db.locations.createIndex({ "location": "2dsphere" })
The 2dsphere index uses a custom geohash variant
internally. Standard MongoDB GeoJSON queries
($geoWithin, $near, etc.) leverage the index for
fast lookup.
Redis
GEOADD locations -73.9857 40.7484 "empire-state-building"
GEORADIUS locations -73.9857 40.7484 100 m
Redis GEO commands use a 52-bit integer geohash internally. The 52 bits give ~0.6 m precision globally. Redis exposes only a thin GEO API over this internal indexing; the geohash bits are not directly user-facing.
PostgreSQL with PostGIS
CREATE INDEX idx_geohash ON locations USING gist (ST_GeoHash(location, 9));
PostGIS provides ST_GeoHash and ST_Box2dFromGeoHash
functions. The first computes a geohash from a geometry;
the second creates a bounding box from a geohash string.
PostGIS also has dedicated spatial index types (GiST,
SP-GiST, BRIN) that don't require geohash, but
geohash remains useful for sharded or distributed
PostgreSQL deployments.
Cassandra and other distributed databases
In Cassandra and similar distributed databases, geohash is often used as a partition key: data with the same geohash prefix lives on the same shard, supporting locality-aware queries. The 5-character prefix is a typical sharding granularity.
CDN edge routing
Some CDNs use geohash for edge routing: requests are routed to the nearest edge server based on the client's geohash. This is a form of geolocation-based load balancing that maps directly onto the prefix-containment property.
Edge cases and limitations
The meridian problem
Two points 1 meter apart on opposite sides of the equator or the antimeridian may have geohashes that share no prefix at all:
Just north of equator: s000000... or w000000...
Just south of equator: k000000... or e000000...
The first character is completely different despite the points being 1 m apart. For queries that find “all points within distance X” from a target point, applications must compute the geohashes of all neighbor cells (typically 8 neighbors plus the cell itself) and union the results.
Geohash libraries provide neighbor-cell computation: given a geohash, return the 8 neighboring cells. The union of point-in-cell queries over these 9 cells covers the expected query region.
Arbitrary regions
Geohash cells are axis-aligned rectangles. Arbitrary shapes (circles, polygons) cannot be represented exactly. Approximation: cover the shape with a union of geohashes at varying precisions. Libraries provide algorithms for “covering” a shape with a minimum-cell-count union.
Coordinate system
Geohash uses WGS 84 latitude/longitude. Applications using other geodetic datums must transform to WGS 84 before encoding. See /learn/wgs84-explained.
Cell-edge sensitivity
A point exactly on a cell boundary is ambiguous — which cell does it belong to? The convention is that the lower/left boundary is inclusive, the upper/right exclusive. Different implementations may handle boundaries differently; production systems should normalize input to a known precision before encoding.
Geohash variants
Geohash-36
A variant by Pier Aron using a 36-character alphabet (adding letters that geohash excluded). Each character encodes 5.17 bits instead of 5 bits, producing slightly shorter strings for the same precision. Limited adoption.
Geohash-int (integer geohash)
Many databases (Redis, MongoDB) use integer representations of geohash internally rather than base32 strings. A 52-bit integer is the common form; Redis uses exactly 52 bits. The integer form is faster for sorting and arithmetic but less human-readable.
Niemeyer geohash vs other space-filling curves
Geohash uses a Z-order curve (Morton order) on the bit interleaving. Other space-filling curves are used in some databases: Hilbert curves (more locality-preserving but harder to compute), Peano curves, etc. Most production databases use Z-order (geohash-style) because of the simplicity of the encoding and decoding.
Comparison with Plus Codes and what3words
| Property | Geohash | Plus Codes | what3words | | -------- | ------- | ---------- | ---------- | | Type | Base32 string | Base20 string | Three words | | Year invented | 2008 | 2014 | 2013 | | License | Public domain | Apache 2.0 | Proprietary | | Memorability | Low | Moderate | High | | Verbal use | Difficult | Moderate | Easy | | Database use | Excellent (index primitive) | Moderate | Difficult | | Prefix containment | Yes (defining feature) | Partial | No | | Cell shape | Rectangular | Square | Square | | Adopted by databases | Yes (most major) | Limited | None | | Adopted by mapping services | Some (legacy OpenStreetMap, etc.) | Google Maps | Some | | Adopted by emergency services | No | Some | Yes (UK, Mongolia) |
The trade-offs are clear: geohash for database integration, Plus Codes for open-source location sharing, what3words for verbal emergency communication.
Implementation notes
Reference implementations exist in essentially every language:
- JavaScript: ngeohash, latlon-geohash, geohash-js
- Python: python-geohash, pygeohash
- Java: ch.hsr.geohash, geohash-java
- Go: github.com/mmcloughlin/geohash
- Rust: geohash crate
- C/C++: libgeohash and others
The algorithm is short (~50 lines per language); many developers reimplement it locally. The implementation is identical across libraries — geohash is fully specified by Niemeyer's public-domain definition.
Common misconceptions
“Geohash is proprietary.” It's public domain. Niemeyer released it without claiming ownership in 2008. Any implementation is permissible without licensing.
“Geohash and Plus Codes are interchangeable.” Both encode lat/lon to a short string, but the algorithms, alphabets, and prefix properties differ. A geohash isn't a Plus Code with different characters; they're structurally distinct encodings.
“Longer geohashes are always more accurate.” Longer geohashes have smaller cells, not necessarily “more accurate” positions. The encoding rounds the input lat/lon to the cell's center (or assigns to a containing cell). For high-precision input that's already accurate, geohash length is a precision-vs-storage trade-off, not an accuracy improvement.
“Adjacent points have similar geohashes.” Usually but not always. Points across the equator or antimeridian can have very different geohashes despite being close. This is the “meridian problem”; applications doing proximity queries must compute neighbor cells.
“Geohash is only for points.” Geohash encodes a point, but can be used to index polygons (by covering with a union of geohashes), lines, and other geometries. PostGIS and other spatial libraries support this via geohash-cover algorithms.
“Database geohash and string geohash are the same.” They're typically equivalent — the database stores an integer or bit-packed representation internally, but the lookup behavior matches the string-prefix model. Redis's 52-bit integer geohash, MongoDB's 2dsphere index, and Elasticsearch's geo_point all behave like string-prefix matches from the application perspective.
“Geohash works in all coordinate systems.” It assumes WGS 84 latitude/longitude. Other geodetic datums (NAD83, ETRS89, etc.) require transformation to WGS 84 before encoding. See /learn/wgs84-explained for the datum context.
“52 bits is the standard geohash length.” Database internal representations use 52 bits (Redis) or other fixed widths. User-facing representations use base32 strings of variable length depending on application needs (typically 6–10 characters).
“Geohash will be replaced.” Unlikely soon. Despite known limitations (meridian problem, rectangular cells, lack of locality at cell boundaries), the prefix-property advantage for database indexing keeps geohash dominant. Newer alternatives (S2 cells from Google, H3 cells from Uber) serve specialized use cases but haven't displaced geohash as the general-purpose primitive.
“The alphabet is arbitrary.” It's
specifically chosen: 0-9 b-z excluding a/i/l/o
to avoid visually confusable characters. The position
ordering within the alphabet matters — the alphabet
maps each character to a specific 5-bit value, and
different orderings produce different encodings.
Related
- Coordinate Formats Explained— The pillar — geohash is one of six common formats
- Plus Codes Explained— The newer open-source alternative — different design trade-offs
- what3words Explained— The proprietary word-based competitor
- What Is Geocoding?— Geohash as the geo-index primitive in geocoding databases
- Methodology— How content is sourced and verified
Frequently asked questions
What is a geohash?
A geohash is a short alphanumeric string that identifies a rectangular cell on Earth's surface, with cell size depending on string length. For example, 'dr5ru6gh' identifies a ~150 m × 150 m cell containing Times Square in New York. Geohash was invented by Gustavo Niemeyer in February 2008 and released into the public domain via geohash.org. The encoding interleaves bits of the latitude and longitude binary representations, then base32-encodes the resulting bit stream using a 32-character alphabet (0-9 b-z excluding a/i/l/o, chosen to avoid visually confusable characters).
How precise is each geohash length?
Precision halves for each additional character (since each character is 5 bits, and bits alternate between latitude and longitude). Approximate cell dimensions: 1 char (~5,000 km × 5,000 km, continental), 3 chars (~156 km × 156 km, regional), 5 chars (~4.9 km × 4.9 km, city), 6 chars (~1.2 km × 0.6 km, neighborhood), 7 chars (~153 m × 153 m, block — typical for 'address' precision), 8 chars (~38 m × 19 m), 9 chars (~4.8 m × 4.8 m), 12 chars (~3.7 cm × 1.9 cm, surveying precision). Note the cells are rectangular, not square: even-position characters add longitude resolution; odd-position characters add latitude.
What's the 'lexicographic prefix containment' property?
Geohashes have a useful nesting property: a shorter geohash is a 'container' of all longer geohashes that start with it. For example, 'dr5' contains 'dr5r', which contains 'dr5ru', which contains 'dr5ru6'. This means spatial containment maps directly to string-prefix matching — a database can find all points in 'dr5' by searching for strings beginning with 'dr5'. This is why geohash is the natural geo-index primitive for prefix-based databases (Elasticsearch, Solr) and B-tree indexes (PostgreSQL with geohash extensions).
Which databases use geohash?
Most major databases with geospatial features. Elasticsearch's geo_point type uses geohash internally and exposes the geohash_grid aggregation; MongoDB's 2dsphere index uses a geohash variant (BSON); Redis GEO commands use a 52-bit integer geohash; Apache Solr's geospatial features include geohash; PostgreSQL/PostGIS has ST_GeoHash and ST_Box2dFromGeoHash functions; Apache Cassandra and other distributed databases use geohash for sharding (data with similar geohash lives on the same shard, supporting locality-aware queries); Amazon DynamoDB's geo library uses geohash for partition keys. Geohash is also used in CDN edge routing for geo-locality.
What are geohash's known edge-case issues?
The most famous: adjacent cells can have very different prefixes when they cross meridian boundaries. Two points 1 meter apart on opposite sides of the equator or the antimeridian may have geohashes that share no prefix at all. This is the 'meridian problem'. Solutions involve querying multiple geohash cells (the cell of interest plus all neighbors) when looking for nearby points; libraries provide neighbor-cell computation. A related limitation: arbitrary-shape regions (circles, polygons) can't be represented exactly as a single geohash — they need a union of geohashes at varying precisions. Despite these limitations, geohash remains the dominant geo-index primitive because its simplicity and prefix property outweigh the edge cases for most use cases.
Sources
- Elastic — Elasticsearch — geo_point and geohash_grid aggregation documentation · https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/geo-point · Accessed .
- MongoDB — MongoDB — 2dsphere index documentation · https://www.mongodb.com/docs/manual/core/2dsphere/ · Accessed .
- Redis — Redis GEO commands documentation (uses 52-bit geohash internally) · https://redis.io/commands/?group=geo · Accessed .
- PostGIS — PostGIS — geohash functions documentation · https://postgis.net/docs/ST_GeoHash.html · Accessed .
Cite this article
APA format:
Steve K. (2026). Geohash Explained. Coordinately. https://coordinately.org/learn/geohash-explained
BibTeX:
@misc{coordinately_geohashexplained_2026,
author = {K., Steve},
title = {Geohash Explained},
year = {2026},
publisher = {Coordinately},
url = {https://coordinately.org/learn/geohash-explained},
note = {Accessed: 2026-06-05}
}