What we call AdminCode is a common 2-3 digits (or letters) of an administrative area. For example, the subarea "Saône-et-Loire", in France, is 71.
We often need to get quickly the AdminCode of an admin area, but we only got its WOF (Whosonfirst), or its OSM (OpenStreetMap), or even its GeoNames id. Depending on endpoints we used to fetch the data, these ids are rarely returned with the area AdminCode.
This project goals to map, in JSON files, the worldwide administrative areas, from different taxonomies (WOF, OSM, GN), with their common AdminCode.
To be compliant with ISO codes (ISO-3166-1 / ISO-3166-2), we prefix all admin codes with the country code. For example : FR-71.
Available countries:
- France
- Belgium
- Switzerland
- Spain
- Germany
- Italy
2-level administrative divisions ./admin-areas/<country>.json
are initially created with ChatGPT. We use ChatGPT because it is a good solution to find a most common/probable 2-level division of a country (based on token frequency) without any complexity of merging and simplifying the divisions from multiple sources (OSM or other sources).
Admin area names returned by ChatGPT can be fine-tuned to fit with each /autocomplete of Pelias, Nominatim or Geonames APIs, for next step:
Then, these data are enriched with APIs, using their names, so we can collect their IDs :
./data/enriched-admin-areas
{
"name": "Auvergne-Rhône-Alpes",
"code": "FR-ARA",
"osm_id": "osm:relation:3792877",
"gn_id": "geonames:macroregion:11071625",
"wof_id": "whosonfirst:macroregion:1108826389",
"children": [
{
"name": "Ain",
"code": "FR-01",
"osm_id": "osm:relation:7387",
"wof_id": "whosonfirst:region:85683153",
"gn_id": "geonames:region:3038422"
},
{
"name": "Allier",
"code": "FR-03",
"osm_id": "osm:relation:1450201",
"wof_id": "whosonfirst:region:85683293",
"gn_id": "geonames:region:3038111"
},
Finally, we got outputs :
./output/osm-to-admin-codes/FR.json
{
"osm:relation:3792877": {
"code": "FR-ARA",
"children": {
"osm:relation:7387": { "code": "FR-01" },
"osm:relation:1450201": { "code": "FR-03" },
"osm:relation:7430": { "code": "FR-07" },
"osm:relation:7381": { "code": "FR-15" },
"osm:relation:7434": { "code": "FR-26" },
"osm:relation:7452": { "code": "FR-43" },
"osm:relation:7407": { "code": "FR-74" },
"osm:relation:7437": { "code": "FR-38" },
"osm:relation:7420": { "code": "FR-42" },
"osm:relation:7406": { "code": "FR-63" },
"osm:relation:7378": { "code": "FR-69" },
"osm:relation:7425": { "code": "FR-73" }
}
},
"osm:relation:3792878": {
"code": "FR-BFC",
"children": {
"osm:relation:7424": { "code": "FR-21" },
You need to set the .env file, with your ChatGPT token and the Nominatim server you want to use. DO NOT use the official Nominatim server, because this is not allowed by their policy and it has rate usage limits anyway.
nvm use
npm ci
Create a ./prompts/<country>.txt
for ChatGPT, and then you can execute Usage commands below.
node main.js FR
This will create 2 datasets :
- (If not exists) : Structured admin areas of the country, generated by ChatGPT, in
./data/admin-areas/FR.json
. You can customize this output to fix results in the next step. - Enriched admin areas with, for each area and subarea, the matching WOF, OSM and Geonames ids, in
./data/enriched-admin-areas/FR.json
. If you find wrong results in these enriched admin areas, you must re-run the step 1 after you customized the./data/admin-areas/FR.json
.
Then :
node map-osm-to-codes.js FR
node map-gn-to-codes.js FR
node map-wof-to-codes.js FR
The final outputs in ./output/*