I found a hackish solution for this problem.
I want a solution additionally satisfying the following needs:
See a live Demo in Action: http://cdn.osterbruecken.de/ostermap (german).
First I fetched a dump of the geonames database. For the this testing case I just needed data for region germany, so I fetched this file
I extracted the file DE.txt out of it, parsed the tab separated file (tsv) with tr and cut (or you can use awk if you like it) and used grep for getting all POIs, marked with ";P"
I reduced the charset and transformed them to lowercase, just allowing the following characters:
a-zöäü ß .-
you can use the following chain to do that
tr "\t" ";" < DE.txt | cut -d";" -f2,5,6,7 | grep ";P" |\ tr -d "," | cut -d";" -f1,2,3 | tr ";" "," |\ tr -dc "0-9a-zA-ZöäüÖÄÜß\n ,.-" | tr "A-ZÖÄÜ" "a-zöäü" > cities.txt
this will export all lines to a new file, called cities.txt based on the following format:
UPDATE: The database of geonames.org was not very satisfying. So I used official openstreetmap database dumps from http://download.geofabrik.de, in this case germany-latest.osm.pbf (>2.4 GB) (uncompressed around 40GB) and extract all cities or streetnames out of it. Use osmconvert.c from the toolset of osmconvert for extracting data: (hint: build a 64bit executable and use a machine with a lot of RAM for this! processing the dataset germany-latest.osm will need about 14 GB of RAM on your machine and it will take some hours to finish)
./osmconvert germany-latest.osm.pbf --max-objects=900000000 --all-to-nodes \ --csv="name @lat @lon" --csv-separator="," | grep -v -E "^," > cities.txt
process your cities.txt and sort out all duplicate names (its a quick hack, perhaps i will rename those in an improved version later)
sort -k1 -t, cities.txt | uniq > uniq_cities.txtall cities are stored in file uniq_cities.txt now - line by line with its coordinates like this:
then I wrote a small script, that reads those lines and makes lots of broken symlinks out of it, just putting them into a folder called "search".
#!/bin/bash mkdir -p search cat uniq_cities.txt | while read line do url="http://osm.org/#map=/`echo $line| tr -dc "0-9.,-" | cut -d"," -f2,3| tr "," "/"`" symlink="search/`echo $line | cut -d"," -f1 | tr -dc "a-zA-ZöäüÖÄÜß. -" | tr "A-ZÖÄÜ" "a-zöäü"`" ln -s "$url" "$symlink" done
the name of the broken symlink is the name of the city and the symlink points to an URL like this
with the given coordinates of the city.
sure, this is just an example. you can use your own tileserver and your own map, like I did in the demo "Ostermap" mentioned before.
In this way you'll get a lot of broken symlinks like these:
... lrwxrwxrwx 1 user group 23 Jun 23 23:00 ührde -> http://osm.org/#map=/51.70547/10.20814 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uhrendorf -> http://osm.org/#map=/53.86275/9.41756 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uhrsleben -> http://osm.org/#map=/52.20087/11.26443 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uhry -> http://osm.org/#map=52.29693/10.85758 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uhsmannsdorf -> http://osm.org/#map=51.33048/14.90316 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uhyst -> http://osm.org/#map=51.36469/14.506 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uhyst am taucher -> http://osm.org/#map=51.19249/14.21843 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uichteritz -> http://osm.org/#map=51.20652/11.92215 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uiffingen -> http://osm.org/#map=49.5024/9.59269 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uigenau -> http://osm.org/#map=49.31204/11.01731 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uigendorf -> http://osm.org/#map=48.18048/9.57969 lrwxrwxrwx 1 user group 23 Jun 23 23:00 uissigheim -> http://osm.org/#map=49.67984/9.57134 lrwxrwxrwx 1 user group 23 Jun 23 23:00 ulbargen -> http://osm.org/#map=53.37535/7.58291 lrwxrwxrwx 1 user group 23 Jun 23 23:00 ulbering -> http://osm.org/#map=48.35362/13.01465 lrwxrwxrwx 1 user group 23 Jun 23 23:00 ulberndorf -> http://osm.org/#map=50.87472/13.67231 ...
Now, I can use those broken symlinks with gatling httpd, a tiny and really fast httpd server by Felix von Leitner.
I already used gatling for leaflet, my own maps and tiles I made with glosm.
You can find the project "Ostermap" right here. I just rendered the map for Saarland.
Gatling recognizes broken symlinks. If they contain "://" ( like in "http://" or "https://" it will make a valid http redirect out of it and redirect your browser to the given URL. This is a really nice feature. Thanks, fefe. In this way it will redirect any name of the given city to leaflet or openstreetmap with the given coordinates.
So I can start a locally listing gatling and enter the following url in a browser
to get the geo-location of the city ulbargen and directly show it on the map.
<html> <input title="please enter the name of the point of interest" id="name" value="ulbargen"> <input type="button" value="suche" onclick="window.location+='search/'+document.getElementById('name').value.toLowerCase()"> </html>
If any invalid POI name is entered, gatling just responses with 404 file not found. you can write additionally a ajax-script, catching 404 response and write something like "sorry, POI not found. please retry". Or specify your special 404 error page.
First: By storing the geo-information in an broken symlink i can implement a very compact storage of those coordinates without limitations of the underlaying filesystem or defined blocksize for a single regular file.
When you try to store the coordinates in regular files, also named by the name of the city - this is not very efficent: The whole "database" of this example would need over 240MB in total, since every file is about 4k on your storage. Even if it just contains those few bytes for the coordinates, every regular file would have a file size about 4096 bytes on your drive (because of block size) (see wikipedia if you want to know more about this). So lots of small regular files would waste a lot of storage capacity.
If you don't believe, just have a look at such files and compare the size with the the following commands
echo hello > regular_file ls -slh1 regular_file stat regular_file du -hcs regular_file ln -s "hello again" symlink_file ls -slh1 symlink_file stat symlink_file du -hcs regular_file
Sure, you can change blocksize of your filesystem by reformating the blockdevice or possibly do some tweaks with tune2fs. But even then the minimal blocksize of ext3 would be around 512 bytes and these changes would be no out-of-the-box solution and could lead to disadvantages of other services on your system.
you will find some more information about this topic here, here and here.
When using symlinks, the whole "database" its just about 1.8 MB in total, since each symlink and inode just needs those 128 bytes in this case. It won't get "blown up" to 4k by the specified minium block size of the underlying filesystem.
now you can also make a tarball out of the folder for distributing the "database". the xz tarball is about 1.1 MB then.
you simply can add a new POI by doing this
ln -s "http://osm.org/#map=/49.49361/7.26694" "osterbrücken"
and remove it, just by deleting the symlink
you also can specify zoom-level for individual POIs, if you want to. just use:
ln -s "http://osm.org/#map=14/49.49361/7.26694" "osterbrücken"
you could also distribute street-names in this way, for example by making cities as subfolders and put street POIs as symlinks. this would work with build in directory indexing of gatling.
since there are no unique city names, i should also consider to generate folders for duplicate names and then put each symlink in this folder.
Have a look at rfc5870 which describes the URL Scheme for Geo-Coordinates. Its a A Uniform Resource Identifier for Geographic Locations, in WGS-84 (World Geodetic System). But than you have to evaluate this URL at your client application. Also since gatling awaits "://" and the geo-url just is "geo:74.4294,19.0245", you wont get a successful redirect. you would have to change sourcecode of gatling in http.c for parsing this correctly.
You can fetch my pois.de.txt.xz (51MB) with 4.919.091 entries in format "name,lat,lon". The dataset is based on extraction of openstreetmap database dump (20150701), so licenced under Open Data Commons Open Database Lizenz (ODbL), and so copyright by © OpenStreetMap contributors