Aquarionics Wiki

geographic data

From the department of Geekery

Geographic Data

Aim

When I tag an entry, picture, or comment with a location in Lat/Long, I will be able to locate the nearest geographical point.

(Eventually, I want this to interface with a GPS unit and my laptop so I can pretend to be Randy from Cryptonomicon, and this is also a sub-point of my “I will travel more” life plan)

Bits

This will require a few things

  • Fucking around with Epistula
    • Epistula currently stores locations in H/M/S lat/long instead of Decimal, which makes it incompatible with almost every geoanything thing ever. I need to convert all the existing data to Decimal.
    • …At that point I expect to discover that Ep. doesn’t actually know where anything is and I’ll have to play with the data.
  • Geodata
    • Large database of geographic points with associated lat/long
  • Code
    • Unless I can get an API for the geodata, I’m going to end up writing something that takes the current lat/long and finds the closest thing in the database.

Work

2006-12-15

Finding the database is the first thing. Apparently Getty have a database of all geographic thingies, but it’s a very high value of non-free for the data, and the front end itself has no API. Annoying, because it does exactly what I need (You give it a point, it tells you what’s closest). According to links from a while ago, it also used to accept lat/long on the search, from which I could screen-scrape. Alas, now it doesn’t.

This AskMifi is exactly what I want, but the data is either a) unhelpful or b) US-only (The Tiger thing), but in looking at Tiger I found this:

The GNS GIS Datafiles are the US “National Geospacial Intelligence Agency”, which has a datafile on every country the US recognises and the recognised spelling (and lat/long) of each. Awesome.

The files are in a tab delimited text file, zipped. One file per country, one item per line. I catted them all together, and the resulting file was about 750mb. I don’t really want a MySQL database that big, and it won’t be written too often (or possibly ever again), so I’m going to look at sqllite to see if it is fast enough for this, then possibly DB2 before trying MySQL.

This is the database:

create table geodata(rc smallint, ufi integer, uni integer, lat real, long real, dms_lat integer, dms_long integer, utm varchar(64), jog varchar(64), fc varchar(64), dsg varchar(64), pc varchar(64), cc1 varchar(64), adm1 varchar(64), adm2 varchar(64), dim integer, cc2 varchar(64), nt varchar(64), lc varchar(64), short_form varchar(64), generic varchar(64), sort_name varchar(64), full_name, full_name_nd, modify_date varchar(64));

(Key to field names)

And this Python code turns the data file into SQL:

myFile = open(“all.uniq.txt”)\ while True:\ line = myFile.readline().strip().split(“t”)\ print ‘insert into geodata values (“%s”);’ % ‘“, ”’.join(line)

(I started trying to use the pysqlite bindings, then realised "python ./builddb.py | sqlite geodata.db" would be far quicker

That’s running now, I’m watching carefully to see how big the data set goes…