py-Projection 0.1
Tuesday, June 22nd, 2004I have built a Windows installer for Python of the Thuban Project’s Proj.4 wrapper. It is available for download here
I have built a Windows installer for Python of the Thuban Project’s Proj.4 wrapper. It is available for download here
The article provides some interesting insight into Microsoft’s reaction to people rapidly moving away from their development platforms.
And here’s the clincher: I noticed (and confirmed this with a recruiter friend) that Windows API programmers here in New York City who know C++ and COM programming earn about $130,000 a year, while typical Web programmers using managed code languages (Java, PHP, Perl, even ASP.NET) earn about $80,000 a year. That’s a huge difference, and when I talked to some friends from Microsoft Consulting Services about this they admitted that Microsoft had lost a whole generation of developers. The reason it takes $130,000 to hire someone with COM experience is because nobody bothered learning COM programming in the last eight years or so, so you have to find somebody really senior, usually they’re already in management, and convince them to take a job as a grunt programmer, dealing with (God help me) marshalling and monikers and apartment threading and aggregates and tearoffs and a million other things that, basically, only Don Box ever understood, and even Don Box can’t bear to look at them any more.
Much as I hate to say it, a huge chunk of developers have long since moved to the web and refuse to move back.
Carlos Segura, formerly of 37 Signals proposes some new standard rates…
MapServer Closing Session
Much time was spent discussing the idea of project governance. One
of the things that is frequently said in the Python community is
the idea of “What if Guido gets hit by a bus?”. There is starting
to be some idea or questioning in the MapServer community about
where the project might go if Steve or DM Solutions were to go away.
Frank chimed in an said that he picked up maintenance of Proj.4
because the original developer retired, and he needed it. He expected
that the same thing would happen if Steve were hit by a bus.
On the other hand, there was a lot of independent developer (those not
affiliated with a larger company that have built their business around
MapServer) support for something like a MapServer Foundation. The
thinking there is that small to medium amounts of money could be used
to fund the general development of the application, rather than the
bounty-based development that is currently going on. I think that
this is something that should be considered, because infrastructure
and design-type development isn’t something that is really developed
in a for-bounty system. I don’t know that this will be a huge issue,
but there are some things like thread safety that are not sexy enough
to easily get funding for.
The developers expressed some frustration that there isn’t enough
user to user support sometimes. It was brought up that the MapServer
website is stale in many instances and the only active or live parts
of the website are the wiki (which is chaotic by definition) and
the links to the email archives. Steve expressed that he would like to
see the website put into some sort of a CMS so that the users can keep
it up rather than filtering through one group or one person like it
currently is. It was also suggested that the website be put into CVS,
but in my opinion that is a bad idea.
Cartoline drawing was discussed. Evidently, MapServer now has support
for drawing lines as polygons, which makes for prettier output and
in some cases draws faster.
Conference Closing Session
The conference closing session was a very lively discussion.
ESRI was discussed a quite a bit, and it was pointed out that
it gives the community a target to shot at/for. The development
models of course are very different, with ESRI using the monolithic,
one solution for everything model, and the open source developers using
a bunch of small, functional pieces that do what they do well and
then connected together. Time will tell which model wins out, but
I think many developers might start flocking to the unix-like model.
There was much “preaching to the converted” in the final plenary, but
the final conference wrap-up was a good discussion. Community was
the word, not things like open source, or specific software products.
The conference really gave the community a chance to become aware of
itself, who they were, and what they were doing. Not only will this
give the group momentum, I think that many new cool things will come
out of ideas discussed and mulled about while people were here.
On the bus ride back to the hotel, people were talking, but I really
couldn’t respond any more. My brain was full. It was a very intensive
conference with a lot of personal interaction. Presentations and
workshops were more of a formality so that we could call it a conference, as
most of the discussion was informal. All in all, it was exhilarating, exhausting,
and energizing all at the same time…
Presentation
The presentation went well, and there was a lot of questions for me at the end of it and at the end of the session. Adena asked how many people I thought were straddling the proprietary-open source boundary and using SDE with MapServer. I responded that I don’t have any idea, but I do get questions somewhat often about how to glue them together…
MapScript
I am in the “Big SWIG of MapScript” session right now and will be listening to Sean give us the skinny on it. Expect more information later…
A Twisted Python implementation of a WMS server from Sean …
import os
import sys
import time
import getopt
import mapscript
# Twisted Python classes for web programming
from twisted.web import server, resource, static
from twisted.internet import reactor
CACHEDIR = './cache'
def usage():
print """Usage: twms.py -m mapfile -p port"""
def main():
# Get options
try:
opts, args = getopt.getopt(sys.argv[1:], ‘m:p:’)
except getopt.GetoptError:
usage()
sys.exit(2)
mapfile = None
port = None
for o, a in opts:
if o == ‘-m’:
mapfile = a
if o == ‘-p’:
port = int(a)
if not (mapfile or port):
usage()
sys.exit(2)
# ======================================================================
# Setup the web application
# Create empty root resource
root = resource.Resource()
# Insert a Twisted WMS Resource at path ‘twms’
root.putChild(’twms’, TwistedWMSResource(mapfile))
# Create a site instance, bind it to a port, and start up the
# Twisted reactor loop
site = server.Site(root)
reactor.listenTCP(port, site)
reactor.run()
# ==========================================================================
# WMS Server resource for Twisted
class TwistedWMSResource(resource.Resource):
“”"Publishes a mapfile as a WMS resource”"”
# Classes deriving from resource.Resource need to implement children.
# This class has no static children.
children = {}
def __init__(self, mapfile):
“”"Object Initialization”"”
# A TwistedWMSResource has a map attribute
self.map = mapscript.mapObj(mapfile)
# Turn on all layers of the map
for i in range(self.map.numlayers):
self.map.getLayer(i).status = mapscript.MS_ON
def getChild(self, path, req):
“”"All WMS requests are treated like dynamic child resources”"”
# Create an OWSRequest instance, like a single-value dictionary
wms_request = mapscript.OWSRequest()
# Push CGI parameters from req.args into the OWSRequest
for param, value in req.args.items():
wms_request.setParameter(param, value[0])
# Clone the master map before executing the request`
tmp_map = self.map.clone()
# The following two statements won’t be needed in 4.2.1
tmp_map.setFontSet(’fonts.txt’)
tmp_map.setSymbolSet(’symbols.txt’)
# Create an OWSRequest instance, like a single-value dictionary
wms_request = mapscript.OWSRequest()
# Push CGI parameters from req.args into the OWSRequest
for param, value in req.args.items():
wms_request.setParameter(param, value[0])
# Map updates its state from OWSRequest
tmp_map.loadOWSParameters(’1.1.1′, wms_request)
# Draw
image = tmp_map.draw()
# Save
cache_name = str(hash(req.uri))
cache_path = os.path.join(CACHEDIR, cache_name)
image.save(cache_path)
# Serve up the map from the image cache location
mimetype = image.format.mimetype
return static.File(cache_path, mimetype)
if __name__ == ‘__main__’:
main()
The Boat Cruise
After the afternoon sessions with Sean, I prepared for the boat
cruise by getting a jacket and catching the bus. The food line
was long and slow, but the beer line was short, and it
necessitated much technical discussion about many subjects.
I spent quite a bit of time hanging out with Frank Warmerdam,
listening to him pontificate on subjects ranging from SCO,
Sun, GeoJP2, the second system effect, developing proprietary
software and the mindset of it, and making it as an independent consultant.
Frank is a very animated and passionate speaker when it comes
to technology issues, he is very good at communicating
his views, and it was fun to listen to him
I also spent some time talking to Norman Vine and Chris Hodgson
discussing using the GPU (video card) for vector math and
transformation operations. Norman says that we should be taking
advantage of all of the computing power we virutally get for
free to do the kinds of things we need to do in geography/GIS
applications. If you think about it, it really makes sense.
He gave me the names of some open source libraries that could
be used to do those kinds of things. For raster data, this
is one area that should really be pursued.
Last November, I purchased a Powerbook G4 for my fiance to use
in her Phd program. When I got it, I was really excited, and I
still love to play with the thing, even though I don’t get
full time with it. After talking to Sean and his MapServer
development with the mac, I am going to make my next machine
a G5. The only piece of software really holding me back is
ArcView 3.x, with which I use for a lot of consulting development.
However, I already have a PC that I can do much of that with,
and I don’t want to limit the rest of my computing experience
just because of one application.
Overall it was a good day at MUM 2 at Carleton University
here in Ottawa, Ontario Canada. I think that I’ve compressed
more acronyms into my speech in the last day than I’ve said
in the past year. The good thing about a meeting like this
is that it is a chance for everyone to really meet each other
in person for the first or second time. Also, it means that
everyone’s interpretations of the pronunciations of
library names and acronyms don’t quite line up.
So, here they are from the source (that is, from the developers themselves).
MapScript
Much time was spent with Sean Gilles discussing MapScript,
its current implementation, and where it might go in the future. We
talked about the upcoming refactoring of the API and some of the naming
of the new methods. We also talked about other various warts that we thought could
could be candidates for refactoring as well.
Python
I got a chance to talk to Norman Vine over a cigarette or
two about Python and all of the various opportunities to use it with
GIS. We talked about the Python C API, and I expressed my frustration
with Python reference counting and my poor understanding, which he
gave me some good tips about. We also talked about the possibilities
for geometry algebra operations in Python. The
PySDE wrapper is one
way of doing it, but he gave me some names of other libraries to look up (which,
of course, I already forgot).
SDE
On the bus ride home, I talked to Frank Warmerdam, who is
the master when it comes to data formats. He is the developer of
OGR and GDAL, which are becoming the foundation data input libraries in
MapServer. We discussed what might need to be done to put SDE vector and
raster support into OGR/GDAL. This is one area I would like to pursue if
I had the time.
GIS Monitor
I talked to Adena Schultzberg of GIS Monitor about my comments about the
swallowing of Mapping Science by LizardTech. I told her that I wrote the
article in a fit of frustration, because I had been championing GeoJP2
within my organization, and then LizardTech pulled the rug out when they
swiped the GeoJP2 specification. That whole thing still makes
me mad…
Zope
J.F Doyon of Environment Canada gave a very cool presentation to
Sean and I about his fancy Zope site that serves up millions
of MapServer maps a month. All of the mapping happens outside
of the Zope process because of the thread-safety issues that
MapServer has, but it was very cool to see all of the neat
CMF stuff that he build for his organization.
I also found the watering hole that the MapServer crowd was hanging out at and had a good discussion with Paul Ramsey and Chris Hodgson of Refractions, Steve Lime of MapServer fame, and Marin Davisof JTS.
I will blog the plenary in the morning as it goes … No wireless internet (arrg!), so I won’t be real-time blogging, but I’ll probably post nightly updates.
MapBuilder is a client-side GUI that consumes WMS map services. It appears to allow you to cleanly separate the presentation and database operations (something that MapServer doesn’t do so well).
Geocoder.us provides a free XMLRPC gateway to the TIGER data. It does not appear to do any sort of reverse geocoding, however. Here is how you can easily use it in Python…
>>> import xmlrpclib
>>> p = xmlrpclib.ServerProxy('http://rpc.geocoder.us/service/xmlrpc')
>>> p.geocode('110 North Russell, Ames, Iowa')
[{'city': 'Ames', 'prefix': 'N', 'suffix': '', 'zip': 50010,
'number': 110, 'long': -93.628218000000004, 'state': 'IA', 'street':
'Russell', 'lat': 42.022807999999998, 'type': 'Ave'}]
>>>
I would like to bring to your attention an issue that has really bothered me
the last few days. It’s a story about image compression. In my industry (GIS,
or Geographic Information Systems), images are one type of data that is very
important. We use aerial photographs of the landscape to collect data and
for referencing other data. The problem with imagery is that it is big, as in too
large to send over the wire, and it must be compressed in some way to allow
software applications to use it.
A leader in the image compression market over the past few years has been a
company called LizardTech. They were
the first company to widely distribute and market a set of image compression
technologies called wavelet compression. Wavelet compression is a more advanced
compression technique than is typically used for compressing images.
Here is a quick primer on image compression. Imagine an image as a set of cells,
much like a sheet of bubble wrap. Each of the bubbles contains a value (your
data, actually), and uncompressed, each bubble of the bubblewrap is full of air (data).
A compression algorithm goes around and pops all of the bubbles that contain the
same (or similar) values, and then it records what the values were and where
they were. Now the size of the entire sheet of bubblewrap is
much smaller than its original. You can then send it over the wire, save it on a disk, etc.
and the physical size of the bubblewrap takes up much less volume than it did before.
Now, when you want to read it again, you go to your de-compression algorithm and ask
which cells in the bubblewrap were popped and what their value was when they
were popped. It then reconstructs the image in its original form.
Again, this is a very simplistic (and probably flawed) analogy of
what is going on, but it gets the point across.
LizardTech’s technology is this wavelet compression stuff. If you go do a google
search for it, you’ll find lots of academic articles about how it works, what
math is used, and examples of images that have been compressed differently. What
they did was not new or novel (like Microsoft copying Macintosh for Windows), but
they marketed it well and aggressively. Big GIS vendors like ESRI and Leica/ERDAS
started incorporating LizardTech’s technology into their software, which was
positioned much like a drug dealer who gives you free samples at the beginning. You could
do small images, but if want more, you have to pay. A lot. To both read and
write the images.
So LizardTech went on like this for about three years. No one was in their market
space and they trounced along making gobs of money from private and government
organizations. They did show, however, that they learned a thing or two from ESRI.
ESRI cemented its dominant market position by understanding that it is governments
that can afford to generate all of the valuable GIS data. This is because it is
extremely difficult for an organization to recover the costs of actually collecting
the data. What ESRI did was basically give the software away to governments so that
they data they were making would end up being distributed in ESRI formats. Pushback
in the form of GRASS and others came too late once all of the momentum was moving
ESRI’s direction. For the past ten years or so, almost all of the data from
government agencies has been in ESRI formats. Slowly, this is starting to move in
the direction of open formats, but the transition has been long and treacherous.
LizardTech has done the same thing, but a technology (and a standard) has come
along to disrupt their plans for global domination. That technology is called
JPEG 2000. On top of that, a couple of their former employees started their
own company called Mapping Science to market JPEG 2000 technologies for the GIS
market. LizardTech didn’t like that too much, and they (recently) sued them out of existence. They
did this because they want to own the geospatial imagery market. They are
following the lead of companies like Microsoft in that they are trying to own
the standard. Microsoft has owned the standard operating system for years,
have profited from it greatly and bullied anyone who tried to get into their space
with lawsuits, threats, and extortion. LizardTech is just following the leader here.
Even though Beta was a better format than VHS, VHS won because it had cheaper
hardware and the movie studios pushed out more product for VHS. If government
agencies like USDA and USGS continue to push out MrSID data files because they
got the technology on the cheap, they are dooming us to an inferior and closed
format. As it stands now, you have to pay $600 to LizardTech for software to even
read their data. In my opinion, this is unacceptable.
If you work for a government agency and you develop GIS data for public use, please
ensure that the data you provide to the public is in an open format. Even
if this format is a bit harder to use, the hidden cost that is passed on to your
users is much greater.