Archive for the ‘gdal’ Category

OGR ArcSDE Write Support

Thursday, April 3rd, 2008

In the midst of having a baby (well my wife, anyway :) ), I was diligently working on completing OGR ArcSDE write support for the upcoming GDAL 1.6 release. This post describes the state of that work and asks for testers to come in and start giving it a whirl.

MapServer has supported querying from versions for almost four years (I think this is something ArcIMS still doesn’t support, but ArcGIS server does), but OGR’s ArcSDE driver only supported querying from SDE.DEFAULT and basic query operations. I was contracted to flesh out the rest of the driver, including write support and participating in ArcSDE’s versioned editing machinery. In January, I received an excellent patch from Shawn Gervais of project10.net implementing geometry conversion and basic ArcSDE layer/attribute field creation that was a fantastic starting point for the work. With the hard bits done, I was able to jump into the miserable bits — implementing the versioned editing/query support.

To be an effective player with ArcSDE, client software really must participate in the versioned editing scheme. After the “base” tables are created and loaded in ArcSDE for a multiversioned layer (”registered with geodatabase” in Arc* parlance), ESRI clients typically edit layers by manipulating the adds/deletes/modifieds tables through ArcSDE’s edit state/versioning machinery. If you would want your edits to an ArcSDE layer from OGR to show up in ArcMap, for example, your changes must happen within this machinery, or they would not be available until the adds/deletes/modifieds tables were “compressed” down into the base tables — an operation that requires locking out all clients from the entire database. The simplistic description of these operations is:

  1. The version client wishes to edit on is locked, preventing other clients from moving its state
  2. A new edit state is created
  3. ArcSDE is told the edits within this session are happening on the newly created state
  4. Edits are made
  5. The transaction is commtted and the version’s state is moved to the newly created state
  6. The lock on the version is removed

OGR doesn’t provide a clean mapping of a number of the concepts of ArcSDE, however. A first hurdle is transaction support — OGR has rather anemic transaction support that happens at the Layer level and ArcSDE’s happens at the connection/DataSource level. To overcome this, OGR’s driver puts all of the operations within the opening/closing of the OGR DataSource. I’m not sure I like this compromise, but I can’t think of a cleaner way to do it without foisting the pain of the user having to know what stage in the editing scheme a particular operation is at and how to roll themselves back from it. The consequence of this choice is that the editing state machinery is switched on as soon as an ArcSDE DataSource is opened for update and switched off when the DataSource is Destroy()’d. Re-opening an ArcSDE DataSource connection is rather heavy (1-2 seconds), so it might be necessary to revisit this choice if people want high-throughput, multi-user operations.

Another challenge is ArcSDE’s coordinate spaces. ArcSDE has fixed coordinate spaces that are defined at layer creation time. This restriction has to do with the underlying implementation actually using integers with offsets to store coordinate info. Coming up with reasonable defaults for defining this space is challenging, especially if you don’t know the geographic extents to which someone might attempt to store data. This problem isn’t limited to just OGR, however, and many an Arc user has had trouble coming up with good X/Y domain values. For OGR, the values are set based on the coordinate system, and there currently isn’t a way to override that default, but if people have trouble with it, it shouldn’t be too difficult to allow a way to override it.

The final hitch is that there are so many options with ArcSDE that don’t map to OGR concepts. Beyond the X/Y domain stuff, there’s whether or not to use multiversion tables, DBTUNE keywords, and so on. Many of these are available as either layer creation options or general OGR environment variables. More will likely have to be added as actual users start attempting to do real work with the code. Leaky Abstractions, indeed.

Come help test

A final note for folks who might be interested in this stuff. Post a comment here or email me if you wish to get a Windows build (against 9.1, 9.2, or 9.3 (beta program FTW)) to test things out and see if they behave properly for you. I have an EDN license, which gets me ArcSDE, but it doesn’t get me ArcMap, so I have no way to really test if things are behaving exactly as they should. Testing help would be much appreciated.

Importing Spatial References from URLs in GDAL 1.5

Thursday, December 13th, 2007

We’ve been busting hard to get things in shape for the GDAL 1.5 release that will be on December 20th. Besides its typical blizzard of new GDAL and OGR drivers, one of the more useful little features that I added for this release allows you to import a spatial reference definition from a URL. It’s expected that you might use http://spatialreference.org URLs, but any old URL will do.

For example, this command will reproject the world_borders shapefile to an Albers projection that a user contributed that focuses on the Northern Pacific:

ogr2ogr -t_srs http://spatialreference.org/ref/user/north-pacific-albers-conic-equal-area/
-s_srs EPSG:4326 world_borders_albers.shp world_borders.shp

It may seem rather innocuous, but this feature can help you out quite a bit when using OGR and GDAL command line utilities. It is a hard and painful problem to ensure that you are using the correct spatial reference, and demonstrating to others what you are using in a bug report can be even harder. By using a wiki-style spatial reference repository like spatialreference.org you can ensure that everyone is using the same one. I’ll admit it’s not very useful for well known spatial references like those in the EPSG list, but if you ever cook your own reference or your pet one was missed by the powers that be, sharing them and consuming them will be a little bit easier with spatialreference.org and GDAL 1.5.

Hopefully some other projects follow suit. Fetching your spatial references from a URL *every* single time through a million member loop might not be the best approach, but it can be really handy in other scenarios.

Not quite valgrind for OS X

Sunday, November 18th, 2007

I do most of my development on OS X, but I do have parallels instances for both Ubuntu and Windows. One thing I really miss on OS X and Windows is valgrind. It is the savior of dumb C/C++ programmers like myself everywhere. Apple ships a couple of tools that are close to approaching the utility of valgrind.

  • Chris Hanson’s nice description of how to use the leaks command on OS X.
  • ‘man malloc’ on OS X describes a number of environment variables you can use for malloc tracing on OS X.

Still not as good as valgrind, but close enough for a lot of cases…

GDAL’s Python bindings aren’t dead yet (and neither is this weblog)

Friday, November 16th, 2007

Hello again. My Plone site got trashed by the spammers, and I’m so sick of Plone that I just grabbed an RSS feed of my posts and plopped them into this Wordpress site. Yes, I know this is PHP. Weblogging in Python sucks.

Everyone’s reimplementing Python bindings to GDAL these days. Sean’s got one or two attempts, and the GeoDjango wrote ctypes bindings for OGR and GEOS. Neither of these efforts are as complete as gdal.py/ogr.py in the standard GDAL distribution, but they are going off and writing new stuff anyway. Why?

GDAL’s Python bindings are really C/C++ in Python form

The API in GDAL’s Python bindings is not at all pleasant to program with if you are a true blue Python coder. There are some gotchas because resource management is foisted on the Python developer who is not used to it, TheNamesOfThingsAreHardToReadSometimes, and the APIs don’t map to other standard Python ways of doing things.

I have been reacting to this by adding more sugar to GDAL’s bindings. Specifically, you will be able to call feature.items() and feature.keys() instead of manually fetching them in the upcoming GDAL 1.5 release. You can also do for ‘feature in layer’ and it will iterate for you. These minor improvements should make things a little better. If anyone has more suggestions for sugar that won’t impact backwards compatibility, let me know.

GDAL’s Python bindings are hard to deploy

GDAL’s deployment story in the past was miserable. Most people depended on pre-packaged Python binaries because it was so difficult to get them built, they didn’t install in a standard (setup.py/disutils) way, depended on external data (the GDAL_DATA directory), and polluted your site-packages. For GDAL 1.5, this story will be improved significantly:

  • Setuptools/distutils is now used by default to build and deploy the bindings. I will be registering the GDAL bindings with the CheeseShop, and once available, you should be able to easily build your own bindings if you have the GDAL base library (and headers) installed.
  • I am expecting to include the scripts/utilities and GDAL_DATA files *with* the Python package for distributions such as the CheeseShop one and the download.osgeo.org Windows binaries.
  • The bindings are now in the ‘osgeo’ namespace, so you can do ‘from osgeo import ogr’ instead of just ‘import ogr’. Note that the old method will continue to work for a while as we deprecate things, but we’re moving in the right direction.

The reimplementers hate SWIG.

SWIG sucks. “Simplified” my ass. SWIG is a complex ball of stuff. If you only have one language to support, SWIG is the worst possible way you could choose to write bindings. If you have five or six languages to support, SWIG is the *only* way you can choose to write bindings and hope to not have to manually maintain thousands and thousands of lines of native interface code. SWIG does indeed suck, but without it GDAL would likely only have Python bindings… not Python, Java, C#, and Perl.

Fresh GDAL

Monday, April 9th, 2007

The GDAL project has historically been very branch-averse, and this has meant that all releases were a combination of new features and bug fixes. Another immediate benefit to the GDAL project of the move to subversion
is that branching and tagging is simple… it’s just another copy, and it’s allowed us to maintain a “stable” branch and selectively backport bug fixes. MapServer has followed this release pattern for quite a while, and I’m sure GDAL users will appreciate not having the ground shift underneath them just for a bug fix.

For more details on the release, read Frank’s release message http://lists.maptools.org/pipermail/gdal-dev/2007-April/012533.html

GDAL migrates to Trac and Subversion

Friday, March 30th, 2007

Trac is the best thing since sliced bread. Er, ok, maybe not *that* great, but if there’s any software product out there that embodies that gross word called synergy, Trac is it. It brings the source, bugs, and documentation (wiki) together into a lean, mean, software production machine (again, more hyperbole, but damn I like Trac).

GDAL previously existed on infrastructure spread across at least three different servers. We had CVS hosted at MapTools, Bugzilla hosted at remotesensing.org, and documentation hosted on Frank’s personal server. Having things spread around so much meant that once it was working, it was pretty much left alone, which was ok until stuff broke (thanks MySQL). Our project’s infrastructure made it very difficult for non-developer types to participate other than through maillist interaction.

In January, after approval from the project steering committee, we migrated from our CVS server to Subversion at http://svn.osgeo.org/gdal. This first step allowed us to easily move stuff around in our repository and has given us tons of flexibility. Last weekend, I migrated our Bugzilla to Trac at http://trac.osgeo.org/gdal. The final step of our infrastructure migration gave us stuff we’ve always wanted like a wiki, usable milestones and development roadmaps, and a great source code browser. I’m excited and hopeful that Trac will help the GDAL project scale as it continues to attract more contributors, code, and bug reports ;)

Toss the dwarf or pick up the axe

Monday, March 26th, 2007
BTW, a member of the ANSI C committee once told me that the only thing rand is used for in C code is to decide whether to pick up the axe or throw the dwarf, and if that’s true I guess “the typical libc rand” is adequate for all but the most fanatic of gamers . Tim Peters. 21 June 1997

import ogr
import osr
import random

ds = ogr.Open(r'c:\hobu\shapefiles\data.shp', 1)

layer = ds.GetLayer(0)
count = layer.GetFeatureCount()
for i in range(count):
    feature = layer.GetFeature(i)
    fudge = random.randint(0,10000)
    geometry = feature.GetGeometryRef()
    gcount = geometry.GetGeometryCount()

    for j in range(gcount):
        g = geometry.GetGeometryRef(j)
        pcount = g.GetPointCount()

        for p in range(pcount):
            x,y = g.GetX(p), g.GetY(p)
            g.SetPoint(p,x+fudge,y+fudge,0)

    layer.SetFeature(feature)

ds.Destroy()

ArcSDE developments

Sunday, March 11th, 2007

The last two months have also been all-ArcSDE, all-the-time for me, as I’ve worked on three development efforts that have focused on integrating the ESRI database abstraction technology with Open Source technologies.

PySDE

In 2002-2003, I developed a SWIG’ified wrapper to the ESRI ArcSDE C API called PySDE that wrapped up the library and allowed its use via Python. I’ll be the first to admit that it was ugly as hell, but its development was my first introduction to the Open Source GIS community, and although there were some good ideas in it, to be truly useful and sustainable the thing needed to be rewritten.

I’ve been noticing more and more posts on the ArcSDE forums about folks wanting to script and program the ArcSDE SDK with C# rather than going through ArcObjects. I suppose they’re wanting to do this for more control and performance reasons. My experience with both MapServer and GDAL and their SWIG bindings has been very instructive, and I have started rewriting PySDE to follow GDAL’s approach and layout. This means that there is good potential to support more than just Python. I think that PySDE could still fill a useful niche for folks, and I’m looking for folks who wish to support its further development. Head to http://sde.hobu.biz if you’re interested in finding out more and looking at the current state of the code.

GDAL ArcSDE Driver

Project number two related to ArcSDE was the big one — development of a full GDAL raster driver. I had posted a little bit about this last fall, and I was able to secure some funding to make it happen. GDAL now has a driver checked into subversion that supports overviews, statistics, many different data types (1, 4, 8 bit and so on), coordinate systems, and colormaps. Head to http://www.gdal.org/frmt_sde.html to find out more information about the driver. If you’re interested in testing it out (I have been looking for more testers…), grab a copy of FWTools 1.2.2, check the box for the ArcSDE plugin, and after installing, head to http://hobu.stat.iastate.edu/mapserver/build_output/gdal_SDE.zip to get a new version of the plugin that has seen a number of improvements since the FWTools release. You’ll need the ArcSDE 9.1 SDK on your PATH to be able to use it.

Development of this driver was an excellent learning experience, and I would like to thank Frank Warmerdam for his guidance while I developed. I hope that the ArcSDE raster driver proves to be very useful, especially for government types hoping to wedge Open Source GIS’s foot in the door in their machine rooms. I think it (along with MapServer’s SDE support and OGR’s SDE vector support) has the potential to be a major glob of glue for organizations not looking to abandon their editing and analysis tools while still looking to things like MapServer and MapGuide to fulfill their web story.

MapServer ArcSDE Joins

The final ArcSDE item I worked on was small, but not so straightfoward. I added the ability for MapServer’s ArcSDE driver to one-to-one joins with another in-database table. Obviously this is very specific and only useful to a very small percentage of the userbase. One area where spatial databases like PostGIS and Oracle (and maybe MySQL if it ever supported geometric algebra and OGC Simple Features operations) is the construction of a query defines what the “layer” is as far as MapServer is concerned. ArcSDE doesn’t have that luxury, and some fire-burning hoop-jumping is required even to do a simple join.

Hobu Pro

With the upcoming move to the new city, I will also be making Hobu, Inc. a full-time endeavor. It will be an exciting and challenging step to take, but I think that now is as good as time as ever to try and build it into a sustainable business. The timeframe for going full-time is still in the air a bit, as we’ve not yet closed on our house in the old city, but once we know the house is sold (buy it here!), I will give my notice at Iowa State University and head off across the state to do GIS software development and consulting full-time. The hope is that we’re moved to the new city by the middle of April, but real estate markets as they are, it’s hard to know for sure.

I’m excited to bring it full-time, and your chance to hire Hobu Pro ™ to do your proprietary <-> open source GIS software bridging development dirty work is soon approaching :)

Geographic Projection Web Services redux

Tuesday, January 9th, 2007

A couple of years ago I was playing around with Twisted Python and developed a simple web service for reprojecting data (this is no longer live). It was merely a glorified wrapper around the spatial reference tools in GDAL, but a few folks found it useful and I still receive an email about it here and there.

Lately, I’ve been playing around with Django, mostly as an effort to find a less punishing framework than Zope/Plone. I’m not very much of a web person however, but I’ve been following some of the recent developments like OpenLayers, GeoJSON, and, of course, the Google. One of the more common and somewhat insane things that I see as part of my daily routine hanging out on #gdal on IRC is somebody showing up asking about projection math and how they can implement it in JavaScript. The Open Source GIS world already has a world-class projections library in Proj.4, but for some reason it makes sense to a few folks to try and re-invent this wheel. I’ve always thought it was kind of silly, and any attempt to do so would probably result in a miserable development effort that retreads the misery that Gerald and Frank endured making Proj.4 in the first place.

JSON

JSON, or JavaScript Object Notation, is rapidly becoming the transport of choice for all of the AJAXian wunderkinds out there. The greatest advantage of JSON is that the parsing overhead of it for JavaScript is zero, and it can practically be eval’d in directly to the client. With these two things in mind, I set about to play around with a JSON-emitting webservice that projects geographic data (points).

The projector lives at http://projection.hobu.biz/json . It takes in these parameters:

Parameter Value Description
x -93.0 The x coordinate (longitude)
y 42.0 The y coordinate (latitude)
inref EPSG:4326 The input spatial reference of the point
outref EPSG:26915 The output spatial reference of the point
function myfunction The name of the JavaScript function to wrap the output in
id 3 An id to give the data

Here’s an example that projects a point in Iowa from EPSG:4326 to EPSG:26915 (decimal degrees WGS 84 to UTM Zone 15):

http://projection.hobu.biz/json/project/?y=43&x=-93&function=myfunction&outref=EPSG%3A26915&inref=EPSG%3A4326

Using the service with EPSG codes would probably be the most common way to work, but because the service builds on the spatial referencing library in GDAL, there are many more options. For example, we could specify our input projection in Proj.4 format instead:

Proj.4 input example

Or even get really nuts and specify our output projection in OGC WKT:

OGC WKT output example

Finally, we can output a coordinate in a spatial reference that doesn’t have an EPSG code, like USGS Albers:

USGS Albers output example

Python usage

You aren’t limited to using GET requests either (although you could probably construct one easily with urllib). For example, with jsonrpclib for Python, you can request projection of points much like you would with XMLRPC:

>>> import jsonrpclib>>> rpc = jsonrpclib.ServerProxy('http://projection.hobu.biz/json')>>> rpc.project('-93.0', '42.0', 'EPSG:4326', 'EPSG:26195')>>>{u'id': 2, u'result': {u'features': {u'center': [500000.0, 4649776.2247029999, 0.0], u’title’: u’title’, u’spatialCoordinates’: [[500000.0, 4649776.2247029999, 0.0]], u’srs’: u’EPSG:26915′, u’geometryType’: u’point’, u’id’: u’1′}}}

If you find it useful, let me know.

GDAL 1.4.0 Released

Sunday, January 7th, 2007

Head to http://www.gdal.org to get a copy. Here’s the note from Frank and relevant news items that describe some of the changes:

The GDAL development team is pleased to announce the release of
GDAL/OGR 1.4.0.  This new release includes many new features
and bug fixes since the 1.3.2 release nine months ago.  These
are described at:

http://www.gdal.org/NEWS.html

The new release source may be downloaded from:

http://www.gdal.org/dl/gdal-1.4.0.tar.gz
http://www.gdal.org/dl/gdal140.zip

Binaries corresponding to the GDAL/OGR 1.4.0 release can
be found included in the FWTools 1.1.3 release for Windows
and Linux at:

http://www.gdal.org/dl/fwtools/FWTools113.exe (windows)
http://www.gdal.org/dl/fwtools/FWTools-linux-1.1.3.tar.gz (linux)

The GDAL project has also introduced the new gdal-announce
list, hosted by OSGeo.  Those interested in just occasional
notices of GDAL/OGR project progress are encouraged to join
this mailing list.

http://lists.osgeo.org/mailman/listinfo/gdal-announce

GDAL/OGR 1.4.0 - General Changes

Perl Bindings:
  • Added doxygen based documentation.
NG Python Bindings:
  • Implemented numpy support.
CSharp Bindings:
  • Now mostly operational.
WinCE Porting:
  • CPL
  • base OGR, OSR and mitab and shape drivers.
  • GDAL, including GeoTIFF, DTED, AAIGrid drivers
  • Added test suite (gdalautotest/cpp)
Mac OSX Port:
  • Added framework support (–with-macosx-framework)

GDAL 1.4.0 - Overview Of Changes

WCS Driver:
  • New
PDS (Planetary Data Set) Driver:
  • New
ISIS (Mars Qubes) Driver:
  • New
HFA (.img) Driver:
  • Support reading ProjectionX PE strings.
  • Support producing .aux files with statistics.
  • Fix serious bugs with u1, u2 and u4 compressed data.
NITF Driver:
  • Added BLOCKA reading support.
  • Added ICORDS=’D’
  • Added jpeg compression support (readonly)
  • Support multiple images as subdatasets.
  • Support CGM data (as metadata)
AIGrid Driver:
  • Use VSI*L API (large files, in memory, etc)
  • Support upper case filenames.
  • Support .clr file above coverage.
HDF4 Driver:
  • Added support for access to geolocation arrays (see RFC 4).
  • External raw raster bands supported.
PCIDSK (.pix) Driver:
  • Support METER/FEET as LOCAL_CS.
  • Fix serious byte swapping error on creation.
BMP Driver:
  • Various fixes, including 16bit combinations, and non-intel byte swapping.
GeoTIFF Driver:
  • Fixed in place update for LZW and Deflated compressed images.
JP2KAK (JPEG2000) Driver:
  • Added support for reading and writing gmljp2 headers.
  • Read xml boxes as metadata.
  • Accelerate YCbCr handling.
JP2MrSID (JPEG2000) Driver:
  • Added support for reading gmljp2 headers.
EHDR (ESRI BIL) Driver:
  • Support 1-7 bit data.
  • Added statistics support.

OGR 1.4.0 - Overview of Changes

OGR SQL:
  • RFC 6: Added support for SQL/attribute filter access to geometry, and
    style strings.
OGRSpatialReference:
  • Support for OGC SRS URNs.
  • Support for +wktext/EXTENSION stuff for preserving PROJ.4 string in WKT.
  • Added Two Point Equidistant projection.
  • Added Krovak projection.
  • Updated support files to EPSG 6.11.
OGRCoordinateTransformation:
  • Support source and destination longitude wrapping control.
OGRFeatureStyle:
  • Various extensions and improvements.
INFORMIX Driver:
  • New
KML Driver:
  • New (write only)
E00 Driver:
  • New (read only)
  • Polygon (PAL) likely not working properly.
Postgres/PostGIS Driver:
  • Updated to support new EWKB results (PostGIS 1.1?)
  • Fixed serious bug with writing SRSes.
  • Added schema support.
GML Driver:
  • Strip namespaces off field names.
  • Handle very large geometries gracefully.
ODBC Driver:
  • Added support for spatial_ref_sys table.
SDE Driver:
  • Added logic to speed things up while actually detecting layer geometry types
PGeo Driver:
  • Added support for MDB Tools ODBC driver on linux/unix.
VRT Driver:
  • Added useSpatialSubquery support.