Archive for the ‘Open Source Sociology’ Category

OpenLayers Graduates OSGeo Incubation

Monday, November 19th, 2007

The OpenLayers developers and project are probably such rock stars that this fact has little consequence for them, but under my limited tutelage as their incubation mentor, OpenLayers recently graduated the OSGeo incubation process, and it was approved by the board.

Congrats guys!
Update: I see Slashgeo picked up the news.

Open Source software is not always openly developed

Thursday, October 19th, 2006

The terms “Open Source” and “Free Software” almost always describe the licensing state of a piece of software. For the software projects that they are attached to, they describe a crucial attribute — your rights as a software developer and user to read, modify, build on, and use the source code of a project for your uses. “Open Source” and “Free Software” are also implicitly attached to the concept of an open development process — the software is worked on, improved, and distributed by a self-selected group of developers to the benefit of all — but this is not always the case. This article will describe and discuss the development processes of four projects that I am familiar with in the Open Source GIS domain and will describe the role that I think that OSGeo should provide in ensuring open development for its member projects.

Karl Fogel’s “Producing Open Source Software” is *the* book to read on this topic. It is the handbook of open development, describes the attributes of successful open source project that is openly developed, and it even covers sticky topics like how to work through issues like money entering the project, forking, and governance. If you get a chance, I highly recommend you read through it, if to only become familiar with the attributes of open source projects with respect to development and cues to help you evaluate them.

ERMapper’s ECW

During the summer of 2005, ERMapper released the software to their wavelet compression/decompression engine called ECW. While the source code is technically open, the licensing terms do not meet any of the requirements of any of the mainstream open/free software licenses, and by all accounts, the development process of ECW is anything but open.

My opinion is that ERMapper is in limbo with respect to opening ECW. They want outside contribution, especially with respect to building and deploying the software and fixing bugs, but they are concerned about their format and its possible divergence from the tons of data already out there in ECW format. The possible licenses you can assume and fall under reflect this ambiguity, and each variant of the license reflects what you as a *user* can do with the software.

As a developer, your only real option at this point is to not bother. According to Fogel’s criteria, ECW meets hardly any of the attributes of a successful open source project:

  • There is no common public source code repository.
  • There is no development mailing list.
  • There is no public bug tracking.
  • There is no public history of development — why the software is the way it is.

I think a project like ECW (or even MrSID) has the potential to be a highly impactful open source project that can reach beyond the geospatial domain. Wavelet compression is becoming an important topic, and participation in a project that utilizes this has the potential to catalyze and seed many other development efforts. Hopefully, ERMapper will see that their worries about format divergence will actually be mitigated by a truly open development effort. Time will tell on this one…

Refractions’ GEOS

GEOS has an interesting history. It started as a port of JTS to the C++ platform by Refractions. It is the geometry engine underneath PostGIS, and it is used by many other open source projects in the ‘C’ camp of open source GIS to provide topology and geometric algebra operations. Licensing-wise, it meets the criteria of open source software, and it is licensed under the LGPL (there is no explicit LICENSE.txt in the source code, but each source file is described as “licensed under the LGPL”).

Here are the attributes of Open Development that GEOS has:

  • A source code repository
  • A public bug tracker
  • Development list
  • Website

GEOS meets most of the criteria listed in Fogel’s book, but in my opinion it is missing couple of key components to truly qualify as open development (disclaimer: I am a source code committer on the GEOS project). First, it really has no community-oriented governance model. The project leader is ostensibly a maintainer that is paid by Refractions to organize and push forward GEOS’ development. Major developments must be vetted by Refractions (and frequently funded through Refractions) before they can be undertaken. Releases are made to serve Refractions’ business needs (PostGIS or client-funded improvements). These attributes make GEOS risky from the perspective of an individual, independent developer because your investment in GEOS as a developer may be thwarted if it is not in line with the interests of the company that “owns” the project.

A quote from Fogel’s money chapter illustrates this point more elegantly than I can:

However, funding also brings a perception of control. If not handled carefully, money can divide a project into in-group and out-group developers. If the unpaid volunteers get the feeling that design decisions or feature additions are simply available to the highest bidder, they’ll head off to a project that seems more like a meritocracy and less like unpaid labor for someone else’s benefit. They may never complain overtly on the mailing lists. Instead, there will simply be less and less noise from external sources, as the volunteers gradually stop trying to be taken seriously. The buzz of small-scale activity will continue, in the form of bug reports and occasional small fixes. But there won’t be any large code contributions or outside participation in design discussions. People sense what’s expected of them, and live up (or down) to those expectations.

The second, and more important divergence in my opinion, is that the head of GEOS is disconnected from its body. Definition of *how* the software is supposed to work — design and architectural decisions — actually comes from JTS. Strict adherence to the “C++ port of a Java library” mindset, and GEOS’ insistence on following JTS to the letter instead of in spirit means that technological advantages that C++ could provide can’t really be taken. There’s no feedback loop so that possible improvements made in GEOS make their way back to JTS. GEOS is the way it is because JTS is that way, and JTS defines what GEOS is. As a developer, only small improvements and changes can be made, as long as the parallelism between GEOS and JTS is not broken and the changes are in line with Refractions’ interests, and these restraints, in my opinion, causes internal forking efforts to happen and retard the development of GEOS in what it aspires to be — a fast, correct, C++-based open source library for geometry and topological operations.

Frank Warmerdam’s GDAL

GDAL meets most of Fogel’s criteria for an open development project (disclosure: I am a committer and PSC member on the GDAL project). It has a source code repository with history going back almost to the first day that Frank checked code into the project. It has a bug tracker with thousands of bugs (most of which are labeled as “fixed”). It has a very active user community, a documentation website, and an IRC channel.

GDAL has historically followed the Linux model, where most changes go through Frank, and lieutenants are left to be in charge of certain areas of the software. From an open development and governance standpoint, GDAL is currently in transition. While we all believe that Frank has a time stopping machine hidden up in the deep Canadian woods with him that allows him to be so prolific, there are limits to what one person can do, and GDAL is increasingly approaching them.

The move to OSGeo for GDAL has brought about the emergence of a Project Steering Committee, and Frank has relinquished outright dictatorial control of the project to it. The release process is slowly moving to a more community-oriented affair, and members of the GDAL development community are stepping forward to take care of maintenance of larger areas of the software. Sweeping technological changes and addition of features now go through a proposal and committee process to ensure that everyone can be made aware of such developments. Explicit communication (and the implicit history this creates) about these things ensures that developers on the project are roughly moving in the same direction.

UMN’s MapServer

In my opinion, MapServer meets Fogel’s criteria for an open development project (disclosure: I am also a committer and PSC member on the MapServer project). Unlike the other three projects that I described which are targeted almost exclusively toward other application developers, MapServer’s audience is both user-oriented web developers and some standard GIS-type application development. This diversity manifests itself throughout the project, most notably in the governance structure and developer community, which I have already described MapServer’s in a previous post. It also contributes to MapServer’s creeping featuritis that is both its blessing and its curse.

Frank’s MS RFC-1 has become one of the prominent models for project governance in OSGeo, and many of the member projects have copied or modified it to fit their culture and needs. MapServer has a ring (or two) of businesses that use it to their competitive advantage. Funding opportunities that arise from this ring feeds back into the software in the form of new features, general maintenance, documentation, and maillist support. Many of the businesses in the ring are represented in the PSC of MapServer. The project soldiers on, despite individual developers coming and going, and it still sees significant growth release over release as measured by software downloads, maillist posts, and bug submissions.

I think that MapServer is a very functional, but slightly imperfect model of open development. There are things described in Fogel’s book we aren’t doing yet, or aren’t so good at, but we’re open to suggestions and highly motivated individuals can have a lot of impact on the project. Technologically, the MapServer project is fairly conservative, and its not prone to withstand or put up with large refactorings. Its governance body is not elected by the community in an effort to sidestep the sticky issue of determining *who* the community is and if they have a right to vote on such things. Its diverse audience and diverse developer base pulls the project in many directions at once, and its focus, as defined by “MapServer is not a GIS!” leaves an awful lot of room to define what it actually is.

How OSGeo can play a role in ensuring Open Development

Hop on the bus, Gus

An important part of a project’s migration to OSGeo and its incubation is the codification of its governance culture. One goal of this governance is to increase what Sean calls the “bus number,” or the number of developers (or entities) a bus would need to run over to kill the project. I think it is important to make a fine distinction between the bus number of the project and the bus number of the software because they might not always be the same. OSGeo should aspire to ensure that the bus number of its member projects is much greater than one entity or person. In fact, I think this should be a requirement that must be met for incubation — if a project can’t satisfy it, it should be clear that there isn’t enough community support for the project to keep it viable. The problem, of course, is actually measuring the bus number of a project is a rather messy endeavor :)

Mo’ money, mo’ problems

Money coming into a project, as Fogel devotes an entire chapter to, can create significant tension within a project. In my opinion, an overriding incentive to form OSGeo by its initial member projects was money — in the form of direct support for the projects through some kind of collective pass-through funding, money for visibility and marketing and the leverage that an organization like OSGeo can provide, and money in the form of member projects sharing infrastructure that is common to all and commonly replicated. OSGeo must be clear, explicit, and careful about how money is distributed (if there is any to distribute). Perception is reality in this instance, and any shady stuff will probably be met with significant backlash.

Insight, foresight, more sight

Another role of OSGeo is to provide infrastructure to mitigate disputes and provide a neutral home for the project that is agnostic with respect to a specific entity. OSGeo could act as the grown-up in the case of intra-project disputes like Apache has done in the past. It aspires to act as the facilitator, in hope of congealing its member projects into a mass of non-overlapping, useful, and integrated technology. Finally, it can act as an educator, introducing the technology of its member projects and providing buoyancy in a way that a single project by itself could not do.

Conclusion

Some final thoughts for those of you who’ve actually read this far. Open source projects, even in the GIS domain, vary widely with respect to their development practices and how open or closed they are. As an open source developer, volunteer, and user, I’m attracted to projects that are openly developed. I will not invest much effort in something where my stake, in effort, is not recognized and respected. Finally, a significant measure of OSGeo’s success or non-success in my mind is how good of a job it does at fostering and ensuring open development of its member projects.

Update:

Please head to SlashGeo if you’re interested in commenting on this article. Still haven’t found a good solution to comment spam for my weblog yet.

Open Source Software Support

Tuesday, July 11th, 2006

Adena linked to an article this morning that highlighted a web mapping application that was developed by ZedX for the USDA to track Soybean Rust infestation. Soybean Rust is a fungal disease that has had a significant economic impact in South America, and worries of it disrupting US soybean production have prompted many monitoring activities and efforts. Web maps are an excellent way to communicate these monitoring efforts. The article stated that $2.5 million were provided to fund this specific project, and although I’m positive all of that funding didn’t go to ZedX for the development of the website, a significant portion surely did.

The article is typical press release-like fanfare except for the interesting bit about MapServer:


The soybean rust system is an open-source, Linux system. Other than the
open-source code and the MapServer GIS mapping applications, ZedX has
written the code, Russo said.”

A search through the MapServer maillist archives (by the way, the MapServer project really needs to fix our archives, they’re atrocious) showed a few posts by ZedX folks asking normal technical questions about compilation or usage of software features. I couldn’t find or I can’t recall ZedX funding any specific MapServer enhancements or contributing developer/documenter time (if anyone knows of any activities that were supported by ZedX, please let me know and I’ll update this article).

I think it’s great that a company like ZedX can use MapServer to provide them with a significant competitive advantage. I also think that stating that MapServer is their secret sauce is great visibility, and articles like that contribute to the project in a roundabout way. I just find it disheartening that for a 2.5 million dollar project, nothing (no direct funding, contributed time, or contributed documentation) found its way back to the software that was an integral component in making it tick.

$5000 to certain software companies gives you a license to run the software and maybe a couple of phone calls to ask why it doesn’t work. $5000 to an open source project like MapServer gives you specific rendering improvements you might like, possibly a data driver to read specific data you need (that the other software tool can use either), streamlining of not-so-fun-to-develop-on components of the software, or even completely new features that the software didn’t have before. $5000 of contributed time can get you documents of poorly documented features, funds the addition of your own specific features, or anything else you might need. These contributions not only benefit you the contributor, but also benefit anyone using the software and directly supports the developer(s) who work on the project — perpetuating development and contributing to its vitality.

It is my hope that OSGeo will be able to provide a clearinghouse for contributions to its member projects and solves the issue of “great, I have some money/time/resources to contribute, who do I give it to?” Also, the ability to pool contributions together for larger efforts is something that is sorely needed. Those efforts are just getting off the ground though, and time will tell if that approach will be any more successful than individual-to-individual or individual-to-project contributions.

In a lot of areas, open source software is about leverage… leveraging collective knowledge, leveraging resources, and leveraging effort. Contributing to an open source project that is an integral component of your development strategy gives you and the project leverage — all for frequently less than the cost of a seat of some commercial tools.

OSGeo Communication Overload

Thursday, March 9th, 2006

After our initial Feb 4th meeting, things have been happening quickly. In fact, I think one of our biggest hurdles is that we are running up against Brook’s Law, ie communication costs increase with the square of the number of participants. Many folks are talking about many things all at once. The amount of brain and network bandwidth required to follow everything has been pretty daunting. An daily inbox with a 100+ messages related to OSGeo is a significant commitment just by itself.

One concern that I have is organizationally whether or not OSGeo has the potential to wither under the weight of the communication load. A corporation has the nice advantage of being hierarchical, for better or worse. OSGeo is a “by consensus” organization, but democracy is expensive from a communcation perspective. In the intial bootstrapping, lots of communication cost is placed on the board and many consensus decisions must be made by it.

Getting there (wherever we define “there” to be) is going to be challenging and a lot of work for the board. The board is working to delegate to various committees within OSGeo, but geometric growth of committees only increases the number of things clammoring for their attention. Add in many projects individually chirping for attention, and things can start to get out of hand.

It is my hope that after the mad rush, things start to settle down into a less hectic mode. Part of this rush is finding out what organizational structure works the best for us, as we are attempting something that hasn’t really been done before (take a bunch of open source software projects under different licenses and bring them under one tent). We’re learning though. Openly.

OSGeo comes into existence

Sunday, February 5th, 2006

Many are calling yesterday’s meeting (and subsequent decisions that were made) a watershed moment for the Open Source GIS software community. I for the most part agree with that sentiment. I am excited that the initial board members that were chosen represent a variety of software projects and a variety of personalities. I think we were able to hone down the issues that people had mostly discussed via email on the maillist to things we could all agree on.

I must say that Autodesk was a gracious and proactive host. Their involvement throughout the last few months (including the ill-fated MapServer foundation/naming announcement) has been admirable. They have listened to the community, acted appropriate in both their interest and the interest of the community, and shown that they are willing to work with what already exists in the Open Source GIS ecosystem. Missteps that the initial group made (mostly the fact that foundation stuff was discussed and proposed under an NDA that was only to cover the fact that Autodesk was releasing a bunch of software under an Open Source license) have not only been recoverable — they have been instructive. The formation of the Open Source Geospatial Foundation is a result of that learning process.

I think that most of the uproar proceeding the initial announcement was related to naming and the idea that the MapServer project was exclusively involved in the formation of a foundation. Now that we have gone beyond that, many opportunities for more structured cooperation and collaboration between various Open Source projects that elect to join the foundation now exist. The common ground we articulated and the organization we have proposed will allow the projects under the OSGeo umbrella to deal with issues that a project by itself could only do with much difficulty — outreach, common and low-overhead infrastructure, a potential entity to flow through aggregate funding to a project, etc.

Another thing that I find exciting is the possibility that by being in the foundation, software projects are structurally encouraged to collaborate. I’ve reinvented my fair share of wheels, and many of the projects that are in OSGeo have done so also. That the projects in the foundation will be overtly encouraged to collaborate is in my opinion a great thing. “Building the stack” as was said yesterday.

The board members have an unenviable amount of work to do to bootstrap OSGeo into existence, but we came away from the meeting with the principles that the board can implement. All of these developments will mean a brighter and more organized future for Open Source GIS.

MapServer Foundation outlook in 2006

Tuesday, December 27th, 2005

Sean posted some wishes he has for the foundation discussions in 2006. It is unfortunate that the re-engauging of the foundation discussions have been happening over the holiday season, so the opportunity for many folks to participate at this point isn’t there. But when people come back, the discussions will probably continue in full force.

Re Sean’s wishes:

  1. My head is pulled out. I made a point in my last post about the foundation that it was cooperative vs antagonistic competition. They will compete regardless.
  2. I agree that the name churn needs to be stopped. Multiple polls that look to tease things apart to see if there is wiggle room demonstrate that there isn’t any.
  3. Financial disclosure. Financial contacts or potential financial contacts between members of the Open Letter should be disclosed in my opinion. Other than being offered an airplane ticket and a hotel room by Autodesk to the bootstrapping meeting that hasn’t happened yet, I have no financial connections.
  4. Software focus. Sean makes the point that maybe a foundation should worry about the commonalities between Tux and MapServer, ie GDAL, GD, PROJ.4, GEOS, etc. Maybe, but those software tools do not have large communities around them - they are more developer-oriented and they aren’t so end-user oriented. If the foundation were to be solely software- and development-focused, that might make a lot of sense. The proposals and sentiment for the MapServer foundation seem to be marketing, branding, and community oriented, however.

The MapServer Foundation

Here are some things that I think the MapServer foundation should concern itself with:

  1. A common home for the software. This includes website, user documentation, download, and development support infrastructure (CVS, bugzilla, etc). This stuff is a sunk cost for whatever organization hosts them and the entire community benefits directly from it - the entire community should have an opportunity to step up and support it.
  2. Either the foundation is developer oriented and attempts to provide legal shielding ala the Apache Foundation developers, or the foundation is marketing and branding oriented. Can a software foundation do both? Maybe, but what I need or want from a foundation as a developer is probably quite different than what someone who sells MapServer services probably needs.

The MapServer Project

Here are some things that I think the MapServer project needs to do to get its house in order:

  1. A project steering committee, whether derived from parts of the MTSC with others being brought in, or completely separate entity that is elected needs to be created. As a member of the MTSC, I felt very uncomfortable representing anyone’s interests but my own. We were in non-technical territory, and the community never explicitly told me that I represented them. We need a decision-making body that can participate and represent MapServer in foundation bootstrapping.
  2. The project needs to articulate what it needs from a software foundation. What will it take for UMN to see that putting the software (and copyright) in a software foundation is the best thing to do? What do we need? What do we want? What are the deal breakers?
  3. By when? It isn’t fair to Autodesk for the MapServer project to flounder around. How do we make a decision to join and when does it need to be done? “Sometime in the future” is not an acceptable answer.

The value of open source isn’t the software

Saturday, November 19th, 2005

Ok, so maybe this isn’t quite true throughout the life span of an open source project. In the beginning, a blob of code that provides a unix-like kernel on an x86 architecture or glues together cartographic map rendering from a couple of other libraries itself can be pretty valuable. At some point, however, the thing that is most valuable about an open source project isn’t the software… it’s the community around the software. A quick browse of sourceforge or freshmeat will give you all kinds of examples of failed software projects. From a technical standpoint, they may fantastic and maybe even widely used, but if you are measuring success by the size and vitality of the community around it, from a sociological and network-effect standpoint, they’ve failed. Why didn’t they take off?

There could be many reasons. Maybe the initial developer of the software isn’t friendly to newbies and the project can’t grow users very fast. Maybe the software is too specific and not widely applicable. Maybe the code is written in such a way that it is not extensible or just plain miserable to work with. Maybe ego of the initial developer(s) gets in the way of allowing others to feel that they have as much stake in the code. This list could go on for a long time.

lusers and users

Users of open source software projects are what provide most of the value to a project. Users’ growth in technical skills and abilities is not static. A successful project fosters this growth. Familiarity with something takes a long time. A project that does not make a user feel that the investment to become familiar with the software is worthwhile will grow very slowly.

Users in a software project go through various larval stages on their way to contributing source code. A percentage of current users become power users, and start writing documentation or helping out on the maillist. Some of them start contributing bug fixes, and eventually graduate to writing code.

Lowly users still provide plenty of value to the project. How they deploy the tools, the money they save, and the kudos they get all contribute egoboo to the project. They bring more people to your project by making folks more aware of what it is capable of. Instead of hype, they contribute credibility by attaching their name to your project’s name in a wider world that is much larger than the frequently myopic problem domain that your open source software project lives in.

How to kill your open source project

1. Change your name

Identity is one of the most important things in an open source software project. The name need not describe what the software does, but it better not change because it disrupts the awareness of a project. Take care when initially naming your project so that it has a distinctive one.

For example, I know what NetBeans is, even though I’m not a Java developer. I have gained this awareness through being connected to the software development community through sites like Slashdot and reading software developer’s weblogs. If NetBeans were to change their name to MungBeans, that cycle would start all over again and it may take a couple of years before I know what MungBeans is about.

2. “Rewrite” your codebase

A rewrite is essentially a fork of your own software. It disrupts the familiarity and time investment stuff I was talking about above. You had better have a damn good reason to completely rewrite your code. Doing so gives your users and development community an opportunity to look elsewhere or even create a fork of the original software so as to continue the momentum that they had personally invested.

3. Fork your maillist

A separate users and developers list for a software project is a good thing. One is typically high-traffic while the other often only contains software development issues pertinent to the project. But once the signal-to-noise ratio of the user list gets to low or even just the traffic gets high, there is often a temptation to fork the list. Don’t. Doing so creates confusion in the community, disrupts people’s awareness of happenings, and short-circuits the ability of people to gain tacit familiarity with problems and issues they might encounter.

A corollary to this is to bury your maillist archives behind a firewall. It increases the frequency of FAQs because it disrupts people’s ability to find out information for themselves. It also slows down the user growth of a project by eliminating the serendipitous finding of your project by people looking for topical information. Your maillist should be visible to the search engine that counts…for right now, this is Google.

4. Crassly attempt to monetize your community

So you’ve created all of this value right? Let’s cash in on it! Not so fast buddy. Extracting money out of your open source software project is a tricky thing. End up getting your users to resent you and you’ll have nothing to extract.

5. Mixing business and pleasure

Many open source developers do so for the pure enjoyment of it. Scratching their itch and all that. Some are in various stages of turning that hobby into a business, either by getting commercial entities to support improvements into their codebase or by selling services based on the tools they’ve developed. If business interests start overtly pushing much of the direction and motivation for a project, your casual users will likely leave for something else. This could happen for a number of reasons, but the foremost one is that individuals doing something for fun and for free can resent their efforts putting coin in some business’s pocket.

Conclusion

So what does all this mean? Remember that the community around your open source project provides most of the value. The software is what brought people in, but the people make the project. Don’t screw it up by thinking you can control them or by thinking that what you say is the final word. You get what you give.