How I select Open Source projects


Earlier this day somebody sent a image on twitter. Its title was something like, “how to choose Open Source projects” or so. It showed a flow chart. The first decision point was: is it an Apache project. If yes, so the creator suggests, don’t use the project. I was looking at this image and thought: wow, what complete and utter bullshit.

Yes, there was some discussion in the past. Is Apache harmful? Or is it not?

Some people seem to forget that GitHub is a tool. And the Apache Software Foundation is a community. The tool and the community, both have their benefits and drawbacks. But you cannot compare them 1:1. Have you ever tried to meet your github fellows? The likelihood to meet ASF guys and drink a beer together is really high. An Open Source project is more than its set of tools. But this post is not about what is better, Github or ASF. Much has been said already about it (and too much bullshit). What really bugs me is that people seem to choose Open Source projects after the tools the projects use. Here is my personal list after which I choose projects.

Is the project actively developed?

Look at the projects source code browser. When happened the last commit? If there was one recently, did it touch the README only or is code changed? Projects which do not see an update for more than 6 months are either very, very stable (unlikely) or there is no interest anymore.

Are there more people developing?

One-Man-Shows might work sometimes, but if there are many people working on a project, it is unlikely that urgent bugs are not fixed. With only one man behind a project, you need to wait for fixes until he returns from vacation.

Is there any support?

A twitter account you can follow is not really support. You may send gists around, ok, but you can not expect in depth responses. Even Stackoverflow is not enough. Stackoverflow is good, but you need direct access to the developers sometimes. And hey, sometimes you have a stupid question when you start and well, on SO you can get downvoted.

Is there any IP clearance?

Where does the source code come from? From the author? Really? What, if the author has stolen it? The Apache Software Foundation has mechanisms which protect users here. On GitHub it is not so clear. But even there are some companies who might (or might not) care about this. For example. Twitter backs My guess is this software has a clean IP. I try to use software only if I know where it comes from. This includes much Software from the ASF, some of Github or others, like

Are the people nice?

When I use Open Source, the creators become my team mates. I usually check out the source code from most projects I use. I want to look in side. Sometimes I have questions. I don’t want to work in a team full of idiots or egomans. If the people are nice, likeliness increases that I use it.


How is the project developed and how is quality controlled? At Apache Commons for example there are many people around looking at every change of each component. Sometimes lenghty discussions happen if we should drop JDK 1.3 support or not. Should we change the interface now or not - it would break backwards compatibility. At Commons there are various tools in place to check binary compatibility, bug freeness and so on. Continous Integration helps to keep quality alive. Finally a Vote is required to push a release, and every of the voters does his checks. It is not easy to get out an release at the ASF, sometimes it requires 5 or more release candidates. I have seen projects on GitHub doing them same.

Are there releases?

Some projects on GitHub do not make releases. They ask you to check out the source code and update it, when they change things. Wow, this is a real blocker. I want tested software, I want a release number. Agile development can become Agile chaos easily.

Are people speaking about it?

If I have never heard of software and nobody speaks about it, I am careful. It does mean nobody else can help you if you run into problems.

Are there docs?

Docs? Complete? Readable? With examples? If yes then good. If no, go away. If developers don’t care on their docs, they have no real interest in their community. Good code documents itself. Right. But I cannot read everything when I am in time pressure.

Is there community?

Is the project only a “we stop by and drop a fix” group or is it a real community? Communities have the benefits of group effects. If somethings smells nasty, they might fork. If something is wrong in a “stop by” group the group will die.

Is there one head?

If the project has 1000 forks with many changes which are not found in the trunk - which one should I choose? Better is one version control system. I don’t care if it is Mercural and Bitbucket, Git and Github or SVN at the Apache Software Foundation. I am just a user in most cases, f**k, the devs spent so much time, I leave it up to them how they want to develop! It is just important that I have -one- place were I get the real, cool, official sources.


Who is running the project? Is it one vendor? Are there many vendors? Or is it a collective of individuals? I am very careful when projects are backed by only one vendor. In case of Bootstrap (Twitter) I don’t care. The project is great, but so small I can replace if something wents wrong in a couple of days. If it comes to JBoss (Red Hat), I am a bit more careful. The strategy of JBoss already earned some critics back then. In this case I would prefer Apache Geronimo. This is another JEE container. At the ASF people are committing, not companies. Even when there are companies who pay committers to work fulltime on the projects (like with, there are always many other people who can continue, just in case. The ASF like having a good diversity of committers at their projects.


Is the license clear and understandable? Is there a LICENSE file in the source code? I prefer the AL2.0 license (or similar, like MIT) because I can do whatever I want with it. I do lots of Open Source work, but honestly, sometimes I need to sell my software and for that AL et al works best. Projects which do not have named LICENSE, or have a complicated “double license” model or a license like the GPL usually are not usable for me.


Thanks to Simone Tripodi and Maurizio Cucchiara for their valuable feedback on this post.

Tags: #Apache #Open Source #Software