Friday, April 19, 2013

Automated software testing is a major investment

Passing test suites attest as to the quality of the code being tested. Just as importantly, it gives the developer confidence to evolve the code, i.e. to refactor or add new features without being afraid that the changes break existing functionality. Without the safety net provided by the test suite, evolving software can be a perilous and nerve wrecking activity to which few developers survive intact.

In a nutshell, a good test suite is critical to the success of a project. You might think that since test code does not directly impact end-users, it should be cheaper to write and maintain. On the contrary, if one is serious about automated testing, then test suites do not come cheap.

For example, in the logback project, the size of test code roughly matches the size of the code being tested. Thus, one might estimate the cost of test code to be equivalent to that of the code being tested. In my experience, this is unfortunately not true. It turns out that reliable test code is surprisingly hard to write, especially for code with many external dependencies. So, in reality we spend more time on the test code. This is a price we are willing to pay in order to guarantee the quality of our product.

Given the above, it seems that consultants insisting that their clients attain 100% code-coverage are misleading their flock or they have test writing skills that many developers, including the author, do not possess.

Tuesday, November 29, 2011

Confusing intent and outcome

Mikeal Rogers recently published a blog entry entitled "Apache considered harmful". He writes:
The open source section of my brain was seeded and curated by Ted Leung, long time ASF member, and it is this ethos, Community > Code, that I've dedicated a significant portion of my life to. It is this ethos that has led me to the hard conclusion that as the world has changed Apache has become a net negative for its projects.

The Mikeal raises a fundamental question. Do the costs of developing software at the Apache Software Foundation, or any other foundation for that matter, outweigh the benefits? Apropos, what exactly are the costs and the benefits? Lets start with the costs.

The costs

The principal cost of developing software at Apache is loss of autonomy and freedom. The foundation owns your software. As far as your software development efforts are concerned, the foundation becomes your pointy-haired and clueless boss. In his very informative blog entry, Ben Collins-Sussman writes:
The ASF has a great set of cultural norms that it pushes on its communities via political means and lightweight processes
The reader should not be fooled by the seemingly innocuous reference to cultural norms, political means and lightweight processes. After all, the official name for North Korea is "The Democratic People’s Republic of Korea" and not the more accurate "The Totalitarian Stalinist Dictatorship of North Korea". In the same vein, DDT was qualified as "harmless" in the 1950's, as are its considerably more toxic equivalents of today. In the world of IT, everybody describes their software product as lightweight and simple. No body advertises their product as an over-engineered piece of crap.

Alright, coming back to Apache, once you join, you will be expected to obey rules disseminated throughout the organization many of which are open to interpretation. Rules are not necessarily bad. Any cohesive group must a have a set rules guiding its actions. It just happens that many Apache rules are forged by discussion and are not the result of careful experimentation. The conquest of false opinion by free discussion is a philosophical question. For a much deeper discussion of this topic, do yourself a favor and read Democracy is not a truth machine. Again, forging rules by discussion is not bad per se. It becomes problematic only when the group is formed by electing similar thinking individuals (co-opting). As time goes by, less and less opposing voices are heard since dissenters leave quietly and those who remain tend to agree with the prevailing dogma. As a corollary, co-opting favors convergence of shared vision and stricter adherence to group's rules over time. Whether such convergence is "desirable" depends on the "desirability" of the rules. I should emphasize that of group convergence is not all black-and-white but a matter of degree, from loose counter-cultural groups on one end, to cults on the other the extreme. It would be a great exaggeration to qualify Apache as a cult.

More concretely, Apache projects cannot designate a formal project leader. Every committer has strictly equal rights independent of past or future contributions. This rule enjoys very strong support within the ASF. It is intended to foster consensus building and collaboration ensuring projects' long term sustainability. Worthy goals indeed! However, one should not confuse intent with outcome. Note that strict committer equality contradicts the notion of meritocracy, one of the core tenets of the ASF.

As I have argued in the past, rejection of project leadership in addition to lack of fair conflict resolution mechanisms constitute favorable terrain for endless arguments. In case of conflicts arising during the lifetime of an Apache project, participants often resort to filibustering. In an all-volunteer project, where most of the participants are only lightly invested in the project, project disruption is almost cost-free to the disruptor. In nature, one of the opponents in a struggle eventually backs off because he or she has something to lose, typically the risk of bodily harm. At Apache, the only cost incurred by the disruptor is the time taken by the disruption. In a highly-competitive industry such as IT, wasteful expenditure of time may imply software stagnation and eventually obsolescence.

When all else fails, a project can ask for the Apache Board's intervention to settle an argument. In the examples I've witnessed, lacking the proper context, the board will tend to intervene with the finesse of an elephant in a china-shop.

As another example, consider the Apache Board position on author tags. The board minutes from February 2004 state:
  - author tags are officially discouraged. these create difficulties 
   in establishing the proper ownership and the protection of our 
   committers. There are other social issues dealing with collaborative 
   development, but the Board is concerned about the legal ramifications
   around the use of author tags
I fail to see the legal ramifications the Board has in mind. However, given copyright law in continental Europe, and in particular in France, removing author tags infringes upon the author's moral rights which are perpetual and inalienable. In short, I would argue that removing author tags is illegal in continental Europe. The board statement also mentions social issues in relation with collaborative development. The idea here is that developers should not mark code that they contribute as their own to avoid territorial conflicts. Thus, the board indirectly discourages developers to take pride in their code. Moreover, it removes one of the currencies in which open source developers are compensated: increased prestige.

The benefits

As far as I can tell, Eclipse, Apache and the FSF all give unconvincing explanations about the benefits they provide. The main benefit of joining any organization is being part of said organization. Seriously, tautology aside, once you join a FOSS foundation, you automatically assume part of the aura of that foundation and have better access to its other members. Mike Milinkovich of Eclipse-fame lists the following benefits of FOSS foundations:
  1. IP management
  2. Predictability
  3. Branding and community
  4. Industry collaboration

Regarding IP management, it is not too difficult to determine whether the dependencies of a software project are compatible with your chosen licensing policy. Eclipse takes the additional step of contacting the authors of projects to ascertain ownership. Who cares?

Regarding branding, it is by now apparent that Apache, Eclipse and the FSF all have very good as well as utterly useless projects. The same is true for software projects outside any FOSS foundation. While software projects in a foundation are perceived to be on average of better quality, many of the top 1% of FOSS projects are outside the control of any foundation. I suspect that given the overhead involved, being part of a foundation ultimately hurts quality. In any case, FOSS foundations do not oversee product quality, they only ensure that projects are alive and follow IP-related guidelines. (Apache projects in addition have to prove diversity before graduating from the incubator.) Good software will stand on its own and does not need any additional branding. If your software responds to an unmet need, users will come in their numbers.

As for industry collaboration, depending on project size, belonging to a foundation may truly help collaboration among corporations. I also think that collaboration can be achieved outside any foundation for large as well as small organizations.

In the early days, joining Apache felt like joining a friendly and informal group of capable geeks. This aspect of Apache is not mentioned often enough.

Recommendations

Thus far, FOSS foundations have been quite successful in attracting software projects. In fact, the Apache Software Foundation has been so successful that it can now brag about not having to advertise to attract newcomers. The prevailing circumstances in 1999 allowed Apache to attract key Java software projects such as Ant, log4j, Struts and Tomcat. From then on, the immense aura of these early projects allowed Apache to be selective. However, by instituting rules largely tangential to software quality, the ASF is squandering the good-will capital it had accrued a decade ago.

Bertrand Delacretaz describes Apache as the turbocharged Diesel of Open Source and Matt Asay compares Apache to IBM. My comparison is less flattering. I consider Apache analogous to a carnivourous plant trapping and slowly digesting insects. As unsuspecting insects find out, it is easy to step foot inside the rolled leaves of a pitcher plant but escaping it unscathed is a different matter. Donating software to Apache is a one-way street, you can enter but you can't leave without being subjected to the digestive juices of Apache dogma. I've been told that Eclipse is better managed and less dogmatic. I don't buy that. I believe that the disadvantages described here apply to software donations made to Eclipse as well as other FOSS foundations. You can enter a FOSS foundation but you cannot leave, much like an insect on the surface of a carnivorous plant.

For the reasons presented here, I urge you to not fall into the trap of complacency. Setting up and maintaining the infrastructure required for a software project takes time, but is becoming cheaper and easier by the year. If your software meets a need, then you don't need the brand-value associated with joining a FOSS foundation. It takes time and effort but by being consistently open, friendly and respectful you can attract long-term contributors to your open source project.

In short, if you are thinking of donating software to and joining a FOSS foundation but have not actually done so, don't, joining is not worth the trouble. If you are forced to join a FOSS foundation, well, by definition you don't have a choice.

If you have already joined a foundation but would like to get out, try convincing the other project members to leave. If they share your point of view, then get out and liberate your software. If a sizable majority of the project members oppose departure, then there are two cases to consider. The software project you are trying to liberate is 1) marginally to moderately popular or 2) highly popular. In the first case, you can leave and start over to create a better product. Your users, few in this case, will follow. In the second case, unless you are willing to put all your energy in the forked project and continue to do so for a very long time, I am afraid you are stuck with the foundation. It might come as a surprise but your users have a radically different agenda than yours. They will consistently value stability over your purportedly innovative but unproven fork. Thus, you need to swallow your pride and learn to live with the foundation's rules regardless of how inane you find them. Patience is your best friend. With enough patience, most arguments will eventually reach a reasonable settlement.

Now, if you have joined a FOSS foundation and are thrilled with the outcome, then this post does not apply to you - until it does. The risk of the management/board raining down its wrath upon your project always exists and is non-negligible. After all, it is the management/board's prerogative to shutdown projects it disapproves of.

Or course, I am making a number of assumptions which may be totally incorrect in some relevant context. Studies indicate that former cult members tend to shade the truth and blow out of proportion minor incidents, turning them into major incidents. My main point however, that there is an important distinction between intent and outcome, is hopefully as valid as it is generally applicable. Organizational ethos of FOSS foundations do not correspond to reality. You don't want to obey rules imposed left and right, writing FOSS software should be first and foremost fun! Why would you want to put up with masters if you do not have to?

Thursday, November 24, 2011

On fear of insects and insecticides

On a recent flight back home from a tropical island, the captain announced that a stewardess was about to spray the cabin with insecticide. "French authorities require that the cabin be sprayed with insecticide on departure. The product is harmless to humans and was approved by the appropriate authorities." A few seconds after the announcement, a stewardess gleefully walked along the length of the airplane, holding two cans over her shoulder and dispensing a mist of particles.

No one seemed to be disturbed by this scene. By total coincidence, I had just began reading chapter 7 "Needless Havoc" in Rachel Carson's Silent Spring. In this chapter, Carson describes the drastic and insanely aggressive steps taken by midwestern states (USA) to stem the westward spread of the Japanese beetle.

After a few moments of hesitation, I worked up the courage to talk to the stewardess about the product that was just sprayed around the cabin.

Me: "What is the product that you just sprayed?"
Stewardess: "it's insecticide, you know, to kill insects".
Me: "Right, but what type of insecticide?"
Stewardess: "Why? Are you allergic?"
Me: "No, I mean what kind of chemical compound?"
Stewardess: "Oh, I don't know. Let me ask the head steward".
Stewardess: "He said that I should show you the can. Is that OK?"
Me: "Yes, that is fine. I am seated in 35B".


A few moments later she returned and handed me the can. Its back label listed at least twenty chemical compounds by reference number which were meaningless and impossible to memorize. However, the back label warmed about mixing the product with water and to avoid oral absorption. It further stated that the product would "kill all insects within at most 6 minutes". How can a product be potent enough to kill insects so quickly and yet be harmless to humans? Is arthropod physiology so different than our own? With heightened interest, I proceed to study the front label. It had an illustration of an airplane under which the words "AMS 1450A - Product of France" were written. The stewardess returned and picked up the aerosol can. She left holding a plate of food in one hand (presumably to be served to a customer momentarily) and the can in the other. A little googling in the comfort of my home several hours later yielded the exact image of the front label. It is reproduced on the left side.

The term "AMS 1450A" designates an SAE specification in the Aerospace Material Specifications category. There are several AMS 1450A compliant products. These seem to typically contain 2% permethrin and 2% d-phenothrin.

Both substances belong to the family of synthetic chemicals called pyrethroids, i.e. neuro-toxins which prevent sodium transport in nerve cells. These toxins keep nerve channels in their open state, so that the nerves cannot de-excite, causing the organism to be paralyzed.

Us humans and other mammals are less vulnerable than invertebrates to pyrethroids because of our higher body temperatures, less porous skin (as compared to exoskeletons) and most importantly the detoxicating capabilities of our livers. Both permethrin and d-phenothrin are lethal to insects and to aquatic lifeforms. The lethal concentration for the rainbow trout is 1.4 parts per billion. If that was not alarming enough, pyrethroids bio-concentrate in fish and the bio-concentration factor increases dramatically (by at least 2800) when combined with other chemicals used in insecticides, e.g. piperonyl butoxide.

Here is an excerpt from the Silent Spring explaining the meaning of bio-concentration aka bioaccumulation.

Can we suppose that poisons we introduce into water will not also enter into these cycles of nature?

The answer is to be found in the amazing history of Clear Lake, California. Clear Lake lies in mountainous country some 90 miles north of San Francisco and has long been popular with anglers. The name is inappropriate, for actually it is a rather turbid lake because of the soft black ooze that covers its shallow bottom. Unfortunately for the fishermen and the resort dwellers on its shores, its waters have provided an ideal habitat for a small gnat, Chaoborus astictopus. Although closely related to mosquitoes, the gnat is not a bloodsucker and probably does not feed at all as an adult. However, human beings who shared its habitat found it annoying because of its sheer numbers. Efforts were made to control it but they were largely fruitless until, in the late 1940s, the chlorinated hydrocarbon insecticides offered new weapons. The chemical chosen for a fresh attack was DDD, a close relative of DDT but apparently offering fewer threats to fish life. The new control measures undertaken in 1949 were carefully planned and few people would have supposed any harm could result. The lake was surveyed, its volume determined, and the insecticide applied in such great dilution that for every part of chemical there would be 70 million parts of water. Control of the gnats was at first good, but by 1954 the treatment had to be repeated, this time at the rate of 1 part of insecticide in 50 million parts of water. The destruction of the gnats was thought to be virtually complete.

The following winter months brought the first intimation that other life was affected: the western grebes on the lake began to die, and soon more than a hundred of them were reported dead. At Clear Lake the western grebe is a breeding bird and also a winter visitant, attracted by the abundant fish of the lake. It is a bird of spectacular appearance and beguiling habits, building its floating nests in shallow lakes of western United States and Canada. It is called the ‘swan grebe’ with reason, for it glides with scarcely a ripple across the lake surface, the body riding low, white neck and shining black head held high. The newly hatched chick is clothed in soft gray down; in only a few hours it takes to the water and rides on the back of the father or mother, nestled under the parental wing coverts.

Following a third assault on the ever-resilient gnat population, in 1957, more grebes died. As had been true in 1954, no evidence of infectious disease could be discovered on examination of the dead birds. But when someone thought to analyze the fatty tissues of the grebes, they were found to be loaded with DDD in the extraordinary concentration of 1600 parts per million. The maximum concentration applied to the water was part per million. How could the chemical have built up to such prodigious levels in the grebes? These birds, of course, are fish eaters. When the fish of Clear Lake also were analyzed the picture began to take form—the poison being picked up by the smallest organisms, concentrated and passed on to the larger predators. Plankton organisms were found to contain about 5 parts per million of the insecticide (about 25 times the maximum concentration ever reached in the water itself); plant-eating fishes had built up accumulations ranging from 40 to 300 parts per million; carnivorous species had stored the most of all. One, a brown bullhead, had the astounding concentration of 2500 parts per million. It was a house-that-Jack-built sequence, in which the large carnivores had eaten the smaller carnivores, that had eaten the herbivores, that had eaten the plankton, that had absorbed the poison from the water.

That is bio-concentration, the tale of the house that Jack built.

D-phenothrin is also extremely toxic to bees, with 2 micrograms being sufficient to kill a bee. Curiously enough, cats are also very susceptible to pyrethroid insecticides. After coming to contact with pyrethroids, cats show at best severe clinical signs such as convulsions, vomiting or diarrhea or at worst die. Why are these "harmless" insecticides deadly to cats? Presumably because cats meticulously groom their coats and lick their paws thus ingesting the toxins dispersed on their outer body. (Both permethrin and d-phenothrin are relatively persistent pyrethroids.) In short, in the minute of the minutest concentrations these neuro-toxins are deadly to insects and to aquatic life and toxic to mammals in higher concentrations.

Returning the subject back to the aircraft cabin, flight attendants are particularly exposed to aircraft desinsection. Given the nonchalant attitude of the stewardess I witnessed, flight attendants appear to be oblivious to the health risks involving the frequent application of these neuro-toxins.

How about long term affects of these chemicals? Studies indicate that pyrethroids can cause anemia, liver failure, kidney failure, hormonal imbalances, miscarriages, hydrocephaly and/or brain atrophy in offspring, and may be carcinogenic in the long term. We should also keep in mind that mixtures of low concentrations of pesticides can become toxic. For example, it has been known for a long time that malathion, a pesticide that is widely used in agriculture and well-tolerated by humans, becomes deadly when mixed with other undisclosed but readily available chemicals. And you were worried about second-hand smoking!

Worried about our own health, it is all too easy to forget the havoc insecticides and herbicides are wreaking on the environment. According to government surveys, "almost every time and place that you observe a stream or river in a populated area you are looking at water that contains pesticides, inhabited by fish that contain pesticides." Pest control seems like a trivial problem compared to irreversible poisoning of our soils and drinking water. Are the overseeing authorities sleeping on the job? The Center for Biological Diversity alleges just that in a incriminating report entitled "Silent Spring Revisted". On the other side of the spectrum, a medical officer from the World Health Organization claims that pyrethroids are not toxic to humans and health risk is in not disinsecting aircraft.

It appears that public awareness of the dangers posed by insecticides is null or non-existent. For example, the policy document of the Swiss green party mentions support for banning Genetically modified organism's (GMO's) but does not once mention insecticides or herbicides. The policy document of the Canadian green party mentions insecticides only once in relation to banning their use on school premises. Like that is a big help! We collectively seem to fear insects more than we fear insecticides. The latter although unseen and intangible are far far more dangerous to our environment and by ricochet to us.

The phrase "No more bees, no more pollination, no more plants, no more animals, no more man" has been attributed to Einstein. However, the same can be said of earthworms as well as many other lifeforms. Earthworms, by their burrowing and digestive actions, considerably improve soil fertility. They are quite sensitive to the application of fertilizers and pesticides, in particular arsenical ones. Coincidentally, cigarettes have become increasingly toxic over the years because the soils of tobacco plantations are now thoroughly impregnated with residues of a heavy and relatively insoluble poison, arsenate of lead.

Thought experiment

Dragonflies are the natural predators of mosquitoes and are known to be effective in controlling mosquito populations. Given the much longer life-cycle of dragonflies and their lesser numbers compared to mosquitoes, Darwin suggests that dragonflies will acquire immunity to pyrethroids later than mosquitoes. Consequently, we could reasonably assume that for a period of time, dragonflies will be more susceptible to pyrethroids than mosquitoes. In this state, the predator-prey relationship is inverted: pyrethroid-immune mosquitoes will contain enough toxin to poison dragonfly feeding on immunized mosquitoes. It follows that the application of a wide-spectrum insecticide such as the pyrethroids in a given region could actually cause mosquito populations to increase in that region due to the elimination of dragonflies, the natural predator of mosquitoes. The astute farmer noticing the increase in the mosquito population will be tempted to spray insecticide on his farmland in even higher dosages, further polluting the environment. The need for higher dosages will in turn pressure the supervising bodies such as the EPA to modify the authorized toxicity limits. Unfortunately, this scenario is not merely fictional. On November 9th, 2011, the EPA issued a risk assessment for the pyrethroid class of insecticides and has decided to reduce the safely factor from 10x to 1x for adults and children over 6 years of age. WARNING: The moment you understand what this means, your head may explode instantly.

With the exception of India, today all nations ban the production and use of DDT. However, DDT is orders of magnitude less toxic than the toxins in use today. The only difference is in chemical persistence. DDT lasts 30 years whereas today's toxins, e.g. pyrethroids, last typically only a few months. Considering that insecticides are applied repeatedly and everywhere, we can conclude that not much has changed since 1962 when Rachel Carson published the "Silent Spring". The names of the toxins have changed but not the fundamental approach to pest control: "Kill them all, for the Lord will recognize His own." Instead of targeting all-living things, we now target all-living things except mammals. How light-handed of us!

There is plenty of evidence suggesting that the pyrethroid class of insecticides pose a risk to humans. Even if pyrethroids were perfectly safe for humans and all other mammals (which they clearly are not), targeting such a large class of living creatures, the insects, is indiscriminate and ultimately irresponsible.

PS. Come to think of it, an adult mosquito flown from the southern hemisphere could not survive in the dead of winter in Paris. Thus, not only is aircraft desinsection is environmentally dangerous it also attempts to fix a non-problem, at least in winter.

Friday, August 26, 2011

Is Scala worthy of your trust?

The Scala language offers significant improvements over the Java language with traits, higher order functions and type inference among other powerful features. At the same time, Scala still allows for seamless import and use of existing classes written in Java. You can migrate to Scala piecemeal, for example in your test classes at first and then migrate larger and larger chunks of code.

However, there is one aspect to the Scala language which I find deeply annoying. Scala keeps breaking binary compatibility with every new release. In spite of previous promises, compatibility was broken in release 2.7, broken again in release 2.8 and broken yet again in 2.9. As I understand it, Scala language designers are forced to breaking compatibility whenever Scala library traits change in an incompatible way.

When a binary breakage occurs at the language level, the whole ecosystem for the language has to align itself with the new release. This is an extremely painful process affecting all users of the language. Even if a user does not want to upgrade to the latest and greatest Scala release, as long as a single tool, say T, in the tool-chain of the user upgrades and the user upgrades to the new version of T, then kaboom! All other project dependencies need to be upgraded as well.

If you decide to upgrade to the newest version of Scala in your project, you will also need to update every single dependency in your project (written in Scala). If you are lucky and every single dependency has made a release for the latest version of Scala, your project will build fine after the update. Otherwise, if a single dependency has not made the required release, you are left with two relatively unpleasant choices. You can either revert to the previous version of Scala or remove the non-compliant dependency.

If the Scala update was triggered by an IDE update, reverting to the older version of Scala may be particularly painful. If removing the non-compliant dependency is impossible, you will be hung out to dry.

As noted earlier, Scala language designers break compatibility for good technical reasons related to traits. The language is improved and cleaned up with every version, unlike Java which accumulates cruft. In other words, there is a good side to breaking compatibility. Preserving compatibility is an immensely intricate problem with a wide range of consequences. However, it is ultimately a political decision balancing between stability and change.

Once you have tasted the expressive power of Scala, it is hard to go back to program in Java. Once you have tasted the stability of Java, it is hard to put up with the brittleness of Scala. It's a non-ideal world out there.

Tooling proposed by Typesafe detects breakages and ensuring compatibility in minor versions. This tool is similar to clirr which has been around for a long time. Typesafe's response to the binary compatibility issue confirms my suspicions that the issue is still largely misunderstood by Typesafe. Typesafe subscription, the Migration manager or taking over a larger set of core libraries by Typesafe do not ensure that upgrading a project to the next version of Scala will go smoothly.

Assuming Scala continues to break compatibility in the foreseeable future, then I'll go out on a limb and make the following predictions:

The current situation limits the Scala user-base to a relatively small niche of enthusiasts. The small user-base hinders the development of a large Scala eco-system which further limit growth of the user-base, creating a vicious cycle.

Apparently, only few people complain about Scala's existing compatibility policy. Presumably, the Scala community has entered a comfort zone where existing users have grown accustomed to the current situation. For example, SBT makes it easy for authors of Scala libraries to generate artifacts for multiple versions of Scala. However, SBT is not suitable for projects which offer Scala-based extensions but otherwise are centered around Java. Thus, Scala's current compatibility policy makes it hard for Java projects to offer Scala-based extensions. I, for one, would love to offer a Scala-based configurator for logback (in addition to XML and Groovy based configurators) but have no intention of migrating our build to SBT.

One might also forget that the vast vast majority of developers will vote with their feet. They will simply walk away instead of engaging the Scala community for the preservation of binary compatibility. This, Scala will probably continue to be attractive for projects where occasional compatibility breakages are acceptable. Of course, the set of projects where breakages are unacceptable is... non-negligible.

The upcoming Java 8 with support closures will be a big leap forward for the Java platform. Competing languages will eventually close the gap, and Scala will stop being cool.

Saturday, October 30, 2010

Using Groovy and Scala, my first year

On Groovy

After programming almost exclusively in Java for the last 14 years, I picked up Groovy during the spring of 2010. If my memory serves me correctly, it took me about a week. Groovy is a powerful and worthy language. Many things which are hard to do in Java are easy in Groovy. The ExpandoMetaClass makes it very easy to write amazingly powerful DSLs in a jiffy. To give you a concrete example, I could replicate logback's XML-based configuration sub-system of about 10'000 lines of rather intricate code with about 500 lines of Groovy. The groovy-based configuration DSL for logback is better than its XML counterpart in every respect. Not only is the groovy syntax much shorter than XML, it is also internal. This means that the logback configuration DSL preserves all the power of Groovy, a fully-fledged programming language. In short, Groovy is fantastic for writing internal DSLs.

Now the bad news. Contrary to Java, Groovy has a dynamic type system which is another way of saying that Groovy has dynamic dispatch. Dynamic-dispatch allows for a lot of flexibility as offered by ExpandoMetaClass and mixins but at the cost of a very significant performance hit. Last but not least, the Groovy compiler skips type checking which leaves you to discover errors at runtime.

A basic for-loop can be 1000 times slower to execute in Groovy than in Java. I don't think it is possible to build performance critical systems using Groovy. But I might be wrong and my groovy-is-slow gripe being as irrelevant as C++ folks complaining about the "slowness" of Java byte code.

By the way, my current IDE of choice is IntelliJ IDEA which has pretty decent support for Groovy as well as Scala. I switched away from Eclipse because of its shoddy support for Groovy and even worse support for Scala.

On Scala

Compared to Groovy, Scala has been harder to learn for me. While Groovy can be learned piecemeal, one has to know a lot of Scala before being able program in Scala. I think this is true for any new language, groovy being a rare-exception since it is essentially a super-set of Java.

In exchange for a steeper learning curve, Scala rewards the apprentice with what seems like a fresh perspective on programming: the functional view. Like Java, Scala is statically typed which means that Scala generated code can be as efficient as Java generated code.

As statically-typed language, the Scala compiler catches many programming errors early in the development cycle. However, thanks to its amazingly powerful type-inference system scala dispenses with much the boilerplate that is found in other statically typed languages. Scala also provides strong support for writing internal DSLs. I have written a configuration DSL for logback in Scala. The still unpublished result is as flexible and powerful as its Groovy counterpart and only slightly more verbose. On the other hand, the Scala version is strongly-typed which means that one can write logback configuration files with code-completion provided by the IDE. Is that cool or what?

Scala has a long series of very neat features. I like Scala to the point of animating the Scala Enthusiasts Group in Lausanne, Switzerland. Yes, the Scala requires some effort to learn. Yes, the Scala compiler is slower than most. Yet, those disadvantages are compensated by the consistency and the sheer expressive power of the Scala language.

On the downside, Scala has a consistent history of braking compatibility between versions. For example, code compiled with Scala version 2.7 will not work with Scala version 2.8. Starting with Scala version 2.9, the Migration Manager will purportedly address this binary-format incompatibility problem.

However, from what I understand, Scala suffers from a much hairier compatibility problem. Indeed, Scala traits are copied/folded into code importing the trait. This means that client code compiled with Scala version 2.x.y-revision_A might not be compatible with Scala version 2.x.y_revision_B (with x and y unchanged) if some trait changed between releases.

If this assertion is true, and I pray the God of Evolution that it is false, then
Scala suffers from a fatal flaw. There is no way an ecosystem of libraries written in Scala can prosper when one has to worry about the micro release version of Scala used to compile a given library. Applications using just a couple of libraries written in Scala would be unmanageable. It is pretty ironic that Scala forfeits one of Java's crucial advantages, namely cross-platform compatibility, in exchange for multiple-inheritance as provided by traits.

Again, I may be completely misunderstanding the implications of Scala traits. If you know better about how Scala traits work, you are hereby invited to set the record straight.

Update

According to new information I received, it appears that traits are imported via stubs. Let A be a class importing a trait T. The compiler will create a stub for T in A and invocations of T's methods in A will be delegated to an encapsulated instance of T by the stub. The code in T is not copied into A, only T's method signatures are directly visible in A.

It follows that A compiled with one version of Scala and T will be compatible with a different version of Scala as long as T's interface, i.e. method signatures, remains the same (or change in a compatible way) and the Scala language versions are binary compatible.

In other words, traits are not a source of new incompatibility issues as long as the method signatures of a trait are not changed in incompatible ways, much like Java interfaces can only change in certain limited ways in order to preserve compatibility.


QOS.ch, main sponsor of cal10n, logback and slf4j open source projects, is looking to hire talented software developers. If interested, please email your resume to hr@qos.ch.

Tuesday, May 25, 2010

Committocracy as an alternative for conflict resolution in OSS projects

I my previous post I presented an unsparing criticism of Apache's voting procedures. In this post, I describe a new voting procedure where each committer's voting rights derive from the number of commits made by the committer in question. To avoid rewarding micro-commits, only a single commit-point is awarded per day for one or more commits. No points are awarded for a committer who makes no commits on a given day.

Non controversial questions are settled by consensus. However, whenever a decision cannot be reached by unanimous agreement, a vote is called for. The commit-point for or against a motion are summed, the total accumulated commit-points determining the outcome.

Such a system endows a collaborative software project, open source or not, to reach agreement in a timely and orderly fashion. It is much less vulnerable to disruption by unscrupulous participants than Apache's current voting procedures. I designate an organization where voting power derives from the number of commits made by an individual, typically a software developer, as a committocracy.

Feasibility

Given any reasonably-open version control system, it should be fairly easy to write software which assigns commit points to each committer. During a vote it is a matter of simple arithmetic to tally the commit-points expressed for or against a motion.

In the case of git, the following command can be used to compute the commit-points accumulated by Alice.
git log --format='%ad %an' --date=short|uniq|grep Alice|wc -l

Worse of both worlds or best of both worlds?

A committocracy is less efficient than the BDFL model for decision making, and compared to the Apache-way, it grants less power to newcomers. However, a committocracy is a fair system in the sense that the same rules apply to all. Today's committer with the most committer-points can be different than that of tomorrow. Moreover, compared to the Apache-way, a committocracy drastically reduces the risk of a project going haywire after admitting a new member. As a corollary, a project can safely reduce the wait-and-see period preceding the admission of new committers. Thus, newcomers may be granted committership status more quickly.

Psychological effects

Immediately after electing a new committer, accepting him or her as an equal member of the community has powerful positive effects on the psyche of the electee. In a minority of cases it results in the electee becoming a major contributor of the project. However, this positive effect is often short-lived. After a few months of modest activity, it can even transform itself to sense of entitlement which can be devastating at the hands of a less-than-scrupulous committer wielding veto powers. Fortunately, most people are fairly decent and do not abuse their rights in any blatant way.

In a committocracy, voting power accrues with each day involving a commit. As the entitlement of the committer grows with each contribution, the positive psychological effects of belonging to the community may be longer lasting, in particular because accrual in voting rights it is deserved, i.e. based on "merit". However, since I am not aware of any committocracy in existence, I can only speculate on the longer duration of said positive effects.

Is a committocracy a meritocracy?

No, not exactly. Granting one voting point per day rewards participants most committed to the project over a lengthy period. It does not directly take into account the value of each commit. In a very general sense, granting veto rights to all committers can also be considered as a meritocracy because it rewards participants with most endurance in arguments. It is just a different, and in my opinion less pertinent, measure of merit.

Does it matter?

What is the point of a committocracy , since it obviously does not perfectly capture the notion of merit? Moreover, why change the current voting system if only a few people complain about it?

While a committocracy is not a perfect or even a very good measure of merit, it is still a much better measure than a 0 or 1 switch correlative with veto powers. Since OSS contributors surrender so much of their independence to the OSS organization, in this case the ASF, the organization must ensure that its rules are as just as possible.

Friday, May 21, 2010

The forces and vulnerabilities of the Apache model

The initial title for this article was "Why the Apache model sucks". It would have been a catchier title but would taint my arguments with triviality. But it was the first title that came to my mind and you should be aware of that.

I have written about Apache in the past past and the present post is a rehash with a slightly different emphasis. Before laying further criticism at the altar of the "Apache Way", it must be mentioned that Apache is one of the most open and transparent organizations I know of. Transparency comes at a price. If one is allowed or even encouraged to voice criticism, the voices of the critics may drown the successes of the organization so that the state of affairs may seem bleaker than it is really is.

On the other hand, an organization can be wildly successful even if some of its governing rules prove to be counter-productive on the long run. As a very extreme example, in the Ottoman-empire, one of the most successful and relatively recent empires in history, whenever a new sultan acceded to power he would proceed to kill all his brothers. Not only fratricide was authorized, in the absence of clear succession rules other than the survival of the fittest, the people actually expected the new ruler to kill all his brothers so that the strongest ruler could emerge. The rulers who chose to spare the lives of their brethren were accused of not getting the Ottoman-way. Suleiman the Magnificent went as far as killing his own son Mustafa after bogusly charging him of treason.

One of the the core tenets of Apache is meritocracy. If Apache is a meritocracy, it is a lousy one and this has consequences. If some developer, say David, shows interest in some Apache project, say Prokayotae, and contributes by submitting patches, or by responding to users on the mailing list for a period of time, usually three months or more, then he will be awarded committership. As Prokayotae committer, David can veto decisions of the project – yes, that is veto with a 'v'. After a few years of good behavior, David could even be co-opted to become a member of the Apache Software Foundation. Being a member is a nice honorary title but does not carry further entitlements at the project level. As a member David can indeed vote for the Apache Board during elections but that is mostly inconsequential at the project level.

Coming back to Prokayotae, the newly elected committer David wields the same voting power as Carol, who has been actively contributing to Prokayotae over several years. In most cases David will be a reasonable individual who play nice and will naturally defer to Carol's and other committers' opinions by virtue of common decency and their past involvement in Prokayotae. However, while most new developers are reasonable individuals and make every effort to play nice, some people are less reasonable. Even a modestly successful project will have over a dozen committers elected over its lifetime. Thus, the probability of electing an "unreasonable" individual increases over time, and I'd venture to say, approaches certainty.

Let Ursula be such an unreasonable person elected as a Prokayotae committer. Ursula maybe otherwise a nice person overall and a valuable member of the community she lives in. But, as a fervent believer in Thor, one night Ursula has a dream where Thor ordains her to oppose the pernicious influence of Carol over the Prokayotae project. Of course, the motivations for Ursula's opposition to Carol may have origins other than Thor appearing to her in a dream. The potential reasons are innumerable. It may be as simple as a design principle lambda that Ursula reveres but which Carol reveres less.

Invested with this new mission in life, Ursula begins to challenge Carol by vetoing her commits or merely intimating that she will veto them based on the lofty lambda design principle. Were Ursula's objections patently silly, she will be revealed as a fool. For example, invoking Thor's appearing in her dreams as a justification for her opposition will expose her to ridicule. However, any reason which is based on some apparently reasonable pretext, often adherence to a lofty principle, provides sufficient cover for long-term disruption. Every developer knows that the IT industry can claim more than its fair share of high and often contradictory principles. Ursula will make convoluted arguments, misrepresent facts and argue endlessly. Given that each new commit takes ages in endless arguments, further development in Prokayotae will be severely disrupted and may even cease.

Carol who has invested heavily in Prokayotae and got along well with the other developers may be shocked and ill-prepared to deal with Ursula's disruptive interventions. Carol may be further shocked to see other committers sitting on the fence with respect to Ursala's arguments cloaked under an important design principle such as lambda. At this stage, Carol must tread with utmost care because she formally has the same voting rights as any other committer, including Ursula. If Carol dares to claim that her arguments should carry more weight in light of the volume of her past contributions, she will immediately lose her credibility.

The Apache culture aggressively enforces its egalitarian model. Carol, even if she effectively led the project for several years, is not allowed the title of project leader or project manager. Author tags in source code are also frowned upon with the Apache board having formulated a statement discouraging their use. Hints at unequal merit are met with condescension and/or alarm.

As a result, if Carol tries to explain the injustices of the Apache way she will be branded as clueless and subsequently ignored. Being ignored is Apache's equivalent to being ostracized in the pre-Internet age.

As most people who face injustice and have a choice, Carol will leave Prokayotae as I have left log4j to start the SLF4J and logback projects. Leaving log4j was one of the most traumatic experiences of my life. Trustin Lee, of Netty and Mina fame, had an analogous experience. He apparently could not bear to see his work being vetoed over a triviality.

Those favoring the current voting system, while recognizing its lopsidedness, argue that letting everyone having an equal voice fosters communication between developers. Admittedly, the current set up is conducive to collaboration as progress can only be achieved after reaching consensus. As a corollary to this line of reasoning, letting all arguments be heard and all objections raised at every step of the development process must surely lead to best possible software.

Unfortunately, in the absence of a fair conflict resolution mechanism, having lots of ideas floating around emanating from different people does not lead to the emergence of the best software design insofar as it promotes bickering and political paralysis. In such a system where development can be easily disrupted, it is not the best ideas that win but the ideas advocated by those possessing the most stamina.

Instead of trying to learn from past failures which open discussion is supposed to encourage, Apache forges on in the path of egalitarianism. As time passes, I see attempts at institutionalizing egalitarianism instead of recognizing its inherent injustice. For the sake of a bogus ideal, the Apache-way expects selflessness on the part of the developer in the same way that the catholic church expects celibacy from its priests. If egalitarianism is really at the core of the Apache way as an absolute value, then the Apache way sucks. Yay!

While the one person one vote principle applies to a democracy in order to run a country for the benefit of all, the one committer one veto principle is ill-suited in a purported meritocracy such as Apache. If it must be "one committer one veto", then the word meritocracy cannot be honestly ascribed to Apache. It it would be most appropriate for the ASF to stop misrepresenting itself as a meritocracy, at least not without clearly defining the meaning of merit.

The Apache culture fails to recognize that project participants may have drastically different investments in a project. Allowing a person with near-zero involvement in a project to hold the same weight as a person with 10'000 hours of investment is to date an essential part of the Apache-way. According to the prevailing Apache culture, if you don't completely agree with this premise, body and soul, then you just don't get it.

I hope that this blog entry will incite OSS organizations, including the ASF, to adopt fairer decision making procedures. If that does not happen, before joining such an organization, developers should at least be aware of the inconveniences associated with lack of fair and orderly decision making mechanisms at the project level. If knowing these inconveniences, a developer decides to join anyway, then it will be the result of an informed decision.