Monday, August 1, 2011

Should Static Type-Checked Compilation Come to Groovy?

Please note: As always, I speak for myself alone and not as a representative for any project or company. These are my opinions and reading of an ever-changing situation.


Update: This was a pretty stupid thing to post, because everything is in flux and these are just some thoughts.

Why Should Groovy have a Static Mode?

There were some internal discussions about "Grumpy mode" among some of the Groovy committers. I had a lot of reservations about introducing static-typing or static-mode into Groovy. When I thought about the idea I come up with many unanswerable questions. Is it a good idea or not? Only time will tell. Here are some of the questions that I wish I had answers to.

What exactly do users want?

Do people want a static language with a dynamic backdoor, such as the way C#/.NET has a dynamic keyword or the way you can use Groovy++ by default with the .gpp file extension and then sprinkle in @Dynamic annotations? This is one approach, where the core of your language is statically typed but you have an option to escape into dynamic-land. Or do people want a dynamic language with a static backdoor, which Groovy will be (as far as I know) one of the few examples of. This use case is the same as adding an @Typed keyword to your Groovy code to enable static compilation, or adding an @Grumpy keyword. Or do people just want better performance from Groovy, end of story?

If you want a statically typed language without the verbosity of Java then use Scala. Yes, Scala is complex. But Groovy is just as complex. Does anyone really know Groovy without reading a few books? Yes you can get things done, but understanding metaprogramming, understanding method dispatch, and understanding all the closure options (delegates, delegate-first, trampoline) is just as complex. I am excited about Kotlin and think it looks like a great statically-typed language, but at this point it is nothing more than a slidedeck and a wiki for anyone that isn't on the IntelliJ IDEA development team. It's not available to anyone.

I honestly don't think a Grumpy mode will boost my productivity much. Will it aide in language adoption? I say probably not. In my opinion there are three large barriers to Groovy adoption in the mainstream: performance, tooling, and bugs. Not being statically typed is a connoisseurs criticism, and connoisseurs generally chase after things that are interesting rather than good (it's been said about me as well). Groovy is a language that appeals to people that just want to get something done; it's about productivity not about programming language theory. Adding static typing might please some high-end connoisseurs if it is done in a good way, but I have trouble believing it will please many real users. Search the web for "Groovy sucks". You won't find pages about dynamic type systems, you'll find performance benchmarks. In my personal experience, people I talk to about Groovy++ are asking about performance. Sure, traits are clever. Sure, the type-checking is an interesting idea. But the interest is in performance. Alex's blog posts are mostly about x-thousand transactions per second, or beating Scala performance. I think the number 1 priority of Groovy++ is performance, and type checking is a 2nd nice to have benefit.

And what about people that want a static language with a dynamic backdoor (for builder objects perhaps)? Other languages have this. C# has a "dynamic" variable, right? Boo has a dynamic variable as well, and even lowly Flex has dynamic XML support. This is useful and valuable. I would love to see an @Groovy annotation in Java, where you escape from Java into Groovy, rather than an @Grumpy annotation for Groovy. The benefit of @Groovy for Java is that tool support is the best in the world (without a doubt) and the language has a specification, is well known, and is well tested. Of course, an @Groovy annotation would only ever work with core Java language grammar changes or very ugly hacks.

Lastly, do users just want static compile-time typing for tooling support to catch silly errors as early as possible? Well yes they want that. Who wouldn't? But who is writing Groovy code without testing it? And we already do a great job of recommending suggestions when methods fail ("hashode not found. Did you mean hashcode()"). And the tools exist to do this. IntelliJ IDEA is very reliable in this regard and GroovyLint does the same thing on a limited scale. These features exist today but since they aren't in Eclipse by default then I suspect they are underused. Catching silly errors is entirely possible without the burden of static typing. In fact, some might say that the majority of CodeNarc errors fall into the silly category!

Can Groovy Afford the Complexity?

The Java world talks about a "complexity budget" where a language can only absorb so much complexity before developers leave for something simpler. The term "complexity budget" is poorly defined and often used as a scare tactic in hyperbolic arguments, but I often fear that Groovy could become the next C++ and overrun this arbitrary "complexity budget". C is a good and simple language. Java is a good and simple language. Neither C nor Java are good for all tasks, but they are both successful and capable of having system written in them. C++ started building on C and had every feature known to man added. Nowadays the criticism of C++ is that only 20% is safe to use, but no two C++ programmers can agree on which 20%. Groovy is a multi-paradigm language. You can script, you can OO, and you can FP. Already I am advising teams to find their standard. If you like FP then embrace curry, trampoline, and functions. If you don't then steer clear of those features. Adding a static mode is going to make Groovy more complex. The same code will most likely behave differently between Grumpy mode and not Gumpy. This is one more thing to think about when programming, even if we work to minimize the differences.

Looking at Groovy++, they have done some very cool things. But sometimes the basic features of Groovy++ and Groovy differ, and this cannot be allowed to happen because it is too complex. Already Groovy developers need to know Java and the JVM. I don't know how you could us Groovy outside of Grails without knowing Java. Introducing a new mode that is subtly different could be too much complexity.

Time and Budget Trade-offs

Adding static-mode seems to me like a big undertaking in terms of developer hours. Groovy++ is a good proof of concept for the idea, but the implementation won't be merged into core-groovy. The approach used by Groovy++ is to subclass ASTNodes with AST Transformations and then write out different bytecode for the node, fighting with groovyc to make sure the correct node is invoked. It is better to implement this feature as a part of the core compiler than as an AST transformation, and that is the route the team are currently taking.

It is wonderful that VMWare supports Groovy financially by giving several people jobs. But remember there are finite resources for the project. Any time spent on static-mode means there is less time spent on other features. There aren't enough hours in the day to do everything. So developing a high-performing and well-behaving @Grumpy mode means there are less resources to solve the performance problems of plain old Groovy. I already asserted that performance was a big barrier to wider adoption. Optimized ints are good for benchmarks and came out in Groovy 1.8, but I reckon most method calls are on user defined objects. How many ints are used in a Grails app? Maybe there is a way to do everything at once, but I'm a little sad to see performance improvements not come faster to the language.

My Wishlist

Here is my wishlist of what @Grumpy would do for me and how the feature should roll out.

Some social wishes...

* GEP - The GEP process for Groovy enhancements is supposed to help us formalize features and provide feedback, but it's underused. Please, let's start a GEP as soon as possible and move this big change into the public debate.

* Dev-List Discussion - The dev-list exists to give developers a forum to communicate with each other. There will always be internal conference calls and IM Chats as well, but please lets move more discussion back into the public domain.

* Research - Are there any other dynamic languages with Static backdoor that we learn from them? I don't know of one, but would love to see more examples of what works and what doesn't

Some type inference and static typing wishes...

* Closure Return Type Type-Inference - Closures should have their return type inferred. This is non trivial, and you can have closures that return the values of closure.apply(). At some point type inference breaks. Where will Groovy's fail? How far will the type inferrer go? And how will you specify a return type on a closure when it does fail?

* Closure Arity - The type checker should check the number of parameters a closure has. You should not be able to pass a one method closure to something like Collection.inject() that requires two parameters. All closures need to specify at their type declaration how many parameters it takes.

* Closure Parameter Types - The parameter types of a closure should be type checked. You shouldn't be able to pass a closure that takes a String argument to something that requires a closure that takes an Event argument.

* Closure casting to interfaces - When you cast a Closure into a Single-Method-Interface then the arity, return type, and parameter types should all be checked and verified.

* Interfaces Implemented as a Map - When you implement an interface using the Map notation [ methodName: { /* body */ } ] then the closure arity, parameter types, and return type should be checked, and the compiler should make sure all interface methods are implemented.

* curry/ncurry type issues- Functions like curry, lCurry, rCurry, and nCurry should work in a correct, statically-typed way. For example, if you curry of a 2-arg closure then you're returned a 1-arg closure. Curry a 4-arg closure and you're returned a 3-arg closure. This is a fun edge case that can't be implemented with just overloaded methods!

* Calling groovy from Java - The semantics for calling Grumpy Groovy from Java must remain sane. I'm interested to see some examples of what the APIs look like once all the new types are added. For instance, what is the Closure class hierarchy? Will there be Closure0, Closure1, Closure2 types to model the arity of closures?

* A Grumpy GDK - All of the libraries that ship with Groovy should be as Grumpy as possible. That way a Grumpy developer never has a problem just because a Groovy GDK method was not Grumpy as well. For instance, you should be able to easily make a Grumpy subclass of GroovyTestCase, so any dynamic method on GroovyTestCase should be updated to be as static as possible. Only really use dynamic features when they are necessary.

* Generic by Default - The type inference mechanism should create the correct generics for methods, similar to how F#/Ocaml creates generic methods by default. For instance, the closure: def wrap = { println "logging..."; it() } should have a generic type signature that returns the same type that the 'it' parameter returns.

* AST Transforms Can Read Types - Transforms should be easily able to read the type information out of the AST, and it should be there as early as possible in the compilation process. So the type inference should happen early. The types don't need to be checked early, but they need to be inferred early. Also, when the type of an object is unknown then an AST transformation should be able to read this. Currently, an unknown type is Object to Groovy, but there should be a difference between Object and Unknown.

* AST Transforms Can Write Types - Transforms should be able to write type information. Transforms should be able to participate fully in this process if they are to remain powerful and flexible.

* Type Verification Happens at the End - Type checking and verification should occur at the end of all the compiler phases. This enables AST transforms to actively participate in the type inferring and type specifying process. If type verification runs and fails too early then transforms that could fix the problem won't have a chance to run

So that is my wishlist, perhaps it is a connoisseur's wishlist. What is it that you want to see in Grumpy mode? What do you not want to see?

You can leave feedback here are on the Groovy dev mailing list. We'll see where this goes!

No comments: