Software Distribution Ranking in Different Languages
Created On: 2016-09-22
Software Distribution Ranking in Different Languages or Platforms
Some software are considerably easier to distribute than others. I will list the languages that is from worst to best in distribution area. In this post, I use linux as target system, and I will not evaluate the situation on Windows or macOS.
Node.js
file size: large easy to create distribution: ? easy to install: no
node.js and npm is not nice at all. The node distribution doesn't do any sensible versioning for itself. There is no gradual upgrade support. It's not obvious how to keep multiple node installations. npm doesn't know about or care about artifact repository, library versioning and library sharing. There is a lot of duplication (re-download, unpacking) when installing packages using npm. Packages on npm often break compatibility without notice. In general, packages' code quality is often worse than other communities. It's good to have a large amount of libraries, that's one reason why people still work with node.js. But the quality issue of the whole platform does bother me.
By the way, npm is the only package tool that I have used that doesn't work in emacs shell buffer. The way it writes the downloading animation in "/" "-" "\" creates many escape characters in emacs shell buffer. I should really fix this and send a patch to upstream sometime if I have to work with node.js on a daily basis.
Ruby
file size: medium easy to create distribution: no easy to install: no
Ruby is often not backward compatible between versions. To create a distribution, you usually need to package ruby itself and all used libraries. Even after that, it may not always work. Gems are often not backward compatible as well. Some popular gem has C code and requires a C compiler at installation time. Tooling is lacking for creating proper distributions.
Examples of ruby distribution: sensu monitoring platform, gitlab community edition.
Python
file size: medium easy to create distribution: no easy to install: no
Python has a slightly better situation compared to ruby. The language is more stable. Virtualenv has been there for long and is used by many people. Wheel packages on PyPI makes installing package written in C code much easier and faster. It is quite comfortable to develop in the language now. But in general, creating and installing a software distribution is still in dark water. You can make it work, but there is no single obvious way.
I have a project template that can create a deb/rpm distribution for linux platform which includes project code and all dependencies. But it's only for linux and it does require a project template to do it. So for a beginner, it's not easy at all.
Java (also clojure and scala) 1
file size: good easy to create distribution: yes easy to install: yes
Java has good tools for packaging and distribution. maven and it's repositories helps a lot. Java jars and wars are easy to create and is small in size. maven has good management of downloaded packages. Java runtime (JVM) is mostly backward compatible in each release. Distribution is easy.
Example of java distribution: tomcat, elasticsearch
Common Lisp (with quicklisp, roswell)
file size: medium easy to create distribution: yes easy to install: yes
Common lisp is quite easy to distribute because it just dumps a new image and makes that executable. Because the image contains the whole common lisp runtime and all loaded libraries, file size is somewhat larger than other almost statically linked languages. Other than that, the distribution is quite pleasant and easy. Roswell supports sbcl and ccl when building a binary.
Haskell (with stack or the earlier halcyon)
file size: good easy to create distribution: yes easy to install: yes
Haskell has good versioning for both the language, the base library and community libraries on Hackage. Some language extension only work with GHC, but it doesn't matter much, since GHC is the state-of-the-art compiler for haskell. Haskell can build statically linked binaries or almost statically linked binaries2. Stack and Stackage makes both development and distribution easier. You don't need to worry about library versions anymore when developing or distributing.
Go
file size: good easy to create distribution: yes easy to install: yes
Go is the king in distribution. I always liked statically linked binaries or almost statically linked binaries2. Go's fast compilation speed also adds good to that. Do you have bad stories when distributing or deploying go program? Please share in the comments.
Summary
My ranks for varies language's distribution situation. Rank is 1 to 10, higher is better.
Language | Situation Rank |
---|---|
node.js | 5 |
ruby | 5 |
python | 6 |
common lisp | 8 |
java | 8 |
haskell | 9 |
go | 10 |
Afterword
Other Languages
There are other languages that I no longer use nowadays. Based on previous experience, I will list my rank for them briefly.
Language | Situation Rank |
---|---|
C/C++ | 9 |
PHP | 10 |
C/C++ is quite easy to compile to static binary. Creating an initial distribution may take some time for larger projects, but it can be tweaked to do very fast recompilation.
PHP is almost the most easy language to create or deploy a distribution. It's usually just a zip file containing the source code, even simpler than java's jar because there is no precompilation. PHP runtime is usually maintained outside the project. At the time I used this language, there were not many libraries around, most things were included in php standard library. Nowadays, it seems PHP has its own package manager. So I am not sure whether that changes things. Based on my experience when deploying owncloud, I think it's still easy to install.
There are also languages that I haven't used professionally or don't have enough experience to make a rank. I will list them here. If you have experience with distributing applications written in them, please leave a comment.
Language | Situation Rank |
---|---|
C# | ? |
D | ? |
Rust | ? |
Erlang | ? |
Elixir | ? |
Swift | ? |
OCaml | ? |
Smalltalk | ? |
Matlab | ? |
R | ? |
About Docker
Since I am talking about distribution, I think I should say something about Docker and how it may change the distribution story. When I first saw the docker video from pycon back in 2013, I think this is a great concept and may bring fundamental changes to software development and deployment. I got excited. Docker allows you to have isolated environment that is tuned to fit your application's requirements. It has very low CPU/RAM/Disk overhead. If you use the same stack (or same few stacks) for your application, the disk space usage becomes irrelevant because the use of overlay filesystem like aufs or later OverlayFS. Many of the current distribution problem won't be a problem if docker is mature. But up until the end of 2015, docker is still like a toy and requires a lot of effort to make things work flawlessly. The storage and network are the biggest issues. The fast change on docker, libcontainer, runc, their own website's layout, etcd, and linux kernel doesn't help. Docker's short maintenance lifecycle doesn't help. There are also other issues. All in all, docker haven't changed how software is developed yet. It has been used in some companies where scalability and/or micro-service architecture is important. That doesn't change the fact that it's still immature and difficult to use.
By the way, docker the idea is easy to implement, but making it production ready and easy to use is not a trivial task.
About Software Distribution
Is it easy to do software distribution? Why do I give the ranking above? If you are interested in software distribution, you may also want to read How to Do Proper Software Distribution.
Footnotes:
The size rank is valid if you don't consider the JVM itself as part of your application. If you do, adjust file size to be large. Modern Oracle JVM is over 300MiB. Other points still hold. Usually the JVM is handled outside the application in the java ecosystem.
By almost statically linked binaries, I mean only very basic libraries like libc, libgmp etc is dynamically linked. The language runtime and other 3rd party libraries are compiled and included in the binary.