How to Do Proper Software Distribution
Created On: 2016-09-22
I have wrote a post on software distribution ranking in different languages, but I didn't write about what is or how to do a proper software distribution.
A proper software distribution
- Is reproducible.
A distribution should not rely on specific personal, server or file to create. For example, if you have dependency on a private patch to some public available software, you should include that patch in your source code repository and apply that patch when building the software. The steps to create the distribution should be documented and always produce the same artifact.
- Is reliable to install.
It should not rely on network or 3rd party service to fetch dependencies. In house configuration server or artifact repository is okay.
If you distribution an end user product, uploading to PyPI is not end-of-story. Things like PyPI is temporarily down or left-pad package is removal from npm should not affect a proper distribution.
- External dependencies should be documented.
That includes operating system, language runtime, database, filesystem, network requirement, compilers, locale and timezone settings, fonts and so on. Explicitly list which environments are supported and which software and version is required.
- Internal dependencies should be included.
Try not to ask user to install internal dependencies themselves. If you have to do so, document the dependencies and supported versions for them. If possible, give concrete package names and version numbers. Do not use ">=3", "latest" etc to describe the dependencies. You cannot assume a dependency never have any backward incompatible changes.
- Has a brief description of what is it.
Not every project has a good and easy to recognize name. Especially some tools used internally in organizations and companies. Always include a README file or similar to document what the project is about.
- Has proper version number, release date, and checksums.
Version number is the way to identify a release. Release date gives an impression how up-to-date the software is. Checksum can be used to verify the downloaded software is not forged. Usually the checksum is md5sum, sha1sum or GPG signature.
It's astonishing that even today there are still many distributions that do not have these basic information listed.
Since software size is getting larger and Internet and CDN based distribution is popular, I can't say enough how important HTTPS website and file checksum is. Verifying you get what you expect is the first thing to do when downloading a software from the Internet. These can not be done if the distributor doesn't provide any checksum.
- Has an installation document and user guides to get user started.
The overall steps for installation and usage should be documented. If it takes a long time to install or do initialization, it should also be documented.
Optional but highly recommended
additions
- Can be generated and distributed automatically.
Usually this is done via continuous integration server or continuous integration pipeline. It's often as easy as turning those reproducible steps into a build script.
- Good separation of code and configuration.
Otherwise, your software may be difficult to use for some people with different environment.
- Has proper changelog.
A changelog file can show how the project is evolved, how often it is released, when an important feature or bug fix is introduced.
Note that git log is not enough as a changelog. git log has no concept of a release or distribution, some commits may not be useful to end users. Write a proper changelog if you want to keep a changelog. Those who are interested in git log can run it themselves.
- Has a sanity check or health check script/user interface.
These checks can be used to detect whether the installation is correct and can be used in system monitoring as well.
- Has support information or bug report links.
If it is a side project, you can write your support email there. If it is an internal tool in company, you can write a department or support team's contact. If it is a public technology related library or project, you probably should have a public bug report system for feedback.
All these is to provide a way to contact the authors or support team when something is wrong. As a software developer, you should expect people to have suggestions, questions or problems when using your software.