So, a new R package is ready to submit to CRAN, but...
I never know which license to use and why. Please feel free to argue.
#rstats
So, a new R package is ready to submit to CRAN, but...
I never know which license to use and why. Please feel free to argue.
#rstats
Academic musings about licences
@HydrePrever@mathstodon.xyz To be honest, it is not possible to answer your question.
What is your aim? Your decision finally reflects more or less a balance between philosophical open source ideals, practicalities regarding package dependencies, and your personal or institutional goals for code sharing.
Here is my academic insight.
These licenses are legal contracts. Choosing a license for an R package depends on how much control and openness is desired for the code, compatibility with dependencies, and philosophical stance about software freedom vs. permissiveness. You are right, the most common licenses on CRAN are MIT, BSD‑3, GPL‑2/3, Apache 2.0, MPL 2.0.
For example the GPL-3 is a strong copyleft license, meaning any derivative works must also be licensed under the GPL. In contrast, the MPL 2.0 is a weak copyleft license, since only the modified files must remain under MPL, while rest can stay proprietary. It guarantees freedom for users to share, modify, and distribute modified versions, but all such derivatives must also be open-sourced under the same terms. If the package links (statically or dynamically) to other GPL-licensed software, it almost always needs to be GPL as well. The viral nature of the GPL means it enforces openness, which some see as a benefit for building a truly open ecosystem, but others view as restrictive if they want their code used in closed or commercial contexts. If maximizing openness and ensuring all derivatives remain open source is important, GPL-3 is the best fit.
In contrast, The MIT (and BSD-3) license is simple most common and highly permissive ("least friction"). It lets anyone use, copy, modify, merge, publish, distribute, sublicense, and sell copies of the software, as long as the original copyright and license notice remain. MIT is well liked for its simplicity and minimal restriction, making it a go-to for developers who want the broadest adoption with the least friction. Importantly, MIT-licensed code can be re-used in both open and closed source projects. For maximally broad reuse, minimal restrictions, and commercial friendliness, MIT is generally preferred. IIRC, Apache 2.0 is also permissive with an explicit patent grant, and it includes a clause that protects against contributor lawsuits. But I am not so sure about this.
CC0 ("No Rights Reserved") is basically a public domain dedication for the code (rare on CRAN), waiving as many rights as legally possible. ImageJ is not CC0 but public domain. This app became the defacto standard for many doing image analysis. It’s ideal for pure data packages or code that the author wants to be maximally reused without even attribution. However, it is less common for code and may cause confusion, as many open source projects and organizations are more familiar with MIT or GPL. For pure data or works intended to be fully in public domain, CC0 is appropriate.
Other licenses include Apache 2.0, BSD, MPL or even more restrictive proprietary licenses. Each comes with its own nuances, such as patent clauses (Apache) or specific attribution requirements (BSD). CRAN policies generally expect open source licenses, and proprietary licenses are not accepted for public packages.
I have MIT and GPL licensed packages on CRAN, and I am fine with both.
@pangolin thank you for this nice summary.
I hope it is a bit of help for you. Do you have a link to your package?