Linux Distributions for Quantitative Finance

Choosing a Linux distribution

If you’re new to Linux you’ll face a choice between some unfamiliar distributions. In this article we try to de-mystify those choices.

So many options

When you’re creating a server instance you have to choose what Linux distribution you want to run. Fortunately the distributions all share a lot of common functionality. They mostly differ in presentation and focus.

Primary considerations

When you look over this list of distributions bear in mind what kind of server you want to build. A production server needs to be stable and reliable, while a server you’re just running for fun gives you more leeway to play with newer software and features.

Any of the distributions will run most software you need. They can all run web servers, database servers, and application servers – the standard “LAMP stack”. They’re all Linux and they all have access to software package repositories containing thousands of programs put together with that distribution in mind.

Finally, consider your level of Linux administration expertise. The distributions near the beginning of this list tend to be more friendly to new admins than those later in the list. Not coincidentally this mirrors the general popularity of each distribution as a server OS.

Quick rundown

First let’s look a brief overview of each distribution we offer to help you narrow down the field.

Ubuntu focuses on being user-friendly and offering newer software versions.

CentOS emphasizes stability and enterprise software compatibility above cutting-edge features.

Debian is similarly conservative with a focus on tested and stable software, but with easier access to a repository of newer but potentially less stable packages.

Red Hat is the best choice when you absolutely need the maximum level of enterprise software compatibility but it costs an extra license fee.

Fedora is laid out similar to CentOS but offers a newer and broader variety of software packages.

Gentoo gives you obsessive control over every aspect of the system and how the software it runs is compiled, making it good for people learning to program for Linux.

Arch is targeted at people who are comfortable running a Linux server and want more control over the server’s inner workings.

With that, let’s look at each distribution in more detail.

The distributions

Some of these distributions are based off some of the others. Many share a package manager (CentOS, Red Hat, and Fedora use “RPM” packages while Ubuntu and Debian use “APT” or “.deb” packages). There are, in short, a lot of interrelations in the Linux world.

These similarities mean that you usually can’t go wrong. At worst some tasks will take a little more work than others, so don’t stress too much over your choice of distribution. Whatever you pick you should be fine.

Ubuntu

Ubuntu has a reputation for ease of use, which helps explain its popularity on desktops and servers. Ubuntu also helps users keep up with the latest software versions by releasing updates on a regular schedule.

The drawback of frequent updates is that it’s harder to keep bugs from slipping into the mix. To this end Ubuntu releases an LTS version periodically, which stands for “Long-Term Support”. The LTS version uses package versions that are considered more stable than cutting-edge, making it more suitable for use on a production server than the interim Ubuntu releases.

If you’re completely lost as to which distribution to run Ubuntu LTS is a safe place to start. Its widespread adoption means there are several forums and sites on the Internet that provide help resources for Ubuntu users.

Ubuntu uses apt as its package manager.

CentOS

CentOS is a distribution that emphasizes reliability. It replicates Red Hat Enterprise Linux as much as possible, omitting only the non-free components of that distribution. That means CentOS is a very stable distribution and is well-suited to production environments. It also tends to be compatible with enterprise software, though it’s not always officially supported by software vendors.

The price of stability is that the software versions included with CentOS are rarely the latest and greatest. The packages included with CentOS have been tuned over time to work out as many bugs and security flaws as possible.

CentOS uses rpm for its package manager.

Debian

Debian focuses on stability and security in its official releases. In that respect it can be similar to CentOS, using older packages with proven track records. The reliability of Debian is such that several other distributions (such as Ubuntu) build on top of Debian releases.

Debian provides an “unstable” repository for ambitious server admins looking to incorporate newer software releases into a Debian installation without sacrificing the stability of the rest of the system.

Debian uses apt as its package manager.

RHEL

Red Hat Enterprise Linux (RHEL) is aimed at enterprise-level servers. That means it’s stable and handles heavy loads well.

The price for reliability is, in this case, a literal one. RHEL requires an additional license fee to Red Hat to access their non-free software components and updates.

The main reason to use RHEL would be if you’re running a software package that has RHEL in its list of supported operating systems. This usually means enterprise software – heavy-duty stuff aimed at larger businesses. If you’re spending that much on your software you’ll want to make sure you run it on an OS that lets you get support from the software vendor.

If you aren’t running software that requires RHEL but want to take advantage of its reliability you’ll usually be fine running CentOS instead. RHEL is worth the extra cost when it gets you vendor support or if you want to be able to take advantage of support from Red Hat itself.

If you don’t know if you use enterprise software then, well, you probably don’t. Use another distribution for now and you can switch later if you decide to migrate to software that requires RHEL.

RHEL uses the rpm package manager.

Fedora

Fedora was originally the free version of Red Hat’s Linux distribution. Red Hat still sponsors the distribution but while Red Hat’s current distribution is very conservative in its package choices Fedora focuses on including cutting-edge software. The release cycle for Fedora is a short one as they continually update to newer software packages.

Fedora is a good choice if you want to have easy access to new software versions soon after release. Fedora is popular as a desktop distribution and for hobbyists learning Linux but it’s still a strong server distribution.

Fedora uses the RPM package manager.

Gentoo

Gentoo is an unusual distribution in that its default behavior is to compile installed software itself instead of grabbing precompiled packages. This means that Gentoo can be intimidating for new system administrators and can take a while to set up (compiling takes time).

If you know what kind of compiler options are best for your environment then Gentoo can allow a level of system optimization that’s difficult to achieve in other distributions. You can configure system default compiler options as well as set them up on a per-package basis so they’ll be used when the package manager updates and recompiles software.

Gentoo is a great choice if you want an environment that forces you to learn more about Linux programming, or if you’re a very knowledgeable system administrator who wants fine-grained control of every aspect of the system. Otherwise you’re probably safer trying a different distribution.

Gentoo uses the emerge package manager.

Arch

Arch is a system administrator’s distribution with its own design philosophy, the “Arch way”. It’s an approach that makes sense once you start learning how they’ve laid out the system but it can be a bit daunting if you’re new to Linux administration.

If you’re an experienced system administrator and want some good low-level control over how programs run on your server, but don’t want to get into the level of detail and complexity Gentoo offers, then Arch can be worth trying. If you’re new to system administration you may want to try another distribution for now. You can always take a look at Arch later when you’re more comfortable.

Arch uses the pacman package manager

Check Also

Machine Learning and Financial Models

The machine-learning approach to financial modeling is an attempt to find financial models automatically, through a …

Leave a Reply