I recently posted a book review for a book about learning Docker. I immediately got some feedback that I did not do a great job of explaining exactly what Docker is. This post is going to be a brief explanation of exactly what Docker is and particularly how it compares to virtual machines.
Before talking about Docker, let’s talk a little about Virtual Machines (VMs). Think of a VM as an emulator. You have your host OS which is what normally runs on your computer. Inside that, you have a virtual machine program (VMware, Hyper-V, VirtualBox, etc.). This program allows you to create and boot a virtual machine. Think of a VM as a computer inside a computer. The VM thinks it is running directly on the hardware and is unaware of the host OS.
VM Use Cases
Why would you want to do this? Previously, I wrote a series of articles about VMs, which illuminates some of the reasons. The short answer is, there are 2 main reasons. The first is that it allows you to run a different OS than your host OS. When you create a VM, it is a blank machine so you can install any OS you want. You can do things like run Linux or MAC OS inside Windows. The second main reason is for isolation. The guest (ie the OS inside the VM) is unaware of the host, so any drivers or other software you install inside the VM can only affect the guest and not the host. Each VM is also isolated from all other VMs. This is very useful for isolating various development and testing environments and dealing with driver conflicts.
One other nice feature of VMs is snapshots, which exactly what it sounds like. You can take a snapshot of your VM and then at any point in time, you can simply restore the VM to that snapshot. This can be very useful for testing. First, you can have a virgin windows image. Then you can test out your installer and when you are done simply restore a snapshot of the virgin image and you are right back where you started.
So what is Docker? In some ways you can think of Docker as a lightweight VM. VMs are kind of heavyweight in that they take up a lot of disk space because you have to recreate the entire OS. Also when it comes to memory as long as the guest is running your guest OS is also running and taking up memory. One of the main advantages of Docker is that you don’t duplicate the OS. Docker gives you an isolated environment similar to a VM but using your main host OS.
Another way to think about Docker is as a jail or chroot. If you are not familiar with those concepts, they come from Linux. Here is a good video that explains it. Basically you run a process inside your OS, but heavily restrict what it has access to. You restrict the file system and processes it sees and its access to hardware and to the network. It is running in its own little isolated environment.
When talking about Docker there are 2 important concepts: images and containers. An image is a file that represents the file system that your process will see. Any utilities or drivers it needs access to need to be included in the image, because all your process will be able see is the file system in the image. A container is actually the running process. So when you start a container you specify what image to load and what process (CLI command) you want to run. One thing to note is that Docker containers don’t have a GUI, it’s just the CLI.
Docker Use Cases
I mentioned that VMs have 2 use cases: running a different OS and isolation. Docker by it’s nature doesn’t really solve the first problem (although if you do some research you will find it is possible to run Linux containers on Windows). Docker’s main use case is isolation. You can isolate a process and all it’s dependencies into a single container. One nice feature of docker is that it has a distribution mechanism built-in. It is very easy to create a program, create a Docker image around it and all its dependencies and then make the image available to your users. Then they can simply pull down a single image and run the container and be able to use your program immediately.
Docker is also widely used for web services. It is very easy to spin up another copy of a web service to meet demand. Docker also allows you to cluster several computers together and will do automatic load balancing. When you hear people talk about microservices, they are typically implemented in Docker. This whole process is often referred to as orchestration.
My main use case for Docker is for Continuous Integration. If you read my review of Docker in Action I explain why I think Docker is better than VMs for CI. The main thing I want to be able to do (still working out all the kinks) is to be able to use the GitLab Docker Executor to spin up containers on demand for various CI tasks. I also want to be able to run those same containers locally so I can run isolated tests and builds locally before I push my changes to the server.
|Run different OS||no*||yes|
|Disk Space Usage||low||high|
|Builtin Distribution Mechanism||yes||no|
Hopefully, that answers any question you might have about Docker. I am not the expert (I literally just read a book, did a few exercises, and have played around with it a little bit). I tried to be as accurate as I could. In an effort to make things easier to understand, I did simplify some things and left out a few details.