{"id":1590,"date":"2019-03-22T05:57:28","date_gmt":"2019-03-22T05:57:28","guid":{"rendered":"http:\/\/kusuaks7\/?p=1195"},"modified":"2023-07-13T10:30:25","modified_gmt":"2023-07-13T10:30:25","slug":"learn-enough-docker-to-be-useful-part-3-a-dozen-dandy-dockerfile-instructions","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/learn-enough-docker-to-be-useful-part-3-a-dozen-dandy-dockerfile-instructions\/","title":{"rendered":"Learn Enough Docker to be Useful &#8211; Part 3: A Dozen Dandy Dockerfile Instructions"},"content":{"rendered":"<p>This article is all about Dockerfiles. It\u2019s the third installment in a six-part series on Docker. If you haven\u2019t read\u00a0<a href=\"https:\/\/www.experfy.com\/blog\/learn-enough-docker-to-be-useful-part1-the-conceptual-landscape\">Part 1<\/a>, read it first and see Docker container concepts in a whole new light. <a href=\"https:\/\/www.experfy.com\/blog\/learn-enough-docker-to-be-useful-part-2-a-delicious-dozen-docker-terms-you-need-to-know\">Part 2<\/a>\u00a0is a quick run-through of the Docker ecosystem. In\u00a0future articles, I\u2019ll look at slimming down Docker images, Docker CLI commands, and using data with Docker.<\/p>\n<section>\n<p id=\"cb6c\">Let\u2019s jump into the dozen Dockerfile instructions to know!<\/p>\n<figure id=\"c63b\"><canvas width=\"75\" height=\"48\"><\/canvas><img decoding=\"async\" style=\"width: 640px; height: 426px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*Sh6i23HzQcvVy_DUHnmrIw.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*Sh6i23HzQcvVy_DUHnmrIw.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">Jump in. True picture\u00a0\ud83d\ude09<\/p>\n<h3 id=\"ec01\">Docker Images<\/h3>\n<p id=\"1713\">Recall that a Docker container is a Docker image brought to life. It\u2019s a self-contained, minimal operating system with application code.<\/p>\n<p id=\"e8a2\">The Docker image is created at build time and the Docker container is created at run time.<\/p>\n<p id=\"c755\">The Dockerfile is at the heart of Docker. The Dockerfile tells Docker how to build the image that will be used to make containers.<\/p>\n<p id=\"4f5e\">Each Docker image contains a file named\u00a0<em>Dockerfile\u00a0<\/em>with no extension. The Dockerfile is assumed to be in the current working directory when\u00a0<code>docker build<\/code>\u00a0is called to create an image. A different location can be specified with the file flag (<code>-f<\/code>).<\/p>\n<p id=\"0d7c\">Recall that a container is built from a series of layers. Each layer is read only, except the final container layer that sits on top of the others. The Dockerfile tells Docker which layers to add and in which order to add them.<\/p>\n<p id=\"0e55\">Each layer is really just a file with the changes since the previous layer. In Unix, pretty much everything is a\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Everything_is_a_file\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Everything_is_a_file\" data->file<\/a>.<\/p>\n<p id=\"dc35\">The base image provides the initial layer(s). A base image is also called a parent image.<\/p>\n<p id=\"dadf\">When an image is pulled from a remote repository to a local machine only layers that are not already on the local machine are downloaded. Docker is all about saving space and time by reusing existing layers.<\/p>\n<figure id=\"f2de\"><canvas width=\"75\" height=\"56\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*AqNC_3Nefyo-enzgRjTD4w.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*AqNC_3Nefyo-enzgRjTD4w.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">A base (jumping) image<\/p>\n<p id=\"428b\">A Dockerfile instruction is a capitalized word at the start of a line followed by its arguments. Each line in a Dockerfile can contain an instruction. Instructions are processed from top to bottom when an image is built. Instructions look like this:<\/p>\n<pre id=\"5ddc\"><code>FROM ubuntu:18.04\r\nCOPY . \/app<\/code><\/pre>\n<p id=\"4748\">Only the instructions FROM, RUN, COPY, and ADD create layers in the final image. Other instructions configure things, add metadata, or tell Docker to do something at run time, such as expose a port or run a command.<\/p>\n<p id=\"7c84\">In this article, I\u2019m assuming you are using a Unix-based Docker image. You can also used Windows-based images, but that\u2019s a slower, less-pleasant, less-common process. So use Unix if you can.<\/p>\n<p id=\"c4a2\">Let\u2019s do a quick once-over of the dozen Dockerfile instructions we\u2019ll explore.<\/p>\n<h3 id=\"1253\">A Dozen Dockerfile Instructions<\/h3>\n<p id=\"bfc2\"><code>FROM<\/code>\u200a\u2014\u200aspecifies the base (parent) image.<br \/>\n<code>LABEL<\/code>\u00a0\u2014provides metadata. Good place to include maintainer info.<br \/>\n<code>ENV<\/code>\u200a\u2014\u200asets a persistent environment variable.<br \/>\n<code>RUN<\/code>\u00a0\u2014runs a command and creates an image layer. Used to install packages into containers.<br \/>\n<code>COPY<\/code>\u200a\u2014\u200acopies files and directories to the container.<br \/>\n<code>ADD<\/code>\u200a\u2014\u200acopies files and directories to the container. Can upack local\u00a0.tar files.<br \/>\n<code>CMD<\/code>\u200a\u2014\u200aprovides a command and arguments for an executing container. Parameters can be overridden. There can be only one CMD.<br \/>\n<code>WORKDIR<\/code>\u200a\u2014\u200asets the working directory for the instructions that follow.<br \/>\n<code>ARG<\/code>\u200a\u2014\u200adefines a variable to pass to Docker at build-time.<br \/>\n<code>ENTRYPOINT<\/code>\u200a\u2014\u200aprovides command and arguments for an executing container. Arguments persist.<br \/>\n<code>EXPOSE<\/code>\u200a\u2014\u200aexposes a port.<br \/>\n<code>VOLUME<\/code>\u200a\u2014\u200acreates a directory mount point to access and store persistent data.<\/p>\n<p id=\"f105\">Let\u2019s get to it!<\/p>\n<h3 id=\"dbb4\">Instructions and\u00a0Examples<\/h3>\n<p id=\"4888\">A Dockerfile can be as simple as this single line:<\/p>\n<pre id=\"8e58\">FROM ubuntu:18.04<\/pre>\n<h4 id=\"69a2\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#from\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#from\" data-><strong>FROM<\/strong><\/a><\/h4>\n<p id=\"0b08\">A Dockerfile must start with a FROM instruction or an ARG instruction followed by a FROM instruction.<\/p>\n<p id=\"06ea\">The FROM<em>\u00a0<\/em>keyword tells Docker to use a base image that matches the provided repository and tag. A base image is also called a\u00a0<a href=\"https:\/\/docs.docker.com\/develop\/develop-images\/baseimages\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/develop\/develop-images\/baseimages\/\" data->parent image<\/a>.<\/p>\n<p id=\"d0f9\">In this example,\u00a0<em>ubuntu\u00a0<\/em>is the image repository. Ubuntu is the name of an\u00a0<a href=\"https:\/\/hub.docker.com\/_\/ubuntu\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/hub.docker.com\/_\/ubuntu\" data->official Docker repository<\/a>\u00a0that provides a basic version of the popular Ubuntu version of the Linux operating system.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*OiaeQ0JXfdeluoJq-liKYg.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*OiaeQ0JXfdeluoJq-liKYg.png\" \/><\/p>\n<p style=\"text-align: center;\"><span style=\"text-align: center;\">Linux mascot\u00a0Tux<\/span><\/p>\n<p id=\"ae42\">Notice that this Dockerfile includes a tag for the base image:\u00a0<em>18.04<\/em>\u00a0. This tag tells Docker which version of the image in the\u00a0<em>ubuntu<\/em>\u00a0repository to pull. If no tag is included, then Docker assumes the\u00a0<em>latest\u00a0<\/em>tag<em>,\u00a0<\/em>by default. To make your intent clear, it\u2019s good practice to specify a base image tag.<\/p>\n<p id=\"ddcc\">When the Dockerfile above is used to build an image locally for the first time, Docker downloads the layers specified in the\u00a0<em>ubuntu<\/em>\u00a0image. The layers can be thought of as stacked upon each other. Each layer is a file with the set of differences from the layer before it.<\/p>\n<p id=\"0580\">When you create a container, you add a writable layer on top of the read-only layers.<\/p>\n<figure id=\"c2a6\"><canvas width=\"75\" height=\"51\"><\/canvas><img decoding=\"async\" style=\"width: 675px; height: 469px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*l8YiwkfvUQsG_uGv9OyJDw.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*l8YiwkfvUQsG_uGv9OyJDw.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">From the\u00a0<a href=\"https:\/\/docs.docker.com\/v17.09\/engine\/userguide\/storagedriver\/imagesandcontainers\/#images-and-layers\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/v17.09\/engine\/userguide\/storagedriver\/imagesandcontainers\/#images-and-layers\" data->Docker\u00a0Docs<\/a><\/p>\n<p id=\"25be\">Docker uses a copy-on-write strategy for efficiency. If a layer exists at a previous level within an image, and another layer needs read access to it, Docker uses the existing file. Nothing needs to be downloaded.<\/p>\n<p id=\"e0ea\">When an image is running, if a layer needs modified by a container, then that file is copied into the top, writeable layer. Check out the Docker docs\u00a0<a href=\"https:\/\/docs.docker.com\/v17.09\/engine\/userguide\/storagedriver\/imagesandcontainers\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/v17.09\/engine\/userguide\/storagedriver\/imagesandcontainers\/\" data->here<\/a>\u00a0to learn more about copy-on-write.<\/p>\n<\/section>\n<section>\n<hr \/>\n<h4 id=\"dc58\">A More Substantive Dockerfile<\/h4>\n<p id=\"34fb\">Although our one-line image is concise, it\u2019s also slow, provides little information, and does nothing at container run time. Let\u2019s look at a longer Dockerfile that builds a much smaller size image and executes a script at container run time.<\/p>\n<p id=\"0c54\">Whoa, what\u2019s going on here? Let\u2019s step through it and demystify.<\/p>\n<p id=\"7d5c\">The base image is an official Python image with the tag\u00a0<em>3.7.2-alpine3.8<\/em>. As you can see from its\u00a0<a href=\"https:\/\/github.com\/docker-library\/python\/blob\/ab8b829cfefdb460ebc17e570332f0479039e918\/3.7\/alpine3.8\/Dockerfile\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/docker-library\/python\/blob\/ab8b829cfefdb460ebc17e570332f0479039e918\/3.7\/alpine3.8\/Dockerfile\" data->source code<\/a>, the image includes Linux, Python and not much else. Alpine images are popular because they are small, fast, and secure. However, Alpine images don\u2019t come with many operating system niceties. You must install such packages yourself, should you need them.<\/p>\n<h4 id=\"31e2\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#label\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#label\" data-><strong>LABEL<\/strong><\/a><\/h4>\n<p id=\"ba6e\">The next instruction is LABEL. LABEL adds metadata to the image. In this case, it provides the image maintainer\u2019s contact info. Labels don\u2019t slow down builds or take up space and they do provide useful information about the Docker image, so definitely use them. More about LABEL metadata can be found\u00a0<a href=\"https:\/\/docs.docker.com\/config\/labels-custom-metadata\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/config\/labels-custom-metadata\/\" data->here<\/a>.<\/p>\n<figure id=\"08c6\"><canvas width=\"75\" height=\"41\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*BxNuK7OE4yohf4ASHPemdQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*BxNuK7OE4yohf4ASHPemdQ.jpeg\" \/><\/figure>\n<h4 id=\"3c13\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#env\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#env\" data->ENV<\/a><\/h4>\n<p id=\"9592\">ENV sets a persistent environment variable that is available at container run time. In the example above, you could use the ADMIN variable when when your Docker container is created.<\/p>\n<p id=\"9547\">ENV is nice for setting constants. If you use a constant several places in your Dockerfile and want to change its value at a later time, you can do so in one location.<\/p>\n<figure id=\"4217\" data-scroll=\"native\"><canvas width=\"75\" height=\"48\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*RqSnnTIeR5dcUeUMfVP4jw.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*RqSnnTIeR5dcUeUMfVP4jw.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">Environment<\/p>\n<p id=\"49a5\">With Dockerfiles there are often multiple ways to accomplish the same thing. The best method for your case is a matter of balancing Docker conventions, transparency, and speed. For example, RUN, CMD, and ENTRYPOINT serve different purposes, and can all be used to execute commands.<\/p>\n<h4 id=\"10ea\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#run\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#run\" data->RUN<\/a><\/h4>\n<p id=\"f979\">RUN creates a layer at build-time. Docker commits the state of the image after each RUN.<\/p>\n<p id=\"f804\">RUN is often used to install packages into an image<em>.\u00a0<\/em>In the example above,\u00a0<code>RUN apk update &amp;&amp; apk upgrade<\/code>\u00a0tells Docker to update the packages from the base image<em>.\u00a0<\/em><code>&amp;&amp; apk add bash<\/code>\u00a0tells Docker to install\u00a0<em>bash<\/em>\u00a0into the image.<\/p>\n<p id=\"1775\"><em>apk\u00a0<\/em>stands for\u00a0<a href=\"https:\/\/www.cyberciti.biz\/faq\/10-alpine-linux-apk-command-examples\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.cyberciti.biz\/faq\/10-alpine-linux-apk-command-examples\/\" data->Alpine Linux package manager<\/a>. If you\u2019re using a Linux base image in a flavor other than Alpine, then you\u2019d install packages with RUN\u00a0<em>apt-get<\/em>\u00a0instead of\u00a0<em>apk<\/em>.\u00a0<em>apt<\/em>\u00a0stand for\u00a0<em>advanced package tool<\/em>. I\u2019ll discuss other ways to install packages in a later example.<\/p>\n<figure id=\"bf79\" data-scroll=\"native\"><canvas width=\"75\" height=\"52\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*ovB1vggQ-wDamyzj4YFaqQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*ovB1vggQ-wDamyzj4YFaqQ.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">RUN<\/p>\n<p id=\"b4b3\">RUN\u200a\u2014\u200aand its cousins, CMD and ENTRYPOINT\u200a\u2014\u200acan be used in exec form or shell form. Exec form uses JSON array syntax like so:\u00a0<code>RUN [\"my_executable\", \"my_first_param1\", \"my_second_param2\"]<\/code>.<\/p>\n<p id=\"0761\">In the example above, we used shell form in the format\u00a0<code>RUN apk update &amp;&amp; apk upgrade &amp;&amp; apk add bash<\/code>.<\/p>\n<p id=\"d452\">Later in our Dockerfile we used the preferred exec form with\u00a0<code>RUN [\"mkdir\", \"\/a_directory\"]<\/code>\u00a0to create a directory. Don\u2019t forget to use double quotes for strings with JSON syntax for exec form!<\/p>\n<h4 id=\"2368\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#copy\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#copy\" data->COPY<\/a><\/h4>\n<p id=\"08c7\">The\u00a0<code>COPY\u00a0.\u00a0.\/app<\/code><em>\u00a0<\/em>instruction tells Docker to take the files and folders in your local build context and add them to the Docker image\u2019s current working directory. Copy will create the target directory if it doesn\u2019t exist.<\/p>\n<figure id=\"0703\" data-scroll=\"native\"><canvas width=\"75\" height=\"48\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*xHw_3tj8JooiRp-yWZCIyQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*xHw_3tj8JooiRp-yWZCIyQ.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">COPY<\/p>\n<h4 id=\"849c\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#add\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#add\" data-><strong>ADD<\/strong><\/a><\/h4>\n<p id=\"af6e\">ADD does the same thing as COPY, but has two more use cases.\u00a0ADD can be used to move files from a remote URL to a container and ADD can extract local TAR files.<\/p>\n<p id=\"dd52\">I used ADD in the example above to copy a file from a remote url into the container\u2019s\u00a0<em>my_app_directory<\/em>. The\u00a0<a href=\"https:\/\/docs.docker.com\/develop\/develop-images\/dockerfile_best-practices\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/develop\/develop-images\/dockerfile_best-practices\/\" data->Docker docs<\/a>\u00a0don\u2019t recommend using remote urls in this manner because you can\u2019t delete the files. Extra files increase the final image size.<\/p>\n<p id=\"cccb\">The\u00a0<a href=\"https:\/\/docs.docker.com\/develop\/develop-images\/dockerfile_best-practices\/#add-or-copy\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/develop\/develop-images\/dockerfile_best-practices\/#add-or-copy\" data->Docker docs<\/a>\u00a0also suggest using COPY instead of ADD whenever possible for improved clarity. It\u2019s too bad that Docker doesn\u2019t combine ADD and COPY into a single command to reduce the number of Dockerfile instructions to keep straight<\/p>\n<p id=\"94b5\">Note that the ADD instruction contains the\u00a0<code><\/code>\u00a0line continuation character. Use it to improve readability by breaking up a long instruction over several lines.<\/p>\n<h4 id=\"5035\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#cmd\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#cmd\" data-><strong>CMD<\/strong><\/a><\/h4>\n<p id=\"1aab\">CMD provides Docker a command to run when a container is started. It does not commit the result of the command to the image at build time. In the example above, CMD will have the Docker container run the my_<em>script.py<\/em>\u00a0file at run time.<\/p>\n<figure id=\"ef57\" data-scroll=\"native\"><canvas width=\"75\" height=\"47\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*534Bheu2kcpBWl8J4vX4rA.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*534Bheu2kcpBWl8J4vX4rA.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">That\u2019s a\u00a0CMD!<\/p>\n<p id=\"149a\">A few other things to know about CMD:<\/p>\n<ul>\n<li id=\"cadf\">Only one CMD instruction per Dockerfile. Otherwise all but the final one are ignored.<\/li>\n<li id=\"0347\">CMD can include an executable. If CMD is present without an executable, then an ENTRYPOINT instruction must exist. In that case, both CMD and ENTRYPOINT instructions should be in JSON format.<\/li>\n<li id=\"c571\">Command line arguments to\u00a0<code>docker run<\/code>\u00a0override arguments provided to CMD in the Dockerfile.<\/li>\n<\/ul>\n<\/section>\n<section>\n<hr \/>\n<h4 id=\"4a53\">Ready for\u00a0more?<\/h4>\n<p id=\"38af\">Let\u2019s introduce a few more instructions in another example Dockerfile.<\/p>\n<p>Note that you can use comments in Dockerfiles. Comments start with\u00a0<code>#<\/code>.<\/p>\n<p id=\"8ea2\">Package installation is a primary job of Dockerfiles. As touched on earlier, there are several ways to install packages with RUN.<\/p>\n<p id=\"614d\">You can install a package in an Alpine Docker image with\u00a0<em>apk. apk\u00a0<\/em>is like\u00a0<em>apt-get\u00a0<\/em>in regular Linux builds. For example, packages in a Dockerfile with a base Ubuntu image can be updated and installed like this:\u00a0<code>RUN apt-get update &amp;&amp; apt-get install my_package<\/code>.<\/p>\n<p id=\"dcee\">In addition to\u00a0<em>apk<\/em>\u00a0and\u00a0<em>apt-get<\/em>, Python packages can be installed through\u00a0<a href=\"https:\/\/pypi.org\/project\/pip\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pypi.org\/project\/pip\/\" data-><em>pip<\/em><\/a>,\u00a0<a href=\"https:\/\/pythonwheels.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pythonwheels.com\/\" data-><em>wheel<\/em><\/a>, and\u00a0<a href=\"https:\/\/medium.com\/@chadlagore\/conda-environments-with-docker-82cdc9d25754\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/@chadlagore\/conda-environments-with-docker-82cdc9d25754\" data-><em>conda<\/em><\/a>. Other languages can use various installers.<\/p>\n<p id=\"376f\">The underlying layers need to provide the install layer with the the relevant package manger. If you\u2019re having an issue with package installation, make sure the package managers are installed before you try to use them.<\/p>\n<p id=\"e427\">You can use RUN with pip and list the packages you want installed directly in your Dockerfile. If you do this concatenate your package installs into a single instruction and break it up with line continuation characters (). This method provides clarity and fewer layers than multiple RUN instructions.<\/p>\n<p id=\"5e52\">Alternatively, you can list your package requirements in a file and RUN a package manager on that file. Folks usually name the file\u00a0<em>requirements.txt<\/em>. I\u2019ll share a recommended pattern to take advantage of build time caching with\u00a0<em>requirements.txt\u00a0<\/em>in the next article.<\/p>\n<h4 id=\"7400\"><a href=\"https:\/\/docs.docker.com\/v17.09\/engine\/reference\/builder\/#workdir\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/v17.09\/engine\/reference\/builder\/#workdir\" data->WORKDIR<\/a><\/h4>\n<p id=\"aba3\">WORKDIR changes the working directory in the container for the COPY, ADD, RUN, CMD, and ENTRYPOINT instructions that follow it. A few notes:<\/p>\n<ul>\n<li id=\"a363\">It\u2019s preferable to set an absolute path with WORKDIR rather than navigate through the file system with\u00a0<code>cd<\/code>\u00a0commands in the Dockerfile.<\/li>\n<li id=\"a5d9\">WORKDIR creates the directory automatically if it doesn\u2019t exist.<\/li>\n<li id=\"9d08\">You can use multiple WORKDIR instructions. If relative paths are provided, then each WORKDIR instruction changes the current working directory.<\/li>\n<\/ul>\n<figure id=\"784f\" data-scroll=\"native\"><canvas width=\"75\" height=\"56\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*IriKIqsuPjl-A9zqU4NUyw.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/2400\/1*IriKIqsuPjl-A9zqU4NUyw.jpeg\" \/><\/figure>\n<h4 style=\"text-align: center;\">WORKDIRs of some\u00a0sort<\/h4>\n<h4 id=\"8e85\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#arg\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#arg\" data->ARG<\/a><\/h4>\n<p id=\"9ab8\">ARG defines a variable to pass from the command line to the image at build-time. A default value can be supplied for ARG in the Dockerfile, as it is in the example:\u00a0<code>ARG my_var=my_default_value<\/code>.<\/p>\n<p id=\"070a\">Unlike ENV variables, ARG variables are not available to running containers. However, you can use ARG values to set a default value for an ENV variable from the command line when you build the image. Then, the ENV variable persists through container run time. Learn more about this technique\u00a0<a href=\"https:\/\/vsupalov.com\/docker-build-time-env-values\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/vsupalov.com\/docker-build-time-env-values\/\" data->here<\/a>.<\/p>\n<h4 id=\"b944\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#entrypoint\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#entrypoint\" data->ENTRYPOINT<\/a><\/h4>\n<p id=\"e14b\">The ENTRYPOINT instruction also allows you provide a default command and arguments when a container starts. It looks similar to CMD, but ENTRYPOINT parameters are not overwritten if a container is run with command line parameters.<\/p>\n<p id=\"187a\">Instead, command line arguments passed to\u00a0<code>docker run my_image_name<\/code>\u00a0are appended to the ENTRYPOINT instruction\u2019s arguments. For example,\u00a0<code>docker run my_image bash<\/code>\u00a0adds the argument\u00a0<em>bash<\/em>\u00a0to the end of the ENTRYPOINT instruction\u2019s existing arguments.<\/p>\n<figure id=\"9104\"><canvas width=\"75\" height=\"41\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*8kfhJgqmW34UYMhDnHUcFA.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*8kfhJgqmW34UYMhDnHUcFA.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">ENTRYPOINT to somewhere<\/p>\n<p id=\"66e2\">A Dockerfile should have at least one CMD or ENTRYPOINT instruction.<\/p>\n<p id=\"29e9\">The\u00a0<a href=\"https:\/\/docs.docker.com\/v17.09\/engine\/reference\/builder\/#understand-how-cmd-and-entrypoint-interact\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/v17.09\/engine\/reference\/builder\/#understand-how-cmd-and-entrypoint-interact\" data->Docker docs<\/a>\u00a0have a few suggestions for choosing between CMD and ENTRYPOINT for your initial container command:<\/p>\n<ul>\n<li id=\"9ca2\">Favor ENTRYPOINT when you need to run the same command every time.<\/li>\n<li id=\"72f8\">Favor ENTRYPOINT when a container will be used as an executable program.<\/li>\n<li id=\"5033\">Favor CMD when you need to provide extra default arguments that could be overwritten from the command line.<\/li>\n<\/ul>\n<p id=\"9ebb\">In the example above,\u00a0<code>ENTRYPOINT [\"python\", \"my_script.py\", \"my_var\"]<\/code>\u00a0has the container run the the python script\u00a0<em>my_script.py\u00a0<\/em>with the argument\u00a0<em>my_var\u00a0<\/em>when the container starts running<em>. my_var\u00a0<\/em>could then be used by\u00a0<em>my_script<\/em>\u00a0via\u00a0<a href=\"https:\/\/docs.python.org\/3\/library\/argparse.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.python.org\/3\/library\/argparse.html\" data->argparse<\/a>. Note that\u00a0<em>my_var<\/em>\u00a0has a default value supplied by ARG earlier in the Dockerfile. So if an argument isn\u2019t passed from the command line, then the default argument will be used.<\/p>\n<p id=\"1507\">Docker recommends you generally use the exec form of ENTRYPOINT:\u00a0<code>ENTRYPOINT [\"executable\", \"param1\", \"param2\"]<\/code>. This form is the one with JSON array syntax.<\/p>\n<p id=\"478c\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#expose\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#expose\" data->EXPOSE<\/a><\/p>\n<p id=\"69b8\">The EXPOSE instruction shows which port is intended to be published to provide access to the running container. EXPOSE does not actually publish the port. Rather, it acts as a documentation between the person who builds the image and the person who runs the container.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*S2QyTx1gOWtsIoXPzyhCPQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*S2QyTx1gOWtsIoXPzyhCPQ.jpeg\" \/><\/p>\n<p style=\"text-align: center;\"><span style=\"text-align: center;\">EXPOSEd<\/span><\/p>\n<p id=\"b4a7\">Use\u00a0<code>docker run<\/code>\u00a0with the\u00a0<code>-p<\/code>\u00a0flag to publish and map one or more ports at run time. The uppercase\u00a0<code>-P<\/code>\u00a0flag will publish all exposed ports.<\/p>\n<h4 id=\"abbb\"><a href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#volume\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/engine\/reference\/builder\/#volume\" data-><strong>VOLUME<\/strong><\/a><\/h4>\n<p id=\"05df\">VOLUME specifies where your container will store and\/or access persistent data. Volumes are the topic of a forthcoming article in this series, so we\u2019ll investigate them then.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*xZT5mao1DZlKu0Zsl9Zc7w.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*xZT5mao1DZlKu0Zsl9Zc7w.jpeg\" \/><\/p>\n<p style=\"text-align: center;\"><span style=\"text-align: center;\">VOLUME<\/span><\/p>\n<p id=\"c32e\">Let\u2019s review the dozen Dockerfile instructions we\u2019ve explored.<\/p>\n<h3 id=\"c915\">Important Dockerfile Instructions<\/h3>\n<p id=\"2225\"><code>FROM<\/code>\u200a\u2014\u200aspecifies the base (parent) image.<br \/>\n<code>LABEL<\/code>\u00a0\u2014provides metadata. Good place to include maintainer info.<br \/>\n<code>ENV<\/code>\u200a\u2014\u200asets a persistent environment variable.<br \/>\n<code>RUN<\/code>\u00a0\u2014runs a command and creates an image layer. Used to install packages into containers.<br \/>\n<code>COPY<\/code>\u200a\u2014\u200acopies files and directories to the container.<br \/>\n<code>ADD<\/code>\u200a\u2014\u200acopies files and directories to the container. Can upack local\u00a0.tar files.<br \/>\n<code>CMD<\/code>\u200a\u2014\u200aprovides a command and arguments for an executing container. Parameters can be overridden. There can be only one CMD.<br \/>\n<code>WORKDIR<\/code>\u200a\u2014\u200asets the working directory for the instructions that follow.<br \/>\n<code>ARG<\/code>\u200a\u2014\u200adefines a variable to pass to Docker at build-time.<br \/>\n<code>ENTRYPOINT<\/code>\u200a\u2014\u200aprovides command and arguments for an executing container. Arguments persist.<br \/>\n<code>EXPOSE<\/code>\u200a\u2014\u200aexposes a port.<br \/>\n<code>VOLUME<\/code>\u200a\u2014\u200acreates a directory mount point to access and store persistent data.<\/p>\n<p id=\"5caf\">Now you know a dozen Dockerfile instructions to make yourself useful! Here\u2019s a bonus bagel: a\u00a0<a href=\"https:\/\/kapeli.com\/cheat_sheets\/Dockerfile.docset\/Contents\/Resources\/Documents\/index\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/kapeli.com\/cheat_sheets\/Dockerfile.docset\/Contents\/Resources\/Documents\/index\" data->cheat sheet<\/a>\u00a0with all the Dockerfile instructions. The five commands we didn\u2019t cover are USER, ONBUILD, STOPSIGNAL, SHELL, and HEALTHCHECK. Now you\u2019ve seen their names if you come across them.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*wO-_QeIh-YFXRZeEMrczRw.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*wO-_QeIh-YFXRZeEMrczRw.jpeg\" \/><\/p>\n<p style=\"text-align: center;\"><span style=\"text-align: center;\">Bonus bagel<\/span><\/p>\n<h3 id=\"76df\">Wrap<\/h3>\n<p id=\"504d\">Dockerfiles are perhaps the key component of Docker to master. I hope this article helped you gain confidence with them. We\u2019ll revisit them in the\u00a0next article in this series on slimming down images.\u00a0Follow me to make sure you don\u2019t miss it!<\/p>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>A Dockerfile instruction is a capitalized word at the start of a line followed by its arguments. Each line in a Dockerfile can contain an instruction. Instructions are processed from top to bottom when an image is built. In this article, I&rsquo;m assuming you are using a Unix-based Docker image. You can also use Windows-based images, but that&rsquo;s a slower, less-pleasant, less-common process. So use Unix if you can. Let&rsquo;s do a quick once-over of the dozen Dockerfile instructions we&rsquo;ll explore.<\/p>\n","protected":false},"author":369,"featured_media":4211,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[94],"ppma_author":[2134],"class_list":["post-1590","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-data-science"],"authors":[{"term_id":2134,"user_id":369,"is_guest":0,"slug":"jeff-hale","display_name":"Jeff Hale","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Hale","first_name":"Jeff","job_title":"","description":"Jeff Hale is a co-founder of Rebel Desk, where he oversees technology, finance, and operations for this company. He&nbsp;is an experienced entrepreneur who has managed technology, operations, and finances for several companies.&nbsp;"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1590","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/369"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1590"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1590\/revisions"}],"predecessor-version":[{"id":29185,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1590\/revisions\/29185"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/4211"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1590"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1590"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1590"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1590"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}