Hey guys, Its Adwaith here, And I'm excited to share some of my personal thought and tips on reducing the size of you docker builds for python.
Before we begin let's look at a bad example and slowly work our way up fixing each of the mistakes.
As an example I will be building a static API with Python and FastAPI. And I will be using python:3.9.7-alpine
image for building the final image.
Bad way of doing things.
This is what the file structure looks like after test running the app for the first time.
├── .git
├── app.py
├── Dockerfile
├── env
├── __pycache__
├── requirements.txt
└── routers
├── __init__.py
└── users.py
And this are the contents of the Dockerfile
FROM python:3.9.7-alpine
WORKDIR /app
EXPOSE 8000
COPY . .
RUN pip3 install requirements.txt
CMD ["uvicorn", "app:app", "--host", "0.0.0.0"]
We can now build the image by running the following command
docker build -t blog-py .
Now if you run
docker images
you can see the size of the image it generated
REPOSITORY TAG IMAGE ID CREATED SIZE
blog-py latest 788a1391adbc 7 seconds ago 114MB
Its 114MB we can do much better than this. Now lets make the build more efficient and less bulky.
The good way of doing things.
Let's have the same file structure as we had before.
The first obvious fix is to move the installing required packages to its own layer. Because during development you will be constantly rebuilding images and you don't want it to reinstall the required package every time you build an image (dockers caching system will cache the packages if its on a different layer).
More on caching and layers will be posted soon. So follow
The next one is to prevent pip from caching the installed whl file. as it's no good for us inside a container.
Implementing these two fixes our Docker file will look something like this.
FROM python:3.9.7-alpine
WORKDIR /app
EXPOSE 8000
COPY requirements.txt .
RUN pip3 install -r requirements.txt --no-cache-dir
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0"]
If you spin up a container from the above image (bad example one) and look into the WORKDIR
you will see something like this
$ docker exec 008efcc88355 ls -a /app
.
..
.git
Dockerfile
__pycache__
app.py
env
requirements.txt
routers
As you can see the .git
, env
and the __pycache__
folder. And even the Dockerfile has been copied into the image. These folder and files are no good for us in the container.
If you for some reason need a .git folder inside a container then you are doing something wrong.
So we need prevent these folder from getting in. So let's add a .dockerignore
file.
.git
env
__pycache__
*.pyc
Dockerfile
.dockerignore # dockception
Like this, add all the file / folder names that you think are unnecessary inside and image into the .dockerignore
file.
Now with the current changes that we have made lets look at the size of the resulting image.
- build the image
docker build -t blog-py2 .
- check the size
REPOSITORY TAG IMAGE ID CREATED SIZE
blog-py2 latest 55134d18777e 36 seconds ago 54.1MB
As you can see with these minute changes we reduced the size of the image by half. If you have anymore suggestions or tips comment below.
That's all for today.
Thank for reading.