Python Dependency Management with Fewer Headaches

One of the most painful parts of developing neural networks is dependency management in Python. It seems like Python has reinvented multiple wheels that other languages like JAVA have been merrily rolling along on for multiple years. Ironically, Python packages are actually called wheels. Oh well.

In this short post we want to show you our solution to this problem for Deep Learning (DL) projects, where this problem is particularly nasty as you also need to juggle multiple CUDA versions. Note that there are multiple ways to deal with this - this just happens to be the one we like most - maybe you will as well?

What is so hard about this?

In a professional setting you will work on multiple projects in parallel. And if you are working on multiple projects you soon feel the pain of managing dependencies. AI assisted coding only amplifies this problem, as now you will create a lot of MVPs quicker and try out ideas that before had been too costly to evaluate. This is of course a good thing but will force you to handle even more environments in parallel.

However, your projects will need to handle:

Different Python versions
Different versions of the same libraries
Multiple CUDA versions in parallel

So a single Python install with a set of installed libraries with pip is out. One global CUDA install will not work. Virtual environments are a solution that is widely used, but using plain pip inside is time consuming and error prone and requires a lot of discipline to pin down all library versions. And then you would have to manually install CUDA in each environment which is hard to automate. In the end people use all kinds of combinations of tools to somehow manage this, every workflow with its own pros & cons.

Enter CONDA

Of course one of the most popular ways to solve this is Anaconda. But you will quickly need the full paid version - for a smaller company this quickly adds up in additional licensing cost. Another (not exactly cheap) subscription just to get working dependency management and stable builds, which should be a basic feature of any language.

Apart from the additional cost there is another problem: Everybody who is supposed to reproduce your builds or work on the project needs a license as well - that makes it a bad choice for open source projects, where you want a large audience to be able to chip in.

Note however that Anaconda is definitely a good solution, so if you or your employer can get you a business license, you are mostly set and can stop reading here.

Our Solution

But there is also a way to achieve Python dependency management bliss with multiple CUDA versions using only Open Source components. Here is how.

The quick outline:

Figure out which PyTorch, Python and CUDA versions you need
Use miniconda to manage virtual environments and and install Python
Use pip tools inside the respective virtual environment to manage versions of libraries conveniently
And the most important trick: Use the PyTorch CUDA dependencies to automatically install the precise CUDA version you need

With these steps you can maintain as many combinations of Python/PyTorch/CUDA as you want on your machine and each project is 100% reproducible by anyone with access to the code.

Step 0: Figuring out what you want

This is how we determine the versions we want to use:

First, determine your target PyTorch version. If it is a greenfield project, just use the latest stable one (just check https://pytorch.org/ and scroll down). In some cases you might need to run / train / modify a network from another project or repo that needs older / deprecated API, so make sure you know the PyTorch version you need in those cases by reading the (hopefully existing) documentation of that project. Let’s assume for now you decide on PyTorch 2.8.0.

Next you need to determine the Python version for this. You can find the compatibility matrix here: https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix We have burnt our fingers many times with newer Python versions as often other libraries you might need later do not support it yet. Hence we recommend sticking to a version a bit older than the most current one. For PyTorch 2.8.0 we would go for Python 3.11.

Finally you need to know which CUDA version you want. To do that, you need to know what the maximum CUDA version is that you can run. To find this out, just run nvidia-smi in the console, that will give you an output like this:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.169                Driver Version: 570.169        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   48C    P4              9W /  115W |    1465MiB /   8188MiB |     33%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1892      G   /usr/lib/xorg/Xorg                      946MiB |
|    0   N/A  N/A            2785      G   cinnamon                                 28MiB |
|    0   N/A  N/A            4266      G   /usr/lib/thunderbird/thunderbird          7MiB |
|    0   N/A  N/A            4477      G   ...led --variations-seed-version        300MiB |
|    0   N/A  N/A            4870      G   /usr/lib/firefox/firefox                 20MiB |
|    0   N/A  N/A           10430      G   ...bkit2gtk-4.0/WebKitWebProcess         44MiB |
|    0   N/A  N/A           11682      G   ...ess --variations-seed-version          9MiB |
+-----------------------------------------------------------------------------------------+

Now at the top right you see CUDA Version: 12.8. You might think this means the system has CUDA 12.8 installed, but this is wrong and is a major source of confusion and stems from a tiny misnomer in the output of nvidia-smi: It should say max CUDA Version - because this is the maximum CUDA version the driver supports. (the output above is actually from a system without a CUDA installation). Now you just pick the highest stable CUDA version from the compatibility matrix your driver supports, in our case 12.8, and you are done.

As CUDA support by the drivers is downwards compatible it makes sense to keep the driver as current as possible so you can freely pick which CUDA version to use.

Now we are equipped with all version numbers we need to create a working Deep Learning project, but how to manage all those versions?

Step 1: Miniconda

We start with miniconda, the smaller Open Source Version of the full Anaconda (The downloads are on the right side of this page: https://www.anaconda.com/download/success). This doesn’t come with all the useful repos you need for CUDA and lots of other libraries, but it will allow you to quickly set up conveniently switchable virtual environments and install the exact Python version you need. To make this easy to reproduce, it is best to create a Yaml config file for this. Let’s call it cool_project_name.yml. It will always look like this (except for different Python versions):

name: cool_project_name
channels:
  - defaults
dependencies:
  - python=3.11 # or any other python version you like
  - pip
  - pip:
      - pip-tools

Note how little we put in the miniconda environment: just the python version and pip tools, which we will use in the next step to actually install the dependencies we need. This file will likely never change again over the lifetime of the project.

To create the environment, do:

    conda env create -f cool_project_name.yml
    conda activate cool_project_name

Step 2: Pip Tools

Now to install dependencies into this environment you could simply use pip, but pip-tools makes this much cleaner and easier to reproduce by figuring out library versions for you and especially handling changes to the dependencies much nicer than plain pip.

To add packages you don’t just install them, but instead add them to a file named requirements.in (note the suffix: it is .in, not .txt). E.g.:

# PyTorch with CUDA support
--extra-index-url https://download.pytorch.org/whl/cu128
torch==2.8.0+cu128
mlflow
transformers
# ...
# add any other libs you might need here
# ...

Ignore the weird --extra-index line for now - this is the secret sauce for our last step. The trick here is that you pin the version of the packages where you really need a specific version for the project to work and do not specify the other versions. Now this is normally a recipe for disaster if multiple people work on a project. Each person installs packages at different moments in a slightly different order without specifying versions, which sooner or later introduces nasty errors due to version inconsistencies. I have often seen projects that run perfectly fine on one developers machine and crash on another’s. However, this is where pip tools will help you.

You now run pip-compile requirements.in - pip tools will figure out good fits for all libraries without specified versions and create a requirements.txt with all versions exactly specified for you. This is what you could do for yourself, but I still have to meet a Python developer disciplined enough to actually do this.

Using this requirements.txt you can now run pip-sync requirements.txt to update your environment. This is really useful if there is a change/update as this will only install missing packages or packages with a changed version number.

To allow others to reproduce the build you now add cool_project_name.yml, requirements.in and requirements.txt to your repository.

To add new packages later:

add them to requirements.in
call pip-compile requirements.in
call pip-sync requirements.txt

If someone else added new packages or changed versions, all you need to do is run pip-sync requirements.txt.

Step 3: CUDA dependency

Apart from shape mismatches nothing causes as many headaches in Deep Learning as finding and managing the correct CUDA / CuDNN packages. This is normally where the Anaconda Business version shines. But luckily you can just solve this using pip. The trick is to add the official PyTorch repo. This is done with the following line:

--extra-index-url https://download.pytorch.org/whl/cu128

Note the suffix /cu128 - this is the repo for CUDA 12.8. Similarly, you would add /cu126 for CUDA 12.6, /cu118 for CUDA 11.8 etc. Now you just need to pin the torch and CUDA version via the torch version:

# 2.8.0 -> the torch version, +cu128 -> the cuda version of the CUDA dependencies
torch==2.8.0+cu128

This triggers pip tools to get the correct CUDA and CuDNN packages for PyTorch. If you want to check for the possible versions, you can view all torch version/cuda version combinations here (but normally the dependency matrix from the PyTorch git repo should be enough): https://download.pytorch.org/whl/torch/

Important: This means you should never install CUDA manually. In fact, it is better if you don’t have a CUDA install on your OS, as weird as this sounds. Otherwise the global install might interfere with the install in your virtual environment. All you need to install on your OS is an up-to-date GPU driver.

Summary

By just using miniconda, pip-tools and two config files you can now easily manage multiple Pytorch/Python/Conda combinations and switch conveniently between them. All members of a project team can recreate a working config with just a few commands. Since we follow this setup pattern for every project we didn’t have any reproducibility headaches anymore and people can be quickly onboarded to new projects. We hope you find this information useful and it will help you as much as it did us.

P.S.: But what if I don’t use PyTorch?

Well, if push comes to shove you can still install the PyTorch CUDA packages without Pytorch - they work just fine with other frameworks if those do not bring their own CUDA pip packages with them. Which packages these are we leave as an exercise for the reader - but admittedly, this method is best suited for PyTorch development.

25 Feb 2025
Working with Ollama, Part 2
In the first part of our article on Ollama, we demonstrated how to install Ollama and local models. In this second part, we cover advanced usage of Ollama by customizing modelfiles and integrating with the AnythingLLM frontend. We show how these tools make managing and utilizing local AI models more efficient.
weiterlesen
24 Feb 2025
Working with Ollama, Part 1
In the first installment of our two-part series “Working with Ollama,” we introduce the open-source, cross-platform solution Ollama, which simplifies both the management and usage of AI models.
weiterlesen
08 Apr 2024
Whisper 3 Large for JAVA
For an internal product prototype we have traced OpenAI’s Whisper 3 model from Huggingface and made it usable under JAVA via DJL.
weiterlesen
14 Jun 2023
ChatGPT for Teams: Privacy-Compliant Use in the Workplace
In today’s digital business world, AI-powered communication platforms like ChatGPT are essential for tasks such as answering complex code questions or creating top-notch texts for offers. However, in companies dealing with sensitive customer data, using ChatGPT can lead to a data protection dilemma. While ChatGPT offers an option to prevent the use of chat conversations for training purposes, it comes with certain limitations. Moreover, as of June 2023, there is no way to manage multiple team members or users through a company account. Each user must register individually and use their own email, phone number, and credit card. If you want to use ChatGPT+, for example, you cannot pay for all users with one credit card. Individual invoices also end up with individual users, creating an organizational and accounting nightmare. We at DIVISO have also grappled with this issue and went in search of a solution.
weiterlesen
25 Oct 2021
Git as a management tool for training data and experiments in ML
In this part of the series of articles on MLOps, we start with information that will be familiar to most of you: With the basics of Git. However, to give a different perspective on the well-known tool, these basics provide the basis to highlight the function and benefits of Git for machine learning (ML) and the difference in managing training data.
weiterlesen
02 Aug 2021
MLOps: Establishment and operation of an AI
With Machine Learning Operations (MLOps) we ensure that data is efficiently and strategically integrated into business processes through regular and automated training, thus contributing to increased revenue. The challenge is to establish and maintain these automated processes.
weiterlesen
31 Aug 2020
Types of Artificial Neural Networks
In our real-world example, we used a “feed-forward neural network” to recognise handwritten numbers. This is probably the most basic form of a NN. In reality, however, there are hundreds of types of mathematical formulas that are used – beyond addition and multiplication – to compute steps in a neural network, many different ways to arrange the layers, and many mathematical approaches to train the network.
weiterlesen
17 Jul 2020
Amazon DJL - a new DL framework for Java
Developers who wanted to explore neural networks and deep learning using the JVM, and especially Java, had little choice so far. Those who wanted to focus exclusively on Java could not get around DL4J until now. If it had to be the JVM, but not necessarily Java, the MXNet Scala Frontend was also an option. Finally, if a little Python didn’t scare you, you could try a hybrid solution, combining TensorFlow and Java just like we already explained in previous articles.
weiterlesen
29 Jun 2020
NLP, NLU and NLG: AI and text
So far, we have generally steered clear of the areas of text comprehension and text generation by ML in our practical examples for the basic understanding of AI. For good reason, we have focused primarily on two types of problems: classification of images and prediction of numerical values.
weiterlesen
23 Jun 2020
Neural networks - The five most common mistakes
AI and especially Neural Networks or Deep Learning have been the technological hype topic for some years now. However, since the subject is quite abstract – one could say it is uncharted territory for most people – we want to clear up some mistakes that we often encounter in our work.
weiterlesen
02 Jun 2020
What are Neural Networks and how do they work?
In our past articles we mainly covered the basics of current AI research and tried to shed some light on them in a way that is understandable for non-IT scientists. We are now proceeding to the probably “hottest” current AI topic: Neural Networks (NN).
weiterlesen
12 May 2020
Deep Java Learning Introduction - Part 1: NDManager & NDArray
After our first presentation of Amazon’s new Deep Learning Framework for Java, DJL, we now want to introduce the basics of Deep Learning under Java with DJL step by step in a series of beginner posts. This is not about quickly copying code snippets, but about really understanding the framework and the concepts.
weiterlesen
11 May 2020
Deep Fakes - How to spot faked Images
A (fairly) new kind of neural networks, so-called Generative Adversarial Networks or GANs, are nowadays capable of generating deceptively real images of people that do not actually exist. These fake images are indistinguishable from real photos at first glance. Fortunately, you might still uncover them if you look closely – if you know what to look for!
weiterlesen
28 Jun 2019
Recap: ML Conference 2019 in Munich
On 17.06. another round of the semi annual ML Conference started in Munich. As usual, it started with a day-long workshop with joint live coding, giving the participants an approachable introduction into Machine Learning and Deep Learning.
weiterlesen
24 May 2019
Understanding AI - Part 5: Supervised & Unsupervised Learning in ML
In the previous article we introduced the basic concepts of Machine Learning and how the training of an ML model works, using a simple but practical algorithm. Next, we want to take a closer look at the different types of Machine Learning.
weiterlesen
14 May 2019
BGL symposium 2019 - lecture 'AI and Magic'
“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur C. Clarke JAX 2019 is barely over, but Christoph is already on the podium for the next talk. At the symposium of the BLG (Federal Association of Industrial Photographic Laboratories), his lecture will cover “AI and Magic – How does Artificial Intelligence work?
weiterlesen
29 Apr 2019
Jax 2019 Recap
JAX 2019 is approaching and once again Christoph is contributing two sessions. This year he’s focussing on Neural Networks and explains how to use TensorFlow-Training while working with JVM.
weiterlesen
25 Apr 2019
Understanding AI - Part 4: The basics of Machine Learning
After shedding some light onto Symbolic AI in the previous article, we’re now moving on to take a closer look at Machine Learning (ML). When it comes to Symbolic AI, breaking down a problem as minutely as possible is key for successfully solving it.
weiterlesen
08 Apr 2019
Understanding AI - Part 3: Methods of symbolic AI
In the previous article we added two distinctions to our initial definition of AI: On the one hand we distinguish between strong and weak AI (Terminator & Science Fiction vs. the scientific status quo). Also we pointed out the difference between symbolic AI and Machine Learning.
weiterlesen
21 Mar 2019
Understanding AI - Part 2: Symbolic AI, Neural Networks and Deep Learning
Artificial Intelligence (AI) is as old as computer science itself. Calculations, logical deductions, complex assignments… all this was once restricted to humans, until computers came forth.
weiterlesen
07 Mar 2019
Understanding AI - Part 1: What is AI?
From household help to doomsday scenario - there’s hardly a topic where public perception, state of research and reality seem so incongruent as with artificial intelligence. Reason enough to shed some light onto this subject with a series of articles.
weiterlesen
06 Aug 2018
DL4J Workshop at the ML Summit in Berlin
On October 1st and 2nd the first ML Summit takes place in Berlin. In 12 workshops in three parallel tracks, experts impart practical knowledge on the topics Applications for Business, Machine Learning Basics & Tools and Specialized Topics.
weiterlesen
23 Apr 2018
Jax 2018 - Talks about DL4J and more
Christoph will give two talks about Java and Machine Learning at JAX 2018
weiterlesen
29 Jan 2018
Enterprise TensorFlow 4 - Executing a TensorFlow Session in Java
A TensorFlow Session can be executed in Java in the same way as in Python. This post shows how.
weiterlesen
23 Jan 2018
Enterprise TensorFlow 3 - Loading a SavedModel in Java
Part 3 in the series about Java / TensorFlow Interoperability, showing how to load a TensorFlow SavedModel in Java
weiterlesen
22 Jan 2018
Enterprise TensorFlow 2 - Saving a trained model
Part 2 in the series about Java / TensorFlow Interoperability, discussing how to save a model so it can be reused in a different environment.
weiterlesen
11 Jan 2018
TensorFlow and Java - An interview with entwickler.de
Our CTO was interviewed about TensorFlow / Java Interoperability while at ML Conference 2017 in Berlin.
weiterlesen
08 Jan 2018
Enterprise Tensorflow: Code Examples
Overview over the example projects for TensorFlow / Java integration
weiterlesen
30 Nov 2017
Enterprise Tensorflow - Java vs. Python
This is the first part of a series of posts about Java and Tensorflow interop. It is a more extensive version of my talk at ML Conference 2017 in Berlin
weiterlesen
15 Nov 2017
ML Conference 2017 in Berlin
An announcement for my presentation at the ML Conference 2017 in Berlin
weiterlesen

Python Dependency Management with Fewer Headaches

What is so hard about this?

Enter CONDA

Our Solution

Step 0: Figuring out what you want

Step 1: Miniconda

Step 2: Pip Tools

Step 3: CUDA dependency

Summary

P.S.: But what if I don’t use PyTorch?

Working with Ollama, Part 2

Working with Ollama, Part 1

Whisper 3 Large for JAVA

ChatGPT for Teams: Privacy-Compliant Use in the Workplace

Git as a management tool for training data and experiments in ML

MLOps: Establishment and operation of an AI

Types of Artificial Neural Networks

Amazon DJL - a new DL framework for Java

NLP, NLU and NLG: AI and text

Neural networks - The five most common mistakes

What are Neural Networks and how do they work?

Deep Java Learning Introduction - Part 1: NDManager & NDArray

Deep Fakes - How to spot faked Images

Recap: ML Conference 2019 in Munich

Understanding AI - Part 5: Supervised & Unsupervised Learning in ML

BGL symposium 2019 - lecture 'AI and Magic'

Jax 2019 Recap

Understanding AI - Part 4: The basics of Machine Learning

Understanding AI - Part 3: Methods of symbolic AI

Understanding AI - Part 2: Symbolic AI, Neural Networks and Deep Learning

Understanding AI - Part 1: What is AI?

DL4J Workshop at the ML Summit in Berlin

Jax 2018 - Talks about DL4J and more

Enterprise TensorFlow 4 - Executing a TensorFlow Session in Java

Enterprise TensorFlow 3 - Loading a SavedModel in Java

Enterprise TensorFlow 2 - Saving a trained model

TensorFlow and Java - An interview with entwickler.de

Enterprise Tensorflow: Code Examples

Enterprise Tensorflow - Java vs. Python

ML Conference 2017 in Berlin