In about 15 minutes, we dive into conda environments and how they can help you lead a happier, healthier scientific programming life.
Discussion
As always, you can discuss this lecture or any related issues on the members-only vxuni forum.
Overview
- What is a conda / virtual environment and why do we need them?
- Creating a conda environment manually.
- Doing it the clever way: Specifying the environment.yml for fun and profit.
Pre-requisites
You only need an installed miniconda3 as explained in the vxuni mini-lecture Up and Running with Miniconda and PyCharm on MacOS.
What are environments and why do we need them? 00:35
- Scientific programming / data science profits greatly from the amazing Python library ecosystem.
- The small price that we pay is to manage a deep tree of dependencies. Virtual environments, of which conda environments are one implementation, make this tractable.
- (Conda) environments enable the easy definition of a complete software environment to run a certain project or application.
- In practice, we use them to keep our analysis pipelines and other software applications reproducible across platforms and between different engineering and client teams.
Creating a new environment “manually” 01:45
Create
We’ll reuse the same environment for many of these mini lectures. Let’s create it:
conda create --name vxuni python=3.6 numpy
List environments
Double check that conda knows about it:
conda info --envs
Activate and work within an environment 03:32
Let’s activate the environment:
conda activate vxuni
List all packages in this environment:
conda list
Install conda packages 04:10 or pip packages 04:40 in an active environment
With the environment active, it becomes even easier to install with conda
or pip
:
pip install requests conda install ipython
Now I can do this all of a sudden:
r = requests.get('https://api.github.com/search/users', {'q': 'botha', 'per_page': 5}) r.json() r.json().get('total_count')
When I’m done, I can deactivate to go back to system defaults:
conda deactivate
What does this look like? 07:04
Let’s go take a look in $HOME/miniconda3/envs
.
The environment.yml specification 08:47
There is a way to specify the name of an environment and all of its dependencies in such a way that you can always easily recreate that environment, or just make sure that the existing one you have satisfies all requirements.
Ask conda for the environment specification:
conda env export
Could share this with a colleague for them to get the exact same environment.
In practice we handcraft them 10:02
Whilst developing though, it’s better to have an environment.yml which only specfies the top-level dependencies, and to let the conda solver take care of the rest.
Let’s try:
name: vxuni dependencies: - python=3.6 - cython>=0.28 - ipython - pip: - gunicorn - uvicorn
Pro-tip: use conda env update to create OR update 11:15
One of my favourite conda nuggets is that conda env update
will now either create the environment and install all packages, or just update the environment if it already exists. This is what I use most of the time.
How and when to specify package versions 13:40
- While developing: Don’t version as many packages.
- Closer to production: Pin down more of the top-level dependencies
Cheatsheet
Get a list of all environments
conda info --envs
Export current environment in yaml format
conda env export
This will generate a detailed environment.yml with all packages and versions.
Removing a conda environment
conda env remove --name the_env_name