Configuring JupyterHub (revision 1)

Introduction

This page describes how Alexis Huxley installed and configured JupyterHub, running on a dedicated server with a remote front-end Apache reverse proxy.

On Debian 12 there are a couple of big problems:

  1. BTS#1038110 means the native packages’s configuration files are not read, making use a remote reverse proxy not possible
  2. prerequisite component npm is very out of date w.r.t. the official documention, making it overly complicated to install
  3. overly complicated configuration of logging, making it difficult to diagnose problems
  4. the native packages are simply broken

Accordingly, this procedure describes how to install the latest packages.

Special hardware requirements

  1. 4GB RAM
  2. 1 x 10GB disk for OS (standard 5GB is too small for JupterHub and its log files, which are stored in OS directories)
  3. 1 x 20GB disk for /home

Prologue

This section does not needed to be completed by anyone except Alexis and Alexis only needs to do it once!

  1. Ensure that debfoster reports no packages (this is so that I can use debfoster to restore the system to the state it was in before running this prologue).
  2. Install the distro-native jupyterhub by running:
    apt-get -y install jupyterhub
    
  3. Copy jupyterhub’s service startup script to a safe location:
    cp /lib/systemd/system/jupyterhub.service ~/
    
  4. Run debfoster to remove all just-added packages:
    debfoster -f -o RemoveCmd="apt-get -y --purge remove"
    rm -fr /etc/jupyterhub
    

    (Option -o ... is to make debfoster‘s call to apt-get non-interactive.)

  5. Update the procedure below, based on the contents of the saved .service file.

Procedure

  1. Since Jupterhub can start browser-based terminals, from which a user could run su, lock the root account:
    passwd -l root
  2. Make sure that the OS can resolve the credentials of the JupyterHub users. Their homes don’t need to be accessible yet. For myself this means just running:
    id alexis
    id suzie
  3. Install some packages that pip will require:
    apt-get -y install gcc g++
  4. Install miniforge3 as follows:
    wget -q https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
    bash Miniforge3-Linux-x86_64.sh -b -p /usr/local/opt/miniforge3
    ln -sr /usr/local/opt/miniforge3/bin/conda /usr/local/bin/conda
  5. Alexis should install jupyterhub-wrapper as follows:
    mkdir -p ~/opt
    svn co https://svn.pasta.freemyip.com/main/wrappers/trunk ~/opt/wrappers
    ln -sr ~/opt/wrappers/bin/jupyterhub-wrapper /usr/local/bin/jupyterhub-wrapper

    Everbody else should install it as follows:

    svn cat https://svn.pasta.freemyip.com/main/wrappers/trunk/bin/jupyterhub-wrapper > /usr/local/bin/jupyterhub-wrapper
    chmod 755 /usr/local/bin/jupyterhub-wrapper

    (The script jupyterhub-wrapper will call conda to download and set up jupyterhub but also keeps root’s home clean by changing the value of HOME.)

  6. Run the wrapper for the first time as follows:
    timeout 60 /usr/local/bin/jupyterhub-wrapper

    and wait the 60 seconds until that exits.

  7. Restrict access by editing /etc/jupyterhub_config.py and setting:
    ...
    c.Authenticator.admin_users = set(['alexis'])
    ...
    c.Authenticator.allowed_users = set(['alexis'])
    ...
    
  8. Set up a service config file, start the service and then check it is running by running:
    cat > /etc/systemd/system/jupyterhub.service <<EOF
    [Unit]
    
    [Service]
    User=root
    Restart=always
    WorkingDirectory=/var/local/jupyterhub
    PrivateTmp=yes
    PrivateDevices=yes
    ProtectKernelTunables=yes
    ProtectKernelModules=yes
    ExecStart=/usr/local/bin/jupyterhub-wrapper
    
    [Install]
    WantedBy=multi-user.target
    EOF
    systemctl daemon-reload
    systemctl enable jupyterhub
    systemctl start jupyterhub
    sleep 5
    ss -tlp
    
  9. Although this document said that JupyterHub was installed on a dedicated server, it is possible that you installed it on your local machine. If you installed it locally then test as follows:
    1. Access JupyterHub at http://localhost:8000/.

    Otherwise test as follows:

    1. On your local machine run:
      ssh -NL 8000:127.0.0.1:8000 <jupyterhub-server>
    2. Access JupyterHub at http://localhost:8000/.
    3. Kill the ssh process.
  10. If JupyterHub was installed on a dedicated server and the service is then to be accessed via a reverse proxy (typically front-ending many different web services) then:
    1. Proxy the backend website as described at Configuring web services (revision 2.1).
    2. Access JupyterHub at https://<frontend-vhost>/.

Adding users

  1. Ensure the new user’s ID and home are resolvable. For my own setup, where LDAP is used to resolve users but homes are local to the JupyterHub server, this means running:
    LOGIN=suzie
    id $LOGIN > /dev/null || echo "$LOGIN: can't resolve user; fix this now"
    mkdir /home/$LOGIN
    echo 'PATH=$PATH:.' >> /home/$LOGIN/.profile
    echo '[ ! -f ~/.bashrc ] || . ~/.bashrc' >> /home/$LOGIN/.profile
    chown -R $LOGIN:$LOGIN /home/$LOGIN
  2. ~/.cache and ~/.conda get pretty big pretty quickly and contain only stuff one should be able to easily regenerate. Therefore also run:
    mkdir -p /srv/scratch/$LOGIN/.{conda,cache}
    ln -sr /srv/scratch/$LOGIN/.{conda,cache} /home/$LOGIN
    chown -R $LOGIN:$LOGIN /srv/scratch/$LOGIN
  3. Edit /etc/jupyterhub_config.py and set:
    c.Authenticator.allowed_users = set(['alexis','suzie'])
  4. Restart jupyterhub by running:
    systemctl restart jupyterhub
    

      FAQ

      How to create conda environment suitable for Jupyter notebooks with a custom python version

      If the python environment provided by the python command that is being used to run Jupyterhub, which is the python command provided by the OS, is not suitable for running your notebook, then you need to install a new version of python and tell Jupyterhub that it exists, so that it can be offered to you in the Jupyterhub web interface.

      1. From Jupterhub open a terminal
      2. Make use of the conda tool possible by running:
        conda init
        #  It is not sufficient to close and reopen the terminal to load the modified environment, due to
        #  how jupyter notebooks invokes the shell. This needs to be done first.
        echo '[ ! -f ~/.bashrc ] || . ~/.bashrc' >> ~/.profile
      3. Close the terminal tab and open a new terminal. The prompt should be:
        (base) <username>@pansoti:~$
      4. Set some environment variables that will ease understanding the next step:
        CPV=<custom-python-version>      #  e.g. CPV=3.11.2
      5. Run:
        #  Create a conda enviroment, which to be usable as Jupyter Notebook
        #  python kernel, *must* contain the ipykernel module.
        conda create -y -n "python-$CPV" python=$CPV ipykernel
        #  Activate the environment, after which, invoking 'python' will 
        #  invoke the python command in the enviromment.
        conda activate "python-$CPV"
        #  Register the just-created (and active) environment with Jupyter 
        #  Notebook. It is not the parameters that tell Jupyter notebook 
        #  in which directory the environment is (they only set names to be
        #  used internally and in the web interface, respectively), but rather
        #  *which* python command gets invoked by this command itself that 
        #  tells Jupyter Notebook in which directory the environment is.
        python -m ipykernel install --user --name my-env-$CPV --display-name "Python $CPV"
        
      6. Switch to the launcher tab (not the browser tab, but the Jupyter tab within the current browser tab).
      7. Reload the browser tab (Firefox & Chrome: CTRL-R). If the launcher tab closes itself then open a new one.
      8. Verify that new custom python version is listed.
      9. Note that this is based on the procedure here.

      How to run ScopeSim

      ScopeSim will not run with the OS-provided python.

      1. Create a conda environment suitable for Jupyter notebooks with custom python version 3.11.2 (see earlier FAQ question).
      2. Download one of the example notebooks from the ScopeSim site. For example, on your local machine run:
        wget https://github.com/AstarVienna/ScopeSim/raw/refs/heads/main/docs/source/examples/1_scopesim_intro.ipynb
      3. Upload that into Jupyter notebook (use the ‘Upload files’ icon at the top left of the workspace manager).
      4. Edit that notebook by double-clicking on its name (in the panel on the left).
      5. Change the python kernel that will execute the notebook to the one with python 3.11.2 (use the ‘Switch kernel’ icon at the top right of the notebook editor).
      6. Add a new first cell (click the existing first cell and then click the ‘Insert a cell above’ icon) and add the following code:
        %pip install scopesim scopesim_templates matplotlib ipywidgets
      7. Run that one cell (if the cell is still open then press SHIFT-ENTER). This will produce a lot of output, necessitating scrolling down. It will take a few moments to run and will pause at certain points, perhaps leading you to think it has finished when it hasn’t. The final output should be:
        Note: you may need to restart the kernel to use updated packages.
      8. As per its suggestion, from the ‘Kernel’ menu, select ‘Restart Kernel…’.
      9. From the ‘Run’ menu, select ‘Run All Cells’.
      10. It should display a pretty picture 🙂

      See also