A Comprehensive Guide to Installing Pairtools

Pairtools is a powerful Python package for processing paired-end sequencing data, widely used in bioinformatics for tasks like Hi-C analysis. However, installing it can be tricky due to dependency linking issues, particularly with pysam. In this article, I'll walk you through a step-by-step process (that I used to install pairtools) to install pairtools successfully, avoiding common pitfalls like the dreaded ImportError caused by broken library links. By the end, you'll have a working setup, verified by running pairtools' test suite. This guide is based on real-world troubleshooting in a GitHub Codespace environment as of March 28, 2025, using pairtools 1.1.3 and pysam 0.23.0. Setup Pairtools works best in Unix based systems due to it's dependence on pysam which in turn utilizes htslib. Therefore this installation is primarily meant for Linux platform, although the results could be replicated for MacOS too. For Windows machine, there is a workaround method to install it by using tools like gcc. Why Installation Can Be Challenging Pairtools relies on pysam, a Python wrapper for the htslib C library, which involves compiling Cython extensions (e.g., parse_pysam.so). During installation, these extensions must link to pysam's shared libraries (e.g., libchtslib.so). A common issue arises when pip's build isolation creates temporary environments, causing pairtools to link against paths like /tmp/pip-build-env-…/pysam/libchtslib.so. These paths vanish post-installation, leading to runtime errors like: ImportError: /tmp/pip-build-env-k1zoq23s/overlay/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory Our goal is to ensure pairtools links to a persistent pysam installation, avoiding temporary paths. GitHub Codespaces, with its Ubuntu-based environment, provides a clean slate to achieve this - if set up correctly. Prerequisites GitHub Codespace: Access to a Codespace for your pairtools repository (e.g., forked from open2c/pairtools). Basic Terminal Skills: Familiarity with running commands in a Linux terminal. Repository Cloned: Your Codespace should have the pairtools repo checked out (e.g., at /workspaces/pairtools). ## Step-by-Step Installation Guide ### Step 1: Start with a Fresh Codespace A clean environment is critical to avoid cached build artifacts that can cause linking errors. Action: Delete your existing Codespace and create a new one.  - Go to the GitHub Codespaces dashboard (https://github.com/codespaces).  - Find your current Codespace for the pairtools repo, delete it, and click "New Codespace" to start fresh. Why: This ensures no stale dependencies or mislinked libraries interfere. ### Step 2: Install System Dependencies Pysam and pairtools require system libraries and development tools for compilation. Command:  bash  sudo apt update && sudo apt install -y build-essential libhts-dev python3-dev   Details:  - build-essential: Provides gcc, make, and other compilation tools.  - libhts-dev: Headers for htslib, used by pysam.  - python3-dev: Python development headers for building extensions. Why: These ensure successful compilation of pysam and pairtools from source. ### Step 3: Uninstall Existing Packages Remove any prior installations to prevent conflicts. Command:  bash  pip uninstall pysam pairtools -y   Why: A clean slate avoids version mismatches or leftover files. ### Step 4: Install pysam from Source Build pysam explicitly to place its shared libraries in a persistent location. Command:  bash  pip install "pysam==0.23.0" - no-binary pysam - no-cache-dir - verbose   Details:  -  - no-binary pysam: Forces a source build, avoiding precompiled wheels that might mislink.  -  - no-cache-dir: Prevents reusing cached files that could cause issues.  -  - verbose: Shows build output for debugging if needed. Verification:  - Check installed .so files:  bash  ls /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/*.so    Expect files like libchtslib.cpython-312-x86_64-linux-gnu.so.  - Verify linking:  bash  ldd /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so    Ensure no /tmp/ paths appear; expect system libraries like /lib/x86_64-linux-gnu/libc.so.6. Why: A source-built pysam ensures its libraries are correctly placed and linked. Note: The exact path (/usr/local/python/3.12.1/) may vary slightly depending on your Codespace's Python setup. Replace it with /home/codespace/.python/current/ if that's your environment. ### Step 5: Install pairtools Without Build Isolation Install pairtools in editable mode, linking it to the installed pysam. Commands:  bash  cd /workspaces/pairtools  rm -rf build dist *.egg-info pairtools/lib/*.so # Clean old build artifacts  pip install -e .[all] - no-build-isolation - no-cache-dir - verbose   Details:  - -e: Editable mode, useful for

Mar 28, 2025 - 19:43
 0
A Comprehensive Guide to Installing Pairtools

Pairtools is a powerful Python package for processing paired-end sequencing data, widely used in bioinformatics for tasks like Hi-C analysis. However, installing it can be tricky due to dependency linking issues, particularly with pysam. In this article, I'll walk you through a step-by-step process (that I used to install pairtools) to install pairtools successfully, avoiding common pitfalls like the dreaded ImportError caused by broken library links. By the end, you'll have a working setup, verified by running pairtools' test suite.
This guide is based on real-world troubleshooting in a GitHub Codespace environment as of March 28, 2025, using pairtools 1.1.3 and pysam 0.23.0.

Setup

Pairtools works best in Unix based systems due to it's dependence on pysam which in turn utilizes htslib.
Therefore this installation is primarily meant for Linux platform, although the results could be replicated for MacOS too.
For Windows machine, there is a workaround method to install it by using tools like gcc.

Why Installation Can Be Challenging

Pairtools relies on pysam, a Python wrapper for the htslib C library, which involves compiling Cython extensions (e.g., parse_pysam.so). During installation, these extensions must link to pysam's shared libraries (e.g., libchtslib.so). A common issue arises when pip's build isolation creates temporary environments, causing pairtools to link against paths like /tmp/pip-build-env-…/pysam/libchtslib.so. These paths vanish post-installation, leading to runtime errors like:

ImportError: /tmp/pip-build-env-k1zoq23s/overlay/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory

Our goal is to ensure pairtools links to a persistent pysam installation, avoiding temporary paths. GitHub Codespaces, with its Ubuntu-based environment, provides a clean slate to achieve this - if set up correctly.

Prerequisites

  • GitHub Codespace: Access to a Codespace for your pairtools repository (e.g., forked from open2c/pairtools).
  • Basic Terminal Skills: Familiarity with running commands in a Linux terminal.
  • Repository Cloned: Your Codespace should have the pairtools repo checked out (e.g., at /workspaces/pairtools). ## Step-by-Step Installation Guide ### Step 1: Start with a Fresh Codespace A clean environment is critical to avoid cached build artifacts that can cause linking errors.
  • Action: Delete your existing Codespace and create a new one.  - Go to the GitHub Codespaces dashboard (https://github.com/codespaces).  - Find your current Codespace for the pairtools repo, delete it, and click "New Codespace" to start fresh.
  • Why: This ensures no stale dependencies or mislinked libraries interfere. ### Step 2: Install System Dependencies Pysam and pairtools require system libraries and development tools for compilation.
  • Command:

 bash
 sudo apt update && sudo apt install -y build-essential libhts-dev python3-dev
 

  • Details:  - build-essential: Provides gcc, make, and other compilation tools.  - libhts-dev: Headers for htslib, used by pysam.  - python3-dev: Python development headers for building extensions.
  • Why: These ensure successful compilation of pysam and pairtools from source. ### Step 3: Uninstall Existing Packages Remove any prior installations to prevent conflicts.
  • Command:

 bash
 pip uninstall pysam pairtools -y
 

  • Why: A clean slate avoids version mismatches or leftover files. ### Step 4: Install pysam from Source Build pysam explicitly to place its shared libraries in a persistent location.
  • Command:

 bash
 pip install "pysam==0.23.0" - no-binary pysam - no-cache-dir - verbose
 

  • Details:  -  - no-binary pysam: Forces a source build, avoiding precompiled wheels that might mislink.  -  - no-cache-dir: Prevents reusing cached files that could cause issues.  -  - verbose: Shows build output for debugging if needed.
  • Verification:  - Check installed .so files:

 bash
 ls /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/*.so
 

 Expect files like libchtslib.cpython-312-x86_64-linux-gnu.so.
 - Verify linking:

 bash
 ldd /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so
 

 Ensure no /tmp/ paths appear; expect system libraries like /lib/x86_64-linux-gnu/libc.so.6.

  • Why: A source-built pysam ensures its libraries are correctly placed and linked. Note: The exact path (/usr/local/python/3.12.1/) may vary slightly depending on your Codespace's Python setup. Replace it with /home/codespace/.python/current/ if that's your environment. ### Step 5: Install pairtools Without Build Isolation Install pairtools in editable mode, linking it to the installed pysam.
  • Commands:

 bash
 cd /workspaces/pairtools
 rm -rf build dist *.egg-info pairtools/lib/*.so # Clean old build artifacts
 pip install -e .[all] - no-build-isolation - no-cache-dir - verbose
 

  • Details:  - -e: Editable mode, useful for development.  - [all]: Installs optional dependencies (e.g., bioframe).  -  - no-build-isolation: Uses the current environment's pysam, avoiding temporary build paths.
  • Verification:  - Check linking:

 bash
 ldd /workspaces/pairtools/pairtools/lib/parse_pysam.cpython-312-x86_64-linux-gnu.so
 

 Look for:

 
 /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so (0x…)
 

 No /tmp/ paths should appear.

  • Why:  - no-build-isolation ensures pairtools' extensions link to the installed pysam. ### Step 6: Run Tests to Confirm Test the installation to ensure everything works.
  • Command:

 bash
 /usr/local/python/3.12.1/bin/python -m pytest -v
 

  • Expected Output: All 17 tests should collect and run, showing pass/fail results. No ImportError for libchtslib.so.
  • Why: This verifies that pairtools and its dependencies are fully functional. ### Step 7: Use pairtools Try it out!
  • Command:

 bash
 /usr/local/python/3.12.1/bin/pairtools - help
 

  • Expected Output: Displays the command-line help, confirming the CLI is operational. ## Troubleshooting Common Issues ### Linking Still Points to /tmp/
  • Symptom: ldd shows /tmp/pip-build-env-…/pysam/libchtslib.so => not found.
  • Fix:  - Clean and rebuild manually:

 bash
 cd /workspaces/pairtools
 rm -rf build dist *.egg-info pairtools/lib/*.so
 python setup.py build_ext - inplace - verbose
 pip install -e .[all] - no-build-isolation - no-cache-dir - verbose
 

 - Recheck with ldd.

NumPy Version Warnings

  • Symptom: Warnings like numpy 1.26.4 is incompatible with bioframe 0.7.2.
  • Fix:  - Downgrade NumPy:

 bash
 pip install "numpy<2"
 

 - Rerun tests.

Missing System Dependencies

  • Symptom: Build fails with errors about missing htslib or gcc.
  • Fix: Reinstall system dependencies (Step 2). ## Why This Works
  • Fresh Codespace: Eliminates cached artifacts that cause linking errors.
  • Source-Built pysam: Ensures shared libraries are in a stable location.
  • No Build Isolation: Forces pairtools to use the installed pysam, avoiding temporary paths.
  • Ubuntu Environment: Native Linux in Codespaces provides reliable library linking compared to WSL quirks. ## Conclusion Installing pairtools in GitHub Codespaces doesn't have to be a headache. By starting fresh, building pysam from source, and installing pairtools without build isolation, you can sidestep linking issues and get to work on your paired-end sequencing projects. This method has been battle-tested to resolve the ImportError for libchtslib.so, ensuring a robust setup as of March 2025. Now that your pairtools is installed, explore its capabilities - parse Hi-C data, run stats, or dive into its test suite for inspiration. Happy coding!