A Comprehensive Guide to Installing Pairtools
Pairtools is a powerful Python package for processing paired-end sequencing data, widely used in bioinformatics for tasks like Hi-C analysis. However, installing it can be tricky due to dependency linking issues, particularly with pysam. In this article, I'll walk you through a step-by-step process (that I used to install pairtools) to install pairtools successfully, avoiding common pitfalls like the dreaded ImportError caused by broken library links. By the end, you'll have a working setup, verified by running pairtools' test suite. This guide is based on real-world troubleshooting in a GitHub Codespace environment as of March 28, 2025, using pairtools 1.1.3 and pysam 0.23.0. Setup Pairtools works best in Unix based systems due to it's dependence on pysam which in turn utilizes htslib. Therefore this installation is primarily meant for Linux platform, although the results could be replicated for MacOS too. For Windows machine, there is a workaround method to install it by using tools like gcc. Why Installation Can Be Challenging Pairtools relies on pysam, a Python wrapper for the htslib C library, which involves compiling Cython extensions (e.g., parse_pysam.so). During installation, these extensions must link to pysam's shared libraries (e.g., libchtslib.so). A common issue arises when pip's build isolation creates temporary environments, causing pairtools to link against paths like /tmp/pip-build-env-…/pysam/libchtslib.so. These paths vanish post-installation, leading to runtime errors like: ImportError: /tmp/pip-build-env-k1zoq23s/overlay/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory Our goal is to ensure pairtools links to a persistent pysam installation, avoiding temporary paths. GitHub Codespaces, with its Ubuntu-based environment, provides a clean slate to achieve this - if set up correctly. Prerequisites GitHub Codespace: Access to a Codespace for your pairtools repository (e.g., forked from open2c/pairtools). Basic Terminal Skills: Familiarity with running commands in a Linux terminal. Repository Cloned: Your Codespace should have the pairtools repo checked out (e.g., at /workspaces/pairtools). ## Step-by-Step Installation Guide ### Step 1: Start with a Fresh Codespace A clean environment is critical to avoid cached build artifacts that can cause linking errors. Action: Delete your existing Codespace and create a new one. - Go to the GitHub Codespaces dashboard (https://github.com/codespaces). - Find your current Codespace for the pairtools repo, delete it, and click "New Codespace" to start fresh. Why: This ensures no stale dependencies or mislinked libraries interfere. ### Step 2: Install System Dependencies Pysam and pairtools require system libraries and development tools for compilation. Command: bash sudo apt update && sudo apt install -y build-essential libhts-dev python3-dev Details: - build-essential: Provides gcc, make, and other compilation tools. - libhts-dev: Headers for htslib, used by pysam. - python3-dev: Python development headers for building extensions. Why: These ensure successful compilation of pysam and pairtools from source. ### Step 3: Uninstall Existing Packages Remove any prior installations to prevent conflicts. Command: bash pip uninstall pysam pairtools -y Why: A clean slate avoids version mismatches or leftover files. ### Step 4: Install pysam from Source Build pysam explicitly to place its shared libraries in a persistent location. Command: bash pip install "pysam==0.23.0" - no-binary pysam - no-cache-dir - verbose Details: - - no-binary pysam: Forces a source build, avoiding precompiled wheels that might mislink. - - no-cache-dir: Prevents reusing cached files that could cause issues. - - verbose: Shows build output for debugging if needed. Verification: - Check installed .so files: bash ls /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/*.so Expect files like libchtslib.cpython-312-x86_64-linux-gnu.so. - Verify linking: bash ldd /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so Ensure no /tmp/ paths appear; expect system libraries like /lib/x86_64-linux-gnu/libc.so.6. Why: A source-built pysam ensures its libraries are correctly placed and linked. Note: The exact path (/usr/local/python/3.12.1/) may vary slightly depending on your Codespace's Python setup. Replace it with /home/codespace/.python/current/ if that's your environment. ### Step 5: Install pairtools Without Build Isolation Install pairtools in editable mode, linking it to the installed pysam. Commands: bash cd /workspaces/pairtools rm -rf build dist *.egg-info pairtools/lib/*.so # Clean old build artifacts pip install -e .[all] - no-build-isolation - no-cache-dir - verbose Details: - -e: Editable mode, useful for

Pairtools
is a powerful Python package for processing paired-end sequencing data, widely used in bioinformatics for tasks like Hi-C analysis. However, installing it can be tricky due to dependency linking issues, particularly with pysam
. In this article, I'll walk you through a step-by-step process (that I used to install pairtools) to install pairtools
successfully, avoiding common pitfalls like the dreaded ImportError
caused by broken library links. By the end, you'll have a working setup, verified by running pairtools
' test suite.
This guide is based on real-world troubleshooting in a GitHub Codespace environment as of March 28, 2025, using pairtools 1.1.3
and pysam 0.23.0
.
Setup
Pairtools works best in Unix based systems due to it's dependence on pysam which in turn utilizes htslib.
Therefore this installation is primarily meant for Linux platform, although the results could be replicated for MacOS too.
For Windows machine, there is a workaround method to install it by using tools like gcc.
Why Installation Can Be Challenging
Pairtools
relies on pysam
, a Python wrapper for the htslib
C library, which involves compiling Cython extensions (e.g., parse_pysam.so
). During installation, these extensions must link to pysam
's shared libraries (e.g., libchtslib.so
). A common issue arises when pip
's build isolation creates temporary environments, causing pairtools
to link against paths like /tmp/pip-build-env-…/pysam/libchtslib.so
. These paths vanish post-installation, leading to runtime errors like:
ImportError: /tmp/pip-build-env-k1zoq23s/overlay/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory
Our goal is to ensure pairtools
links to a persistent pysam
installation, avoiding temporary paths. GitHub Codespaces, with its Ubuntu-based environment, provides a clean slate to achieve this - if set up correctly.
Prerequisites
-
GitHub Codespace: Access to a Codespace for your
pairtools
repository (e.g., forked from open2c/pairtools). - Basic Terminal Skills: Familiarity with running commands in a Linux terminal.
-
Repository Cloned: Your Codespace should have the
pairtools
repo checked out (e.g., at/workspaces/pairtools
). ## Step-by-Step Installation Guide ### Step 1: Start with a Fresh Codespace A clean environment is critical to avoid cached build artifacts that can cause linking errors. -
Action: Delete your existing Codespace and create a new one.
- Go to the GitHub Codespaces dashboard (
https://github.com/codespaces
). - Find your current Codespace for thepairtools
repo, delete it, and click "New Codespace" to start fresh. -
Why: This ensures no stale dependencies or mislinked libraries interfere.
### Step 2: Install System Dependencies
Pysam
andpairtools
require system libraries and development tools for compilation. - Command:
bash
sudo apt update && sudo apt install -y build-essential libhts-dev python3-dev
-
Details:
-
build-essential
: Providesgcc
,make
, and other compilation tools. -libhts-dev
: Headers forhtslib
, used bypysam
. -python3-dev
: Python development headers for building extensions. -
Why: These ensure successful compilation of
pysam
andpairtools
from source. ### Step 3: Uninstall Existing Packages Remove any prior installations to prevent conflicts. - Command:
bash
pip uninstall pysam pairtools -y
-
Why: A clean slate avoids version mismatches or leftover files.
### Step 4: Install
pysam
from Source Buildpysam
explicitly to place its shared libraries in a persistent location. - Command:
bash
pip install "pysam==0.23.0" - no-binary pysam - no-cache-dir - verbose
-
Details:
-
- no-binary pysam
: Forces a source build, avoiding precompiled wheels that might mislink. -- no-cache-dir
: Prevents reusing cached files that could cause issues. -- verbose
: Shows build output for debugging if needed. -
Verification:
- Check installed
.so
files:
bash
ls /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/*.so
Expect files like libchtslib.cpython-312-x86_64-linux-gnu.so
.
- Verify linking:
bash
ldd /usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so
Ensure no /tmp/
paths appear; expect system libraries like /lib/x86_64-linux-gnu/libc.so.6
.
-
Why: A source-built
pysam
ensures its libraries are correctly placed and linked. Note: The exact path (/usr/local/python/3.12.1/
) may vary slightly depending on your Codespace's Python setup. Replace it with/home/codespace/.python/current/
if that's your environment. ### Step 5: Installpairtools
Without Build Isolation Installpairtools
in editable mode, linking it to the installedpysam
. - Commands:
bash
cd /workspaces/pairtools
rm -rf build dist *.egg-info pairtools/lib/*.so # Clean old build artifacts
pip install -e .[all] - no-build-isolation - no-cache-dir - verbose
-
Details:
-
-e
: Editable mode, useful for development. -[all]
: Installs optional dependencies (e.g.,bioframe
). -- no-build-isolation
: Uses the current environment'spysam
, avoiding temporary build paths. - Verification: - Check linking:
bash
ldd /workspaces/pairtools/pairtools/lib/parse_pysam.cpython-312-x86_64-linux-gnu.so
Look for:
/usr/local/python/3.12.1/lib/python3.12/site-packages/pysam/libchtslib.cpython-312-x86_64-linux-gnu.so (0x…)
No /tmp/
paths should appear.
-
Why:
- no-build-isolation
ensurespairtools
' extensions link to the installedpysam
. ### Step 6: Run Tests to Confirm Test the installation to ensure everything works. - Command:
bash
/usr/local/python/3.12.1/bin/python -m pytest -v
-
Expected Output: All 17 tests should collect and run, showing pass/fail results. No
ImportError
forlibchtslib.so
. -
Why: This verifies that
pairtools
and its dependencies are fully functional. ### Step 7: Usepairtools
Try it out! - Command:
bash
/usr/local/python/3.12.1/bin/pairtools - help
-
Expected Output: Displays the command-line help, confirming the CLI is operational.
## Troubleshooting Common Issues
### Linking Still Points to
/tmp/
-
Symptom:
ldd
shows/tmp/pip-build-env-…/pysam/libchtslib.so => not found
. - Fix: - Clean and rebuild manually:
bash
cd /workspaces/pairtools
rm -rf build dist *.egg-info pairtools/lib/*.so
python setup.py build_ext - inplace - verbose
pip install -e .[all] - no-build-isolation - no-cache-dir - verbose
- Recheck with ldd
.
NumPy Version Warnings
-
Symptom: Warnings like
numpy 1.26.4 is incompatible with bioframe 0.7.2
. - Fix: - Downgrade NumPy:
bash
pip install "numpy<2"
- Rerun tests.
Missing System Dependencies
-
Symptom: Build fails with errors about missing
htslib
orgcc
. - Fix: Reinstall system dependencies (Step 2). ## Why This Works
- Fresh Codespace: Eliminates cached artifacts that cause linking errors.
-
Source-Built
pysam
: Ensures shared libraries are in a stable location. -
No Build Isolation: Forces
pairtools
to use the installedpysam
, avoiding temporary paths. -
Ubuntu Environment: Native Linux in Codespaces provides reliable library linking compared to WSL quirks.
## Conclusion
Installing
pairtools
in GitHub Codespaces doesn't have to be a headache. By starting fresh, buildingpysam
from source, and installingpairtools
without build isolation, you can sidestep linking issues and get to work on your paired-end sequencing projects. This method has been battle-tested to resolve theImportError
forlibchtslib.so
, ensuring a robust setup as of March 2025. Now that yourpairtools
is installed, explore its capabilities - parse Hi-C data, run stats, or dive into its test suite for inspiration. Happy coding!