_c.py_mount: Fatal Python error: Segmentation fault

Hi everyone,

I’m encountering a segmentation fault when using nnsight in a multiprocessing setup. The crash happens deep inside the C extension _c.py_mount and terminates with:

_c.py_mount: Fatal Python error: Segmentation fault

The error occurs when evaluating a model using joblib.Parallel (loky backend) and multiple GPUs. Each worker calls an evaluate_checkpoint function that internally uses nnsight for tracing/interventions.

Environment:

  • Python 3.12.11
  • PyTorch (GPU build)
  • nnsight (latest release from PyPI)
  • CUDA 12.x

What I’ve tried:

  • Reinstalling nnsight and PyTorch
  • Setting multiprocessing start method to spawn
  • Ensuring CUDA drivers and PyTorch versions match

The segmentation fault still occurs randomly during execution.

Has anyone seen similar behavior with _c.py_mount or nnsight under Python 3.12 or in multiprocessing contexts?
Any guidance or workarounds would be appreciated!

Hey @jonas . I tried to reproduce this and was unable to do so using this script:

import nnsight
from joblib import Parallel, delayed


def run_model(i):
    print(f"Running model {i}")
    model = nnsight.LanguageModel("openai-community/gpt2", device_map="auto")
    
    for ii in range(30):
        print(f"Running model {i}: {ii}")
        with model.trace("hello world"):
            output = model.lm_head.output.save()
                
    return output

Parallel(n_jobs=2, backend='loky')(delayed(run_model)(i) for i in range(2))

Does this capture what you’re trying to do?

  • Python 3.12.11
  • NNsight 0.5.8
  • PyTorch 2.8.0

Yes. Pretty much. This first happens after about 300 checkpoints on olmo-1b. I’ll try to create a minimal example to reproduce this.

1 Like