Silent error when doing logit lens before model dispatch

sfeucht · June 30, 2025, 11:52pm

I noticed a subtle issue where when I was trying to apply logit lens (model.lm_head) to a tensor before the model was dispatched, I would get all 0s. This would then result in the same random tokens coming up every time, which was really confusing until I realized the issue. It might be nice to have an error come up when you try to apply modules that haven’t actually been loaded yet.

(this is with 0.4.1)

Python 3.11.11 (main, Dec 11 2024, 16:28:39) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from nnsight import LanguageModel
>>> model = LanguageModel('meta-llama/Llama-2-7b-hf')
>>> import torch
>>> t = torch.randn((4096,))
>>> bad = model.lm_head(t.cuda())
>>> bad
tensor([0., 0., 0.,  ..., 0., 0., 0.], device='cuda:0',
       grad_fn=<SqueezeBackward4>)
>>> (bad == 0).all()
tensor(True, device='cuda:0')
>>> with model.trace('actually load model now'):
...     pass
... 
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:03<00:00,  1.89s/it]
>>> good = model.lm_head(t)
>>> good
tensor([-0.4224, -1.6249,  0.1027,  ...,  0.8179, -1.4246,  0.0459],
       grad_fn=<SqueezeBackward4>)

clement_dumas · July 1, 2025, 1:54pm

I think the weirdness comes from using lm_head outside a trace, but I agree this should throw an error rather than failing silently