• brkirch's avatar
    `torch.empty` can create issues; use `torch.zeros` · 24892520
    brkirch authored
    For MPS, using a tensor created with `torch.empty()` can cause `torch.baddbmm()` to include NaNs in the tensor it returns, even though `beta=0`. However, with a tensor of shape [1,1,1], there should be a negligible performance difference between `torch.empty()` and `torch.zeros()` anyway, so it's better to just use `torch.zeros()` for this and avoid unnecessarily creating issues.
    24892520
sub_quadratic_attention.py 7.14 KB