Fp8 runs ~100 tflops faster when the kernel name has “cutlass” in it
triton-lang / triton Public Notifications You must be signed in to change notification settings Fork 2.3k Star 17.1k [Gluon][Tutorial] Persistent…
triton-lang / triton Public Notifications You must be signed in to change notification settings Fork 2.3k Star 17.1k [Gluon][Tutorial] Persistent…