- FP16 GEMM: Uses 16-bit floating point (FP16) inputs for A and B matrices
- FP8 GEMM: Uses 8-bit floating point (FP8) inputs for A and B matrices, with scaling factors to maintain numerical stability
M: variableN,K: constant
A: [M, K]B: [N, K]- Scaling factors for FP8 GEMM:
A_scale: [M]B_scale: [N]
C: [M, N]

