CANN/catlass基础矩阵乘
Basic Matmul【免费下载链接】catlass本项目是CANN的算子模板库提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass代码位置功能说明基础矩阵乘cube算子无AIV计算非TLA实现。类模板概述模板入参class BlockMmad_blockMmad类矩阵乘组件class BlockEpilogue_blockEpilogue类后处理组件实际未使用class BlockScheduler_blockScheduler类仅支持Gemm::Block::GemmIdentityBlockSwizzleParamsstruct Params { GemmCoord problemShape; //用例shape GM_ADDR ptrA; //输入matA的GM起始地址 LayoutA layoutA; //输入matA的layout GM_ADDR ptrB; //输入matB的GM起始地址 LayoutB layoutB; //输入matB的layout GM_ADDR ptrC; //输出matC的GM起始地址 LayoutC layoutC; //输出matC的layout ... }Argumentsstruct Arguments { GemmCoord problemShape; //用例shape GM_ADDR ptrA; //输入matA的GM起始地址 GM_ADDR ptrB; //输入matB的GM起始地址 GM_ADDR ptrC; //输出matC的GM起始地址 };调用示例kernel组装using BlockMmad Gemm::Block::BlockMmadDispatchPolicy, L1TileShape, L0TileShape, AType, BType, CType; using BlockEpilogue void; using BlockScheduler typename Gemm::Block::GemmIdentityBlockSwizzle3, 0; // kernel level using MatmulKernel Gemm::Kernel::BasicMatmulBlockMmad, BlockEpilogue, BlockScheduler;约束说明该kernel在void operator()AscendC::AIC核函数中调用blockMmad的方式不涉及异步和Preload故仅支持block_mmad_pingpong等简单blockMmad组件【免费下载链接】catlass本项目是CANN的算子模板库提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考