Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction to Custom Operator Development
- Rationale for building custom operators: use cases and constraints
- CANN runtime architecture and points for operator integration
- Overview of TBE, TIK, and TVM within the Huawei AI ecosystem
Employing TIK for Low-Level Operator Programming
- Grasping the TIK programming model and its supported APIs
- Memory management and tiling strategies in TIK
- Creating, compiling, and registering a custom op with CANN
Testing and Validating Custom Ops
- Unit testing and integration testing of ops within the graph
- Debugging kernel-level performance bottlenecks
- Visualizing op execution and buffer behavior
TVM-Based Scheduling and Optimization
- Overview of TVM as a compiler for tensor operations
- Writing a schedule for a custom op in TVM
- TVM tuning, benchmarking, and code generation for Ascend
Integration with Frameworks and Models
- Registering custom ops for MindSpore and ONNX
- Verifying model integrity and fallback behaviour
- Supporting multi-operator graphs with mixed precision
Case Studies and Specialized Optimizations
- Case study: high-efficiency convolution for small input shapes
- Case study: memory-aware attention operator optimization
- Best practices in custom op deployment across devices
Summary and Next Steps
Requirements
- Profound understanding of AI model internals and operator-level computations
- Practical experience with Python and Linux development environments
- Knowledge of neural network compilers or graph-level optimization techniques
Target Audience
- Compiler engineers engaged in AI toolchain development
- Systems developers specializing in low-level AI optimization
- Developers creating custom operators or addressing novel AI workloads
14 Hours