Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Overview of CANN Optimisation Capabilities
- How inference performance is managed within CANN.
- Optimisation objectives for edge and embedded AI systems.
- Understanding AI Core utilisation and memory allocation.
Leveraging the Graph Engine for Analysis
- Introduction to the Graph Engine and its execution pipeline.
- Visualising operator graphs and runtime metrics.
- Modifying computational graphs to drive optimisation.
Profiling Tools and Performance Metrics
- Utilising the CANN Profiling Tool (profiler) for workload analysis.
- Analyzing kernel execution time and identifying bottlenecks.
- Memory access profiling and tiling strategies.
Custom Operator Development with TIK
- Overview of TIK and the operator programming model.
- Implementing a custom operator using TIK DSL.
- Testing and benchmarking operator performance.
Advanced Operator Optimisation with TVM
- Introduction to TVM integration with CANN.
- Auto-tuning strategies for computational graphs.
- Guidelines on when and how to switch between TVM and TIK.
Memory Optimisation Techniques
- Managing memory layout and buffer placement.
- Techniques to minimise on-chip memory consumption.
- Best practices for asynchronous execution and resource reuse.
Real-World Deployment and Case Studies
- Case study: performance tuning for smart city camera pipelines.
- Case study: optimising the inference stack for autonomous vehicles.
- Guidelines for iterative profiling and continuous improvement.
Summary and Next Steps
Requirements
- Robust understanding of deep learning model architectures and training workflows.
- Practical experience in model deployment using CANN, TensorFlow, or PyTorch.
- Familiarity with Linux CLI, shell scripting, and Python programming.
Target Audience
- AI performance engineers.
- Inference optimisation specialists.
- Developers working on edge AI or real-time systems.
14 Hours