Hardware-effizientes Maschinelles Lernen
- type: Lecture (V)
- chair: ITEC Henkel
- semester: WS 25/26
- lecturer:
- SWS: 2
- ECTS: 3
- lv-no.: 2400232
Content
Deep learning models are ubiquitous in modern applications, from computer vision to natural language processing. However, deploying these models efficiently on hardware platforms remains a significant challenge, leading to increased latency, energy consumption, memory usage, and cost. As deep learning models become larger and more complex, the need for optimization techniques to improve their performance on hardware is more critical than ever. This course discusses the principles and techniques for optimizing deep learning models for hardware deployment, focusing on both cloud and
embedded systems.
The course covers various optimization techniques to improve model efficiency, reduce latency, and minimize memory usage while maintaining a high task performance. Precisely, the covered topics include
- Knowledge distillation
- Neural architecture search (NAS)
- Pruning
- Quantization
A basic understanding of deep learning concepts and neural networks is recommended. Furthermore, familiarity with hardware architectures and embedded systems is beneficial but not mandatory.
Recommendations
A basic understanding of deep learning concepts and neural networks is expected; familiarity with hardware architectures
and embedded systems is helpful but not required.
Competence Goal
Upon successful completion of the course, participants will be able to:
1. Understand and analyze the challenges of efficiently deploying deep learning models on hardware platforms.
2. Apply optimization techniques to improve the hardware efficiency of deep learning models, specifically to reduce latency, energy consumption, and memory usage.
3. Explain and implement the following methods in practical scenarios:
- Knowledge Distillation
- Neural Architecture Search (NAS) o Pruning
o Quantization
4. Evaluate and adapt the use of these techniques for both cloud-based and embedded systems.
5. Perform model optimizations while maintaining high task performance.