Get in Touch

Course Outline

Foundations of Safe and Fair AI

  • Core concepts: safety, bias, fairness, and transparency.
  • Types of bias: dataset, representation, and algorithmic biases.
  • Overview of regulatory frameworks, including the EU AI Act and GDPR.

Bias in Fine-Tuned Models

  • Understanding how fine-tuning can introduce or amplify bias.
  • Case studies and real-world failure examples.
  • Methods for identifying bias in datasets and model predictions.

Techniques for Bias Mitigation

  • Data-level strategies, such as rebalancing and augmentation.
  • In-training strategies, including regularization and adversarial debiasing.
  • Post-processing strategies, such as output filtering and calibration.

Model Safety and Robustness

  • Detecting unsafe or harmful model outputs.
  • Handling adversarial inputs effectively.
  • Conducting red teaming and stress testing on fine-tuned models.

Auditing and Monitoring AI Systems

  • Metrics for evaluating bias and fairness, such as demographic parity.
  • Tools for explainability and transparency frameworks.
  • Best practices for ongoing monitoring and governance.

Toolkits and Hands-On Practice

  • Utilizing open-source libraries like Fairlearn, Transformers, and CheckList.
  • Practical session: Detecting and mitigating bias in a fine-tuned model.
  • Generating safe outputs through effective prompt design and constraints.

Enterprise Use Cases and Compliance Readiness

  • Best practices for integrating safety measures into LLM workflows.
  • Documentation and model cards for compliance purposes.
  • Preparing for audits and external reviews.

Summary and Next Steps

Requirements

  • Knowledge of machine learning models and their training procedures.
  • Practical experience with fine-tuning techniques and Large Language Models (LLMs).
  • Familiarity with Python programming and Natural Language Processing (NLP) concepts.

Target Audience

  • AI compliance teams.
  • Machine learning engineers.
 14 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories