Jupyter for Data Science Teams Training Course
Jupyter is an open-source, web-based interactive IDE and computing environment.
This instructor-led, live training (online or onsite) introduces the concept of collaborative development in data science and demonstrates how to use Jupyter to track and participate as a team in the "life cycle of a computational idea". It guides participants through the creation of a sample data science project built on the Jupyter ecosystem.
By the end of this training, participants will be able to:
- Install and configure Jupyter, including the creation and integration of a team repository on Git.
- Utilise Jupyter features such as extensions, interactive widgets, multiuser mode, and more to facilitate project collaboration.
- Create, share, and organise Jupyter Notebooks with team members.
- Choose from Scala, Python, or R to write and execute code against big data systems such as Apache Spark, all through the Jupyter interface.
Format of the Course
- Interactive lecture and discussion.
- Plenty of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- The Jupyter Notebook supports over 40 languages including R, Python, Scala, Julia, etc. To customize this course to your language(s) of choice, please contact us to arrange.
Course Outline
Introduction to Jupyter
- Overview of Jupyter and its ecosystem
- Installation and setup
- Configuring Jupyter for team collaboration
Collaborative Features
- Using Git for version control
- Extensions and interactive widgets
- Multiuser mode
Creating and Managing Notebooks
- Notebook structure and functionality
- Sharing and organising notebooks
- Best practices for collaboration
Programming with Jupyter
- Choosing and using programming languages (Python, R, Scala)
- Writing and executing code
- Integrating with big data systems (Apache Spark)
Advanced Jupyter Features
- Customizing Jupyter environment
- Automating workflows with Jupyter
- Exploring advanced use cases
Practical Sessions
- Hands-on labs
- Real-world data science projects
- Group exercises and peer reviews
Summary and Next Steps
Requirements
- Programming experience in languages such as Python, R, Scala, etc.
- A background in data science
Audience
- Data science teams
Open Training Courses require 5+ participants.
Jupyter for Data Science Teams Training Course - Booking
Jupyter for Data Science Teams Training Course - Enquiry
Jupyter for Data Science Teams - Consultancy Enquiry
Testimonials (1)
It is great to have the course custom made to the key areas that I have highlighted in the pre-course questionnaire. This really helps to address the questions that I have with the subject matter and to align with my learning goals.
Winnie Chan - Statistics Canada
Course - Jupyter for Data Science Teams
Upcoming Courses
Related Courses
Introduction to Data Science and AI using Python
35 HoursThis five-day programme offers an introductory overview of Data Science and Artificial Intelligence (AI).
The training is conducted through practical examples and exercises utilising Python.
Apache Airflow for Data Science: Automating Machine Learning Pipelines
21 HoursThis instructor-led, live training in Malaysia (available online or onsite) targets intermediate-level participants who wish to automate and manage machine learning workflows, including model training, validation, and deployment using Apache Airflow.
Upon completing this training, participants will be equipped to:
- Configure Apache Airflow for orchestrating machine learning workflows.
- Automate tasks related to data preprocessing, model training, and validation.
- Seamlessly integrate Airflow with various machine learning frameworks and tools.
- Deploy machine learning models through automated pipelines.
- Monitor and optimize machine learning workflows within production environments.
Anaconda Ecosystem for Data Scientists
14 HoursThis instructor-led live training in Malaysia (online or onsite) is aimed at data scientists who wish to use the Anaconda ecosystem to capture, manage, and deploy packages and data analysis workflows in a single platform.
By the end of this training, participants will be able to:
- Install and configure Anaconda components and libraries.
- Understand the core concepts, features, and benefits of Anaconda.
- Manage packages, environments, and channels using Anaconda Navigator.
- Use Conda, R, and Python packages for data science and machine learning.
- Get to know some practical use cases and techniques for managing multiple data environments.
AWS Cloud9 for Data Science
28 HoursThis instructor-led, live training in Malaysia (online or onsite) is aimed at intermediate-level data scientists and analysts who wish to use AWS Cloud9 for streamlined data science workflows.
By the end of this training, participants will be able to:
- Set up a data science environment in AWS Cloud9.
- Perform data analysis using Python, R, and Jupyter Notebook in Cloud9.
- Integrate AWS Cloud9 with AWS data services like S3, RDS, and Redshift.
- Utilize AWS Cloud9 for machine learning model development and deployment.
- Optimize cloud-based workflows for data analysis and processing.
Introduction to Google Colab for Data Science
14 HoursThis instructor-led, live training in Malaysia (online or onsite) is designed for beginner-level data scientists and IT professionals seeking to grasp the fundamentals of data science via Google Colab.
Upon completing this training, participants will be able to:
- Configure and navigate Google Colab.
- Compose and run fundamental Python code.
- Import and manage datasets.
- Produce visualizations leveraging Python libraries.
A Practical Introduction to Data Science
35 HoursAttendees who finish this training will acquire a practical, real-world grasp of Data Science, along with its associated technologies, methodologies, and tools.
The course provides ample opportunity for participants to apply their knowledge through practical exercises. Interactive group work and direct feedback from the instructor are key elements of the learning experience.
The programme begins by covering fundamental Data Science concepts before advancing to the specific tools and methodologies employed in the field.
Target Audience
- Software Developers
- Technical Analysts
- IT Consultants
Course Format
- A blend of lectures, discussions, exercises, and extensive hands-on practice
Note
- For those interested in a customized training session for this course, please get in touch with us to arrange it.
Data Science for Big Data Analytics
35 HoursBig data refers to data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
Data Science essential for Marketing/Sales professionals
21 HoursThis course is designed for Marketing and Sales Professionals who wish to deepen their understanding of how to apply data science within these fields. It provides comprehensive coverage of various data science techniques utilized for 'upselling', 'cross-selling', market segmentation, branding, and Customer Lifetime Value (CLV).
Understanding the Difference Between Marketing and Sales - How do sales and marketing differ?
In simple terms, sales is a process that focuses on targeting individuals or small groups. Marketing, on the other hand, targets a broader audience or the general public. Marketing involves research (identifying customer needs), product development (creating innovative offerings), and promotion (through advertisements to create awareness among consumers). Essentially, marketing is about generating leads or prospects. Once a product is launched, the salesperson's role is to persuade customers to make a purchase. Sales focuses on converting these leads or prospects into purchases and orders, whereas marketing is geared toward long-term goals, while sales addresses shorter-term objectives.
Introduction to Data Science
35 HoursThis instructor-led, live training (online or onsite) is aimed at professionals who wish to start a career in Data Science.
By the end of this training, participants will be able to:
- Install and configure Python and MySql.
- Understand what Data Science is and how it can add value to virtually any business.
- Learn the fundamentals of coding in Python
- Learn supervised and unsupervised Machine Learning techniques, and how to implement them and interpret the results.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Kaggle
14 HoursThis instructor-led live training, conducted Malaysia (online or onsite), targets data scientists and developers aiming to learn and develop their careers in Data Science using Kaggle.
By the conclusion of this training, participants will be able to:
- Understand data science and machine learning.
- Explore data analytics.
- Learn about Kaggle and how it works.
Data Science with KNIME Analytics Platform
21 HoursThe KNIME Analytics Platform stands as a premier open-source solution for driving data-led innovation. It empowers you to uncover hidden value within your data, extract novel insights, or forecast future trends. Equipped with over 1,000 modules, hundreds of pre-built examples, a broad suite of integrated tools, and an extensive selection of advanced algorithms, KNIME Analytics Platform serves as the ideal toolkit for data scientists and business analysts alike.
This course offers an excellent pathway for beginners, advanced users, and seasoned KNIME experts to familiarize themselves with the platform, enhance their proficiency, and learn how to generate clear, detailed reports using KNIME workflows.
Delivered by an instructor, this live training (available online or on-site) is designed for data professionals aiming to leverage KNIME to address complex business requirements.
The programme is specifically tailored for individuals who may not have programming expertise but wish to utilise cutting-edge tools to execute analytics scenarios.
Upon completion of this training, participants will be capable of:
- Installing and configuring KNIME.
- Developing Data Science scenarios.
- Training, testing, and validating models.
- Implementing the end-to-end data science model value chain.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical application.
- Practical implementation within a live-lab environment.
Customisation Options
- For personalised training or further details on this programme, please contact us to make arrangements.
MATLAB Fundamentals, Data Science & Report Generation
35 HoursThis training is divided into three key segments. The first segment introduces the core fundamentals of MATLAB, exploring its role as both a programming language and a comprehensive platform. Topics covered include an introduction to MATLAB syntax, arrays and matrices, data visualization techniques, script development, and the principles of object-oriented programming.
The second segment demonstrates how to leverage MATLAB for data mining, machine learning, and predictive analytics. To help participants clearly appreciate MATLAB's capabilities and efficiency, we draw comparisons between using MATLAB and other common tools such as spreadsheets, C, C++, and Visual Basic.
In the third segment, participants will learn how to enhance their workflow efficiency by automating data processing and report generation tasks.
Throughout the course, participants will apply learned concepts through practical exercises in a lab setting. By the conclusion of the training, attendees will possess a solid understanding of MATLAB’s functionalities and will be equipped to address real-world data science challenges while streamlining their work via automation.
Progress assessments will be conducted throughout the course to monitor development.
Course Format
- The course comprises both theoretical instruction and practical exercises, including case studies, sample code analysis, and hands-on implementation.
Note
- Practice sessions rely on pre-arranged sample data and report templates. If you have specific requirements, please contact us to make necessary arrangements.
Machine Learning for Data Science with Python
21 HoursThis instructor-led, live training in Malaysia (online or on-site) targets intermediate-level data analysts, developers, and aspiring data scientists who aim to utilise machine learning techniques in Python to extract insights, make predictions, and automate data-driven decisions.
By the end of this course, participants will be able to:
- Comprehend and distinguish key machine learning paradigms.
- Explore data preprocessing techniques and model evaluation metrics.
- Apply machine learning algorithms to resolve real-world data problems.
- Utilise Python libraries and Jupyter notebooks for practical development.
- Construct models for prediction, classification, recommendation, and clustering.
Accelerating Python Pandas Workflows with Modin
14 HoursThis instructor-led, live training in Malaysia (online or onsite) is aimed at data scientists and developers who wish to use Modin to build and implement parallel computations with Pandas for faster data analysis.
By the end of this training, participants will be able to:
- Set up the necessary environment to start developing Pandas workflows at scale with Modin.
- Understand the features, architecture, and advantages of Modin.
- Know the differences between Modin, Dask, and Ray.
- Perform Pandas operations faster with Modin.
- Implement the entire Pandas API and functions.
GPU Data Science with NVIDIA RAPIDS
14 HoursThis instructor-led live training in Malaysia (online or onsite) targets data scientists and developers who want to use RAPIDS to develop GPU-accelerated data pipelines, workflows, and visualizations, applying machine learning algorithms like XGBoost and cuML.
By the end of this training, participants will be able to:
- Set up the necessary development environment to build data models with NVIDIA RAPIDS.
- Understand the features, components, and advantages of RAPIDS.
- Leverage GPUs to accelerate end-to-end data and analytics pipelines.
- Implement GPU-accelerated data preparation and ETL with cuDF and Apache Arrow.
- Learn how to perform machine learning tasks with XGBoost and cuML algorithms.
- Build data visualizations and execute graph analysis with cuXfilter and cuGraph.