XCalibre Loader
Date Format Fee
07 Jul - 11 Jul 2025 Virtual/Live $ 2,950 Register Now
18 Aug - 22 Aug 2025 Virtual/Live $ 2,950 Register Now
24 Nov - 28 Nov 2025 Virtual/Live $ 2,950 Register Now
About the Course

Data Science is the connection point of statistics, analytics, programming, optimisation, visualisation and decision making.

Data Analysis is a process of gathering, inspecting, cleansing, transforming, and modelling data to discover useful information, suggest conclusions, and support decision-making. With the development of software tools and data collection techniques, data and data analytics have become the way companies and industries are doing business. However, there are both rising demand and salaries for data scientists and a significant shortage of trained professionals. Applying statistical methods without complying with governing principles can lead to unwanted results. In this Big Data Capstone virtual training course, the delegates will assume the role of a Data Scientist working for a startup, following the Data Science methodology involving data collection, data wrangling, exploratory data analysis, data visualisation, model development, model evaluation, and reporting your results to stakeholders.

The objective of this training is to prepare the delegates, as they will be working on the actual project, to identify all the aspects of the Big Data lifecycle, get the relevant data sources, and cleans the data from any errors, biases and outliers which can skew or derail the project as a whole. The example project within the virtual training course will be chosen depending on the delegates' work environment and industry.

Core Objectives

Delegates will achieve the following objectives:

  • Identify the sources of data related to the Data Science project
  • Know how to collect, import, transform, visualise, and model data with Python and R
  • Acquire the knowledge needed to implement the Data Science projects
  • Learn the Data Analytics models and lifecycle,
  • Adopt the use of Python and R for clustering, association, regression analysis
  • Use Python and R for text analytics
Training Approach

This training course is a mixture of lecture, video presentation, trainer-facilitated workshop exercises, and case study analysis organised through a Virtual Learning Platform anytime and anywhere.

The Attendees

This virtual training course is designed for all the people involved in decision making and analysis, as well as researchers and consultants involved in data management, analytics, optimisation, business analysis, IT, project management and process optimisation.

This virtual training course will be valuable to professionals, including (but not limited to) the following:

  • Researchers and Practitioners in Data Management and Analytics
  • Statistical and Research Analysts
  • Key Application Development and Data Research Personnel
  • Technology Engineers, CTOs and CIOs
  • Strategic Development Personnel
  • Project Managers
Daily Discussion

DAY ONE: DATA SCIENCE IMPORTANCE AND DEVELOPMENT

  • History and Development of Data Science and Data Analytics
  • Current Practices and trends in Data Analytics
  • Key Drivers for Data Analytics and Data Science Implementation
  • Success and Failure: Data Analytics and Data Science Applications
  • Identification of areas within the organisation for Data Science Projects Implementation

DAY TWO: DATA ANALYTICS IMPLEMENTATION

  • Data Analytics Lifecycle
  • Data Gathering, Inspection and Modelling,
  • Describing Datasets using Statistics
  • Measures of Central Tendency, Dispersion, Symmetry and Kurtosis of Data
  • Advanced Methods of Clustering
  • Advanced Theory and Methods of Association Rules
  • Advanced Theory and Methods of Regression
  • Tabletop exercise:
    • Use of K-means for Clustering
    • Use of Apriori Algorithm for Market Basket Analysis
    • Use of Linear and Logistics regression

DAY THREE: USE OF PYTHON FOR DATA SCIENCE

  • Overview of Python Framework
  • Basic Operations with Python
  • Variables and Basic Calculation
  • Python Lists
  • Python Functions and Packages
  • Basic Calculations with Python
  • Data Extraction and Transformation with Python
  • K-means Clustering Calculation with Python
  • Data Visualisation

DAY FOUR: R FOR DATA SCIENCE

  • Overview of R Framework
  • Basic R Syntax
  • Installing and using R packages
  • Data Collection from Online Sources
  • Logistics Regression with R
  • Decision Tree Analysis with R
  • Apriori Algorithm with R

DAY FIVE: MANAGEMENT OF DATA SCIENCE PROJECTS WITHIN THE ORGANISATION

  • Comparison between Data Science Projects and other Projects
  • Project management for Data Projects
  • Identification of Stakeholders
  • Data Science Projects Deliverables
  • Key Risks: Time, Place, Competition, Circumstances and Opportunities
  • Projects for Data Science
  • Methods and Models for selected Data Science Projects
  • Risks and expected outcomes for selected projects