prospen.co.za

ProspenAfrica | Training and Consulting Services Provider

Big Data Analytics with Python Training

Training

Dates: Available on Request
Locations: Johannesburg, South Africa
Platform: Available In-Class

Price: Available on request

Course Introduction

Python stands out as a highly adaptable and robust open-source language, known for its ease of learning and powerful libraries for data manipulation and analysis. It has been extensively used in scientific computing and various mathematical domains such as physics, finance, oil and gas, and signal processing.

 

The Big Data Analytics with Python course offers a comprehensive overview of data analysis techniques using Python. As Data Scientist is one of the most sought-after professions today, mastering Python is essential for these roles. This course equips you with the essential tools and knowledge required for Data Analytics with Python, helping you to become proficient in Python programming concepts.

 

Participants will gain expertise in Python programming, focusing on data analytics and Machine Learning techniques. This training course will provide practical experience and the skills needed for predictive modelling and other advanced analytics tasks.

 

This training course will provide practical experience required for predictive modelling and Machine Learning using Python.

Course Objectives

Upon successful completion, participants will be able to:

  • Programmatically download and analyse data

  • Manage various types of data: ordinal, categorical, encoding

  • Perform data visualization

  • Execute step-by-step data analysis

  • Understand the roles of a Machine Learning Engineer

  • Describe Machine Learning

  • Work with real-time data

  • Use tools and techniques for predictive modelling

  • Discuss Machine Learning algorithms and their implementation

  • Validate Machine Learning algorithms

  • Explain Time Series and related concepts

  • Perform Text Mining and Sentiment analysis

Organizational Benefits

Companies sending employees to this training can benefit in the following ways:

  • Save on operational costs as Python is free to use due to its OSI-approved open source license

  • Equip employees with knowledge in data analysis, Machine Learning, data visualization, web scraping, and Natural Language Processing to improve organizational functions

  • Offer flexible and cost-effective professional development opportunities

  • Analyse case studies and apply successful techniques within the organization

  • Understand the principles and practices of Big Data Analytics

Who should attend?

This Big Data Analytics with Python Training course is suitable for:

  • Analytics Team Managers

  • Business Analysts interested in Machine Learning concepts

  • Information Architects seeking proficiency in Predictive Analytics

  • Programmers, Developers, Technical Leads, Architects

  • Individuals aspiring to be Machine Learning Engineers

  • Professionals aiming to develop automatic predictive models using data

IT and Records Courses

Training Methodology

Our diverse instructional approaches ensure effective learning:

– Lectures & Presentations: Engage with expert-driven, stimulating content.
– Course Material: Access well-crafted supporting resources.
– Group Work: Collaborate on discussions and case studies for practical insights.
– Workshops & Role-Play: Participate in immersive, scenario-based activities.
– Practical Application: Focus on applying theoretical knowledge in real situations.
– Post-Training Support: Receive extensive support after training for skill implementation.

Training Outline

MODULE 1: DATA SCIENCE OVERVIEW

  • Introduction to Data Science

  • Different Sectors Using Data Science

  • Purpose and Components of Python

MODULE 2: DATA ANALYTICS OVERVIEW

  • Data Analytics Process

  • Knowledge Check

  • Exploratory Data Analysis (EDA)

    • EDA-Quantitative Technique

    • EDA – Graphical Technique

  • Data Analytics Conclusion or Predictions

  • Data Analytics Communication

  • Data Types for Plotting

MODULE 3: STATISTICAL ANALYSIS AND BUSINESS APPLICATIONS

  • Introduction to Statistics

  • Statistical and Non-statistical Analysis

  • Major Categories of Statistics

  • Statistical Analysis Considerations

  • Population and Sample

  • Statistical Analysis Process

  • Data Distribution

  • Dispersion

  • Histogram

  • Correlation and Inferential Statistics

MODULE 4: PYTHON ENVIRONMENT SETUP AND ESSENTIALS

  • Anaconda

  • Installation of Anaconda Python Distribution

  • Data Types with Python

  • Basic Operators and Functions

MODULE 5: MATHEMATICAL COMPUTING WITH PYTHON (NUMPY)

  • Introduction to NumPy

  • Activity: Sequence it Right

  • Creating and Printing an ndarray

  • Class and Attributes of ndarray

  • Basic Operations

  • Copy and Views

  • Mathematical Functions of NumPy

  • Evaluate datasets containing GDPs of different countries

  • Evaluate datasets of Summer Olympics 2012

MODULE 6: SCIENTIFIC COMPUTING WITH PYTHON (SCIPY)

  • Introduction to SciPy

  • SciPy Sub Packages – Integration and Optimization

  • Demo: Calculate Eigenvalues and Eigenvectors

  • Use SciPy to solve a linear algebra problem

  • Use SciPy to define 20 random variables for random values

MODULE 7: DATA MANIPULATION WITH PANDAS

  • Introduction to Pandas

  • Understanding DataFrame

  • View and Select Data Demo

  • Handling Missing Values

  • Data Operations

  • File Read and Write Support

  • Pandas SQL Operation

  • Analyse the Federal Aviation Authority (FAA) dataset using Pandas

  • Analyse a given CSV dataset for the fire department

MODULE 8: MACHINE LEARNING WITH SCIKIT-LEARN

  • Machine Learning Approach

  • Understand datasets and extract features

  • Identifying problem type and learning model

  • Train, test, and optimize the model

  • Supervised Learning Models

    • Linear Regression

    • Logistic Regression

  • Unsupervised Learning Models

  • Pipeline

  • Model Persistence and Evaluation

  • Analyse a dataset to identify features and response labels

MODULE 9: NATURAL LANGUAGE PROCESSING WITH SCIKIT LEARN

  • NLP Overview

  • NLP Applications

  • NLP Libraries – Scikit

  • Extraction Considerations

  • Scikit Learn – Model Training and Grid Search

  • Analyse a given spam collection dataset

  • Analyse a sentiment dataset using NLP

MODULE 10: DATA VISUALIZATION IN PYTHON USING MATPLOTLIB

  • Introduction to Data Visualization

  • Line Properties

  • (x, y) Plot and Subplots

  • Types of Plots

  • Analyse the “auto mpg data” and draw a pair plot

  • Draw a pie chart to visualize a dataset

MODULE 11: WEB SCRAPING WITH BEAUTIFUL SOUP

  • Web Scraping and Parsing

  • Knowledge Check

  • Understanding and Searching the Tree

  • Navigating options

  • Demo: Navigating a Tree

  • Modifying the Tree

  • Parsing and Printing the Document

  • Scrape the Simplilearn website page to perform tasks

MODULE 12: INTEGRATION WITH HADOOP MAP-REDUCE AND SPARK

  • Why Big Data Solutions are Provided for Python

  • Big Data and Hadoop

  • Hadoop Core Components

  • Python Integration with HDFS using Hadoop Streaming

  • Using Hadoop Streaming for Calculating Word Count

  • Python Integration with Spark using PySpark

  • Using PySpark to Determine Word Count

  • Determine the word count for an Amazon dataset

Related Courses

Developing and Implementing Electronic Document and Records Management Systems
Dates: 19 - 21 Jul | 02 - 04 Sep | 21 - 23 Oct | 26 - 28 Nov 2024

Developing and Implementing Electronic Document and Records Management Systems

Vuew More
EDRMS Configuration & Implementation: Using SharePoint as A Solution
Dates: 19 - 23 Aug | 28 Oct - 01 Nov 2024

EDRMS Configuration & Implementation: Using SharePoint as A Solution

5 Day Training

View More
Intermediate Archives and Records Management
Dates: 07 - 11 Sep | 04 - 08 Nov 2024

Intermediate Archives and Records Management

5 Day Training

View More
Power Bi: Visualization and Dashboards
Dates: 14 - 16 Aug | 25 - 27 Sep | 29 - 31 Oct 2024

Power Bi: Visualization and Dashboards

3 Day Training

View More

Open chat
Need Help? Chat with Us
Scan the code
Powered by Prospen Africa
Welcome to Prospen Africa!
Check out our 15% Off sale when you purchase QCTO Training Material