コース概要

Week 1 Big Data concepts

  • VVVV (Velocity, Volume, Variety, Veracity) definition
  • Limits to traditional data processing capacity
  • Distributed Processing
  • Statistical Analysis
  • Machine Learning Analysis Types
  • Data Visualization
  • Distributed Processing (e.g. map-reduce)
  • Introduction to used languages
  • R language crash-course
  • Python crash course

Weeks 2&3 Performing Data Analysis

  • Statistical Analysis
  • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Inferential Statistics (estimating)
  • Forecasting with Correlation and Regression models
  • Time Series analysis
  • Basics of Machine Learning
  • Supervised vs unsupervised learning
  • Classification and clustering
  • Estimating cost of specific methods
  • Filter

Week 4 Natural Language Processing

  • Processing text
  • Understanding meaning of the text
  • Automatic text generation
  • Sentiment/Topic Analysis
  • Computer Vision

Week 5&6 Tooling concept

  • Data storage solution (SQL, NoSQL, hierarchical, object oriented, document oriented)
  • MySQL, Cassandra, MongoDB, Elasticsearch, HDFS, etc...)
  • Choosing right solution to the problem
  • Distributed Processing
  • Spark
  • Machine Learning with Spark (MLLib)
  • Spark SQL
  • Scalability
  • Public cloud (AWS, Google, etc...)
  • Private cloud (OpenStack, cloud foundry)
  • Autoscalability

Week 7 Soft Skills

  • Advisory & Leadership Skills
  • Making an impact: data-driven story telling
  • Understanding your audience
  • Effective data presentation - getting your message across
  • Influence effectiveness and change leadership
  • Handling difficult situations

Exam

  • End of Programme graduation exam

要求

Participants to have good grounding in maths, at least high school level.

Though programming skills are not required, any programming skills will be useful.

Participants will be assessed and interviewed prior to participation in this training programme.

 245 時間

参加者の人数



Price per participant

お客様の声 (4)

関連コース

Kaggle

14 時間

Accelerating Python Pandas Workflows with Modin

14 時間

GPU Data Science with NVIDIA RAPIDS

14 時間

Anaconda Ecosystem for Data Scientists

14 時間

Introduction to Data Science and AI using Python

35 時間

Big Data Business Intelligence for Telecom and Communication Service Providers

35 時間

A Practical Introduction to Data Science

35 時間

Data Science for Big Data Analytics

35 時間

Data Science essential for Marketing/Sales professionals

21 時間

F# for Data Science

21 時間

Introduction to Data Science

35 時間

Jupyter for Data Science Teams

7 時間

Data Science with KNIME Analytics Platform

21 時間

Data Science Implementation Management using KNIME Server

14 時間

MATLAB Fundamentals, Data Science & Report Generation

35 時間

関連カテゴリー