Course Objectives
Course Objectives
By the end of the course, participants will be able to:
Understand and design data for efficient analysis
Compare solutions related to Data Analysis vs. Machine Learning
Differentiate between predictive models and pattern finding ones
Decide between “proprietary” and “open source” technologies
Outline the modern data flow from sources to reports
Manage Data Science projects with project management best practices
Target Audience
This course is for specialists who aspire to become accustomed with data science components, and how they can be applied coordinately to solve data and business problems, as well as research issues. The course is specifically suited for managers and persons involved in marketing, CRM, research, manufacturing, quality control, app developers and IT analysts from almost any sector, such as banks, insurance companies, retail, governments, manufacturers, healthcare, telecom, transport and distributors.
Target Competencies
Business data analysis
Data analytic validity
Judging AI algorithms
Evaluating IoT platforms
Comparing big data results
Course Outline
Data Analysis and VisualizationTypes of data and data visualizationEvaluating the representative quality of dataUsing descriptive statistics to summarize dataProfiling two or more groups with statistical testsVisualizing multiple analytics with powerful smart chartsSimple Linear RegressionSimple Logistic RegressionManaging and removing outliers
Machine Learning – SupervisedMultiple linear regressionsMultiple logistic regressionsDiscriminant analysis: Functions and probabilistic modelsDecision trees: CART – CHAID and Random ForestsSupport vector machinesK-nearest neighborsNaïve BayesNeural networks, deep learning and AI possibilities
Business Intelligence Forecasting – R vs. PythonBusiness IntelligenceDatabases: collection and sourcesETLStorage: Data warehouses, data marts and data lakesAnalytics: BI Tools, OLAP, Dashboards, etc.ForecastingTrendsExponential smoothing: Additive and multiplicative methodsTime Series: Additive and multiplicative methodsARIMA modelsR vs. PythonStatistical TestsMachine Learning algorithms
Machine Learning: UnsupervisedPrinciple Component AnalysisClustering: Hierarchical and K MeansSimple correspondence analysisMulti-dimensional scalingQuadrant analysis
PMP for Data ScientistsPMPIntegration, Cost, ScopeTime, Cost, Quality, CommunicationRisk, Procurement and Stakeholders
IoT and Big Data EcosystemIoT essentials - M2M and Embedded SystemsBasic IoT protocolsBig Data: “where” and “when”Big Data distributed files with HDFSMapReduce vs. Spark Data SharingBig Data Ecosystem bird's eye view: Spark, Mongo DB, Cassandra, Flume, Cloudera, Oozie, Mahout