Data Science and Big Data AnalyticsData science is a “concept to unify statistics, data analysis and their related methods” to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science from the subdomains of machine learning, classification, cluster analysis, data mining, databases, and visualization.

This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career.

It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible — increasing the potential for data to transform our world.


1. Introduction to Big Data Analytics

* Big Data Overview
* State of the Practice in Analytics
* The Data Scientist
* Big Data Analytics in Industry Verticals

2. Data Analytics Lifecycle

* Discovery
* Data Preparation
* Model Planning
* Model Building
* Communicating Results
* Operationalizing

3. Review of Basic Data Analytic Methods Using R

* Using R to Look at Data Introduction to R
* Analyzing and Exploring the Data
* Statistics for Model Building and Evaluation

4. Advanced Analytics Theory And Methods

* K Means Clustering
* Association Rules
* Linear Regression
* Logistic Regression
* Naïve Bayesian Classifier
* Decision Trees
* Time Series Analysis
* Text Analysis

5. Advanced Analytics – Technologies and Tools

* Analytics for Unstructured Data – MapReduce and Hadoop
* The Hadoop Ecosystem

+ In-database Analytics SQL Essentials
+ Advanced SQL and MADlib for In-database Analytics

6. The Endgame or Putting it All Together

* Operationalizing an Analytics Project
* Creating the Final Deliverables
* Data Visualization Techniques
* Final Lab Exercise on Big Data Analytics




