Tentative schedule.  THIS MAY CHANGE.

 

Assignments in red (or marked with four asterisks - ****) are required and due on that day.

 

Date

Class Topic and Activity

Assignment(s) Due

Week 1

Wednesday, January 16, 2019

Intro to class

Intro slides

 

Introduction to Big Data  Analytics

 

Data Analytics Lifecycle

 

Intro to lab environment

 

2015 Big Data “must have skills”

 

Ongoing Big Data Opportunities

 

Read Participant Guide:

Module 1

·         Lesson 1: Big Data and its characteristics (corresponding with slides 11-42)

·         Lesson 2: Business value from Big Data (43-58)

·         Lesson 3: Data scientist (59-69)

 

Get a free Google account. (or use the one you already have).

 

For your web site, you MUST name your site NIU<znumber>.  So for example, my site would be named NIUz035098.  And you MUST choose “Classic Sites”, not “new”.  (UNLESS you have a different web page, at which point please tell me here).

·         Here is Web page Video I to help you with your Web page

·         Web Page Help Video II

 

Begin Lab 1

 

Week 2

Wednesday, January 23, 2019

CLASS ONLINE ONLY DUE TO WEATHER – *** 6 items needed for success in 1-23-19's online class! ** 

 

[Continue]

Read Participant Guide:

Module 2

·         Lessons: Data analytics lifecycle overview (72-127)

 

**** Register email, code, group, background. ****

(10 points off Participation for each week not completed)

 

**** Google Web Site.  10 points off Participation if not completed ****

 

**** Lab 1 ****

 

Week 3

Wednesday, January 30, 2019

CLASS ONLINE ONLY DUE TO WEATHER – *** 4 items needed for success in 1-30-19's online class! ** 

 

Basic Data Analytics Methods Using R

 

**** Quiz 1:  Big Data  Analytics and Data Analytics Lifecycle (covering 4 Module 1 and Module 2 Participant Guide Lessons listed on schedule, Lab 1, and class discussions)

 

Read Participant Guide:

Module 3:

·         Lesson 1: Introduction to the R programming language (130-148)

 

**** Lab 2 ****

 

Week 4

Wednesday, February 6, 2019

No physical classroom meeting tonight.  Class on 2/6/19 is ONLINE ONLY.

*** 5 items needed for success in 2-6-19's online class! ** 

 

Optional (short!) Readings:

Analytics trends for 2019

 

Looking back:  What experts said in 2017:

·         Top Ten Big Data Trends for 2017

 

What they said in 2016:

·         Top Eight Big Data Trends of 2016

 

Read Participant Guide:

Module 3:

·         Lesson 2: Analyzing and exploring data (149-165)

·         Lesson 3: Statistics for model building and evaluation (166-185)

 

**** Lab 3 ****

 

Week 5

Wednesday, February 13, 2019

Advanced analytics—theory and methods

**** Quiz 2:  Data Analytics Methods Using R (covering 3 Module 3 Participant Guide Lessons listed on schedule since Quiz 1, Labs 2 & 3, and class discussions)

 

Read Participant Guide:

Module 4

·         Lesson 1:  K-Means Clustering (192-204)

 

**** Lab 4 ****

 

Week 6

Wednesday, February 20, 2019

BH 300

 

Guest Speaker – Stephen Stoyan, Director of Business Analytics and Strategy at Abbott:  How data and advanced analytical techniques are used to drive business value in marketing science and the supply chain”.

 

BH 300

 

*** Attendance required ***

 

Class today is only from 7:00 to 8:30 p.m.  Please bring a web-enabled device.

 

Week 7

Wednesday, February 27, 2019

[Continue]

 

Read Participant Guide:

Module 4

·         Lesson 2:  Association Rules (205-225)

·         Lesson 3:  Linear Regression (226-243)

 

**** Lab 5 ****

 

**** Lab 6 ****

 

Week 8

Wednesday, March 6, 2019

[Continue]

 

Great Odds Ratio explanation

 

Handy Log Odds calculator

 

Best ROC curve explanation I’ve found (watch the video)

 

Here is my ROC Curve example in Excel

 

 

Read Participant Guide:

Module 5

·         Lesson 1:  Logistic Regression (244-259)

 

**** Lab 7 ****

 

 

(end)**** Quiz 3:  Clustering, Association Rules, Linear & Logistic Regression (covering 4 Participant Guide Lessons listed on schedule since Quiz 2, Labs 4-6, and class discussions)

 (nothing specific on Lab 7, but otherwise Logistic Regression is eligible)

 

Wednesday, March 13, 2019

*Spring Break – No class*

 

 

Week 9

Wednesday, March 20, 2019

Advanced analytics—theory and methods

Read Participant Guide:

Module 6

·         Lesson 1:  Naive Bayes (277-295)

·         Lesson 2:  Decision Trees (296-313)

 

**** Lab 8 ****

 

Week 10

Wednesday, March 27, 2019

[Continue]

 

InfoGain calculations for the Golf example in Lab 9

 

Check out this site for helpful, iterative regular expression building / learning.

 

Regex examples

Read Participant Guide:

Module 6

·         Lesson 3:  Time Series (314-335)

Module 7

·         Lesson 1:  Text Analysis (260-276)

 

**** Lab 9 ****

 

**** Lab 10 ****

 

Week 11

Wednesday, April 3, 2019

Advanced analytics—technology and tools

 

Hadoop from the source

 

Google Cloud’s use of MapReduce

 

Mahout from the source

 

Some Mahout clarification

 

Optional:  Want more?  Take a Lynda Hadoop course (FREE for NIU students… choose “sign In” and then enter your NIU login)

 

**** Project decision deadline:  Groups must have declared project choice by this date (penalty for not doing so:  -5 points on Final Project)

 

 

**** Quiz 4:  Decision Trees, Naïve-Bayes, Time Series Analysis, and Text Analytics (covering 4 Participant Guide Lessons listed on schedule since Quiz 3, Labs 8-10, and class discussions)

 

Read Participant Guide:

Module 8

·         Lesson 1: Introduction to advanced analytics—technology and tools (338-367)

·         Lesson 2:  Hadoop Ecosystem (368-386)

 

**** Lab 11 ****

 

Week 12

Wednesday, April 10, 2019

[Continue]

 

Nice SQL tutorial

 

Regular Expression info

 

Check out this site for helpful, iterative regular expression building / learning.

 

Regex examples

 

Good regex explanation with examples of escaping with a backslash

 

Read Participant Guide:

Module 8

·         Lesson 3:  In-database Analytics SQL Essentials (387-408)

·         Lesson 4: Advanced SQL and MADlib (409-433)

 

**** Lab 12 ****

 

Week 13

Wednesday, April 17, 2019

Project preparation.  Final project tips.

 

Note:  Your final projects will be the “Final Lab” in your Lab Guide.

**** Quiz 5:  MapReduce, Hadoop, and In-Database Analytics (covering 4 Participant Guide Lessons listed on schedule since Quiz 4, Labs 11 & 12, and class discussions)

 

Read Participant Guide:

Module 9

·         Lesson 1:  Putting it all together, project presentation tips, data visualization (437-491)

 

Week 14

Wednesday, April 24, 2019

Project Presentations

 

Presentation Guidelines

 

ROC guidelines

 

*** Please sign up here BY MAY 8 if you would like a voucher to take the full ($75) EMC Education Services DEA-7TT2 Data Science and Big Data Analytics Exam (this is NOT what you are taking for your Final Exam… that one is free) ***

 

Presentation Order:

Group 10: KY

Group 8:  CA

Group 6:  IL

Group 4:  IN

Break

Group 9:  MT

Group 3:  NY

Group 2:  WY

Group 5:  WI

Week 15

Wednesday, May 1, 2019

****Final Exam, Wednesday, May 1, 2019

 

The final will be a combination of the EMC 35 question free certification exam and 15 repeat questions from our quizzes this semester.  If you score 70% (NOTE:  THIS WAS LOWERED… ANY PASSING SCORE RECEIVES CERTIFICATION AND THEREFORE THE 5 POINT BONUS) or higher on the EMC exam, you earn EMC Academic

Associate recognition.  If you receive this certification, you will also get a 5 point bonus added to your Final Exam score (e.g., score 25/35 on EMC portion and 15/15 on quiz repeat questions, you would have 40/50=80% on the Final Exam.  EXCEPT, since your 25/35 is good for certification, you would get a bump to 45/50=90% for the Final Exam).

 

Take a practice exam here.  Your organization is “Dell EMC Partner/Customer” and your Company Name is “Northern Illinois University”

 

Thinking of taking the full certification exam, EMC Education Services DEA-7TT2 Data Science and Big Data Analytics Exam?

Sign up here (you'll need to create an account and get a voucher from me).

 

****Rate your group members!!  (If you don’t do this by Wednesday May 1 at midnight, you will lose 5 participation points).

 

**** Check the accuracy of all of your posted grades and report any errors!! 

Midnight Wednesday May 1 is closing time for any participation or grade changes for the course!!  NO changes will be made after this date.

 

 

Sorry, no one may take the Final Exam early or late.

 

If you cannot make the date due to a valid, documented reason (NOT “I’m not ready” / “I need to study more”, etc.), you may take an incomplete in the class, and then take the Final Exam when you return to school next semester, at which point your grade will also be changed.

 

You must tell me prior to the Final Exam if you cannot make the date.  If you simply do not show up to the Final Exam, you will receive a “0” for the Final Exam.

 

Week 16

Wednesday, May 8, 2019

Optional Final Exam Opportunity #2, Wednesday, May 8, 2019

 

You can improve your grade with a second certification exam attempt.  Calculate your Final Exam grade by looking up your results in the Grades section on the 15 repeat questions, add your number correct from the certification exam, which is your %*35 (+5 bonus correct points if you PASSED), and multiply the total by 2 to get your score out of 100.  Enter it in the online grade calculator (Final Exam box) to calculate your Final Course Grade.  Don’t like it?  Come tonight and take the certification exam again (I will use the higher of your two attempts to calculate your final grade).