This course introduces relevant programming techniques for data analytics. Topics include programming languages, relevant software packages, good programming practices, linear algebra in data analytics, numerical computing, and 4~5 machine learning algorithms as running problems. After completing the course, students will gain the skills to implement a data analytics pipeline (data collection, data retrieval, data analysis, data visualization) and several "handy" machine learning algorithms.
We will use Piazza for discussion (e.g., homework, project). Post your questions there, and the teaching staff and your fellow classmates will be able to help answer them quickly. You can also use Piazza to find project teammates.
T-square will only be used for submission of assignments and projects.
Instructor | Da Kuang | Thu, 2-3pm, Klaus 1315 |
Instructor | Polo Chau | Thu, 4-5pm, Klaus 1315 |
TA | Lianxiao (Shawn) Qiu | Mon, 1-2pm, Klaus 2108 |
Date | Topic | Wed | Fri | Events | |
---|---|---|---|---|---|
Aug | 20, 22 | * Course introduction * Course survey * Introduction to Python and its data structures |
Slides | Slides | |
27, 29 | * Python exercises Q&A * Data collection
|
Slides | |||
Sep | 3, 5 | * Data collection (cont'd)
|
Slides | Slides | HW1 out (Wed) |
10, 12 | * Charting/Visualization
|
R resources (Link 1) (Link 2) (Link 3) |
Slides | ||
17, 19 |
* Data storage and retrieval in sqlite * Basic linear algebra overview
|
Slides | Notes | HW1 due (Fri) | |
24, 26 | * Dense and sparse matrices (including Numpy) * Good programming practices |
Slides | |||
Oct | 1, 3 | * Basic linear algebra overview
|
Notes | Notes | HW2 out (Mon) |
8, 10 | * Linear regression
|
Notes | Scripts | HW2 due (Fri) HW3 out (Sat) |
|
15, 17 | * Logistic regression
|
Slides | Scripts | ||
22, 24 | * Computer architecture overview * Vectorization in Numpy and R |
Slides | Scripts | HW3 due (Fri/Sat) | |
29, 31 | * K-means clustering * Project proposal presentations |
Slides | Project proposal due (Thu) | ||
Nov | 5, 7 | * K-means clustering: Case studies * Efficient implementation of K-means |
Scripts (README) | ||
12, 14 | * Numerical software stacks * Singular value decomposition (SVD) |
Slides | Notes | ||
19, 21 | * SVD, eigenvalue decomposition (EVD), and PCA * Computing SVD, EVD and PCA |
Notes | Progress report due (Wed) | ||
26, 28 | (Thanksgiving holiday) | ||||
Dec | 3, 5 | * Latest research; popular topics * Final project presentations |
Slides | Final report due (Thu) |
Prof. Le Song - Introduction to Computational Data Analysis - Spring 2014