DTS002TC Essentials of Big Data
中文版:DTS002TC 大数据基础
This course provides a comprehensive introduction to big data essentials, covering Python programming fundamentals, numerical computing with NumPy, data manipulation with Pandas, data visualization with Matplotlib, and introductory machine learning using Scikit-Learn.
Course Overview
DTS002TC is an introductory course on big data essentials at Xi’an Jiaotong-Liverpool University. The course covers both theoretical foundations of big data and practical programming skills using Python and its data science ecosystem.
Lectures
| # | Topic | Materials |
|---|---|---|
| 1 | General Introduction to Big Data | Lecture 1 |
| 2 | Technical Aspects of Big Data | Lecture 2 |
| 2+ | Intro to GPU | GPU Intro |
| 3a | Data Detectives - CIKW | CIKW |
| 3b | Storage and Treatment of Big Data | Lecture 3 |
| 4 | Analysis of Big Data | Lecture 4 |
| 5 | Computer Vision and Big Data Analysis | Lecture 5 |
| 5+ | Coursework 1 (SJL) | CW1 |
| Day2 | Introduction to Python | Python Intro |
| Day2 | Matplotlib and Machine Learning | Matplotlib & ML |
Labs
Lab 1: Python Fundamentals
- Introduction to Python
- How to Run Python Code
- Basic Python Syntax
- Semantics & Variables
- Semantics & Operators
- Jupyter Lab
- Week1 Guide
Lab 2: Python Core Concepts
- Built-in Scalar Types
- Built-in Data Structures
- Control Flow Statements
- Defining Functions
- Errors and Exceptions
- Practice 1 | Solution
- Practice 2 | Solution
Lab 3: NumPy Basics
Lab 4: NumPy Advanced Arrays
- More on NumPy Arrays
- Fancy Indexing
- Structured Data in NumPy
- Practice 2 | Solution
- Practice 3 Solution
Lab 5: NumPy Computation
Lab 6: Boolean Arrays & Pandas
Lab 7: Data Visualization
- Introduction to Matplotlib
- Simple Line Plots
- Simple Scatter Plots
- Histograms and Binnings
- Customizing Legends
- Data Analysis & Visualization Practice | Solution
- Pandas CSV Practice | Solution
Lab 8: Machine Learning with Scikit-Learn
- Introducing Scikit-Learn
- Naive Bayes
- Linear Regression
- Data Handling Practice Solution
- Prediction Practice Solution
Lab 9-10: Final Practice
Review
Sources
All materials sourced from raw course files in raw/DTS002/.