Final Practice2

中文版：期末练习 2

Task 2

Weather Forecast: Linear Regression

🎯 Task Description

You are provided with a synthetic weather forecast dataset containing the following features:

Temperature (°C)
Humidity (%)
WindSpeed (km/h)
Pressure (hPa)
RainToday (Yes/No)
RainTomorrow (0 = No, 1 = Yes) — Thisisthe target variable.

Your objectives are:

Preprocess the data:

Convert categorical variables to numeric.
Split the dataset into training and testing sets.
Apply feature scaling to numerical columns.

Train one predictive model:

Linear Regression

Evaluate the predictions:

Make predictions onthetestset.
Calculate percentage error.

Try to predict the ‘Temperature’ with ‘Humidity’, ‘WindSpeed’, ‘Pressure’

1. Preprocess the data

a. use pd.read_csv() tosavethedataindf

b. use df.sample() and df.drop() to randomly split the dataset into a training set (70%) and a testing set (30%)

c. define features (X) and target variable (y) for both training andtestsets

e. standardization and normalization

import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder
 
# Load and encode categorical feature
df = pd.read_csv("weather_data.csv")

# Shuffle and split manually using sample()
df_train = df.sample(frac=0.7, random_state=42)
df_test = df.drop(df_train.index)

# Feature and target separation
X_train = df_train[['Humidity', 'WindSpeed', 'Pressure']]
y_train = df_train['Temperature']
X_test = df_test[['Humidity', 'WindSpeed', 'Pressure']]
y_test = df_test['Temperature']

# Feature scaling
scaler = StandardScaler()
 # Use fit_transform() on the training settolearn and apply scaling.
X_train_scaled = scaler.fit_transform(X_train)
 # Use transform() (only) onthetestsettoavoid data leakage.
X_test_scaled = scaler.transform(X_test)

2. Train Linear Regression

from sklearn.linear_model import LinearRegression
 
lr = LinearRegression()
lr.fit(X_train_scaled, y_train)

3. Predict the ‘Temperature’ for Test dataset

y_pred_lr = lr.predict(X_test_scaled)
print(y_pred_lr)

4. Evaluate the predictions, by calculating the percentage error between the actual values and the predicted values.

import numpy as np
 
epsilon = 1e-8 # Avoid division byzeroby adding a small epsilon or filtering out zeros
percentage_error = np.abs((y_test - y_pred_lr) / (y_test + epsilon)) * 100
print("Percentage error for each prediction (%): \n", percentage_error)
 
mean_percentage_error = np.mean(percentage_error)
print("Mean percentage error (%): ", mean_percentage_error)

Explorer

AITC Wiki

Final Practice2

期末练习 2

Final Practice2

Task 2

Weather Forecast: Linear Regression

🎯 Task Description

Try to predict the ‘Temperature’ with ‘Humidity’, ‘WindSpeed’, ‘Pressure’

1. Preprocess the data

2. Train Linear Regression

3. Predict the ‘Temperature’ for Test dataset

4. Evaluate the predictions, by calculating the percentage error between the actual values and the predicted values.

Graph View

Table of Contents