practice Predict solution

中文版：预测练习参考答案

Weather Forecast: Logistic Regression and Naive Bayes

🎯 Task Description

You are provided with a synthetic weather forecast dataset containing the following features:

Temperature (°C)
Humidity (%)
WindSpeed (km/h)
Pressure (hPa)
RainToday (Yes/No)
RainTomorrow (0 = No, 1 = Yes) — This is the target variable.

Your objectives are:

Preprocess the data:

Convert categorical variables to numeric.
Split the dataset into training and testing sets.
Apply feature scaling to numerical columns.

Train two predictive models:

Linear Regression
Naive Bayes Classifier

Evaluate the models:

Make predictions on the test set.
Calculate accuracy scores.
Compare the performance。

Try to predict the ‘Temperature’ with ‘Humidity’, ‘WindSpeed’, ‘Pressure’

1. Preprocess the data

a. use pd.read_csv() to save the data in df

b. use df.sample() and df.drop() to randomly split the dataset into a training set (70%) and a testing set (30%)

c. define features (X) and target variable (y) for both training and test sets

e. standardization and normalization

import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder
 
# Load and encode categorical feature
df = pd.read_csv("weather_data.csv")

# Shuffle and split manually using sample()
df_train = df.sample(frac=0.7, random_state=42)
df_test = df.drop(df_train.index)

# Feature and target separation
X_train = df_train[['Humidity', 'WindSpeed', 'Pressure']]
y_train = df_train['Temperature']
X_test = df_test[['Humidity', 'WindSpeed', 'Pressure']]
y_test = df_test['Temperature']

# Feature scaling
scaler = StandardScaler()
 # Use fit_transform() on the training set to learn and apply scaling.
X_train_scaled = scaler.fit_transform(X_train)
 # Use transform() (only) on the test set to avoid data leakage.
X_test_scaled = scaler.transform(X_test)

2. Train Linear Regression

from sklearn.linear_model import LinearRegression
 
lr = LinearRegression()
lr.fit(X_train_scaled, y_train)

3. Predict the ‘Temperature’ for Test dataset

y_pred_lr = lr.predict(X_test_scaled)
print(y_pred_lr)

4. Evaluate the model, by checking the distance between prediction and ground truth

from sklearn.metrics import mean_absolute_error
 
mae = mean_absolute_error(y_test, y_pred_lr)
print("MAE:", mae)

Try to predict if it will rain (‘RainTomorrow’) with ‘Temperature’, ‘Humidity’, ‘WindSpeed’, ‘Pressure’, ‘RainToday’

0 ⇒ No
1 ⇒ Yes

1. Preprocessing

Define features (X) and target variable (y) for both training and test sets

Standardization and normalization

# Feature and target separation
X_train = df_train[['Temperature', 'Humidity', 'WindSpeed', 'Pressure', 'RainToday']]
y_train = df_train['RainTomorrow']
X_test = df_test[['Temperature', 'Humidity', 'WindSpeed', 'Pressure', 'RainToday']]
y_test = df_test['RainTomorrow']
 
# Feature scaling
scaler = StandardScaler()
 # Use fit_transform() on the training set to learn and apply scaling.
X_train_scaled = scaler.fit_transform(X_train)
 # Use transform() (only) on the test set to avoid data leakage.
X_test_scaled = scaler.transform(X_test)
 
print(y_test)

2. Train Naive Bayes

from sklearn.naive_bayes import GaussianNB
 
nb = GaussianNB()
nb.fit(X_train, y_train) # Naive Bayes works well with raw features (without Feature scaling)

3. Predict

y_pred_nb = nb.predict(X_test)
print(y_pred_nb)

4. Evaluate the model

from sklearn.metrics import accuracy_score, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
 
acc_nb = accuracy_score(y_test, y_pred_nb)
 
print("Naive Bayes Accuracy",acc_nb)

5* Visualize the results using confusion matrices （additional）

cm_nb = confusion_matrix(y_test, y_pred_nb)
 
sns.heatmap(cm_nb, annot=True, cmap="Greens")
plt.show()

Explorer

AITC Wiki

practice Predict solution

预测练习参考答案

practice Predict solution

Weather Forecast: Logistic Regression and Naive Bayes

🎯 Task Description

Try to predict the ‘Temperature’ with ‘Humidity’, ‘WindSpeed’, ‘Pressure’

1. Preprocess the data

2. Train Linear Regression

3. Predict the ‘Temperature’ for Test dataset

4. Evaluate the model, by checking the distance between prediction and ground truth

Try to predict if it will rain (‘RainTomorrow’) with ‘Temperature’, ‘Humidity’, ‘WindSpeed’, ‘Pressure’, ‘RainToday’

1. Preprocessing

2. Train Naive Bayes

3. Predict

4. Evaluate the model

5* Visualize the results using confusion matrices （additional）

Graph View

Table of Contents

Backlinks