AITC Wiki

pandas

Pandas 简介

pandas

中文版:Pandas 简介

Pandas: For data processing and analysis

import pandas as pd

Most important: Series & DataFrame

A DataFrame is like a table in a database, or an Excel spreadsheet

To create a DataFrame

data = {'Name': ['Google', 'Runoob', 'Taobao'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
 
print(df)
myvar = pd.DataFrame({
 'sites': ["Google", "Runoob", "Wiki"],
 'number': [1, 2, 3]
})
print(myvar)

Or you may create the Series and combine them as a DataFrame.

# Create two series
series_apples = pd.Series([1, 3, 7, 4])
series_bananas = pd.Series([2, 6, 3, 5])
 
# Put two Series together,get a DataFrame,name the colomns
df = pd.DataFrame({ 'Apples': series_apples, 'Bananas': series_bananas })
 
print(df)

Create DataFrame from list

data = [['Google', 10], ['Runoob', 12], ['Wiki', 13]]
 
# create DataFrame
df = pd.DataFrame(data, columns=['Site', 'Age'])
 
# use astype to set the datatype of each colomn
df['Site'] = df['Site'].astype(str)
df['Age'] = df['Age'].astype(float)
 
print(df)
import numpy as np
# create an array
ndarray_data = np.array([
 ['Google', 10],
 ['Runoob', 12],
 ['Wiki', 13]
])
 
# turn the array into DataFrame
df = pd.DataFrame(ndarray_data, columns=['Site', 'Age'])
 
print(df)

Read data from csv file as DataFrame

import pandas as pd
df = pd.read_csv('president_heights.csv')
print(df)
# to check the shape of a DataFrame
df.shape
# to get the index of a DataFrame
df.index
# Display information about the data
df.info()
# Display basic statistical information
df.describe()
df['name']
print(df[['height(cm)', 'order']])
# Display the first five rows of data
df.head()
# Display the last five rows of data
df.tail()
print(df.iloc[0:2])
print(df.iloc[0:2, 0:2])
print(df.loc[0:1, ['name', 'order']])
print(df[df['height(cm)'] > 180])
print(df.loc[df['height(cm)'] > 180, ['name', 'order']])