pandas
中文版:Pandas 简介
Pandas: For data processing and analysis
import pandas as pdMost important: Series & DataFrame
A DataFrame is like a table in a database, or an Excel spreadsheet
To create a DataFrame
data = {'Name': ['Google', 'Runoob', 'Taobao'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
print(df)myvar = pd.DataFrame({
'sites': ["Google", "Runoob", "Wiki"],
'number': [1, 2, 3]
})
print(myvar)Or you may create the Series and combine them as a DataFrame.
# Create two series
series_apples = pd.Series([1, 3, 7, 4])
series_bananas = pd.Series([2, 6, 3, 5])
# Put two Series together,get a DataFrame,name the colomns
df = pd.DataFrame({ 'Apples': series_apples, 'Bananas': series_bananas })
print(df)Create DataFrame from list
data = [['Google', 10], ['Runoob', 12], ['Wiki', 13]]
# create DataFrame
df = pd.DataFrame(data, columns=['Site', 'Age'])
# use astype to set the datatype of each colomn
df['Site'] = df['Site'].astype(str)
df['Age'] = df['Age'].astype(float)
print(df)import numpy as np
# create an array
ndarray_data = np.array([
['Google', 10],
['Runoob', 12],
['Wiki', 13]
])
# turn the array into DataFrame
df = pd.DataFrame(ndarray_data, columns=['Site', 'Age'])
print(df)Read data from csv file as DataFrame
import pandas as pd
df = pd.read_csv('president_heights.csv')
print(df)# to check the shape of a DataFrame
df.shape# to get the index of a DataFrame
df.index# Display information about the data
df.info()# Display basic statistical information
df.describe()df['name']print(df[['height(cm)', 'order']])# Display the first five rows of data
df.head()# Display the last five rows of data
df.tail()print(df.iloc[0:2])print(df.iloc[0:2, 0:2])print(df.loc[0:1, ['name', 'order']])print(df[df['height(cm)'] > 180])print(df.loc[df['height(cm)'] > 180, ['name', 'order']])