[TOC]
# 構造一個DataFrame結構
~~~
import pandas as pd
data = {'country': ['aaa', 'bbb', 'ccc'],
'population': [10, 12, 14]}
frame = pd.DataFrame(data)
print(frame)
~~~
輸出
~~~
country population
0 aaa 10
1 bbb 12
2 ccc 14
~~~
構造出來的DataFrame也能打印出info這些信息
# 取指定的數據
cvs是這樣

我們取出里面的Age值
~~~
import pandas as pd
csv = pd.read_csv('./titanic.csv')
print(csv['Age'])
~~~
輸出
左側0,1,2,3是索引

這些行有空值
如果要取出前5個
`print(csv['Age'][:5])`
# 指定索引
~~~
import pandas as pd
csv = pd.read_csv('./titanic.csv')
# 設置左側索引
index = csv.set_index('Name')
print(index['Age'][:5])
~~~
輸出
~~~
Name
Braund, Mr. Owen Harris 22.0
Cumings, Mrs. John Bradley (Florence Briggs Thayer) 38.0
Heikkinen, Miss. Laina 26.0
Futrelle, Mrs. Jacques Heath (Lily May Peel) 35.0
Allen, Mr. William Henry 35.0
Name: Age, dtype: float64
~~~
# 根據索引找對應的值
~~~
import pandas as pd
csv = pd.read_csv('./titanic.csv')
# 設置左側索引
index = csv.set_index('Name')
# 根據名字這個索引找他的年齡
# 索引和年齡
age = index['Age']
print(age['Braund, Mr. Owen Harris'])
~~~
輸出
~~~
22.0
~~~
# 運算
~~~
import pandas as pd
csv = pd.read_csv('./titanic.csv')
# 設置左側索引
index = csv.set_index('Name')
# 索引和年齡
age = index['Age']
age1 = age * 10
print(age1)
~~~
輸出
~~~
Name
Braund, Mr. Owen Harris 22.0
Cumings, Mrs. John Bradley (Florence Briggs Thayer) 38.0
Heikkinen, Miss. Laina 26.0
Futrelle, Mrs. Jacques Heath (Lily May Peel) 35.0
Allen, Mr. William Henry 35.0
Moran, Mr. James NaN
~~~
他的運算不影響原來的值,運算的結果需要新的值類來接收
# 求最小值,最大值,mean值
~~~
import pandas as pd
csv = pd.read_csv('./titanic.csv')
# 設置左側索引
index = csv.set_index('Name')
# 索引和年齡
age = index['Age']
# 求最小值
age_min = age.min()
print(age_min)
# 求最大值
age_max = age.max()
print(age_max)
# 求mean值
mean = age.mean()
print(mean)
~~~
輸出
~~~
0.42
80.0
29.6991176471
~~~
# 觀察數據的基本統計特性
~~~
import pandas as pd
csv = pd.read_csv('./titanic.csv')
describe = csv.describe()
print(describe)
~~~
輸出
