4.8. 預測目標 (y) 的轉換 · sklearn中文文檔

# 4.8. 預測目標 (`y`) 的轉換校驗者: [@FontTian](https://github.com/FontTian) [@numpy](https://github.com/apachecn/scikit-learn-doc-zh) 翻譯者: [@程威](https://github.com/apachecn/scikit-learn-doc-zh) ## 4.8.1. 標簽二值化 [`LabelBinarizer`](generated/sklearn.preprocessing.LabelBinarizer.html#sklearn.preprocessing.LabelBinarizer "sklearn.preprocessing.LabelBinarizer") 是一個用來從多類別列表創建標簽矩陣的工具類: ``` >>> from sklearn import preprocessing >>> lb = preprocessing.LabelBinarizer() >>> lb.fit([1, 2, 6, 4, 2]) LabelBinarizer(neg_label=0, pos_label=1, sparse_output=False) >>> lb.classes_ array([1, 2, 4, 6]) >>> lb.transform([1, 6]) array([[1, 0, 0, 0], [0, 0, 0, 1]]) ``` 對于多類別是實例，可以使用 [`MultiLabelBinarizer`](generated/sklearn.preprocessing.MultiLabelBinarizer.html#sklearn.preprocessing.MultiLabelBinarizer "sklearn.preprocessing.MultiLabelBinarizer"): ``` >>> lb = preprocessing.MultiLabelBinarizer() >>> lb.fit_transform([(1, 2), (3,)]) array([[1, 1, 0], [0, 0, 1]]) >>> lb.classes_ array([1, 2, 3]) ``` ## 4.8.2. 標簽編碼 [`LabelEncoder`](generated/sklearn.preprocessing.LabelEncoder.html#sklearn.preprocessing.LabelEncoder "sklearn.preprocessing.LabelEncoder") 是一個可以用來將標簽規范化的工具類，它可以將標簽的編碼值范圍限定在\[0,n\_classes-1\]. 這在編寫高效的Cython程序時是非常有用的. [`LabelEncoder`](generated/sklearn.preprocessing.LabelEncoder.html#sklearn.preprocessing.LabelEncoder "sklearn.preprocessing.LabelEncoder") 可以如下使用: ``` >>> from sklearn import preprocessing >>> le = preprocessing.LabelEncoder() >>> le.fit([1, 2, 2, 6]) LabelEncoder() >>> le.classes_ array([1, 2, 6]) >>> le.transform([1, 1, 2, 6]) array([0, 0, 1, 2]) >>> le.inverse_transform([0, 0, 1, 2]) array([1, 1, 2, 6]) ``` 當然，它也可以用于非數值型標簽的編碼轉換成數值標簽（只要它們是可哈希并且可比較的）: ``` >>> le = preprocessing.LabelEncoder() >>> le.fit(["paris", "paris", "tokyo", "amsterdam"]) LabelEncoder() >>> list(le.classes_) ['amsterdam', 'paris', 'tokyo'] >>> le.transform(["tokyo", "tokyo", "paris"]) array([2, 2, 1]) >>> list(le.inverse_transform([2, 2, 1])) ['tokyo', 'tokyo', 'paris'] ```