參數優化 · LightGBM 中文文檔

# 參數優化該頁面包含了 LightGBM 中所有的參數. **其他有用鏈接列表** * [參數](./Parameters.rst) * [Python API](./Python-API.rst) ## 針對 Leaf-wise (最佳優先) 樹的參數優化 LightGBM uses the [leaf-wise](./Features.rst#leaf-wise-best-first-tree-growth) tree growth algorithm, while many other popular tools use depth-wise tree growth. Compared with depth-wise growth, the leaf-wise algorithm can convenge much faster. However, the leaf-wise growth may be over-fitting if not used with the appropriate parameters. LightGBM 使用 [leaf-wise](./Features.rst#leaf-wise-best-first-tree-growth) 的樹生長策略, 而很多其他流行的算法采用 depth-wise 的樹生長策略. 與 depth-wise 的樹生長策略相較, leaf-wise 算法可以收斂的更快. 但是, 如果參數選擇不當的話, leaf-wise 算法有可能導致過擬合. To get good results using a leaf-wise tree, these are some important parameters: 想要在使用 leaf-wise 算法時得到好的結果, 這里有幾個重要的參數值得注意: 1. `num_leaves`. This is the main parameter to control the complexity of the tree model. Theoretically, we can set `num_leaves = 2^(max_depth)` to convert from depth-wise tree. However, this simple conversion is not good in practice. The reason is, when number of leaves are the same, the leaf-wise tree is much deeper than depth-wise tree. As a result, it may be over-fitting. Thus, when trying to tune the `num_leaves`, we should let it be smaller than `2^(max_depth)`. For example, when the `max_depth=6` the depth-wise tree can get good accuracy, but setting `num_leaves` to `127` may cause over-fitting, and setting it to `70` or `80` may get better accuracy than depth-wise. Actually, the concept `depth` can be forgotten in leaf-wise tree, since it doesn’t have a correct mapping from `leaves` to `depth`. 1. `num_leaves`. 這是控制樹模型復雜度的主要參數. 理論上, 借鑒 depth-wise 樹, 我們可以設置 `num_leaves = 2^(max_depth)` 但是, 這種簡單的轉化在實際應用中表現不佳. 這是因為, 當葉子數目相同時, leaf-wise 樹要比 depth-wise 樹深得多, 這就有可能導致過擬合. 因此, 當我們試著調整 `num_leaves` 的取值時, 應該讓其小于 `2^(max_depth)`. 舉個例子, 當 `max_depth=6` 時(這里譯者認為例子中, 樹的最大深度應為7), depth-wise 樹可以達到較高的準確率.但是如果設置 `num_leaves` 為 `127` 時, 有可能會導致過擬合, 而將其設置為 `70` 或 `80` 時可能會得到比 depth-wise 樹更高的準確率. 其實, `depth` 的概念在 leaf-wise 樹中并沒有多大作用, 因為并不存在一個從 `leaves` 到 `depth` 的合理映射. 2. `min_data_in_leaf`. This is a very important parameter to deal with over-fitting in leaf-wise tree. Its value depends on the number of training data and `num_leaves`. Setting it to a large value can avoid growing too deep a tree, but may cause under-fitting. In practice, setting it to hundreds or thousands is enough for a large dataset. 1. `min_data_in_leaf`. 這是處理 leaf-wise 樹的過擬合問題中一個非常重要的參數. 它的值取決于訓練數據的樣本個樹和 `num_leaves`. 將其設置的較大可以避免生成一個過深的樹, 但有可能導致欠擬合. 實際應用中, 對于大數據集, 設置其為幾百或幾千就足夠了. 2. `max_depth`. You also can use `max_depth` to limit the tree depth explicitly. 1. `max_depth`. 你也可以利用 `max_depth` 來顯式地限制樹的深度. ## 針對更快的訓練速度 * Use bagging by setting `bagging_fraction` and `bagging_freq` * Use feature sub-sampling by setting `feature_fraction` * Use small `max_bin` * Use `save_binary` to speed up data loading in future learning * Use parallel learning, refer to [并行學習指南](./Parallel-Learning-Guide.rst) * 通過設置 `bagging_fraction` 和 `bagging_freq` 參數來使用 bagging 方法 * 通過設置 `feature_fraction` 參數來使用特征的子抽樣 * 使用較小的 `max_bin` * 使用 `save_binary` 在未來的學習過程對數據加載進行加速 * 使用并行學習, 可參考 [并行學習指南](./Parallel-Learning-Guide.rst) ## 針對更好的準確率 * Use large `max_bin` (may be slower) * Use small `learning_rate` with large `num_iterations` * Use large `num_leaves` (may cause over-fitting) * Use bigger training data * Try `dart` * 使用較大的 `max_bin` （學習速度可能變慢） * 使用較小的 `learning_rate` 和較大的 `num_iterations` * 使用較大的 `num_leaves` （可能導致過擬合） * 使用更大的訓練數據 * 嘗試 `dart` ## 處理過擬合 * Use small `max_bin` * Use small `num_leaves` * Use `min_data_in_leaf` and `min_sum_hessian_in_leaf` * Use bagging by set `bagging_fraction` and `bagging_freq` * Use feature sub-sampling by set `feature_fraction` * Use bigger training data * Try `lambda_l1`, `lambda_l2` and `min_gain_to_split` for regularization * Try `max_depth` to avoid growing deep tree * 使用較小的 `max_bin` * 使用較小的 `num_leaves` * 使用 `min_data_in_leaf` 和 `min_sum_hessian_in_leaf` * 通過設置 `bagging_fraction` 和 `bagging_freq` 來使用 bagging * 通過設置 `feature_fraction` 來使用特征子抽樣 * 使用更大的訓練數據 * 使用 `lambda_l1`, `lambda_l2` 和 `min_gain_to_split` 來使用正則 * 嘗試 `max_depth` 來避免生成過深的樹