3.7 softmax回歸的簡潔實現 · 動手學深度學習--pytorch

# 3.7 softmax回歸的簡潔實現我們在3.3節（線性回歸的簡潔實現）中已經了解了使用Pytorch實現模型的便利。下面，讓我們再次使用Pytorch來實現一個softmax回歸模型。首先導入所需的包或模塊。 ``` python import torch from torch import nn from torch.nn import init import numpy as np import sys sys.path.append("..") import d2lzh_pytorch as d2l ``` ## 3.7.1 獲取和讀取數據我們仍然使用Fashion-MNIST數據集和上一節中設置的批量大小。 ``` python batch_size = 256 train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size) ``` ## 3.7.2 定義和初始化模型在3.4節（softmax回歸）中提到，softmax回歸的輸出層是一個全連接層，所以我們用一個線性模塊就可以了。因為前面我們數據返回的每個batch樣本`x`的形狀為(batch_size, 1, 28, 28), 所以我們要先用`view()`將`x`的形狀轉換成(batch_size, 784)才送入全連接層。 ``` python num_inputs = 784 num_outputs = 10 class LinearNet(nn.Module): def __init__(self, num_inputs, num_outputs): super(LinearNet, self).__init__() self.linear = nn.Linear(num_inputs, num_outputs) def forward(self, x): # x shape: (batch, 1, 28, 28) y = self.linear(x.view(x.shape[0], -1)) return y net = LinearNet(num_inputs, num_outputs) ``` 我們將對`x`的形狀轉換的這個功能自定義一個`FlattenLayer`并記錄在`d2lzh_pytorch`中方便后面使用。 ``` python # 本函數已保存在d2lzh_pytorch包中方便以后使用 class FlattenLayer(nn.Module): def __init__(self): super(FlattenLayer, self).__init__() def forward(self, x): # x shape: (batch, *, *, ...) return x.view(x.shape[0], -1) ``` 這樣我們就可以更方便地定義我們的模型： ``` python from collections import OrderedDict net = nn.Sequential( # FlattenLayer(), # nn.Linear(num_inputs, num_outputs) OrderedDict([ ('flatten', FlattenLayer()), ('linear', nn.Linear(num_inputs, num_outputs))]) ) ``` 然后，我們使用均值為0、標準差為0.01的正態分布隨機初始化模型的權重參數。 ``` python init.normal_(net.linear.weight, mean=0, std=0.01) init.constant_(net.linear.bias, val=0) ``` ## 3.7.3 softmax和交叉熵損失函數如果做了上一節的練習，那么你可能意識到了分開定義softmax運算和交叉熵損失函數可能會造成數值不穩定。因此，PyTorch提供了一個包括softmax運算和交叉熵損失計算的函數。它的數值穩定性更好。 ``` python loss = nn.CrossEntropyLoss() ``` ## 3.7.4 定義優化算法我們使用學習率為0.1的小批量隨機梯度下降作為優化算法。 ``` python optimizer = torch.optim.SGD(net.parameters(), lr=0.1) ``` ## 3.7.5 訓練模型接下來，我們使用上一節中定義的訓練函數來訓練模型。 ``` python num_epochs = 5 d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer) ``` 輸出： ``` epoch 1, loss 0.0031, train acc 0.745, test acc 0.790 epoch 2, loss 0.0022, train acc 0.812, test acc 0.807 epoch 3, loss 0.0021, train acc 0.825, test acc 0.806 epoch 4, loss 0.0020, train acc 0.832, test acc 0.810 epoch 5, loss 0.0019, train acc 0.838, test acc 0.823 ``` ## 小結 * PyTorch提供的函數往往具有更好的數值穩定性。 * 可以使用PyTorch更簡潔地實現softmax回歸。 ----------- > 注：本節除了代碼之外與原書基本相同，[原書傳送門](https://zh.d2l.ai/chapter_deep-learning-basics/softmax-regression-gluon.html)