# torchvision.datasets
# torchvision.datasets
`torchvision.datasets`中包含了以下數據集
- MNIST
- COCO(用于圖像標注和目標檢測)(Captioning and Detection)
- LSUN Classification
- ImageFolder
- Imagenet-12
- CIFAR10 and CIFAR100
- STL10
`Datasets` 擁有以下`API`:
`__getitem__``__len__`
由于以上`Datasets`都是 `torch.utils.data.Dataset`的子類,所以,他們也可以通過`torch.utils.data.DataLoader`使用多線程(python的多進程)。
舉例說明: `torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)`
在構造函數中,不同的數據集直接的構造函數會有些許不同,但是他們共同擁有 `keyword` 參數。 In the constructor, each dataset has a slightly different API as needed, but they all take the keyword args:
- `transform`: 一個函數,原始圖片作為輸入,返回一個轉換后的圖片。(詳情請看下面關于`torchvision-tranform`的部分)
- `target_transform` - 一個函數,輸入為`target`,輸出對其的轉換。例子,輸入的是圖片標注的`string`,輸出為`word`的索引。
## MNIST
```
dset.MNIST(root, train=True, transform=None, target_transform=None, download=False)
```
參數說明:
- root : `processed/training.pt` 和 `processed/test.pt` 的主目錄
- train : `True` = 訓練集, `False` = 測試集
- download : `True` = 從互聯網上下載數據集,并把數據集放在`root`目錄下. 如果數據集之前下載過,將處理過的數據(minist.py中有相關函數)放在`processed`文件夾下。
## COCO
需要安裝[COCO API](https://github.com/pdollar/coco/tree/master/PythonAPI)
### 圖像標注:
```
dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])
```
例子:
```
import torchvision.datasets as dset
import torchvision.transforms as transforms
cap = dset.CocoCaptions(root = 'dir where images are',
annFile = 'json annotation file',
transform=transforms.ToTensor())
print('Number of samples: ', len(cap))
img, target = cap[3] # load 4th sample
print("Image Size: ", img.size())
print(target)
```
輸出:
```
Number of samples: 82783
Image Size: (3L, 427L, 640L)
[u'A plane emitting smoke stream flying over a mountain.',
u'A plane darts across a bright blue sky behind a mountain covered in snow',
u'A plane leaves a contrail above the snowy mountain top.',
u'A mountain that has a plane flying overheard in the distance.',
u'A mountain view with a plume of smoke in the background']
```
### 檢測:
```
dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])
```
## LSUN
```
dset.LSUN(db_path, classes='train', [transform, target_transform])
```
參數說明:
- db\_path = 數據集文件的根目錄
- classes = ‘train’ (所有類別, 訓練集), ‘val’ (所有類別, 驗證集), ‘test’ (所有類別, 測試集) \[‘bedroom\_train’, ‘church\_train’, …\] : a list of categories to load## ImageFolder
一個通用的數據加載器,數據集中的數據以以下方式組織 ``` root/dog/xxx.png root/dog/xxy.png root/dog/xxz.png
root/cat/123.png root/cat/nsdf3.png root/cat/asd932\_.png
```
```python
dset.ImageFolder(root="root folder path", [transform, target_transform])
```
他有以下成員變量:
- self.classes - 用一個list保存 類名
- self.class\_to\_idx - 類名對應的 索引
- self.imgs - 保存(img-path, class) tuple的list
## Imagenet-12
This is simply implemented with an ImageFolder dataset.
The data is preprocessed [as described here](https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset)
[Here is an example](https://github.com/pytorch/examples/blob/27e2a46c1d1505324032b1d94fc6ce24d5b67e97/imagenet/main.py#L48-L62)
## CIFAR
```
dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)
dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)
```
參數說明:
- root : `cifar-10-batches-py` 的根目錄
- train : `True` = 訓練集, `False` = 測試集
- download : `True` = 從互聯上下載數據,并將其放在`root`目錄下。如果數據集已經下載,什么都不干。## STL10
```
dset.STL10(root, split='train', transform=None, target_transform=None, download=False)
```
參數說明:
- root : `stl10_binary`的根目錄
- split : 'train' = 訓練集, 'test' = 測試集, 'unlabeled' = 無標簽數據集, 'train+unlabeled' = 訓練 + 無標簽數據集 (沒有標簽的標記為-1)
- download : `True` = 從互聯上下載數據,并將其放在`root`目錄下。如果數據集已經下載,什么都不干。
- PyTorch 中文文檔
- 主頁
- 自動求導機制
- CUDA語義
- 擴展PyTorch
- 多進程最佳實踐
- 序列化語義
- torch
- torch.Tensor
- torch.Storage
- torch.nn
- torch.nn.functional
- torch.autograd
- torch.optim
- torch.nn.init
- torch.multiprocessing
- torch.legacy
- torch.cuda
- torch.utils.ffi
- torch.utils.data
- torch.utils.model_zoo
- torchvision
- torchvision.datasets
- torchvision.models
- torchvision.transforms
- torchvision.utils
- 致謝