1. 使用 RMarkdown 的 child 参数,进行文档拼接。
  2. 这样拼接以后的笔记方便复习。
  3. 相关问题提交到 GitHub

Elezi (2019) 的内容有三个两点

  1. 学习 Facebook PyTorch 的框架
  2. 理解 Convolution and Padding
  3. 理解 Transfer Learning

参考 https://pytorch.org/get-started/locally/

1 理解 Pytorch tensor

The blackcellmagic extension is already loaded. To reload it, use:
  %reload_ext blackcellmagic
tensor([[0.0249, 0.3875, 0.9637],
        [0.6190, 0.8936, 0.4906],
        [0.6484, 0.1036, 0.5873]])
torch.Size([3, 3])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

可以验证一些线性代数的想法。

torch.Size([1000, 1000]) torch.Size([1000, 1000]) torch.Size([1000, 1000])
torch.Size([1000, 1000])
tensor(125.0998)
torch.Tensor

torch.mean 把一个二维变量降维为一个常数。

… how to compute derivatives! While PyTorch computes derivatives for you, mastering them will make you a much better deep learning practitioner and that knowledge will guide you in training neural networks better.

理解如何在神经网络中求导,可以方便我们更好实践。

这个梯度下降过程实现的不错!

梯度下降

梯度下降

tensor([[18938.6172, 18050.3359, 20434.3965, 19943.2285, 20502.7695, 19317.7422,
         19452.3828, 19648.2441, 18432.9512, 19240.4980]])

这是直接的 MLP 算法。

提前训练好的。

For the most part, neural networks are just matrix (tensor) multiplication. This is the reason why we have put so much emphasis on matrices and tensors!

大多时候,神经网络仅仅是涉及矩阵相乘,这是一个非常有经验的说法。 学习下 class 建立函数,为后续写包有帮助。

这么看,其实 forward pyTorch 挺清晰的。 后期会发现,__init__ 定义的越 sequence,forward 会越简单。

2 MLP 搭建

The blackcellmagic extension is already loaded. To reload it, use:
  %reload_ext blackcellmagic

MLP 中矩阵相乘具体表现。

2.1 实现 MLP

tensor([[4.1747, 2.9866, 4.6681, 5.9294]])
tensor([[4.1747, 2.9866, 4.6681, 5.9294]])

这正好说明了 linear 的变换没有什么用,一定要用非线性函数 ReLu等。

\[\begin{array}{c}{X W_{1}=H_{1}} \\ {H_{1} W_{1}=H_{2}} \\ {H_{2} W_{2}=H_{3}} \\ {H_{3} W_{3}=O}\end{array}\]

因此

\[XW_1W_2W_3 = O\]

tensor([[4.1747, 2.9866, 4.6681, 5.9294]])
tensor([[9.7038, 9.1091, 8.5771, 4.8548]])

每一步进行 relu 都会产生不一样的效果。

2.2 计算损失函数

tensor([[-1.2000,  0.1200,  4.8000]])
torch.Size([1])

class = [0,1,2]

tensor(0.0117)

Being proficient in understanding and calculating loss functions is a very important skill in deep learning.

损失函数的定义可以帮助我们理解深度学习。

tensor(7.1459)
6.907755278982137

The score is close to -ln(1/1000) = 6.9. This is not surprising, considering that scores were random and close to each other, so the probability for each class was approximately the same (1/1000) = 0.001.

随机选择一个分类进行预测,那么概率机会为1/1000左右。

使用 MNIST 进行 MLP 搭建。

0it [00:00, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to mnist\MNIST\raw\train-images-idx3-ubyte.gz


100%|████████████████████████████▉| 9895936/9912422 [02:10<00:00, 64784.97it/s]

Extracting mnist\MNIST\raw\train-images-idx3-ubyte.gz



0it [00:00, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to mnist\MNIST\raw\train-labels-idx1-ubyte.gz



  0%|                                                | 0/28881 [00:00<?, ?it/s]
 57%|██████████████████▋              | 16384/28881 [00:00<00:00, 54251.65it/s]
32768it [00:01, 26629.83it/s]                                                  

0it [00:00, ?it/s]

Extracting mnist\MNIST\raw\train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to mnist\MNIST\raw\t10k-images-idx3-ubyte.gz



  0%|                                              | 0/1648877 [00:00<?, ?it/s]
  1%|▎                              | 16384/1648877 [00:00<00:32, 49498.48it/s]
  1%|▍                              | 24576/1648877 [00:01<00:45, 35733.91it/s]
  2%|▊                              | 40960/1648877 [00:01<00:46, 34744.99it/s]
  3%|█                              | 57344/1648877 [00:02<00:41, 38561.02it/s]
  4%|█▏                             | 65536/1648877 [00:02<00:45, 34578.56it/s]
  5%|█▌                             | 81920/1648877 [00:02<00:40, 39063.12it/s]
  6%|█▊                             | 98304/1648877 [00:03<00:35, 43615.46it/s]
  6%|█▉                            | 106496/1648877 [00:03<00:37, 41326.50it/s]
  7%|██▏                           | 122880/1648877 [00:03<00:36, 41545.08it/s]
  8%|██▌                           | 139264/1648877 [00:03<00:33, 45165.26it/s]
  9%|██▊                           | 155648/1648877 [00:04<00:29, 51452.48it/s]
 10%|███▏                          | 172032/1648877 [00:04<00:28, 51839.76it/s]
 11%|███▎                          | 180224/1648877 [00:05<00:51, 28616.56it/s]
 13%|███▉                          | 212992/1648877 [00:05<00:39, 36618.44it/s]
 14%|████▏                         | 229376/1648877 [00:05<00:35, 40157.12it/s]
 15%|████▍                         | 245760/1648877 [00:06<00:39, 35613.66it/s]
 17%|█████                         | 278528/1648877 [00:06<00:30, 44932.76it/s]
 18%|█████▎                        | 294912/1648877 [00:06<00:28, 46938.38it/s]
 19%|█████▋                        | 311296/1648877 [00:07<00:25, 51543.53it/s]
 20%|█████▉                        | 327680/1648877 [00:07<00:25, 52503.22it/s]
 21%|██████▎                       | 344064/1648877 [00:07<00:24, 53274.37it/s]
 22%|██████▌                       | 360448/1648877 [00:07<00:23, 55020.87it/s]
9920512it [02:20, 64784.97it/s]                                                
 24%|███████▏                      | 393216/1648877 [00:08<00:21, 58606.12it/s]
 25%|███████▍                      | 409600/1648877 [00:08<00:20, 59764.46it/s]
 26%|███████▉                      | 434176/1648877 [00:09<00:18, 65222.93it/s]
 27%|████████▏                     | 450560/1648877 [00:09<00:18, 64889.69it/s]
 29%|████████▋                     | 475136/1648877 [00:09<00:22, 53273.85it/s]
 30%|████████▉                     | 491520/1648877 [00:10<00:19, 59130.92it/s]
 32%|█████████▌                    | 524288/1648877 [00:10<00:16, 68694.28it/s]
 33%|█████████▊                    | 540672/1648877 [00:10<00:16, 66236.98it/s]
 34%|██████████▎                   | 565248/1648877 [00:11<00:16, 67079.37it/s]
 35%|██████████▌                   | 581632/1648877 [00:11<00:15, 67892.23it/s]
 37%|███████████                   | 606208/1648877 [00:11<00:14, 69800.39it/s]
 38%|███████████▍                  | 630784/1648877 [00:11<00:13, 73699.36it/s]
 39%|███████████▊                  | 647168/1648877 [00:12<00:13, 73730.07it/s]
 41%|████████████▏                 | 671744/1648877 [00:12<00:12, 77337.30it/s]
 42%|████████████▋                 | 696320/1648877 [00:12<00:12, 77872.27it/s]
 44%|█████████████                 | 720896/1648877 [00:13<00:11, 79581.47it/s]
 44%|█████████████▎                | 729088/1648877 [00:13<00:19, 46623.08it/s]
 46%|█████████████▊                | 761856/1648877 [00:13<00:15, 57099.09it/s]
 47%|██████████████▏               | 778240/1648877 [00:13<00:15, 54858.75it/s]
 48%|██████████████▍               | 794624/1648877 [00:14<00:14, 57738.84it/s]
 50%|██████████████▉               | 819200/1648877 [00:14<00:12, 65422.86it/s]
 51%|███████████████▏              | 835584/1648877 [00:14<00:11, 70076.20it/s]
 51%|███████████████▎              | 843776/1648877 [00:14<00:11, 70148.46it/s]
 52%|███████████████▋              | 860160/1648877 [00:14<00:10, 74306.13it/s]
 53%|███████████████▊              | 868352/1648877 [00:15<00:23, 33454.11it/s]
 55%|████████████████▌             | 909312/1648877 [00:15<00:16, 43678.81it/s]
 56%|████████████████▊             | 925696/1648877 [00:15<00:13, 53899.85it/s]
 57%|█████████████████▏            | 942080/1648877 [00:16<00:13, 54085.11it/s]
 58%|█████████████████▍            | 958464/1648877 [00:16<00:12, 54841.68it/s]
 59%|█████████████████▌            | 966656/1648877 [00:16<00:12, 54827.97it/s]
 59%|█████████████████▋            | 974848/1648877 [00:16<00:11, 57654.03it/s]
 60%|██████████████████            | 991232/1648877 [00:16<00:09, 66594.96it/s]
 61%|██████████████████▏           | 999424/1648877 [00:17<00:09, 66033.36it/s]
 62%|█████████████████▊           | 1015808/1648877 [00:17<00:08, 73957.73it/s]
 63%|██████████████████▏          | 1032192/1648877 [00:17<00:08, 68617.65it/s]
 64%|██████████████████▍          | 1048576/1648877 [00:17<00:07, 80297.70it/s]
 65%|██████████████████▋          | 1064960/1648877 [00:17<00:08, 69297.90it/s]
 66%|███████████████████          | 1081344/1648877 [00:18<00:07, 78336.17it/s]
 67%|███████████████████▎         | 1097728/1648877 [00:18<00:06, 86548.19it/s]
 68%|███████████████████▌         | 1114112/1648877 [00:18<00:07, 75619.59it/s]
 69%|███████████████████▉         | 1130496/1648877 [00:18<00:06, 78972.60it/s]
 70%|████████████████████▏        | 1146880/1648877 [00:18<00:06, 75580.64it/s]
 70%|████████████████████▎        | 1155072/1648877 [00:19<00:07, 68972.88it/s]
 71%|████████████████████▌        | 1171456/1648877 [00:19<00:06, 77656.71it/s]
 72%|████████████████████▉        | 1187840/1648877 [00:19<00:07, 59929.93it/s]
 74%|█████████████████████▎       | 1212416/1648877 [00:19<00:05, 74145.76it/s]
 75%|█████████████████████▌       | 1228800/1648877 [00:20<00:06, 65476.01it/s]
 76%|██████████████████████       | 1253376/1648877 [00:20<00:06, 64641.28it/s]
 77%|██████████████████████▎      | 1269760/1648877 [00:21<00:09, 40759.04it/s]
 79%|███████████████████████      | 1310720/1648877 [00:21<00:06, 51690.97it/s]
 80%|███████████████████████▎     | 1327104/1648877 [00:21<00:06, 53121.95it/s]
 81%|███████████████████████▋     | 1343488/1648877 [00:22<00:05, 55690.83it/s]
 82%|███████████████████████▉     | 1359872/1648877 [00:22<00:05, 56479.62it/s]
 83%|████████████████████████     | 1368064/1648877 [00:22<00:04, 61649.40it/s]
 83%|████████████████████████▏    | 1376256/1648877 [00:22<00:05, 54174.05it/s]
 84%|████████████████████████▎    | 1384448/1648877 [00:22<00:04, 59642.63it/s]
 84%|████████████████████████▍    | 1392640/1648877 [00:22<00:04, 54020.55it/s]
 85%|████████████████████████▊    | 1409024/1648877 [00:23<00:04, 55015.99it/s]
 86%|████████████████████████▉    | 1417216/1648877 [00:23<00:03, 60960.72it/s]
 87%|█████████████████████████▏   | 1433600/1648877 [00:23<00:04, 44963.53it/s]
 89%|█████████████████████████▊   | 1466368/1648877 [00:24<00:03, 55544.83it/s]
 89%|█████████████████████████▉   | 1474560/1648877 [00:24<00:03, 49591.63it/s]
 90%|██████████████████████████   | 1482752/1648877 [00:25<00:05, 28730.51it/s]
 91%|██████████████████████████▌  | 1507328/1648877 [00:25<00:03, 36281.07it/s]
 92%|██████████████████████████▊  | 1523712/1648877 [00:25<00:03, 40628.90it/s]
 93%|██████████████████████████▉  | 1531904/1648877 [00:25<00:02, 47066.75it/s]
 93%|███████████████████████████  | 1540096/1648877 [00:25<00:02, 39539.44it/s]
 94%|███████████████████████████▏ | 1548288/1648877 [00:26<00:02, 41905.03it/s]
 95%|███████████████████████████▌ | 1564672/1648877 [00:26<00:01, 45817.17it/s]
 95%|███████████████████████████▋ | 1572864/1648877 [00:26<00:01, 38341.60it/s]
 96%|███████████████████████████▊ | 1581056/1648877 [00:27<00:02, 23428.42it/s]
 97%|████████████████████████████▏| 1605632/1648877 [00:27<00:01, 31206.12it/s]
 98%|████████████████████████████▍| 1613824/1648877 [00:27<00:00, 37767.03it/s]
 98%|████████████████████████████▌| 1622016/1648877 [00:28<00:01, 22216.43it/s]
100%|████████████████████████████▉| 1646592/1648877 [00:28<00:00, 27543.98it/s]

0it [00:00, ?it/s]

Extracting mnist\MNIST\raw\t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to mnist\MNIST\raw\t10k-labels-idx1-ubyte.gz




8192it [00:00, 14346.76it/s]                                                   


Extracting mnist\MNIST\raw\t10k-labels-idx1-ubyte.gz
Processing...
Done!



1654784it [00:48, 27543.98it/s]                                                

下载的数据自动做好标准化。

trainset 是数据集的变量名称。

torch.utils.data.dataloader.DataLoader
torch.Size([60000, 28, 28]) torch.Size([10000, 28, 28])
32 32


D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py:53: UserWarning: train_data has been renamed data
  warnings.warn("train_data has been renamed data")
D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py:58: UserWarning: test_data has been renamed data
  warnings.warn("test_data has been renamed data")

28x28 这是典型的图片数据。

这种分批导入的思想,需要打磨。

Define the class called Net which inherits from nn.Module

PyTorch 的使用中,定义类是常见的做法。

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-65-ee644993ab91> in <module>
      3 criterion = nn.CrossEntropyLoss()
      4 
----> 5 for batch_idx, data_target in enumerate(trainloader):
      6     data = data_target[0]
      7     target = data_target[1]


D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)


D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in <listcomp>(.0)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)


D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py in __getitem__(self, index)
     93 
     94         if self.transform is not None:
---> 95             img = self.transform(img)
     96 
     97         if self.target_transform is not None:


D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, img)
     59     def __call__(self, img):
     60         for t in self.transforms:
---> 61             img = t(img)
     62         return img
     63 


D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, tensor)
    162             Tensor: Normalized Tensor image.
    163         """
--> 164         return F.normalize(tensor, self.mean, self.std, self.inplace)
    165 
    166     def __repr__(self):


D:\install\miniconda\lib\site-packages\torchvision\transforms\functional.py in normalize(tensor, mean, std, inplace)
    206     mean = torch.as_tensor(mean, dtype=torch.float32, device=tensor.device)
    207     std = torch.as_tensor(std, dtype=torch.float32, device=tensor.device)
--> 208     tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
    209     return tensor
    210 


IndexError: too many indices for tensor of dimension 0
---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-68-4abe7f84ec52> in <module>
      2 model.eval()
      3 
----> 4 for i, data in enumerate(testloader, 0):
      5     inputs, labels = data
      6 


D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)


D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in <listcomp>(.0)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)


D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py in __getitem__(self, index)
     93 
     94         if self.transform is not None:
---> 95             img = self.transform(img)
     96 
     97         if self.target_transform is not None:


D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, img)
     59     def __call__(self, img):
     60         for t in self.transforms:
---> 61             img = t(img)
     62         return img
     63 


D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, tensor)
    162             Tensor: Normalized Tensor image.
    163         """
--> 164         return F.normalize(tensor, self.mean, self.std, self.inplace)
    165 
    166     def __repr__(self):


D:\install\miniconda\lib\site-packages\torchvision\transforms\functional.py in normalize(tensor, mean, std, inplace)
    206     mean = torch.as_tensor(mean, dtype=torch.float32, device=tensor.device)
    207     std = torch.as_tensor(std, dtype=torch.float32, device=tensor.device)
--> 208     tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
    209     return tensor
    210 


IndexError: too many indices for tensor of dimension 0

3 CNN

Convolution 的必要性

Convolution 的必要性

  1. 变量之间不是独立的
  2. 参数太多太难训练,需要降维
  3. 参数多容易过拟合

3.1 Convolution operator

Convolution operator Elezi (2019) 算是解释得比较清楚的了,好好学习。 有两种编程方式

  1. OOP
  2. Functional
filter 的定义

filter 的定义

这里解释了 filter 是如何运行的,并且 channels 是如何定义的。

filter 中的 weight 实现

filter 中的 weight 实现

filter 其实就是一个 weight 矩阵,实现了一个降低维度的手段。

多个 filters

多个 filters

当然这个 filter 是可以多个的,也就是 out_channels 的定义。

torch.rand(10, 1, 28, 28) 建立十个图片,单色,因此 in_channels=1 filter 大小为 (3, 3),因此 kernel_size=3 建立6个 filter,因此 out_channels=6

这是函数化编程的方式

定义好 filters 进行训练。

3.3 AlexNet

AlexNet 是以 pooling operators 实现的

AlexNet 是以 pooling operators 实现的

AlexNet, CNN 好复杂,用了好多降维的手段。

感觉后面慢慢复杂起来了。

更加完备的一个网络

这部分是 slides 上的代码,我觉得更加具备参考价值。

Training a CNN

4 训练模块

4.1 The sequential module

The sequential module 类似于 keras,但是还是要在 init 里面定义完。

nn.Sequential 写得非常好。

4.2 Regularization techniques

overfitting

  1. Training set: train the model
  2. Validation set: select the model
  3. Testing set: test the model

这个分类方式很好。

L2-regularization

\[C=-\frac{1}{n} \sum_{x j}\left[y_{j} \ln a_{j}^{L}+\left(1-y_{j}\right) \ln \left(1-a_{j}^{L}\right)\right]+\frac{\lambda}{2 n} \sum_{w} w^{2}\]

Dropout

参考 Srivastava et al. (2014)

Dropout 实现过程

Dropout 实现过程

Batch-normalization

参考 Ioffe and Szegedy (2015)

BN 实现的伪代码

BN 实现的伪代码

4.3 Transfer learning

迁移学习,我目前就是觉得太多了,那么多层需要定义,电脑也跑不动啊 !

理想化的神经网络结构

理想化的神经网络结构

引入新的小数据,使用大数据训练的模型

引入新的小数据,使用大数据训练的模型

不是去 training by scratch,而是用一个小数据去再训练这个模型,因为数据小,所以训练很快。

实现的 PyTorch 代码

实现的 PyTorch 代码

只需要修改 out_channels 和 input 结构就好。

作者的一个建议是好的,就是当不熟悉 CNN 时,可以找一个训练好的,然后迁移学习。

you’ll create a new model using this training set, but the accuracy will be poor. Next, you’ll perform the same training, but you’ll start with the parameters from your digit classifying model. Even though digits and letters are two different classification problems, you’ll see that using information from your previous model will dramatically improve this one.

一个是字母,一个是数字,相当于模型从 pre-trained 模型开始梯度下降。

果然 pre-train 模型是精细化处理的,所以效果更好,准确率更高。

Excellent! By incorporating information from the previously trained model, we are able to get a good model for handwritten digits, even with a small training set!

怎么理解呢?因为神经网络需要大量的数据进行训练,但是这里只是一个很小的数据集,因此再怎么训练,模型也不会好,因此需要一个用之前信息预训练好的模型,这相当于是利用之前的大量数据了,因此再用一个小数据集再训练模型时,效果会很好。

You already finetuned a net you had pretrained. In practice though, it is very common to finetune CNNs that someone else (typically the library’s developers) have pretrained in ImageNet. Big networks still take a lot of time to be trained on large datasets, and maybe you cannot afford to train a large network on a dataset of 1.2 million images on your laptop.

这个例子举例很好。

Instead, you can simply download the network and finetune it on your dataset. That’s what you will do right now. You are going to assume that you have a personal dataset, containing the images from all your last 7 holidays. You want to build a neural network that can classify each image depending on the holiday it comes from. However, since the dataset is so small, you need to use the finetuning technique.

这是作者的一个假设。

附录

4.4 完成证书

参考文献

Elezi, Ismail. 2019. “Deep Learning with Pytorch.” DataCamp. 2019. https://www.datacamp.com/courses/deep-learning-with-pytorch.

Ioffe, Sergey, and Christian Szegedy. 2015. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.”

Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. “Dropout: A Simple Way to Prevent Neural Networks from Overfitting.” Journal of Machine Learning Research 15 (1): 1929–58.