1. 使用 RMarkdown 的 child 参数,进行文档拼接。
  2. 这样拼接以后的笔记方便复习。
  3. 相关问题提交到 GitHub

Elezi (2019) 的内容有三个两点

  1. 学习 Facebook PyTorch 的框架
  2. 理解 Convolution and Padding
  3. 理解 Transfer Learning

参考 https://pytorch.org/get-started/locally/

1 理解 Pytorch tensor

The blackcellmagic extension is already loaded. To reload it, use:
  %reload_ext blackcellmagic
tensor([[0.0249, 0.3875, 0.9637],
        [0.6190, 0.8936, 0.4906],
        [0.6484, 0.1036, 0.5873]])
torch.Size([3, 3])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


torch.Size([1000, 1000]) torch.Size([1000, 1000]) torch.Size([1000, 1000])
torch.Size([1000, 1000])

torch.mean 把一个二维变量降维为一个常数。

… how to compute derivatives! While PyTorch computes derivatives for you, mastering them will make you a much better deep learning practitioner and that knowledge will guide you in training neural networks better.





tensor([[18938.6172, 18050.3359, 20434.3965, 19943.2285, 20502.7695, 19317.7422,
         19452.3828, 19648.2441, 18432.9512, 19240.4980]])

这是直接的 MLP 算法。


For the most part, neural networks are just matrix (tensor) multiplication. This is the reason why we have put so much emphasis on matrices and tensors!

大多时候,神经网络仅仅是涉及矩阵相乘,这是一个非常有经验的说法。 学习下 class 建立函数,为后续写包有帮助。

这么看,其实 forward pyTorch 挺清晰的。 后期会发现,__init__ 定义的越 sequence,forward 会越简单。

2 MLP 搭建

The blackcellmagic extension is already loaded. To reload it, use:
  %reload_ext blackcellmagic

MLP 中矩阵相乘具体表现。

2.1 实现 MLP

tensor([[4.1747, 2.9866, 4.6681, 5.9294]])
tensor([[4.1747, 2.9866, 4.6681, 5.9294]])

这正好说明了 linear 的变换没有什么用,一定要用非线性函数 ReLu等。

\[\begin{array}{c}{X W_{1}=H_{1}} \\ {H_{1} W_{1}=H_{2}} \\ {H_{2} W_{2}=H_{3}} \\ {H_{3} W_{3}=O}\end{array}\]


\[XW_1W_2W_3 = O\]

tensor([[4.1747, 2.9866, 4.6681, 5.9294]])
tensor([[9.7038, 9.1091, 8.5771, 4.8548]])

每一步进行 relu 都会产生不一样的效果。

2.2 计算损失函数

tensor([[-1.2000,  0.1200,  4.8000]])

class = [0,1,2]


Being proficient in understanding and calculating loss functions is a very important skill in deep learning.



The score is close to -ln(1/1000) = 6.9. This is not surprising, considering that scores were random and close to each other, so the probability for each class was approximately the same (1/1000) = 0.001.


使用 MNIST 进行 MLP 搭建。

0it [00:00, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to mnist\MNIST\raw\train-images-idx3-ubyte.gz

100%|████████████████████████████▉| 9895936/9912422 [02:10<00:00, 64784.97it/s]

Extracting mnist\MNIST\raw\train-images-idx3-ubyte.gz

0it [00:00, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to mnist\MNIST\raw\train-labels-idx1-ubyte.gz

  0%|                                                | 0/28881 [00:00<?, ?it/s]
 57%|██████████████████▋              | 16384/28881 [00:00<00:00, 54251.65it/s]
32768it [00:01, 26629.83it/s]                                                  

0it [00:00, ?it/s]

Extracting mnist\MNIST\raw\train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to mnist\MNIST\raw\t10k-images-idx3-ubyte.gz

  0%|                                              | 0/1648877 [00:00<?, ?it/s]
  1%|▎                              | 16384/1648877 [00:00<00:32, 49498.48it/s]
  1%|▍                              | 24576/1648877 [00:01<00:45, 35733.91it/s]
  2%|▊                              | 40960/1648877 [00:01<00:46, 34744.99it/s]
  3%|█                              | 57344/1648877 [00:02<00:41, 38561.02it/s]
  4%|█▏                             | 65536/1648877 [00:02<00:45, 34578.56it/s]
  5%|█▌                             | 81920/1648877 [00:02<00:40, 39063.12it/s]
  6%|█▊                             | 98304/1648877 [00:03<00:35, 43615.46it/s]
  6%|█▉                            | 106496/1648877 [00:03<00:37, 41326.50it/s]
  7%|██▏                           | 122880/1648877 [00:03<00:36, 41545.08it/s]
  8%|██▌                           | 139264/1648877 [00:03<00:33, 45165.26it/s]
  9%|██▊                           | 155648/1648877 [00:04<00:29, 51452.48it/s]
 10%|███▏                          | 172032/1648877 [00:04<00:28, 51839.76it/s]
 11%|███▎                          | 180224/1648877 [00:05<00:51, 28616.56it/s]
 13%|███▉                          | 212992/1648877 [00:05<00:39, 36618.44it/s]
 14%|████▏                         | 229376/1648877 [00:05<00:35, 40157.12it/s]
 15%|████▍                         | 245760/1648877 [00:06<00:39, 35613.66it/s]
 17%|█████                         | 278528/1648877 [00:06<00:30, 44932.76it/s]
 18%|█████▎                        | 294912/1648877 [00:06<00:28, 46938.38it/s]
 19%|█████▋                        | 311296/1648877 [00:07<00:25, 51543.53it/s]
 20%|█████▉                        | 327680/1648877 [00:07<00:25, 52503.22it/s]
 21%|██████▎                       | 344064/1648877 [00:07<00:24, 53274.37it/s]
 22%|██████▌                       | 360448/1648877 [00:07<00:23, 55020.87it/s]
9920512it [02:20, 64784.97it/s]                                                
 24%|███████▏                      | 393216/1648877 [00:08<00:21, 58606.12it/s]
 25%|███████▍                      | 409600/1648877 [00:08<00:20, 59764.46it/s]
 26%|███████▉                      | 434176/1648877 [00:09<00:18, 65222.93it/s]
 27%|████████▏                     | 450560/1648877 [00:09<00:18, 64889.69it/s]
 29%|████████▋                     | 475136/1648877 [00:09<00:22, 53273.85it/s]
 30%|████████▉                     | 491520/1648877 [00:10<00:19, 59130.92it/s]
 32%|█████████▌                    | 524288/1648877 [00:10<00:16, 68694.28it/s]
 33%|█████████▊                    | 540672/1648877 [00:10<00:16, 66236.98it/s]
 34%|██████████▎                   | 565248/1648877 [00:11<00:16, 67079.37it/s]
 35%|██████████▌                   | 581632/1648877 [00:11<00:15, 67892.23it/s]
 37%|███████████                   | 606208/1648877 [00:11<00:14, 69800.39it/s]
 38%|███████████▍                  | 630784/1648877 [00:11<00:13, 73699.36it/s]
 39%|███████████▊                  | 647168/1648877 [00:12<00:13, 73730.07it/s]
 41%|████████████▏                 | 671744/1648877 [00:12<00:12, 77337.30it/s]
 42%|████████████▋                 | 696320/1648877 [00:12<00:12, 77872.27it/s]
 44%|█████████████                 | 720896/1648877 [00:13<00:11, 79581.47it/s]
 44%|█████████████▎                | 729088/1648877 [00:13<00:19, 46623.08it/s]
 46%|█████████████▊                | 761856/1648877 [00:13<00:15, 57099.09it/s]
 47%|██████████████▏               | 778240/1648877 [00:13<00:15, 54858.75it/s]
 48%|██████████████▍               | 794624/1648877 [00:14<00:14, 57738.84it/s]
 50%|██████████████▉               | 819200/1648877 [00:14<00:12, 65422.86it/s]
 51%|███████████████▏              | 835584/1648877 [00:14<00:11, 70076.20it/s]
 51%|███████████████▎              | 843776/1648877 [00:14<00:11, 70148.46it/s]
 52%|███████████████▋              | 860160/1648877 [00:14<00:10, 74306.13it/s]
 53%|███████████████▊              | 868352/1648877 [00:15<00:23, 33454.11it/s]
 55%|████████████████▌             | 909312/1648877 [00:15<00:16, 43678.81it/s]
 56%|████████████████▊             | 925696/1648877 [00:15<00:13, 53899.85it/s]
 57%|█████████████████▏            | 942080/1648877 [00:16<00:13, 54085.11it/s]
 58%|█████████████████▍            | 958464/1648877 [00:16<00:12, 54841.68it/s]
 59%|█████████████████▌            | 966656/1648877 [00:16<00:12, 54827.97it/s]
 59%|█████████████████▋            | 974848/1648877 [00:16<00:11, 57654.03it/s]
 60%|██████████████████            | 991232/1648877 [00:16<00:09, 66594.96it/s]
 61%|██████████████████▏           | 999424/1648877 [00:17<00:09, 66033.36it/s]
 62%|█████████████████▊           | 1015808/1648877 [00:17<00:08, 73957.73it/s]
 63%|██████████████████▏          | 1032192/1648877 [00:17<00:08, 68617.65it/s]
 64%|██████████████████▍          | 1048576/1648877 [00:17<00:07, 80297.70it/s]
 65%|██████████████████▋          | 1064960/1648877 [00:17<00:08, 69297.90it/s]
 66%|███████████████████          | 1081344/1648877 [00:18<00:07, 78336.17it/s]
 67%|███████████████████▎         | 1097728/1648877 [00:18<00:06, 86548.19it/s]
 68%|███████████████████▌         | 1114112/1648877 [00:18<00:07, 75619.59it/s]
 69%|███████████████████▉         | 1130496/1648877 [00:18<00:06, 78972.60it/s]
 70%|████████████████████▏        | 1146880/1648877 [00:18<00:06, 75580.64it/s]
 70%|████████████████████▎        | 1155072/1648877 [00:19<00:07, 68972.88it/s]
 71%|████████████████████▌        | 1171456/1648877 [00:19<00:06, 77656.71it/s]
 72%|████████████████████▉        | 1187840/1648877 [00:19<00:07, 59929.93it/s]
 74%|█████████████████████▎       | 1212416/1648877 [00:19<00:05, 74145.76it/s]
 75%|█████████████████████▌       | 1228800/1648877 [00:20<00:06, 65476.01it/s]
 76%|██████████████████████       | 1253376/1648877 [00:20<00:06, 64641.28it/s]
 77%|██████████████████████▎      | 1269760/1648877 [00:21<00:09, 40759.04it/s]
 79%|███████████████████████      | 1310720/1648877 [00:21<00:06, 51690.97it/s]
 80%|███████████████████████▎     | 1327104/1648877 [00:21<00:06, 53121.95it/s]
 81%|███████████████████████▋     | 1343488/1648877 [00:22<00:05, 55690.83it/s]
 82%|███████████████████████▉     | 1359872/1648877 [00:22<00:05, 56479.62it/s]
 83%|████████████████████████     | 1368064/1648877 [00:22<00:04, 61649.40it/s]
 83%|████████████████████████▏    | 1376256/1648877 [00:22<00:05, 54174.05it/s]
 84%|████████████████████████▎    | 1384448/1648877 [00:22<00:04, 59642.63it/s]
 84%|████████████████████████▍    | 1392640/1648877 [00:22<00:04, 54020.55it/s]
 85%|████████████████████████▊    | 1409024/1648877 [00:23<00:04, 55015.99it/s]
 86%|████████████████████████▉    | 1417216/1648877 [00:23<00:03, 60960.72it/s]
 87%|█████████████████████████▏   | 1433600/1648877 [00:23<00:04, 44963.53it/s]
 89%|█████████████████████████▊   | 1466368/1648877 [00:24<00:03, 55544.83it/s]
 89%|█████████████████████████▉   | 1474560/1648877 [00:24<00:03, 49591.63it/s]
 90%|██████████████████████████   | 1482752/1648877 [00:25<00:05, 28730.51it/s]
 91%|██████████████████████████▌  | 1507328/1648877 [00:25<00:03, 36281.07it/s]
 92%|██████████████████████████▊  | 1523712/1648877 [00:25<00:03, 40628.90it/s]
 93%|██████████████████████████▉  | 1531904/1648877 [00:25<00:02, 47066.75it/s]
 93%|███████████████████████████  | 1540096/1648877 [00:25<00:02, 39539.44it/s]
 94%|███████████████████████████▏ | 1548288/1648877 [00:26<00:02, 41905.03it/s]
 95%|███████████████████████████▌ | 1564672/1648877 [00:26<00:01, 45817.17it/s]
 95%|███████████████████████████▋ | 1572864/1648877 [00:26<00:01, 38341.60it/s]
 96%|███████████████████████████▊ | 1581056/1648877 [00:27<00:02, 23428.42it/s]
 97%|████████████████████████████▏| 1605632/1648877 [00:27<00:01, 31206.12it/s]
 98%|████████████████████████████▍| 1613824/1648877 [00:27<00:00, 37767.03it/s]
 98%|████████████████████████████▌| 1622016/1648877 [00:28<00:01, 22216.43it/s]
100%|████████████████████████████▉| 1646592/1648877 [00:28<00:00, 27543.98it/s]

0it [00:00, ?it/s]

Extracting mnist\MNIST\raw\t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to mnist\MNIST\raw\t10k-labels-idx1-ubyte.gz

8192it [00:00, 14346.76it/s]                                                   

Extracting mnist\MNIST\raw\t10k-labels-idx1-ubyte.gz

1654784it [00:48, 27543.98it/s]                                                


trainset 是数据集的变量名称。

torch.Size([60000, 28, 28]) torch.Size([10000, 28, 28])
32 32

D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py:53: UserWarning: train_data has been renamed data
  warnings.warn("train_data has been renamed data")
D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py:58: UserWarning: test_data has been renamed data
  warnings.warn("test_data has been renamed data")

28x28 这是典型的图片数据。


Define the class called Net which inherits from nn.Module

PyTorch 的使用中,定义类是常见的做法。


IndexError                                Traceback (most recent call last)

<ipython-input-65-ee644993ab91> in <module>
      3 criterion = nn.CrossEntropyLoss()
----> 5 for batch_idx, data_target in enumerate(trainloader):
      6     data = data_target[0]
      7     target = data_target[1]

D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)

D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in <listcomp>(.0)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)

D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py in __getitem__(self, index)
     94         if self.transform is not None:
---> 95             img = self.transform(img)
     97         if self.target_transform is not None:

D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, img)
     59     def __call__(self, img):
     60         for t in self.transforms:
---> 61             img = t(img)
     62         return img

D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, tensor)
    162             Tensor: Normalized Tensor image.
    163         """
--> 164         return F.normalize(tensor, self.mean, self.std, self.inplace)
    166     def __repr__(self):

D:\install\miniconda\lib\site-packages\torchvision\transforms\functional.py in normalize(tensor, mean, std, inplace)
    206     mean = torch.as_tensor(mean, dtype=torch.float32, device=tensor.device)
    207     std = torch.as_tensor(std, dtype=torch.float32, device=tensor.device)
--> 208     tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
    209     return tensor

IndexError: too many indices for tensor of dimension 0

IndexError                                Traceback (most recent call last)

<ipython-input-68-4abe7f84ec52> in <module>
      2 model.eval()
----> 4 for i, data in enumerate(testloader, 0):
      5     inputs, labels = data

D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)

D:\install\miniconda\lib\site-packages\torch\utils\data\dataloader.py in <listcomp>(.0)
    558         if self.num_workers == 0:  # same-process loading
    559             indices = next(self.sample_iter)  # may raise StopIteration
--> 560             batch = self.collate_fn([self.dataset[i] for i in indices])
    561             if self.pin_memory:
    562                 batch = _utils.pin_memory.pin_memory_batch(batch)

D:\install\miniconda\lib\site-packages\torchvision\datasets\mnist.py in __getitem__(self, index)
     94         if self.transform is not None:
---> 95             img = self.transform(img)
     97         if self.target_transform is not None:

D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, img)
     59     def __call__(self, img):
     60         for t in self.transforms:
---> 61             img = t(img)
     62         return img

D:\install\miniconda\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, tensor)
    162             Tensor: Normalized Tensor image.
    163         """
--> 164         return F.normalize(tensor, self.mean, self.std, self.inplace)
    166     def __repr__(self):

D:\install\miniconda\lib\site-packages\torchvision\transforms\functional.py in normalize(tensor, mean, std, inplace)
    206     mean = torch.as_tensor(mean, dtype=torch.float32, device=tensor.device)
    207     std = torch.as_tensor(std, dtype=torch.float32, device=tensor.device)
--> 208     tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
    209     return tensor

IndexError: too many indices for tensor of dimension 0


Convolution 的必要性

Convolution 的必要性

  1. 变量之间不是独立的
  2. 参数太多太难训练,需要降维
  3. 参数多容易过拟合

3.1 Convolution operator

Convolution operator Elezi (2019) 算是解释得比较清楚的了,好好学习。 有两种编程方式

  1. OOP
  2. Functional
filter 的定义

filter 的定义

这里解释了 filter 是如何运行的,并且 channels 是如何定义的。

filter 中的 weight 实现

filter 中的 weight 实现

filter 其实就是一个 weight 矩阵,实现了一个降低维度的手段。

多个 filters

多个 filters

当然这个 filter 是可以多个的,也就是 out_channels 的定义。

torch.rand(10, 1, 28, 28) 建立十个图片,单色,因此 in_channels=1 filter 大小为 (3, 3),因此 kernel_size=3 建立6个 filter,因此 out_channels=6


定义好 filters 进行训练。

3.3 AlexNet

AlexNet 是以 pooling operators 实现的

AlexNet 是以 pooling operators 实现的

AlexNet, CNN 好复杂,用了好多降维的手段。



这部分是 slides 上的代码,我觉得更加具备参考价值。

Training a CNN

4 训练模块

4.1 The sequential module

The sequential module 类似于 keras,但是还是要在 init 里面定义完。

nn.Sequential 写得非常好。

4.2 Regularization techniques


  1. Training set: train the model
  2. Validation set: select the model
  3. Testing set: test the model



\[C=-\frac{1}{n} \sum_{x j}\left[y_{j} \ln a_{j}^{L}+\left(1-y_{j}\right) \ln \left(1-a_{j}^{L}\right)\right]+\frac{\lambda}{2 n} \sum_{w} w^{2}\]


参考 Srivastava et al. (2014)

Dropout 实现过程

Dropout 实现过程


参考 Ioffe and Szegedy (2015)

BN 实现的伪代码

BN 实现的伪代码

4.3 Transfer learning

迁移学习,我目前就是觉得太多了,那么多层需要定义,电脑也跑不动啊 !





不是去 training by scratch,而是用一个小数据去再训练这个模型,因为数据小,所以训练很快。

实现的 PyTorch 代码

实现的 PyTorch 代码

只需要修改 out_channels 和 input 结构就好。

作者的一个建议是好的,就是当不熟悉 CNN 时,可以找一个训练好的,然后迁移学习。

you’ll create a new model using this training set, but the accuracy will be poor. Next, you’ll perform the same training, but you’ll start with the parameters from your digit classifying model. Even though digits and letters are two different classification problems, you’ll see that using information from your previous model will dramatically improve this one.

一个是字母,一个是数字,相当于模型从 pre-trained 模型开始梯度下降。

果然 pre-train 模型是精细化处理的,所以效果更好,准确率更高。

Excellent! By incorporating information from the previously trained model, we are able to get a good model for handwritten digits, even with a small training set!


You already finetuned a net you had pretrained. In practice though, it is very common to finetune CNNs that someone else (typically the library’s developers) have pretrained in ImageNet. Big networks still take a lot of time to be trained on large datasets, and maybe you cannot afford to train a large network on a dataset of 1.2 million images on your laptop.


Instead, you can simply download the network and finetune it on your dataset. That’s what you will do right now. You are going to assume that you have a personal dataset, containing the images from all your last 7 holidays. You want to build a neural network that can classify each image depending on the holiday it comes from. However, since the dataset is so small, you need to use the finetuning technique.



4.4 完成证书


Elezi, Ismail. 2019. “Deep Learning with Pytorch.” DataCamp. 2019. https://www.datacamp.com/courses/deep-learning-with-pytorch.

Ioffe, Sergey, and Christian Szegedy. 2015. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.”

Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. “Dropout: A Simple Way to Prevent Neural Networks from Overfitting.” Journal of Machine Learning Research 15 (1): 1929–58.