1. 程式人生 > >pytorch中的卷積和池化計算方式

pytorch中的卷積和池化計算方式

TensorFlow裡面的padding只有兩個選項也就是valid和same

pytorch裡面的padding麼有這兩個選項,它是數字0,1,2,3等等,預設是0

所以輸出的h和w的計算方式也是稍微有一點點不同的:tf中的輸出大小是和原來的大小成倍數關係,不能任意的輸出大小;而nn輸出大小可以通過padding進行改變

nn裡面的卷積操作或者是池化操作的H和W部分都是一樣的計算公式:H和W的計算

class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False):
"""
Parameters:	

    kernel_size – the size of the window to take a max over
    stride – the stride of the window. 預設值是kernel_size
    padding – implicit zero padding to be added on both side,預設值是0
    dilation – a parameter that controls the stride of elements in the window,預設值是1
    return_indices – if True, will return the max indices along with the outputs. Useful when Unpooling later
    ceil_mode – when True, will use ceil instead of floor to compute the output shape,向上取整和向下取整,預設是向下取整
"""

 不一樣的地方在於:第一點,步長stride預設值,上面預設和設定的kernel_size一樣,下面預設是1;第二點,輸出通道的不一樣,上面的輸出通道和輸入通道是一樣的也就是沒有改變特徵圖的數目,下面改變特徵圖的數目為out_channels

class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True):
        pass
"""
Parameters:	

    in_channels (int) – Number of channels in the input image
    out_channels (int) – Number of channels produced by the convolution
    kernel_size (int or tuple) – Size of the convolving kernel
    stride (int or tuple, optional) – Stride of the convolution. Default: 1,預設是1
    padding (int or tuple, optional) – Zero-padding added to both sides of the input. Default: 0
    dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1
    groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1
    bias (bool, optional) – If True, adds a learnable bias to the output. Default: True
"""

第三點不一樣是卷積有一個引數groups,將特徵圖分開給不同的卷積進行操作然後再整合到一起,xception就是利用這一個。

"""
At groups=1, all inputs are convolved to all outputs.
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.
At groups= in_channels, each input channel is convolved with its own set of filters (of size ⌊out_channelsin_channels⌋
).
"""