visual_backbone

visual_backbone#

class Conv2d(*args, **kwargs)[source]#

Bases: Conv2D

forward(x)[source]#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class CNNBlockBase(in_channels, out_channels, stride)[source]#: Bases: Layer

ResNetBlockBase#: alias of CNNBlockBase

class ShapeSpec(channels=None, height=None, width=None, stride=None)[source]#: Bases: _ShapeSpec

get_norm(norm, out_channels)[source]#

Parameters:

norm (str or callable) – either one of BN, SyncBN, FrozenBN, GN; or a callable that takes a channel number and returns the normalization layer as a nn.Layer.
out_channels (int) – out_channels

Returns:

the normalization layer

Return type:

nn.Layer or None

class FrozenBatchNorm(num_channels)[source]#: Bases: BatchNorm

class Backbone[source]#

Bases: Layer

abstract forward(*args)[source]#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class BasicBlock(in_channels, out_channels, *, stride=1, norm='BN')[source]#

Bases: CNNBlockBase

The basic residual block for ResNet-18 and ResNet-34 defined in :paper:`ResNet`, with two 3x3 conv layers and a projection shortcut if needed.

class BottleneckBlock(in_channels, out_channels, *, bottleneck_channels, stride=1, num_groups=1, norm='BN', stride_in_1x1=False, dilation=1)[source]#

Bases: CNNBlockBase

The standard bottleneck residual block used by ResNet-50, 101 and 152 defined in :paper:`ResNet`. It contains 3 conv layers with kernels 1x1, 3x3, 1x1, and a projection shortcut if needed.

forward(x)[source]#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class DeformBottleneckBlock(in_channels, out_channels, *, bottleneck_channels, stride=1, num_groups=1, norm='BN', stride_in_1x1=False, dilation=1, deform_modulated=False, deform_num_groups=1)[source]#

Bases: CNNBlockBase

Similar to BottleneckBlock, but with :paper:`deformable conv <deformconv>` in the 3x3 convolution.

class BasicStem(in_channels=3, out_channels=64, norm='BN')[source]#

Bases: CNNBlockBase

The standard ResNet stem (layers before the first residual block), with a conv, relu and max_pool.

forward(x)[source]#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class ResNet(stem, stages, num_classes=None, out_features=None, freeze_at=0)[source]#

Bases: Backbone

forward(x)[source]#

Parameters:: x – Tensor of shape (N,C,H,W). H, W must be a multiple of self.size_divisibility.
Returns:: names and the corresponding features
Return type:: dict[str->Tensor]

static make_stage(block_class, num_blocks, *, in_channels, out_channels, **kwargs)[source]#

Create a list of blocks of the same type that forms one ResNet stage.

Parameters:

block_class (type) – a subclass of CNNBlockBase that’s used to create all blocks in this stage. A module of this type must not change spatial resolution of inputs unless its stride != 1.
num_blocks (int) – number of blocks in this stage
in_channels (int) – input channels of the entire stage.
out_channels (int) – output channels of every block in the stage.
kwargs – other arguments passed to the constructor of block_class. If the argument name is “xx_per_block”, the argument is a list of values to be passed to each block in the stage. Otherwise, the same argument is passed to every block in the stage.

Returns:

a list of block module.

Return type:

list[CNNBlockBase]

Examples:

stage = ResNet.make_stage(
    BottleneckBlock, 3, in_channels=16, out_channels=64,
    bottleneck_channels=16, num_groups=1,
    stride_per_block=[2, 1, 1],
    dilations_per_block=[1, 1, 2]
)

Usually, layers that produce the same feature map spatial size are defined as one “stage” (in :paper:`FPN`). Under such definition, stride_per_block[1:] should all be 1.

static make_default_stages(depth, block_class=None, **kwargs)[source]#

Created list of ResNet stages from pre-defined depth (one of 18, 34, 50, 101, 152). If it doesn’t create the ResNet variant you need, please use make_stage() instead for fine-grained customization.

Parameters:

depth (int) – depth of ResNet
block_class (type) – the CNN block class. Has to accept bottleneck_channels argument for depth > 50. By default it is BasicBlock or BottleneckBlock, based on the depth.
kwargs – other arguments to pass to make_stage. Should not contain stride and channels, as they are predefined for each depth.

Returns:

modules in all stages; see arguments of: ResNet.__init__.

Return type:

list[list[CNNBlockBase]]

class LastLevelMaxPool[source]#

Bases: Layer

This module is used in the original FPN to generate a downsampled P6 feature from P5.

forward(x)[source]#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class FPN(bottom_up, in_features, out_channels, norm='', top_block=None, fuse_type='sum')[source]#

Bases: Backbone

forward(x)[source]#

Parameters:: x (dict[str->Tensor]) – mapping feature map name (e.g., “res5”) to feature map tensor for each feature level in high to low resolution order.
Returns:: mapping from feature map name to FPN feature map tensor in high to low resolution order. Returned feature names follow the FPN paper convention: “p<stage>”, where stage has stride = 2 ** stage e.g., [“p2”, “p3”, …, “p6”].
Return type:: dict[str->Tensor]

make_stage(*args, **kwargs)[source]#: Deprecated alias for backward compatibiltiy.

build_resnet_backbone(cfg, input_shape=None)[source]#

Create a ResNet instance from config.

Returns:: a ResNet instance.
Return type:: ResNet

class VisualBackbone(config)[source]#

Bases: Layer

forward(images)[source]#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

visual_backbone

Contents

visual_backbone#