GridSample

Description

Given an inputΒ XΒ and a flow-fieldΒ grid, computes the outputΒ YΒ usingΒ XΒ values and pixel locations from theΒ grid.

 

 

For spatial inputΒ XΒ with shape (N, C, H, W), theΒ gridΒ will have shape (N, H_out, W_out, 2), the outputΒ YΒ will have shape (N, C, H_out, W_out). For volumetric inputΒ XΒ with shape (N, C, D, H, W), theΒ gridΒ will have shape (N, D_out, H_out, W_out, 3), the outputΒ YΒ will have shape (N, C, D_out, H_out, W_out). More generally, for an inputΒ XΒ of rank r+2 with shape (N, C, d1, d2, …, dr), theΒ gridΒ will have shape (N, D1_out, D2_out, …, Dr_out, r), the outputΒ YΒ will have shape (N, C, D1_out, D2_out, …, Dr_out).

The tensorΒ X contains values at centers of square pixels (voxels, etc) locations such as (n, c, d1_in, d2_in, …, dr_in). The (n, d1_out, d2_out, …, dr_out, values from the tensor gridΒ are the normalized positions for interpolating the values at the (n, c, d1_out, d2_out, …, dr_out) locations from the output tensorΒ YΒ using a specified interpolation method (the mode) and a padding mode (forΒ gridΒ positions falling outside the 2-dimensional image).

For example, the values inΒ grid[n,Β h_out,Β w_out,Β :]Β are size-2 vectors specifying normalized positions in the 2-dimensional space ofΒ X. They are used to interpolate output values ofΒ Y[n,Β c,Β h_out,Β w_out].

The GridSample operator is often used in doing grid generator and sampler in theΒ Spatial Transformer Networks. See also inΒ torch.nn.functional.grid_sample.

 

Input parameters

 

specified_outputs_name :Β array, this parameter lets you manually assign custom names to the output tensors of a node.

Β Graphs in :Β cluster, ONNX model architecture.

X (heterogeneous) – T1 :Β object, input tensor of rank r+2 that has shape (N, C, D1, D2, …, Dr), where N is the batch size, C is the number of channels, D1, D2, …, Dr are the spatial dimensions.
grid (heterogeneous) – T2 : object, input offset of shape (N, D1_out, D2_out, …, Dr_out, r), where D1_out, D2_out, …, Dr_out are the spatial dimensions of the grid and output, and r is the number of spatial dimensions. Grid specifies the sampling locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. If the grid has values outside the range of [-1, 1], the corresponding outputs will be handled as defined by padding_mode. Following computer vision convention, the coordinates in the length-r location vector are listed from the innermost tensor dimension to the outermost, the opposite of regular tensor indexing.

Β Parameters :Β cluster,

align_cornersΒ :Β boolean, if align_corners=true, the extrema (-1 and 1) are considered as referring to the center points of the input’s corner pixels (voxels, etc.). If align_corners=false, they are instead considered as referring to the corner points of the input’s corner pixels (voxels, etc.), making the sampling more resolution agnostic.
Default value β€œFalse”.
mode : enum, three interpolation modes: linear (default), nearest and cubic. The β€œlinear” mode includes linear and N-linear interpolation modes depending on the number of spatial dimensions of the input tensor (i.e. linear for 1 spatial dimension, bilinear for 2 spatial dimensions, etc.). The β€œcubic” mode also includes N-cubic interpolation modes following the same rules. The β€œnearest” mode rounds to the nearest even index when the sampling point falls halfway between two indices.
Default value β€œlinear”.
padding_modeΒ : enum, support padding modes for outside grid values:Β zeros(default),Β border,Β reflection. zeros: use 0 for out-of-bound grid locations, border: use border values for out-of-bound grid locations, reflection: use values at locations reflected by the border for out-of-bound grid locations. If index 0 represents the margin pixel, the reflected value at index -1 will be the same as the value at index 1. For location far away from the border, it will keep being reflected until becoming in bound. If pixel location x = -3.5 reflects by border -1 and becomes x’ = 1.5, then reflects by border 1 and becomes x’’ = 0.5.
Default value β€œzeros”.
Β training?Β :Β boolean, whether the layer is in training mode (can store data for backward).
Default value β€œTrue”.
Β lda coeff :Β float, defines the coefficient by which the loss derivative will be multiplied before being sent to the previous layer (since during the backward run we go backwards).
Default value β€œ1”.

Β name (optional) :Β string, name of the node.

Output parameters

 

Β Y (heterogeneous) – T1 : object, output tensor of rank r+2 that has shape (N, C, D1_out, D2_out, …, Dr_out) of the sampled values. For integer input types, intermediate values are computed as floating point and cast to integer at the end.

Type Constraints

T1 in (tensor(bool),Β tensor(complex128),Β tensor(complex64),Β tensor(double),Β tensor(float),Β tensor(float16),Β tensor(int16),Β 
tensor(int32),Β tensor(int64),Β tensor(int8),Β tensor(string),Β tensor(uint16),Β tensor(uint32),Β tensor(uint64),Β tensor(uint8)) : Constrain inputΒ XΒ and outputΒ YΒ types to all tensor types.

T2 in (tensor(double),Β tensor(float),Β tensor(float16)) : Constrain grid types to float tensors.

Example

All these exemples are snippets PNG, you can drop these Snippet onto the block diagram and get the depicted code added to your VI (Do not forget to install Deep Learning library to run it).
Table of Contents