Skip to main content
Fig. 1 | Heritage Science

Fig. 1

From: Exploring spatiotemporal changes in cities and villages through remote sensing using multibranch networks

Fig. 1

The network structure of our proposed perception frameworks. where \({T^{(1)}:2002}\) and \({T^{(2)}:2018}\) indicate different time phases. \({x^{(1)}}\) and \({x^{(2)}}\) indicate the remote sensing images of the input. \({f^{Spatial}}\) and \({f^{Temporal}}\) indicate the spatial information and temporal information via the feature extraction module, and the module mainly contains densely connected convolutional networks (DenseNet-121). H, W, C indicates the height, width and channel, respectively. \({\gamma ,\varepsilon }\) indicates the subspace of temporal and spatial feature maps, \({(\frac{C}{\gamma })\prime =\frac{C}{8\gamma }}\),\({(\frac{C}{\varepsilon })\prime = \frac{C}{8\varepsilon }}\).\({y^{(1)}}\) and \({y^{(2)}}\) indicates the output feature via STPM, where STPM indicates the layers of spatiotemporal perceptions. \({\tau _{Total}}\) indicates the total loss of our frameworks. \({C-CHA}\) indicates the cross-channel attention component, GNPA indicates the Group-Norm position attention component. \({Conv_{7\times 7} (\cdot )} \) indicates that the convolutional operation of the kernel size is \({7\times 7} \), \({MP{3 \times 3}(\cdot )}\) indicates that the max pooling operation of the kernel size is \({3\times 3}\).\({Conv_{1\times 1}(\cdot )}\) indicates that the convolutional operation of the kernel size is \({1\times 1}\).GN indicates the Group-Norm operation. \({\times }\) indicates the elementwise product operate

Back to article page