-
Vittrup Sanders posted an update 6 days, 5 hours ago
On these premises, we introduce an extension that outperforms conventional convolution on benchmark data. Quantitative experiments are provided on synthetic and benchmark data, showing that the direct encoding hit-or-miss transform provides better interpretability on learned shapes consistent with objects, whereas our morphologically inspired generalized convolution yields higher classification accuracy. Finally, qualitative hit and miss filter visualizations are provided relative to single morphological layer.We consider the problem of minimizing the sum of an average of a large number of smooth convex component functions and a possibly nonsmooth convex function that admits a simple proximal mapping. This class of problems arises frequently in machine learning, known as regularized empirical risk minimization (ERM). In this article, we propose mSRGTR-BB, a minibatch proximal stochastic recursive gradient algorithm, which employs a trust-region-like scheme to select stepsizes that are automatically computed by the Barzilai-Borwein method. We prove that mSRGTR-BB converges linearly in expectation for strongly and nonstrongly convex objective functions. With proper parameters, mSRGTR-BB enjoys a faster convergence rate than the state-of-the-art minibatch proximal variant of the semistochastic gradient method (mS2GD). Numerical experiments on standard data sets show that the performance of mSRGTR-BB is comparable to and sometimes even better than mS2GD with best-tuned stepsizes and is superior to some modern proximal stochastic gradient methods.Snake-like robots move flexibly in complex environments due to their multiple degrees of freedom and various gaits. However, their existing 3-D models are not accurate enough, and most gaits are applicable to special environments only. This work investigates a 3-D model and designs hybrid 3-D gaits. In the proposed 3-D model, a robot is considered as a continuous beam system. Its normal reaction forces are computed based on the mechanics of materials. To improve the applicability of such robots to different terrains or tasks, this work designs hybrid 3-D gaits by mixing basic gaits in different parts of their bodies. Performances of hybrid gaits are analyzed based on extensive simulations. These gaits are compared with traditional gaits including lateral undulation, rectilinear, and sidewinding ones. Results of simulations and physical experiments are presented to demonstrate the performances of the proposed model and hybrid gaits of snake-like robots.The problem of sparse Blind Source Separation (BSS) has been extensively studied when the noise is additive and Gaussian. This is however not the case when the measurements follow Poisson or shot noise statistics, which is customary with counting-based measurements. To that purpose, we introduce a novel sparse BSS algorithm coined pGMCA (poisson-Generalized Morphological Component Analysis) that specifically tackles the blind separation of sparse sources from measurements following Poisson statistics. The proposed algorithm builds upon Nesterov’s smoothing technique to define a smooth approximation of sparse BSS, with a data fidelity term derived from the Poisson likelihood. see more This allows to design a block coordinate descent-based minimization procedure with a simple choice of the regularization parameter. Numerical experiments have been carried out that illustrate the robustness of the proposed method with respect to Poisson noise. The pGMCA algorithm has been further evaluated in a realistic astrophysical X-ray imaging setting.Most existing work that grounds natural language phrases in images starts with the assumption that the phrase in question is relevant to the image. In this paper we address a more realistic version of the natural language grounding task where we must both identify whether the phrase is relevant to an image \textbfand localize the phrase. This can also be viewed as a generalization of object detection to an open-ended vocabulary, introducing elements of few- and zero-shot detection. We propose an approach for this task that extends Faster R-CNN to relate image regions and phrases. By carefully initializing the classification layers of our network using canonical correlation analysis (CCA), we encourage a solution that is more discerning when reasoning between similar phrases, resulting in over double the performance compared to a naive adaptation on three popular phrase grounding datasets, Flickr30K Entities, ReferIt Game, and Visual Genome, with test-time phrase vocabulary sizes of 5K, 32K, and 159K, respectively.Deep models are commonly treated as black-boxes and lack interpretability. Here, we propose a novel approach to interpret deep image classifiers by generating discrete masks. Our method follows the generative adversarial network formalism. The deep model to be interpreted is the discriminator while we train a generator to explain it. The generator is trained to capture discriminative image regions that should convey the same or similar meaning as the original image from the model’s perspective. It produces a probability map from which a discrete mask can be sampled. Then the discriminator is used to measure the quality of the sampled mask and provide feedbacks for updating. Due to the sampling operations, the generator cannot be trained directly by back-propagation. We propose to update it using policy gradient. Furthermore, we propose to incorporate gradients as auxiliary information to reduce the search space and facilitate training. We conduct both quantitative and qualitative experiments on the ILSVRC dataset. Experimental results indicate that our method can provide reasonable explanations for predictions and outperform existing approaches. In addition, our method can pass the model randomization test, indicating that it is reasoning the attribution of network predictions.Non-rigid motion-corrected reconstruction has been proposed to account for the complex motion of the heart in free-breathing 3D coronary magnetic resonance angiography (CMRA). This reconstruction framework requires efficient and accurate estimation of non-rigid motion fields from undersampled images at different respiratory positions (or bins). However, state-of-the-art registration methods can be time-consuming. This article presents a novel unsupervised deep learning-based strategy for fast estimation of inter-bin 3D non-rigid respiratory motion fields for motion-corrected free-breathing CMRA. The proposed 3D respiratory motion estimation network (RespME-net) is trained as a deep encoder-decoder network, taking pairs of 3D image patches extracted from CMRA volumes as input and outputting the motion field between image patches. Using image warping by the estimated motion field, a loss function that imposes image similarity and motion smoothness is adopted to enable training without ground truth motion field.