Mastodawn

Machine Learning Journal May 3, 2023

"On the incompatibility of accuracy and equal opportunity" by Carlos Pinzón, Catuscia Palamidessi, Pablo Piantanida & Frank Valencia (https://rdcu.be/dbe9N)

On the incompatibility of accuracy and equal opportunity

Machine Learning Journal May 3, 2023

May's here and so are new #MLJ online-first papers: "Faster Riemannian Newton-type optimization by subsampling and cubic regularization" by Yian Deng & Tingting Mu (https://link.springer.com/article/10.1007/s10994-023-06321-0) (OA)

Faster Riemannian Newton-type optimization by subsampling and cubic regularization - Machine Learning

This work is on constrained large-scale non-convex optimization where the constraint set implies a manifold structure. Solving such problems is important in a multitude of fundamental machine learning tasks. Recent advances on Riemannian optimization have enabled the convenient recovery of solutions by adapting unconstrained optimization algorithms over manifolds. However, it remains challenging to scale up and meanwhile maintain stable convergence rates and handle saddle points. We propose a new second-order Riemannian optimization algorithm, aiming at improving convergence rate and reducing computational cost. It enhances the Riemannian trust-region algorithm that explores curvature information to escape saddle points through a mixture of subsampling and cubic regularization techniques. We conduct rigorous analysis to study the convergence behavior of the proposed algorithm. We also perform extensive experiments to evaluate it based on two general machine learning tasks using multiple datasets. The proposed algorithm exhibits improved computational speed, e.g., a speed improvement from $$12\% \:\text {to} \:227\%$$ 12 % to 227 % , and improved convergence behavior, e.g., an iteration number reduction from $$\mathcal{O}\left(\max\left(\epsilon_g^{-2}\epsilon_H^{-1},\epsilon_H^{-3}\right)\right) \,\text {to}\: \mathcal{O}\left(\max\left(\epsilon_g^{-2},\epsilon_H^{-3}\right)\right)$$ O max ϵ g - 2 ϵ H - 1 , ϵ H - 3 to O max ϵ g - 2 , ϵ H - 3 , compared to a large set of state-of-the-art Riemannian optimization algorithms.

SpringerLink

Machine Learning Journal May 2, 2023

An #MLJ online-first #NewPaper on a new data set: "ROAD-R: the autonomous driving dataset with logical requirements" by Eleonora Giunchiglia, Mihaela Cătălina Stoian, Salman Khan, Fabio Cuzzolin & Thomas Lukasiewicz (https://link.springer.com/article/10.1007/s10994-023-06322-z)

ROAD-R: the autonomous driving dataset with logical requirements - Machine Learning

Neural networks have proven to be very powerful at computer vision tasks. However, they often exhibit unexpected behaviors, acting against background knowledge about the problem at hand. This calls for models (i) able to learn from requirements expressing such background knowledge, and (ii) guaranteed to be compliant with the requirements themselves. Unfortunately, the development of such models is hampered by the lack of real-world datasets equipped with formally specified requirements. In this paper, we introduce the ROad event Awareness Dataset with logical Requirements (ROAD-R), the first publicly available dataset for autonomous driving with requirements expressed as logical constraints. Given ROAD-R, we show that current state-of-the-art models often violate its logical constraints, and that it is possible to exploit them to create models that (i) have a better performance, and (ii) are guaranteed to be compliant with the requirements themselves.

SpringerLink

Machine Learning Journal Apr 25, 2023

#MLJ online-first #NewPaper: "Data driven discovery of systems of ordinary differential equations using nonconvex multitask learning" by Clément Lejeune, Josiane Mothe, Adil Soubki & Olivier Teste (https://rdcu.be/daInh)

Data driven discovery of systems of ordinary differential equations using nonconvex multitask learning

Machine Learning Journal Apr 12, 2023

Hey, scientifically-inclined followers! We're not the only interesting ML/DM journal out there.
If you haven't yet, you should go and follow @DAMI_Social so you won't miss the interesting work published over there.

Machine Learning Journal Apr 12, 2023

"Interpreting machine-learning models in transformed feature space with an application to remote-sensing classification" by
@geobrenning
(https://link.springer.com/article/10.1007/s10994-023-06327-8) (OA)

Interpreting machine-learning models in transformed feature space with an application to remote-sensing classification - Machine Learning

Model-agnostic tools for the post-hoc interpretation of machine-learning models struggle to summarize the joint effects of strongly dependent features in high-dimensional feature spaces, which play an important role in semantic image classification, for example in remote sensing of landcover. This contribution proposes a novel approach that interprets machine-learning models through the lens of feature-space transformations. It can be used to enhance unconditional as well as conditional post-hoc diagnostic tools including partial-dependence plots, accumulated local effects (ALE) plots, permutation feature importance, or Shapley additive explanations (SHAP). While the approach can also be applied to nonlinear transformations, linear ones are particularly appealing, especially principal component analysis (PCA) and a proposed partial orthogonalization technique. Moreover, structured PCA and model diagnostics along user-defined synthetic features offer opportunities for representing domain knowledge. The new approach is implemented in the R package wiml, which can be combined with existing explainable machine-learning packages. A case study on remote-sensing landcover classification with 46 features is used to demonstrate the potential of the proposed approach for model interpretation by domain experts. It is most useful in situations where groups of feature are linearly dependent and PCA can provide meaningful multivariate data summaries.

SpringerLink

Machine Learning Journal Apr 12, 2023

Two #MLJ online-first #NewPaper|s on understanding models for image classification today: "Understanding CNN fragility when learning with imbalanced data" by Damien Dablain, Kristen N. Jacobson, Colin Bellinger, Mark Roberts & Nitesh V. Chawla (https://link.springer.com/article/10.1007/s10994-023-06326-9) (OA)

Understanding CNN fragility when learning with imbalanced data - Machine Learning

Convolutional neural networks (CNNs) have achieved impressive results on imbalanced image data, but they still have difficulty generalizing to minority classes and their decisions are difficult to interpret. These problems are related because the method by which CNNs generalize to minority classes, which requires improvement, is wrapped in a black-box. To demystify CNN decisions on imbalanced data, we focus on their latent features. Although CNNs embed the pattern knowledge learned from a training set in model parameters, the effect of this knowledge is contained in feature and classification embeddings (FE and CE). These embeddings can be extracted from a trained model and their global, class properties (e.g., frequency, magnitude and identity) can be analyzed. We find that important information regarding the ability of a neural network to generalize to minority classes resides in the class top-K CE and FE. We show that a CNN learns a limited number of class top-K CE per category, and that their magnitudes vary based on whether the same class is balanced or imbalanced. We hypothesize that latent class diversity is as important as the number of class examples, which has important implications for re-sampling and cost-sensitive methods. These methods generally focus on rebalancing model weights, class numbers and margins; instead of diversifying class latent features. We also demonstrate that a CNN has difficulty generalizing to test data if the magnitude of its top-K latent features do not match the training set. We use three popular image datasets and two cost-sensitive algorithms commonly employed in imbalanced learning for our experiments.

SpringerLink

Machine Learning Journal Apr 9, 2023

Good Friday #MLJ online-first paper: "An accelerated proximal algorithm for regularized nonconvex and nonsmooth bi-level optimization" by Ziyi Chen, Bhavya Kailkhura & Yi Zhou (https://rdcu.be/c9tDx)

An accelerated proximal algorithm for regularized nonconvex and nonsmooth bi-level optimization

Machine Learning Journal Apr 7, 2023

New #MLJ online-first paper: "Robust matrix estimations meet Frank–Wolfe algorithm" by Naimin Jing, Ethan X. Fang & Cheng Yong Tang (https://rdcu.be/c9mY5)

Robust matrix estimations meet Frank–Wolfe algorithm

Machine Learning Journal Apr 4, 2023

Another #MLJ online-first #NewPaper dropped yesterday: "Domain adversarial neural networks for domain generalization: when it works and how to improve" by Anthony Sicilia, Xingchen Zhao & Seong Jae Hwang (https://link.springer.com/article/10.1007/s10994-023-06324-x) (OA)

Domain adversarial neural networks for domain generalization: when it works and how to improve - Machine Learning

Theoretically, domain adaptation is a well-researched problem. Further, this theory has been well-used in practice. In particular, we note the bound on target error given by Ben-David et al. (Mach Learn 79(1–2):151–175, 2010) and the well-known domain-aligning algorithm based on this work using Domain Adversarial Neural Networks (DANN) presented by Ganin and Lempitsky (in International conference on machine learning, pp 1180–1189). Recently, multiple variants of DANN have been proposed for the related problem of domain generalization, but without much discussion of the original motivating bound. In this paper, we investigate the validity of DANN in domain generalization from this perspective. We investigate conditions under which application of DANN makes sense and further consider DANN as a dynamic process during training. Our investigation suggests that the application of DANN to domain generalization may not be as straightforward as it seems. To address this, we design an algorithmic extension to DANN in the domain generalization case. Our experimentation validates both theory and algorithm.

SpringerLink