Параметризация: PyTest vs RobotFramework

В этой статье вы можете найти наглядное сравнение двух популярных инструментов для автоматизации тестирования: PyTest и RobotFramework. На Хабре уже есть хорошая статья с общим сравнением этих фреймворков. Я сфокусируюсь на простоте параметризации тестов.

https://habr.com/ru/companies/beget/articles/902456/

#robotframework #pytest #testautomation #testing #parametrization

Параметризация: PyTest vs RobotFramework

В этой статье вы можете найти наглядное сравнение двух популярных инструментов для автоматизации тестирования: PyTest и RobotFramework. На Хабре уже есть хорошая статья с общим сравнением этих...

Хабр
Parametrization of surfaces

YouTube

Applying matrix diagonalisation in the classroom with #GeoGebra: parametrising the intersection of a sphere and plane

In collaboration with Bradley Welch

https://www.tandfonline.com/doi/full/10.1080/0020739X.2023.2233513

#dynamic #geometric #systems #linear #algebra #vectors #parametrization

Pruning neural networks using Bayesian inference
https://arxiv.org/abs/2308.02451

* neural network (NN) pruning highly effective at reducing computational & memory demands of large NN
* novel approach utilizing Bayesian inference; seamlessly integrates into training procedure
* leverages the posterior probabilities of NN prior to/following pruning, enabling calculation of Bayes factors
* achieves sparsity maintains accuracy

#ML #MachineLearning #parametrization #pruning #BayesianPruning #NeuralNetworks

Pruning a neural network using Bayesian inference

Neural network pruning is a highly effective technique aimed at reducing the computational and memory demands of large neural networks. In this research paper, we present a novel approach to pruning neural networks utilizing Bayesian inference, which can seamlessly integrate into the training procedure. Our proposed method leverages the posterior probabilities of the neural network prior to and following pruning, enabling the calculation of Bayes factors. The calculated Bayes factors guide the iterative pruning. Through comprehensive evaluations conducted on multiple benchmarks, we demonstrate that our method achieves desired levels of sparsity while maintaining competitive accuracy.

arXiv.org

On the curvature of the loss landscape
https://arxiv.org/abs/2307.04719

A main challenge in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data ... we consider the loss landscape as an embedded Riemannian manifold ... we focus on the scalar curvature, which can be computed analytically for our manifold ...

Manifolds: https://en.wikipedia.org/wiki/Manifold
...

#MachineLearning #NeuralNetworks #manifolds #parametrization #DeepLearning #LossFunctions

On the curvature of the loss landscape

One of the main challenges in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data. A way to analyze this generalization concept is through the properties of the associated loss landscape. In this work, we consider the loss landscape as an embedded Riemannian manifold and show that the differential geometric properties of the manifold can be used when analyzing the generalization abilities of a deep net. In particular, we focus on the scalar curvature, which can be computed analytically for our manifold, and show connections to several settings that potentially imply generalization.

arXiv.org

Extending the Forward Forward Algorithm
https://arxiv.org/abs/2307.04205

The Forward Forward algorithm (Geoffrey Hinton, 2022-11) is an alternative to backpropagation for training neural networks (NN)

Backpropagation - the most widely successful and used optimization algorithm for training NN - has 3 important limitations ...

Hinton's paper: https://www.cs.toronto.edu/~hinton/FFA13.pdf
Discussion: https://bdtechtalks.com/2022/12/19/forward-forward-algorithm-geoffrey-hinton
...

#GeoffHinton #ForwardPropagation #NeruralNetworks #parametrization #BackproPagation #LossFunction

Extending the Forward Forward Algorithm

The Forward Forward algorithm, proposed by Geoffrey Hinton in November 2022, is a novel method for training neural networks as an alternative to backpropagation. In this project, we replicate Hinton's experiments on the MNIST dataset, and subsequently extend the scope of the method with two significant contributions. First, we establish a baseline performance for the Forward Forward network on the IMDb movie reviews dataset. As far as we know, our results on this sentiment analysis task marks the first instance of the algorithm's extension beyond computer vision. Second, we introduce a novel pyramidal optimization strategy for the loss threshold - a hyperparameter specific to the Forward Forward method. Our pyramidal approach shows that a good thresholding strategy causes a difference of upto 8% in test error. 1 Lastly, we perform visualizations of the trained parameters and derived several significant insights, such as a notably larger (10-20x) mean and variance in the weights acquired by the Forward Forward network.

arXiv.org

Loss Functions and Metrics in Deep Learning. A Review
https://arxiv.org/abs/2307.02694

One of the essential components of deep learning is the choice of the loss function and performance metrics used to train and evaluate models.

This paper reviews the most prevalent loss functions and performance measurements in deep learning.

#MachineLearning #parametrization #LossFunctions #CostFunctions #NeuralNetworks #optimization #MathematicalOptimization

Loss Functions and Metrics in Deep Learning

This paper presents a comprehensive review of loss functions and performance metrics in deep learning, highlighting key developments and practical insights across diverse application areas. We begin by outlining fundamental considerations in classic tasks such as regression and classification, then extend our analysis to specialized domains like computer vision and natural language processing including retrieval-augmented generation. In each setting, we systematically examine how different loss functions and evaluation metrics can be paired to address task-specific challenges such as class imbalance, outliers, and sequence-level optimization. Key contributions of this work include: (1) a unified framework for understanding how losses and metrics align with different learning objectives, (2) an in-depth discussion of multi-loss setups that balance competing goals, and (3) new insights into specialized metrics used to evaluate modern applications like retrieval-augmented generation, where faithfulness and context relevance are pivotal. Along the way, we highlight best practices for selecting or combining losses and metrics based on empirical behaviors and domain constraints. Finally, we identify open problems and promising directions, including the automation of loss-function search and the development of robust, interpretable evaluation measures for increasingly complex deep learning tasks. Our review aims to equip researchers and practitioners with clearer guidance in designing effective training pipelines and reliable model assessments for a wide spectrum of real-world applications.

arXiv.org

Pruning vs Quantization: Which is Better?
https://arxiv.org/abs/2307.02973

* Pruning remove weights reducing memory footprint
* Quantization (4-bit, 8-bit matrix multiplication; ...) reduces bit-width used for both weights / computation used in neural networks, leading to both predictable memory savings & reductions in the necessary compute

In most cases quantization outperforms pruning.

#MachineLearning #parametrization #weights #pruning #quantization #NeuralNetworks #MathematicalPrecision #matrices

Pruning vs Quantization: Which is Better?

Neural network pruning and quantization techniques are almost as old as neural networks themselves. However, to date only ad-hoc comparisons between the two have been published. In this paper, we set out to answer the question on which is better: neural network quantization or pruning? By answering this question, we hope to inform design decisions made on neural network hardware going forward. We provide an extensive comparison between the two techniques for compressing deep neural networks. First, we give an analytical comparison of expected quantization and pruning error for general data distributions. Then, we provide lower bounds for the per-layer pruning and quantization error in trained networks, and compare these to empirical error after optimization. Finally, we provide an extensive experimental comparison for training 8 large-scale models on 3 tasks. Our results show that in most cases quantization outperforms pruning. Only in some scenarios with very high compression ratio, pruning might be beneficial from an accuracy standpoint.

arXiv.org
The bigger-is-better approach to AI is running out of road

If AI is to keep getting better, it will have to do more with less

The Economist

Training Transformers with 4-bit Integers
https://arxiv.org/abs/2306.11987

... we propose a training method for transformers with matrix multiplications implemented with the INT4 arithmetic. Training with an ultra-low INT4 precision is challenging ... we carefully analyze the specific structures of activation & gradients in transformers to propose dedicated quantizers for them. For forward propagation, we identify ...

#ML #MachineLearning #parametrization #INT4 #NeuralNetworks #transformers #matrices

Training Transformers with 4-bit Integers

Quantizing the activation, weight, and gradient to 4-bit is promising to accelerate neural network training. However, existing 4-bit training methods require custom numerical formats which are not supported by contemporary hardware. In this work, we propose a training method for transformers with all matrix multiplications implemented with the INT4 arithmetic. Training with an ultra-low INT4 precision is challenging. To achieve this, we carefully analyze the specific structures of activation and gradients in transformers to propose dedicated quantizers for them. For forward propagation, we identify the challenge of outliers and propose a Hadamard quantizer to suppress the outliers. For backpropagation, we leverage the structural sparsity of gradients by proposing bit splitting and leverage score sampling techniques to quantize gradients accurately. Our algorithm achieves competitive accuracy on a wide range of tasks including natural language understanding, machine translation, and image classification. Unlike previous 4-bit training methods, our algorithm can be implemented on the current generation of GPUs. Our prototypical linear operator implementation is up to 2.2 times faster than the FP16 counterparts and speeds up the training by up to 35.1%.

arXiv.org