Параметризация: PyTest vs RobotFramework
В этой статье вы можете найти наглядное сравнение двух популярных инструментов для автоматизации тестирования: PyTest и RobotFramework. На Хабре уже есть хорошая статья с общим сравнением этих фреймворков. Я сфокусируюсь на простоте параметризации тестов.
https://habr.com/ru/companies/beget/articles/902456/
#robotframework #pytest #testautomation #testing #parametrization

Параметризация: PyTest vs RobotFramework
В этой статье вы можете найти наглядное сравнение двух популярных инструментов для автоматизации тестирования: PyTest и RobotFramework. На Хабре уже есть хорошая статья с общим сравнением этих...
Хабр
Parametrization of surfaces
YouTubeApplying matrix diagonalisation in the classroom with #GeoGebra: parametrising the intersection of a sphere and plane
In collaboration with Bradley Welch
https://www.tandfonline.com/doi/full/10.1080/0020739X.2023.2233513
#dynamic #geometric #systems #linear #algebra #vectors #parametrization
Pruning neural networks using Bayesian inference
https://arxiv.org/abs/2308.02451
* neural network (NN) pruning highly effective at reducing computational & memory demands of large NN
* novel approach utilizing Bayesian inference; seamlessly integrates into training procedure
* leverages the posterior probabilities of NN prior to/following pruning, enabling calculation of Bayes factors
* achieves sparsity maintains accuracy
#ML #MachineLearning #parametrization #pruning #BayesianPruning #NeuralNetworks

Pruning a neural network using Bayesian inference
Neural network pruning is a highly effective technique aimed at reducing the
computational and memory demands of large neural networks. In this research
paper, we present a novel approach to pruning neural networks utilizing
Bayesian inference, which can seamlessly integrate into the training procedure.
Our proposed method leverages the posterior probabilities of the neural network
prior to and following pruning, enabling the calculation of Bayes factors. The
calculated Bayes factors guide the iterative pruning. Through comprehensive
evaluations conducted on multiple benchmarks, we demonstrate that our method
achieves desired levels of sparsity while maintaining competitive accuracy.
arXiv.orgOn the curvature of the loss landscape
https://arxiv.org/abs/2307.04719
A main challenge in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data ... we consider the loss landscape as an embedded Riemannian manifold ... we focus on the scalar curvature, which can be computed analytically for our manifold ...
Manifolds: https://en.wikipedia.org/wiki/Manifold
...
#MachineLearning #NeuralNetworks #manifolds #parametrization #DeepLearning #LossFunctions

On the curvature of the loss landscape
One of the main challenges in modern deep learning is to understand why such
over-parameterized models perform so well when trained on finite data. A way to
analyze this generalization concept is through the properties of the associated
loss landscape. In this work, we consider the loss landscape as an embedded
Riemannian manifold and show that the differential geometric properties of the
manifold can be used when analyzing the generalization abilities of a deep net.
In particular, we focus on the scalar curvature, which can be computed
analytically for our manifold, and show connections to several settings that
potentially imply generalization.
arXiv.orgExtending the Forward Forward Algorithm
https://arxiv.org/abs/2307.04205
The Forward Forward algorithm (Geoffrey Hinton, 2022-11) is an alternative to backpropagation for training neural networks (NN)
Backpropagation - the most widely successful and used optimization algorithm for training NN - has 3 important limitations ...
Hinton's paper: https://www.cs.toronto.edu/~hinton/FFA13.pdf
Discussion: https://bdtechtalks.com/2022/12/19/forward-forward-algorithm-geoffrey-hinton
...
#GeoffHinton #ForwardPropagation #NeruralNetworks #parametrization #BackproPagation #LossFunction

Extending the Forward Forward Algorithm
The Forward Forward algorithm, proposed by Geoffrey Hinton in November 2022,
is a novel method for training neural networks as an alternative to
backpropagation. In this project, we replicate Hinton's experiments on the
MNIST dataset, and subsequently extend the scope of the method with two
significant contributions. First, we establish a baseline performance for the
Forward Forward network on the IMDb movie reviews dataset. As far as we know,
our results on this sentiment analysis task marks the first instance of the
algorithm's extension beyond computer vision. Second, we introduce a novel
pyramidal optimization strategy for the loss threshold - a hyperparameter
specific to the Forward Forward method. Our pyramidal approach shows that a
good thresholding strategy causes a difference of upto 8% in test error. 1
Lastly, we perform visualizations of the trained parameters and derived several
significant insights, such as a notably larger (10-20x) mean and variance in
the weights acquired by the Forward Forward network.
arXiv.orgLoss Functions and Metrics in Deep Learning. A Review
https://arxiv.org/abs/2307.02694
One of the essential components of deep learning is the choice of the loss function and performance metrics used to train and evaluate models.
This paper reviews the most prevalent loss functions and performance measurements in deep learning.
#MachineLearning #parametrization #LossFunctions #CostFunctions #NeuralNetworks #optimization #MathematicalOptimization

Loss Functions and Metrics in Deep Learning
This paper presents a comprehensive review of loss functions and performance metrics in deep learning, highlighting key developments and practical insights across diverse application areas. We begin by outlining fundamental considerations in classic tasks such as regression and classification, then extend our analysis to specialized domains like computer vision and natural language processing including retrieval-augmented generation. In each setting, we systematically examine how different loss functions and evaluation metrics can be paired to address task-specific challenges such as class imbalance, outliers, and sequence-level optimization. Key contributions of this work include: (1) a unified framework for understanding how losses and metrics align with different learning objectives, (2) an in-depth discussion of multi-loss setups that balance competing goals, and (3) new insights into specialized metrics used to evaluate modern applications like retrieval-augmented generation, where faithfulness and context relevance are pivotal. Along the way, we highlight best practices for selecting or combining losses and metrics based on empirical behaviors and domain constraints. Finally, we identify open problems and promising directions, including the automation of loss-function search and the development of robust, interpretable evaluation measures for increasingly complex deep learning tasks. Our review aims to equip researchers and practitioners with clearer guidance in designing effective training pipelines and reliable model assessments for a wide spectrum of real-world applications.
arXiv.orgPruning vs Quantization: Which is Better?
https://arxiv.org/abs/2307.02973
* Pruning remove weights reducing memory footprint
* Quantization (4-bit, 8-bit matrix multiplication; ...) reduces bit-width used for both weights / computation used in neural networks, leading to both predictable memory savings & reductions in the necessary compute
In most cases quantization outperforms pruning.
#MachineLearning #parametrization #weights #pruning #quantization #NeuralNetworks #MathematicalPrecision #matrices

Pruning vs Quantization: Which is Better?
Neural network pruning and quantization techniques are almost as old as neural networks themselves. However, to date only ad-hoc comparisons between the two have been published. In this paper, we set out to answer the question on which is better: neural network quantization or pruning? By answering this question, we hope to inform design decisions made on neural network hardware going forward. We provide an extensive comparison between the two techniques for compressing deep neural networks. First, we give an analytical comparison of expected quantization and pruning error for general data distributions. Then, we provide lower bounds for the per-layer pruning and quantization error in trained networks, and compare these to empirical error after optimization. Finally, we provide an extensive experimental comparison for training 8 large-scale models on 3 tasks. Our results show that in most cases quantization outperforms pruning. Only in some scenarios with very high compression ratio, pruning might be beneficial from an accuracy standpoint.
arXiv.org
The bigger-is-better approach to AI is running out of road
If AI is to keep getting better, it will have to do more with less
The EconomistTraining Transformers with 4-bit Integers
https://arxiv.org/abs/2306.11987
... we propose a training method for transformers with matrix multiplications implemented with the INT4 arithmetic. Training with an ultra-low INT4 precision is challenging ... we carefully analyze the specific structures of activation & gradients in transformers to propose dedicated quantizers for them. For forward propagation, we identify ...
#ML #MachineLearning #parametrization #INT4 #NeuralNetworks #transformers #matrices

Training Transformers with 4-bit Integers
Quantizing the activation, weight, and gradient to 4-bit is promising to
accelerate neural network training. However, existing 4-bit training methods
require custom numerical formats which are not supported by contemporary
hardware. In this work, we propose a training method for transformers with all
matrix multiplications implemented with the INT4 arithmetic. Training with an
ultra-low INT4 precision is challenging. To achieve this, we carefully analyze
the specific structures of activation and gradients in transformers to propose
dedicated quantizers for them. For forward propagation, we identify the
challenge of outliers and propose a Hadamard quantizer to suppress the
outliers. For backpropagation, we leverage the structural sparsity of gradients
by proposing bit splitting and leverage score sampling techniques to quantize
gradients accurately. Our algorithm achieves competitive accuracy on a wide
range of tasks including natural language understanding, machine translation,
and image classification. Unlike previous 4-bit training methods, our algorithm
can be implemented on the current generation of GPUs. Our prototypical linear
operator implementation is up to 2.2 times faster than the FP16 counterparts
and speeds up the training by up to 35.1%.
arXiv.org