fly51fly (@fly51fly)
강화학습에서 인간 피드백을 활용할 때, 워서슈타인 분포강건(regret) 최적화를 적용하는 새로운 연구입니다. RLHF의 불확실성과 분포 변화에 더 강한 학습 방법을 제안합니다.
https://x.com/fly51fly/status/2051417235110187109
#reinforcementlearning #rlhf #robustoptimization #wasserstein #research
fly51fly (@fly51fly)
강화학습에서 인간 피드백을 활용할 때, 워서슈타인 분포강건(regret) 최적화를 적용하는 새로운 연구입니다. RLHF의 불확실성과 분포 변화에 더 강한 학습 방법을 제안합니다.
https://x.com/fly51fly/status/2051417235110187109
#reinforcementlearning #rlhf #robustoptimization #wasserstein #research
Excited to attend my first PhD defence at TU Delft: Daniël Vos will be defending his thesis "Decision Tree Learning: Algorithms for Robust Prediction and Policy Optimization", containing work he did under supervision of Prof. Dr. Ir. R.L. Lagendijk and Dr. Ir. Sicco Verwer.
#AcademicMastodon #AcademicChatter #PhDLife #Delft #TUDelft #DelftUniversityOfTechnology #Dissertation #Defence #PhD #PhDDefence #DecisionTrees #Optimization #Optimisation #Algorithms #ExplainableAI #RobustOptimization #RobustAI #Defense #PhDDefense
“Thanks again to @Francy_Maggioni for the rich talk about bounds for Multistage #StochasticOptimization and Distributionally #RobustOptimization! A lot of people from Italy, Brazil, and more attended the first webinar co-organized with @log_ufpb Stay tuned for the next ones😉”
Our preprint "Finding Regions of Counterfactual Explanations via Robust Optimization" (written together with Donato Maragno, Tabea Röber, Rob Goedhart, Ilker Birbil and Dick den Hertog) is available online now.
Paper: https://lnkd.in/embHrTvM
Code & Slides: https://lnkd.in/eV2vaM2D
#robustoptimization #machinelearning #counterfactualexplanations #explainableAI
Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this work we derive an iterative method to calculate robust CEs, i.e. CEs that remain valid even after the features are slightly perturbed. To this end, our method provides a whole region of CEs allowing the user to choose a suitable recourse to obtain a desired outcome. We use algorithmic ideas from robust optimization and prove convergence results for the most common machine learning methods including logistic regression, decision trees, random forests, and neural networks. Our experiments show that our method can efficiently generate globally optimal robust CEs for a variety of common data sets and classification models.
We completely revised our paper "Data-driven Prediction of Relevant Scenarios for Robust Combinatorial Optimization" (written together with Marc Goerigk) and replaced the old method by a completely new data-driven algorithm which generalizes well to problem instances which are of larger dimension than the training instances.
We study iterative methods for (two-stage) robust combinatorial optimization problems with discrete uncertainty. We propose a machine-learning-based heuristic to determine starting scenarios that provide strong lower bounds. To this end, we design dimension-independent features and train a Random Forest Classifier on small-dimensional instances. Experiments show that our method improves the solution process for larger instances than contained in the training set and also provides a feature importance-score which gives insights into the role of scenario properties.
I'm happy to share that our paper "Data-driven robust optimization using deep neural networks" (written together with Marc Goerigk) is published now in Computers & OR.
We study #robustoptimization problems where observations of the uncertain parameters are given by historical data. On this data we train one-class deep #neuralnetworks to detect outliers and extract the hidden data structures from the observations.
https://doi.org/10.1016/j.cor.2022.106087