Differences

This shows you the differences between two versions of the page.

--- students:phd_2019 [2019/05/10 16:11]
blay created
+++ students:phd_2019 [2019/05/10 20:40] (current)
blay [Objectives]
@@ Line 1: / Line 1: @@
-====== Learning variability of Machine Learning Workflows ======
+====== Meta-learning in a Portfolio of Machine Learning Workflows ======
 //By Mireille Blay-Fornarino and Frédéric Precioso
 //
-Depending on data set and objectives, different machine learning workflows perform differently, commonly known as the no free lunch theorem [17].  Is it then possible to envision the meta-learning process as a systematic approach to analyzing past experiences to identify, explain and predict the right choices? This PhD thesis will address this issue by correlating research on software architectures (including product lines) and meta-learning.
+Recent advances in Machine Learning (ML) have brought new solutions for the problems of prediction, decision, and identification. ML is impacting almost all domains of science or industry but determining the right ML workflow for a given problem remains a key question. To allow not only experts in the field to benefit from ML potential, last years have seen an increasing effort from the big data companies (Amazon AWS, Microsoft Azure, Google AutoML...) to provide any user with simple platforms for designing their own ML workflow. However, none of these solutions consider the design of ML workflow as a generic process intending to capture common processing patterns between workflows (even through workflows targeting different application contexts). These platforms either propose a set of dedicated solutions for given classes of problem (i.e. AutoML Vision, AutoML natural language , AutoML Translation...) or propose a recipe to build your own ML workflow from scratch (i.e. MS  Azure Machine Learning studio, RapidMiner).\\
-===== Context =====
+Is it then possible to envision the meta-learning process of designing ML workflow as a systematic approach analyzing past experiences to identify, explain and predict the right choices?
- Advances in Machine Learning (ML) have brought new solutions for the problems of prediction, decision, and identification. To determine the right ML workflow for a given problem, numerous parameters have to be taken in account: the kind of data, expected predictions (error, accuracy, time, memory space), the choice of the algorithms and their judicious composition [14,11]. To help with this task, Microsoft Azure Machine Learning, Amazon AWS, and RapidMiner Auto Model[12]  provide ML component assembly tools.
+This PhD thesis will address this issue by correlating research on software architectures (including product lines) and meta-learning, to bring ML workflow design to the next level by producing explanation on algorithm choices and by cutting portfolio exploration  complexity identifying common patterns between workflows.
-However, faced with the complexity of choosing the "right" assembly, meta-learning offers an attractive solution, learning from the problems of the past.  The algorithm selection problem is one of its applications [5]: given a dataset, identify which learning algorithm (and which hyperparameter setting) performs best on it.
+===== Context =====
+ Advances in Machine Learning (ML) have brought new solutions for the problems of prediction, decision, and identification. To determine the right ML workflow for a given problem, numerous parameters have to be taken in account: the kind of data, expected predictions (error, accuracy, time, memory space), the choice of the algorithms and their judicious composition [14,11]. \\
+To help with this task, Microsoft Azure Machine Learning, Amazon AWS, and RapidMiner Auto Model[12]  provide ML component assembly tools. However, faced with the complexity of choosing the "right" assembly, meta-learning offers an attractive solution, learning from the problems of the past.  The algorithm selection problem is one of its applications [5]: given a dataset, identify which learning algorithm (and which hyperparameter setting) performs best on it.\\
  Algorithm Portfolio generalizes the problem and automates the construction of selection models [8]. The immediate goal is the same: to predict the results of the algorithms on a given problem without executing them. Even if, in the portfolio, some selection models are built by meta-learning [1], the purpose is different. The portfolio is based on the systematic acquisition of knowledge about the algorithms it contains. The research then focuses on the quality and the return of knowledge, the acquisition process itself and the construction of selection models over time.
@@ Line 14: / Line 16: @@
 ===== Objectives =====
- The construction of a portfolio requires covering a space of experiments broad enough to "cover" all the problems that may be submitted to it. Since the field is particularly productive, the portfolio must be able to evolve to integrate new algorithms.
+ The construction of a portfolio requires covering a space of experiments broad enough to "cover" all the problems that may be submitted to it. \\
-The space of problems and solutions presents a very great ((The variability subjects are related to pretreatment, algorithms, datasets, evaluation criteria, experimental results. Each subject has several variants. For example, in OpenML, for each dataset downloaded, 61 dataset meta-features are calculated[18]. There are more than a hundred classification algorithms[5], etc.)) “diversity” [16] even within a single class of problem like classification [9].
+However, (1) the space of problems and solutions presents a very great ((The variability subjects are related to pretreatment, algorithms, datasets, evaluation criteria, experimental results. Each subject has several variants. For example, in OpenML, for each dataset downloaded, 61 dataset meta-features are calculated[17]. There are more than a hundred classification algorithms[5], etc.)) “diversity” [16] even within a single class of problem like classification [9].\\
-The resources required for ML experiments are massive (time, memory, energy)((The number of theoretical experiments to study p pretreatments, n algorithms and d data sets is 2^p*n*d. For 10 preprocessing algorithms, 100 classification algorithms and 100 sets of data, considering that each experiment only lasts one minute, it would take more than 7000 days of execution time.)). The transformation of experimental results into knowledge requires the establishment of automatic analysis procedures. The objective of this thesis is, therefore, to propose different paradigms for constructing a portfolio of machine-learning workflows that meet these requirements: relevance of the results while taming the space of experiments, explanation of knowledge from meta-learning (the meta-learning is not a black box), automation of selection processes.
+(2) The resources required for ML experiments are massive (time, memory, energy)((The number of theoretical experiments to study p pretreatments, n algorithms and d data sets is 2^p*n*d. For 10 preprocessing algorithms, 100 classification algorithms and 100 sets of data, considering that each experiment only lasts one minute, it would take more than 7000 days of execution time.)).  \\
+(3) As the ML domain is particularly productive, the portfolio must be able to evolve to integrate new algorithms.\\
+(4) To cope with the mass of data, the transformation of experimental results into knowledge requires the implementation of automatic analysis procedures.
+The objective of this thesis is, therefore, to propose different paradigms for constructing a portfolio of machine-learning workflows that meet these requirements: relevance of the results while taming the space of experiments, explanation of knowledge from meta-learning (the meta-learning is not a black box), automation of selection processes. \\
 The PhD work will be organized to provide contributions in the following directions: \\
 - A representation of experiments in the form of graphs [10] and exploitation of these structures by adapted learning algorithms[3,15]; In particular, this approach should be exploited to reduce the search space and explain the choices made[4];\\
@@ Line 22: / Line 28: @@
 - A systematic exploitation of this structure to reduce the number of executions, to drive the workflow compositions, to manage the feedback loop, and to justify choices.\\
-** The number of theoretical experiments to study p pretreatments, n algorithms and d data sets is 2^p*n*d. For 10 preprocessing algorithms, 100 classification algorithms and 100 sets of data, considering that each experiment only lasts one minute, it would take more than 7000 days of execution time.
 ===== References =====
@@ Line 57: / Line 63: @@
 .	Pohl, K., Böckle, G. & van der Linden, F. J. Software Product Line Engineering: Foundations, Principles and Techniques. (Springer-Verlag, 2005).
-.	Wolpert, D. H. & Macready, W. G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. (1997).
+.	Bilalli, B., Abelló, A. & Aluja-Banet, T. On the predictive power of meta-features in OpenML. Int. J. Appl. Math. Comput. Sci. 27, (2017).
-.	Bilalli, B., Abelló, A. & Aluja-Banet, T. On the predictive power of meta-features in OpenML. Int. J. Appl. Math. Comput. Sci. 27, (2017).

Mireille Blay-Fornarino

User Tools

Site Tools

Differences

Page Tools