students:phd_mlws
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
students:phd_mlws [2017/05/28 17:37] – [Bibliographie] blay | students:phd_mlws [2017/05/28 18:03] (current) – [Context] blay | ||
---|---|---|---|
Line 11: | Line 11: | ||
* The structural characteristics (size, quality, and nature) of the collected data | * The structural characteristics (size, quality, and nature) of the collected data | ||
* How the results will be used. | * How the results will be used. | ||
- | This task is highly complex because of the increasing number of available algorithms, the difficulty in choosing the correct preprocessing techniques together with the right algorithms as well as the correct tuning of their parameters. To decide which algorithm to choose, data scientists often consider families of algorithms in which they are experts, and can leave aside algorithms that are more “exotic” to them, but could perform better for the problem they are trying to solve. | + | This task is highly complex because of the increasing number of available algorithms, the difficulty in choosing the correct preprocessing techniques together with the right algorithms as well as the correct tuning of their parameters |
ROCKFlows | ROCKFlows | ||
Line 25: | Line 25: | ||
The thesis must address the following challenges: Relevance and quality of predictions and Scalability to manage the huge mass of ML workflows. | The thesis must address the following challenges: Relevance and quality of predictions and Scalability to manage the huge mass of ML workflows. | ||
To meet these challenges, attention should be paid to the following aspects: | To meet these challenges, attention should be paid to the following aspects: | ||
- | * //Handling Variabilities: | + | * //Handling Variabilities: |
*// Architecture of the portfolio : // automatically manage (1) experiment running, (2) collecting of experiment results, (3) analyzis of results, (4) evolution of algorithm base. It must support the management of execution errors, incremental analyzes, identifying context of experiments. | *// Architecture of the portfolio : // automatically manage (1) experiment running, (2) collecting of experiment results, (3) analyzis of results, (4) evolution of algorithm base. It must support the management of execution errors, incremental analyzes, identifying context of experiments. | ||
* //Handling Scalability of the Portfolio: //Selecting discriminating data sets; Detecting “deprecated” algorithms and WF from experiments and literature revues; Dealing with information from scientific literature without deteriorating portfolio computed knowledge. | * //Handling Scalability of the Portfolio: //Selecting discriminating data sets; Detecting “deprecated” algorithms and WF from experiments and literature revues; Dealing with information from scientific literature without deteriorating portfolio computed knowledge. | ||
Line 68: | Line 68: | ||
Martin Salvador M, Budka M, Gabrys B (2016) Towards automatic composition of multicomponent predictive systems. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics). doi: 10.1007/ | Martin Salvador M, Budka M, Gabrys B (2016) Towards automatic composition of multicomponent predictive systems. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics). doi: 10.1007/ | ||
+ | |||
+ | Serban F, Vanschoren J, Kietz J-U, Bernstein A (2013) A survey of intelligent assistants for data analysis. ACM Comput Surv. doi: 10.1145/ | ||
Wolpert D (1996) The lack of a priori distinctions between learning algorithms. Neural Computation 8(7): | Wolpert D (1996) The lack of a priori distinctions between learning algorithms. Neural Computation 8(7): | ||
students/phd_mlws.1495993067.txt.gz · Last modified: 2017/05/28 17:37 by blay