Abstract
Numerical value range partitioning is an inherent part of inductive learning. In classification problems, a common partition ranking method is to use an attribute evaluation function to assign a goodness score to each candidate. Optimal cut point selection constitutes a potential efficiency bottleneck, which is often circumvented by using heuristic methods.
This paper aims at improving the efficiency of optimal multisplitting. We analyze convex and cumulative evaluation functions, which account for the majority of commonly used goodness criteria. We derive an analytical bound, which lets us filter out—when searching for the optimal multisplit—all partitions containing a specific subpartition as their prefix. Thus, the search space of the algorithm can be restricted without losing optimality.
We compare the partition candidate pruning algorithm with the best existing optimization algorithms for multisplitting. For it the numbers of evaluated partition candidates are, on the average, only approximately 25% and 50% of those performed by the comparison methods. In time saving that amounts up to 50% less evaluation time per attribute.
This paper aims at improving the efficiency of optimal multisplitting. We analyze convex and cumulative evaluation functions, which account for the majority of commonly used goodness criteria. We derive an analytical bound, which lets us filter out—when searching for the optimal multisplit—all partitions containing a specific subpartition as their prefix. Thus, the search space of the algorithm can be restricted without losing optimality.
We compare the partition candidate pruning algorithm with the best existing optimization algorithms for multisplitting. For it the numbers of evaluated partition candidates are, on the average, only approximately 25% and 50% of those performed by the comparison methods. In time saving that amounts up to 50% less evaluation time per attribute.
| Original language | English |
|---|---|
| Title of host publication | Principles of Data Mining and Knowledge Discovery |
| Subtitle of host publication | Third European Conference, PKDD’99 |
| Editors | Jan M. Żytkow, Jan Rauch |
| Place of Publication | Berlin |
| Publisher | Springer |
| Pages | 89-97 |
| ISBN (Electronic) | 978-3-540-48247-5 |
| ISBN (Print) | 978-3-540-66490-1 |
| DOIs | |
| Publication status | Published - 1999 |
| MoE publication type | A4 Article in a conference publication |
| Event | 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'99) - Praha, Czech Republic Duration: 15 Sept 1999 → 18 Sept 1999 |
Publication series
| Series | Lecture Notes in Computer Science |
|---|---|
| Volume | 1704 |
| ISSN | 0302-9743 |
Conference
| Conference | 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'99) |
|---|---|
| Country/Territory | Czech Republic |
| City | Praha |
| Period | 15/09/99 → 18/09/99 |
Fingerprint
Dive into the research topics of 'Speeding Up the Search for Optimal Partitions'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver