On the Complexity of Optimal Multisplitting

Tapio Elomaa, Juho Rousu

Research output: Chapter in Book/Report/Conference proceedingChapter or book articleProfessional

Abstract

Dynamic programming has been studied extensively, e.g., in computational geometry and string matching. It has recently found a new application in the optimal multisplitting of numerical attribute value domains.We reflect the results obtained earlier to this problem and study whether they help to shed a new light on the inherent complexity of this time-critical subtask of machine learning and data mining programs. The concept of monotonicity has come up in earlier research. It helps to explain the different asymptotic time requirements of optimal multisplitting with respect to different attribute evaluation functions. As case studies we examine Training Set Error and Average Class Entropy functions. The former has a linear-time optimization algorithm, while the latter—like most well-known attribute evaluation functions—takes a quadratic time to optimize. It is shown that neither of them fulfills the strict monotonicity condition, but computing optimal Training Set Error values can be decomposed into monotone subproblems.
Original languageEnglish
Title of host publicationFoundations of Intelligent Systems
Subtitle of host publication12th International Symposium, ISMIS 2000 Charlotte, NC, USA, October 11–14, 2000 Proceedings
EditorsZbigniew W. Ras, Setsuo Ohsuga
Place of PublicationBerlin - Heidelberg
PublisherSpringer
Pages552-561
ISBN (Electronic)978-3-540-39963-6
ISBN (Print)978-3-540-41094-2
DOIs
Publication statusPublished - 2000
MoE publication typeD2 Article in professional manuals or guides or professional information systems or text book material
Event12th Int. Symp. ISMIS 2000, Charlotte, NC, Oct. 2000 -
Duration: 1 Jan 2000 → …

Publication series

SeriesLecture Notes in Computer Science
Volume1932
ISSN0302-9743

Conference

Conference12th Int. Symp. ISMIS 2000, Charlotte, NC, Oct. 2000
Period1/01/00 → …

Fingerprint

Computational geometry
Function evaluation
Dynamic programming
Data mining
Learning systems
Entropy

Cite this

Elomaa, T., & Rousu, J. (2000). On the Complexity of Optimal Multisplitting. In Z. W. Ras, & S. Ohsuga (Eds.), Foundations of Intelligent Systems: 12th International Symposium, ISMIS 2000 Charlotte, NC, USA, October 11–14, 2000 Proceedings (pp. 552-561). Berlin - Heidelberg: Springer. Lecture Notes in Computer Science, Vol.. 1932 https://doi.org/10.1007/3-540-39963-1_58
Elomaa, Tapio ; Rousu, Juho. / On the Complexity of Optimal Multisplitting. Foundations of Intelligent Systems: 12th International Symposium, ISMIS 2000 Charlotte, NC, USA, October 11–14, 2000 Proceedings. editor / Zbigniew W. Ras ; Setsuo Ohsuga. Berlin - Heidelberg : Springer, 2000. pp. 552-561 (Lecture Notes in Computer Science, Vol. 1932).
@inbook{d9fec4ca799648fba1b56235a926e779,
title = "On the Complexity of Optimal Multisplitting",
abstract = "Dynamic programming has been studied extensively, e.g., in computational geometry and string matching. It has recently found a new application in the optimal multisplitting of numerical attribute value domains.We reflect the results obtained earlier to this problem and study whether they help to shed a new light on the inherent complexity of this time-critical subtask of machine learning and data mining programs. The concept of monotonicity has come up in earlier research. It helps to explain the different asymptotic time requirements of optimal multisplitting with respect to different attribute evaluation functions. As case studies we examine Training Set Error and Average Class Entropy functions. The former has a linear-time optimization algorithm, while the latter—like most well-known attribute evaluation functions—takes a quadratic time to optimize. It is shown that neither of them fulfills the strict monotonicity condition, but computing optimal Training Set Error values can be decomposed into monotone subproblems.",
author = "Tapio Elomaa and Juho Rousu",
year = "2000",
doi = "10.1007/3-540-39963-1_58",
language = "English",
isbn = "978-3-540-41094-2",
series = "Lecture Notes in Computer Science",
publisher = "Springer",
pages = "552--561",
editor = "Ras, {Zbigniew W.} and Setsuo Ohsuga",
booktitle = "Foundations of Intelligent Systems",
address = "Germany",

}

Elomaa, T & Rousu, J 2000, On the Complexity of Optimal Multisplitting. in ZW Ras & S Ohsuga (eds), Foundations of Intelligent Systems: 12th International Symposium, ISMIS 2000 Charlotte, NC, USA, October 11–14, 2000 Proceedings. Springer, Berlin - Heidelberg, Lecture Notes in Computer Science, vol. 1932, pp. 552-561, 12th Int. Symp. ISMIS 2000, Charlotte, NC, Oct. 2000, 1/01/00. https://doi.org/10.1007/3-540-39963-1_58

On the Complexity of Optimal Multisplitting. / Elomaa, Tapio; Rousu, Juho.

Foundations of Intelligent Systems: 12th International Symposium, ISMIS 2000 Charlotte, NC, USA, October 11–14, 2000 Proceedings. ed. / Zbigniew W. Ras; Setsuo Ohsuga. Berlin - Heidelberg : Springer, 2000. p. 552-561 (Lecture Notes in Computer Science, Vol. 1932).

Research output: Chapter in Book/Report/Conference proceedingChapter or book articleProfessional

TY - CHAP

T1 - On the Complexity of Optimal Multisplitting

AU - Elomaa, Tapio

AU - Rousu, Juho

PY - 2000

Y1 - 2000

N2 - Dynamic programming has been studied extensively, e.g., in computational geometry and string matching. It has recently found a new application in the optimal multisplitting of numerical attribute value domains.We reflect the results obtained earlier to this problem and study whether they help to shed a new light on the inherent complexity of this time-critical subtask of machine learning and data mining programs. The concept of monotonicity has come up in earlier research. It helps to explain the different asymptotic time requirements of optimal multisplitting with respect to different attribute evaluation functions. As case studies we examine Training Set Error and Average Class Entropy functions. The former has a linear-time optimization algorithm, while the latter—like most well-known attribute evaluation functions—takes a quadratic time to optimize. It is shown that neither of them fulfills the strict monotonicity condition, but computing optimal Training Set Error values can be decomposed into monotone subproblems.

AB - Dynamic programming has been studied extensively, e.g., in computational geometry and string matching. It has recently found a new application in the optimal multisplitting of numerical attribute value domains.We reflect the results obtained earlier to this problem and study whether they help to shed a new light on the inherent complexity of this time-critical subtask of machine learning and data mining programs. The concept of monotonicity has come up in earlier research. It helps to explain the different asymptotic time requirements of optimal multisplitting with respect to different attribute evaluation functions. As case studies we examine Training Set Error and Average Class Entropy functions. The former has a linear-time optimization algorithm, while the latter—like most well-known attribute evaluation functions—takes a quadratic time to optimize. It is shown that neither of them fulfills the strict monotonicity condition, but computing optimal Training Set Error values can be decomposed into monotone subproblems.

U2 - 10.1007/3-540-39963-1_58

DO - 10.1007/3-540-39963-1_58

M3 - Chapter or book article

SN - 978-3-540-41094-2

T3 - Lecture Notes in Computer Science

SP - 552

EP - 561

BT - Foundations of Intelligent Systems

A2 - Ras, Zbigniew W.

A2 - Ohsuga, Setsuo

PB - Springer

CY - Berlin - Heidelberg

ER -

Elomaa T, Rousu J. On the Complexity of Optimal Multisplitting. In Ras ZW, Ohsuga S, editors, Foundations of Intelligent Systems: 12th International Symposium, ISMIS 2000 Charlotte, NC, USA, October 11–14, 2000 Proceedings. Berlin - Heidelberg: Springer. 2000. p. 552-561. (Lecture Notes in Computer Science, Vol. 1932). https://doi.org/10.1007/3-540-39963-1_58