Boosting algorithm techniques and analyses, 1996

Item — Call Number: MU Thesis Cat

Identifier: b2089046

Scope and Contents

From the Collection:

The collection consists of theses written by students enrolled in the Monmouth College and Monmouth University graduate Electronic Engineering programs. The holdings are bound print documents that were submitted in partial fulfillment of requirements for the Master of Science degree.

Dates

Creation: 1996

Creator

Catlin, Gary M. (Author, Person)
Drucker, Harris (1943-2024) (Thesis advisor, Person)

Conditions Governing Access

All analog collection holdings are limited to library use only.

Researchers seeking to photocopy collection materials must complete an Application to Photocopy Form.

In some cases, photocopying of collection materials may be performed by the Monmouth University Library staff.

The Monmouth University Library reserves the right to limit or refuse duplication requests subject to the condition of collection materials and/or restrictions imposed by the collection creators or by the United States Copyright Act.

Permission to examine, or copy, collection materials does not imply permission to publish or quote. It is the responsibility of the researcher to obtain such permissions from both the copyright holder and Monmouth University.

Full Extent

1 Items (print book) : 78 pages ; 8.5 x 11.0 inches (28 cm).

Language of Materials

English

Additional Description

Abstract

The enclosed project contains the results of the research, analysis and testing of a Modified Boosting Algorithm to be used by a configuration of learning machines in order to perform regression. The algorithm is used to construct a set of "weak learners", whose individual performance is only slightly better than 50%, but whose ensemble performance is better than any one regression machine. The intent is to develop the algorithm and to demonstrate that it performs as well, if not better than, currently used regression techniques.

The developed alogrithm builds the ensemble of "weak learners" from points obtained through a sampling of a training data set, whose sampling values are weighted by previous passes. A threshold error point, selected as the next value higher than the median error value from the first pass of the alogrithm, is used as an indicator for whether points are considered to be valid or error values by the algorithm. The valid points receive smaller sampling weights, ensuring that subsequent samplings will have higher concentrations of points which are considered to be in error. The high levels of "bad" data also ensure that the performance of individual machines meet the "weak learner" criteria.

Simulations were performed on the developed algorithm, using variations of the "weak learner" construction method, machine validation method, and data type and distribution. The results of simulations performed using this boosting algorithm demonstrate that the ensemble mean square error (MSE) values show a performance improvement over ordinary least squares (OLS) MSE values under certain conditions. In all cases, the ensemble boosting algorithm performed no worse than the OLS method. Also, a performance improvement in test ensemble MSE was observed when the machine ensemble was constructed from a training data set containing outlier points.