Pattern Aided Regression Modeling: New Regression Model Type, New Regression Algorithm, and New Insights on Regression Problems

      Department of Systems Engineering and Engineering Management
                             The Chinese University of Hong Kong
Speaker: Prof. Guozhu Dong, Wright State University, Ohio, USA
Title: Pattern Aided Regression Modeling: New Regression Model Type, New Regression Algorithm, and New Insights on Regression Problems 
Abstract: Constructing accurate numerical prediction models is a fundamental task in a wide range of prediction/forecasting applications, and prediction modeling is a key part of data science. In this talk I will introduce a new type of regression models, namely pattern aided regression (PXR) models. Our research was motivated by two observations: (1) Regression modeling problems often involve complex diverse predictor-response relationships, which occur when the optimal regression models fitting distinct data subgroups of the application are highly different. (2) State-of-the-art regression methods are often unable to accurately model such highly diverse predictor-response relationships, even when they involve ensembles with hundreds of member prediction models. To effectively meet the challenges of diverse predictor-response relationships, a PXR model uses several pattern and local regression model pairs, each serving as logical and behavioral characterizations of a distinct predictor-response relationship, to define a prediction model. I will also discuss a contrast pattern aided regression (CPXR) method, to build accurate PXR models. CPXR was developed out of our extensive work on contrast data mining, which is related to frequent pattern mining. In experiments, the PXR models built by CPXR are very accurate in general, often outperforming state-of-the-art regression methods by wide margins. Using several simple patterns and (piecewise) linear local regression models, those PXR models are easy to interpret. CPXR is especially effective for high-dimensional data. The CPXR methodology can be used for analyzing prediction models and correcting their prediction errors. I will also discuss how to use CPXR for classification, including results on medical risk prediction for traumatic brain injury and heart failure.
This talk is based on the following recent paper: Guozhu Dong and Vahid Taslimitehrani. Pattern-Aided Regression Modeling and Prediction Model Analysis. IEEE Transactions on Knowledge and Data Engineering.   27:9 (2452--2465), 2015.
Bio: Guozhu Dong is a full professor at Wright State University. His main research interests are data science, data mining and machine learning, bioinformatics, and databases. He has published over 150 articles and two books on data mining, and he holds 4 US patents. He is widely known for his (pioneering and extensive) work on contrast/emerging pattern mining and applications, and for his work on first-order maintenance of recursive and transitive closure queries/views.
This seminar is hosted by Prof. Hong Cheng.
Venue: Room 1009,
      William M.W. Mong Engineering Building (ERB),
      (Engineering Building Complex Phase 2)
      The Chinese University of Hong Kong.
Friday, April 22, 2016 - 08:30 to 09:30