Full Name
Lee K. Jones
Job Title
Professor Emeritus Mathematics and Statistics
Company
UMass Lowell
Speaker Bio
Dr. Lee K. Jones is professor emeritus of mathematics and statistics at the University of Massachusetts Lowell. He held additional responsibilities as a coordinator of the Computational Mathematics Doctoral Program and served as a consultant to the Submillimeter Wave Technology Laboratory. Prior to his academic tenure, Dr. Jones worked as an engineer and principal scientist at Lockheed, MIT Lincoln Laboratory, and The Naval Research Laboratory. Currently he is consulting for I.P.A.( Individual Prediction Analysis).

His research interests are diverse, spanning areas such as transportation, mathematics of data mining, radar image processing, pattern recognition, and simulation. Dr. Jones has made significant contributions to his field, as evidenced by his extensive publication record, which includes works on robust controls for traffic networks, the development of algorithms for genetic data analysis, and advancements in numerical methods and statistical theories.

With I.P.A. he is focusing exclusively on local statistical learning, in particular: on truly local determination of bandwidth without ( global) methods of cross-validation, on finding important variables for interpretation for individual queries as these variables may change from query to query when the target function is non-linear , and on finding lower dimensional structure that provides user visualization and accurate prediction for the given query.
Abstract
We first review some optimal finite sample accuracy bounds in Loader [1999] and J. [2009] for weighted k-nearest neighbor rules for estimating a function f(x) at a point xo under ‘smoothness’ assumptions and conditions of approximate linearity in the spherical neighborhood about xo containing k nearest neighbors. Such bounds depend only on the design points and not on the responses. We extend these results here to include neighborhoods which consist of the convex hull of xo and any sub-collection of the neighbors. These new bounds can be computed using greedy convex optimization and Karmarkar’s linear programming algorithm with a quadratic constraint. We indicate how to implement the new results for binary classification problems where f(x) = Pr(Success|x) by determining appropriate neighborhoods from the bounds and using information criteria measuring the degree of approximate linearity of f(x) in the neighborhood based on the design and response sample. We search (greedily) over lower dimensional subspaces S of the design domain for accurate estimates at xo of fs(x) = E(f(x)|x) where xo and x are projections of the design data onto the subspace S. Since the bounds do not depend on the observed responses, overfit is avoided and computation is the only cost. On the other hand, thresholding an information criteria to determine approximate linearity will have type 2 error. Additional information measures, with or without response dependence, can be combined with the latter two procedures in a strategy to pick the projection and neighborhood.
Lee Jones