Newell-Simon Hall 3124
208 S. Millvale Ave.
Pittsburgh, PA 15224
Carnegie Mellon University
Pittsburgh, PA 15213
11 Sep 2005
I am currently studying the applications of logistic regression, Dynamic AD-trees, and other statistical classifiers to high-
dimensional data mining problems for discriminative and modeling applications. Though I find algorithms and data structures
fascinating, my computing interests extend to the hardware and software systems that enable massive computation and au-
tomated data analysis. My interests also include the technological, economic, and social environments that facilitate and
constrain scientific exploration and problem solving. I prefer collaborative research and development with direct application
to real-world problems. I am project-oriented, persistent, and comfortable in environments where interdisciplinary teams solve
impossible problems on a tight budget in record time.
(Algorithms, Combinatorics, and Optimization) Carnegie Mellon University, advised by Andrew Moore, May 2004.
(Algorithms, Combinatorics, and Optimization) Carnegie Mellon University, advised by Andrew Moore, May 1998.
(Mathematics) Western Washington University, Magna Cum Laude, Graduation with Distinction in Mathematics, June
· NASA Space Grant Scholarship Pennsylvania Space Grant Consortium, December 2001
· Outstanding Participant CMU Center for Nonlinear Analysis Summer Undergraduate Applied Mathematics Institute,
· Outstanding Mathematics Graduate Western Washington University, June 1997
· Seafirst Excellence Award, 4 year merit scholarship from Seafirst Bank, June 1992
· Washington Scholars Award, 4 year merit scholarship from Washington State, June 1992
Education Related Activities and Employment
Carnegie Mellon University, Robotics Institute, Auton Lab, May 2004 through present
Pfizer Collaboration Lead
Carnegie Mellon University, Robotics Institute, Auton Lab, January 2003 through present,
coordinated development of Auton software products for the Auton-Pfizer collaboration
, January 2002 to April 2002, researched wireless communications, developed
an elevator interface protocol and accompanying software, provided some systems support,
and delivered finished functional prototypes for these systems (Aethon develops a mobile
robot for hospital use)
Carnegie Mellon University, Robotics Institute, Auton Lab, 2000 through present, includ-
ing planning, acquisition, deployment, maintenance and security of computing resources
and servers; also helped hire, train, and supervise three systems administrators.
Carnegie Mellon University, Department of Mathematical Sciences, August 1997
through May 2004, except when teaching (see below)
Western Washington University, Department of Mathematics, September 1996 to June
1997, developed Maple-based two- and three-dimensional tomographic reconstruction soft-
ware for convex polytopes
Western Washington University, Department of Mathematics' Math Center, September
1994 to June 1997, upper-division mathematics tutor
Deployed Hardware and Software Systems
May 2005- First source release for my Logistic Regression with Truncated Regularized It-
eratively Re-weighted Least Squares software. Licensed under the GNU General Public Li-
cense (GPL), available at
AFC Active Learning
September 2004- Active learning software for scheduling roboticized pharmaceutical ex-
periments (see AFC, below). I am responsible for the design, implementation, and main-
tenance of this software. This software has been delivered to the sponsor, and will be
maintained and distributed contingent on future contracts.
Auton Fast Classifiers (AFC)
April 2002- Fast classification software for high-dimensional datasets. I provided new al-
gorithms and eventually took over the entire software system, including the user interface,
learner and and dataset framework, performance evaluation, and documentation. This soft-
ware is still in use by the sponsor, is maintained regularly, and has been widely distributed.
Aethon Elevator Controller
January 2002- Software for managing a single-board computer and serial interface board
connected to an elevator's control system. I developed a protocol and daemons for bidirec-
tional communication between a mobile robot and a passenger elevator. Aethon's current
elevator controller is a derivative of my prototype.
Aethon Wireless Relays
January 2002- Stand-alone devices for ad-hoc relaying of communications between mobile
robots and an elevator controller (above). I selected the embedded hardware, created a
small GNU/Linux operating system and installation utilities adapted to compact flash, and
wrote the message relay software. Aethon's current version of this device is a derivative of
Auton Build System
-July 2001- I maintain the makefile and scripts used for building all Auton software on
various compilers, microprocessors, and POSIX-ish environments.
Auton Compute Infrastructure
-January 2000- Servers, compute machines, storage and services used by Auton lab mem-
bers for research. My responsibilities include
· hardware and software selection, procurement, and deployment
· maximizing performance for niche scientific needs on a limited capital budget in a
· maintaining vendor relationships and negotiating affordable prices
· understanding the current high-performance and consumer computing markets, both
for our needs and for occasional advisement of clients and other academics.
· maintaining software, security, and services
This collection of user and server systems is used daily and maintained constantly. Some
responsibilities have been shared with additional admins since Spring 2002.
Fall 1996 Software package for reconstructing a density function over a convex polytope
using only information from (n-1)-dimensional integrals ("x-rays"). I developed this soft-
ware for a math professor to use as part of his Geometric Tomography classes.
Publications and Talks
· Paul Komarek and Andrew Moore, Making Logistic Regression A Core Data Mining Tool with TR-IRLS, International
Conference on Data Mining
, 2005 (ICDM 2005)
· Paul Komarek, Logistic regression for fast, accurate, and parameter free data mining, Invited talk at Google Inc., July
· Paul Komarek and Andrew Moore, Making Logistic Regression A Core Data Mining Tool: A Practical Investigation of
Accuracy, Speed, and Simplicity, Technical Report TR-05-27 at the Robotics Institute, Carnegie Mellon University,
· Paul Komarek, Autonomous Fast Classifiers for Pharmaceutical Data Sets, Invited talk at Applied Biosystems Inc., July
· Paul Komarek, Autonomous Fast Classifiers for Pharmaceutical Data Sets, Invited talk at the Midwest Biopharmaceu-
tical Statistics Workshop
2004 (MBSW 2004)
· Paul Komarek, Logistic Regression for Data Mining and High-Dimensional Classification, Doctoral Thesis, 2004
· Alex Gray, Paul Komarek, Ting Liu, and Andrew Moore, High-Dimensional Probabilistic Classification for Drug Dis-
covery, Computational Statistics, 2004 (CompStat 2004)
· Anya Goldenberg, Paul Komarek, Jeremy Kubica, Andrew Moore, and Jeff Schneider A Comparison of Statistical and
Machine Learning Algorithms on the Task of Link Completion, Knowledge Discovery in Databases, 2003 (KDD 2003)
· Paul Komarek and Andrew Moore, Fast Logistic Regression for Large Sparse Datasets with Binary Outputs, Artificial
Intelligence and Statistics
, 2003 (AISTAT 2003)
· Paul Komarek and Andrew Moore, A Dynamic Adaptation of AD-trees for Efficient Machine Learning on Large Data
Sets, International Conference on Machine Learning, 2000 (ICML 2000)
· Paul Komarek, Canonical Ramsey NumbersA New Lower Bound for Off-Diagonal Ramsey Numbers, Joint Mathe-
matics Meetings (AMS/MAA)
Other Professional Activities
· Mentoring Auton graduate students, Jan 2002 to Dec 2002 and Jan 2005 to present.
· Advising an undergraduate intern in the Auton lab, June 2005 to present.
· Supervising part-time system administration employees, March 2002 to June 2005.
· Research advisement of graduate and undergraduate students in the Auton Lab, at the Robotics Institute, Carnegie Mellon
University, May 2004 to present.
· Refereeing submissions to the Information Systems journal and several conferences, including the IEEE Transactions on
Knowledge and Data Engineering (TKDE), Uncertainty in Artificial Intelligence (UAI), Knowledge Discovery and Data
Mining (KDD), and Neural Information Processing Systems (NIPS).
· Teaching Assistant for the Department of Mathematical Sciences, Carnegie Mellon University, August 1998 to December
· Participant in the Center for Nonlinear Analysis' Summer Undergraduate Applied Mathematics Institute at Carnegie
Mellon University, Summer 1996.
When outdoors I enjoy soccer, hiking, travel and photography. Indoors, I dabble in electronics and use my embedded computing
experience for entertainment. I like to combine software, hardware, woodworking and metalworking to complete "essential"
upgrades to our home, including a small home theater.
Available on request.