David Kozak
Quantitative Research Scientist

The focus of my research is theoretical machine learning and uncertainty quantification, but theory is only useful insofar as it drives practical innovation. For eight years I have applied this philosophy to the financial power markets, coupling mathematical knowledge with the fundamental insight of experienced market professionals to develop fully automated trading algorithms in order to successfully adapt to dynamics in the most volatile markets in the world.

Yes Energy
Boulder, Colorado
Consulting Statistician
08/2018 to Present

I have been conducting interpretable analytics to provide insights to the management team at Yes Energy and develop aspects of the quick signals product. The primary focuses of my work over this time have been two-fold:  First, to create algorithms to fuse data from the live power sensors into a single index in order to predict LMPs at Western Hub for ICE trading. More recently I have been working to develop short term (24 hour) load forecasts with five minute resolution, along with estimates of the uncertainty of the forecast.

eXion Energy
Boulder, Colorado
Consulting Statistician
01/2016 to 02/2018

I applied cutting-edge machine learning techniques to assist in understanding every aspect of the virtuals trading process for MISO, SPP, and PJM. Projects I oversaw included:

- Forecasting day ahead and real time energy prices.

- Portfolio optimization according to pre-specified objective functions.

- Risk analysis at nodal and portfolio level.

- Fundamentally driven statistical models of congestion prices.

Using these tools, I constructed automated trading strategies for MISO, PJM, and SPP that were profitable in every quarter of operation.

Endurance Energy
Boulder, Colorado
09/2011 to 05/2015

Using econometric and machine learning techniques I traded virtual day-ahead futures on MISO and SPP, FTRs in MISO, and BalDay/BalWeek contracts on ICE. In addition to my trading responsibilities I conducted several research-oriented projects including:

- Predicting ISO-level and regional load curves, congestion component of real-time LMPs, volatility of real-time LMPs, Day-ahead LMP values, and wind power generation.

- Developing statistically sound risk models to minimize company-wide drawdowns.

- Developing performance metrics to characterize strengths and weaknesses of individual traders and strategies.

Colorado School of Mines · Poate Fellowship

Awarded annually to one outstanding first-year graduate student in the Applied Mathematics and Statistics department.

American Statistical Association · Maurice Davies Award

Awarded annually to one outstanding statistics PhD student in the front range of Colorado and Wyoming. 

Summer Schools
Gene Golub Summer School · Inverse Problems: Systematic Integration of Data with Models under Uncertainty

Internationally competitive summer school taught by world class researchers to 50 PhD students from around the world.

Regularization in Machine Learning

Internationally competitive summer school taught by world class researchers to 40 PhD students from around the world.

Colorado School of Mines
09/2015 to Present
PhD Statistics
Colorado School of Mines
09/2015 to 05/2018
MSc Statistics 2018
University of Colorado
09/2006 to 05/2011
BSc Applied Mathematics 2011
A Nonstationary Designer Space-Time Kernel

Presented at the 2018 Neural Information Processing Systems (NeurIPS) conference, the largest Machine Learning and AI conference in the world. In this work we present a new method for modeling time series data that deals with uncertainty in a natural way. 

Sampled Tikhonov Regularization for Large Linear Inverse Problems

Under revision at IOP Inverse Problems. In this work we present a method for an adaptive parameter to prevent overfitting in the big data domain.

Intraday Load Forecasts with Uncertainty

In preparation. In this work we describe a method for performing accurate short term load forecasts with rigorous estimates of the uncertainty.

Stochastic Subspace Descent

In preparation. In this work we develop a method for optimizing extremely large scale problems for which access to the gradient is not feasible.

T-SQL (SQL Server)
PL/SQL (Oracle)
Learning Theory
Probability Theory
Inverse Problems