Dartmouth College
Sept. 2011 to June 2015
BA History 2015
BA Economics 2015
Nov. 2017 to June 2018
Data Science Career Track

I take a liberal arts approach to data science, making insights accessible to every organizational level. I turn the theoretical into the practical,  illuminating the 'why' and the 'what's next.'

The Baseball Futures Market
Winter 2018

  • Constructed a data set of over a million rows of play-by-play data from an online database, manipulated it using pandas and numPy, and connected it to historical odds data from an acquired data set
  • Created visualization using matplotlib and seaborn along with some preliminary exploratory data analysis and inferential statistics Predicted four different betting outcomes for each game with linear and logistic regression and tracked my account
  • Tested on a single season of games, retraining my model every year
  • When wagering $100-500 per bet, the account won around $350,000 over eight season
  • Wrote a blog post (available on my website) describing the project, including an interactive graph built using Dash by Plotly

Computational Linguistics & Lyrics
Spring 2018

  • Used the Genius API to build a data set of lyrics for thirty-three albums from eleven artists to perform natural-language processing Performed regular expression operations to prepare data for analysis
  • Created visualizations showing the distribution of common words
  • Built a pipeline to vectorize and prioritize words with a tf-idf transformer in preparation for classification using a support vector machine Ran 100 different random combinations of training and test data, correctly predicting the artist 79% of the time
  • Clustered to find predicted groupings and cross-genre confusion

Dallas, TX
Data Scientist
Aug. 2018 to Current

  • Working with a team of developers, designers, and strategists to create bespoke and technology-agnostic products for companies in any industry
  • Experience in Commercial Real Estate and Oil & Gas with deliverables aimed at implementing ML & AI into everyday operations
  • General ideation in how best to implement Machine Learning for a company's given state of technology
  • Dedicated time for exploration of the newest industry techniques

BNY Mellon
Dallas, TX
Business Analyst
May 2017 to Feb. 2018

  • Signed a short-term contract to aid in various data efforts, tracking millions of mortgage loans with the preexisting Alteryx infrastructure
  • Used the R tool within Alteryx to drastically increase the flexibility of data processed and in a medium that any analyst could utilize
  •  Consulted on database construction meetings with a focus on translating business requirements to technical definitions and vise versa, while working within a Waterfall data model structure
  • Maintained various R scripts for categories of data requests, resulting in outputs for various levels of the organization
  • Spearheaded a new approach to processing incoming data and correctly updating within system of records by coordinating between managers and workers in three independent departments

NorthStar Anesthesia
Irving, TX
Operations Analyst
Mar. 2016 to Mar. 2017

  • Manipulated billing data to explain Operating Room Utilization and charge hospitals for excess coverage
  • By June, product roll out began and R-generated metrics were used consistently for forty facilities and resulted in $300,000 in billing Power BI and DOMO, combined with the flexibility of R, produced refreshable and filterable graphics on demand, observing the behavior of 2,000 providers for 180 facilities in twenty-one states
  • SQL queries within R pulled data from one of six locations related to: billing, compensation and benefits, time stamp data, financial performance, contract and budget data, and scheduling
  • R fed graphs into Latex to present a critical analysis of the problems specific to their facility and points of improvement to be made if the financial, operational, and political environment allowed it

Programming: Python, R, pandas, numPy, tidy, R-Studio, Jupyter Notebook, LaTeX
Machine Learning: scikit-learn, nltk, Linear Regression, Logistic Regression, Support Vector Machines, Clustering, Pipeline, Principle Component Analysis, tf-idf
Data Visualiztion: matplotlib, seaborn, Bokeh, Dash by Plotly, ggplot, Power BI, DOMO