Garrett Hoffman
| Machine Learning Infrastructure & Engineering
Experience
Staff Software Engineer, ML Infrastructure
Reddit · 
Remote
 · 
Apr. 2020 to Current

  • Architect, design and build a company wide centralized machine learning platform including an online and offline feature store, automated model training and evaluation pipelines, model serving and model monitoring.
  • Develop and own backend microservices that support the ranking and serving of feeds, notifications, email digests and recommendation units serving over 10,000 requests/second.
  • Manage data infrastructure related to machine learning platform and microservices including Kafka and Flink consumers, Cassandra clusters and Redis and Memcached clusters. 
Freelance Machine Learning Engineer
Self-Employed · 
Remote
 · 
Aug. 2017 to Current

  • Built a prototype of a question answering system for tabular data for a linguistic AI start-up using an end to end memory network to extract answers to free-formed questions from tabular data.

  • Developed and deployed a clustering model for a health tech startup using representation learning, fine-tuning and BERT embeddings to summarize eligibility criteria for clinical trials. 

  • Analyzed how employees' engagement with web portals and call centers impact health benefit election decisions for HR consulting company utlizing gradient boosting and feature importance.

Director of Data Science & Machine Learning
Stocktwits · 
New York, NY
 · 
Aug. 2016 to Mar. 2020

  • Lead data science and machine learning related activities, including building/managing data infrastructure, developing and deploying models and constructing ETL pipelines.
  • Research, build and deploy data products including sentiment models,  moderation tools, image classifiers, recommendation systems, algorithmic feed sorting, credibility scoring, and trend finding.
  • Develop metrics, heuristics, and indices to provide users with a layer of structured analytics on top of a high volume and unstructured data set.
  • Manage a team of two data scientists/analysts and serve as product manager for data products by developing, prioritizing and grooming long term data science roadmap.
  • Evangelize data science, machine learning and artificial intelligence at Stocktwits through the publication of technical blog posts, presentation and conference talks.

Instructor
Metis · 
Remote
 · 
Mar. 2018 to Feb. 2020

  • Teach part-time professional development courses in Beginner Python and Math for Data Science and Introduction to Data Science.
  • Deliver lectures on computer science topics including data structures, operations, control flow, functions, and python packages Numpy, Pandas, Matplotlib and Sci-kit Learn.

  • Deliver lectures on math and machine learning topics including linear algebra, calculus, probability, statistics, supervised/unsupervised learning, feature selection, and model evaluation. 
Actuarial Consultant, Planning and Risk Management
Xerox · 
Secaucus, NJ
 · 
July 2011 to July 2016

  • Lead Advanced Data Analytics team tasked with product development for new data science products and services.
  • Worked closely with clients to identify issues and opportunities surrounding employee benefit programs and deliver financial/strategic recommendations.
  • Performed valuations of liabilities in excess of $10 Billion through engineering of life contingency models  and identify actionable insights.
  • Forecasted future funding and reporting requirements to develop budget recommendations and  risk management strategies.
  • Communicated quantitative results to non-technical supervisors and clients through presentations and written reports to inform decision making process.

Education
Georgia Institute of Technology
Dec. 2019
Master of Science Computer Science
Specialization in Machine Learning and Interactive Intelligence
The College of New Jersey
May 2011
Bachelor of Arts Mathematics
Minors in Finance and Statistics
Skills
Math, Stats and Machine Learning: Probability Theory, Machine Learning, Deep Learning, Statistics, Simulation, Optimization, Natural Language Processing, Computer Vision
Computer Science: Databases, APIs, Containerization, Architecture Design, Data Structures, Algorithms, Testing
Tools/Technology: python, SQL, TensorFlow, AWS, Docker, Kubernetes, Spark, Kafka, NoSQL, Git/Github
Self Directed Learning
Data Engineering Nanodegree
Sept. 2019
Udacity
Deep Learning Specialization
Feb. 2018
Coursera / deeplearning.ai
Deep Learning Nanodegree
May 2017
Udacity
Side Projects
Sagecli  · 
Dec. 2019 to Current

  • Open-source command-line interface for AWS Sagemaker written in Go.
  • Enables Data Scientists and Machine Learning Engineers to create and manage Jupyter Notebook Instances, Model Training Jobs and Model Endpoints from the command line.
  • Easily define Model Training Jobs and Model Endpoints with YAML configuration files.

Tramline  · 
Dec. 2019 to Current

  • B2B SaaS product for monitoring Machine Learning models in production.
  • Log model predictions, track distribution shift of model features and labels and send alerts when models need to be retrained.
  • Web-based dashboard for viewing logs, dashboards and configuring alerts to be sent via slack, email or webhook.

Talks & Blogs
Evolving Reddit’s ML Model Deployment and Serving Architecture
Oct. 2021
RedditEng
Deep Learning Methods for NLP
Sept. 2019
O'Reilly Strata 2019 / O'Reilly AIConf 2019 / ODSC East 2018
Deploying Data Science Applications
May 2019
ODSC East 2019 / ODSC Accelerate AI 2019
How Neural Networks Learn Distributed Representations
Feb. 2018
O'Reilly Media
Convolutional Neural Networks for Language Tasks
Feb. 2018
O'Reilly Media
Introduction to LSTMs with TensorFlow
Jan. 2018
O'Reilly Media