Garrett Hoffman
| Machine Learning Infrastructure & Engineering
Experience
Staff Software Engineer, ML Infrastructure |
Reddit ·
Remote
·
Apr. 2020 to Current
|
- Architect, design and build a company wide centralized machine learning platform including an online and offline feature store, automated model training and evaluation pipelines, model serving and model monitoring.
- Develop and own backend microservices that support the ranking and serving of feeds, notifications, email digests and recommendation units serving over 10,000 requests/second.
- Manage data infrastructure related to machine learning platform and microservices including Kafka and Flink consumers, Cassandra clusters and Redis and Memcached clusters.
Freelance Machine Learning Engineer |
Self-Employed ·
Remote
·
Aug. 2017 to Current
|
Built a prototype of a question answering system for tabular data for a linguistic AI start-up using an end to end memory network to extract answers to free-formed questions from tabular data.
Developed and deployed a clustering model for a health tech startup using representation learning, fine-tuning and BERT embeddings to summarize eligibility criteria for clinical trials.
Analyzed how employees' engagement with web portals and call centers impact health benefit election decisions for HR consulting company utlizing gradient boosting and feature importance.
Director of Data Science & Machine Learning |
Stocktwits ·
New York, NY
·
Aug. 2016 to Mar. 2020
|
- Lead data science and machine learning related activities, including building/managing data infrastructure, developing and deploying models and constructing ETL pipelines.
- Research, build and deploy data products including sentiment models, moderation tools, image classifiers, recommendation systems, algorithmic feed sorting, credibility scoring, and trend finding.
- Develop metrics, heuristics, and indices to provide users with a layer of structured analytics on top of a high volume and unstructured data set.
- Manage a team of two data scientists/analysts and serve as product manager for data products by developing, prioritizing and grooming long term data science roadmap.
- Evangelize data science, machine learning and artificial intelligence at Stocktwits through the publication of technical blog posts, presentation and conference talks.
Instructor |
Metis ·
Remote
·
Mar. 2018 to Feb. 2020
|
- Teach part-time professional development courses in Beginner Python and Math for Data Science and Introduction to Data Science.
- Deliver lectures on computer science topics including data structures, operations, control flow, functions, and python packages Numpy, Pandas, Matplotlib and Sci-kit Learn.
- Deliver lectures on math and machine learning topics including linear algebra, calculus, probability, statistics, supervised/unsupervised learning, feature selection, and model evaluation.
Actuarial Consultant, Planning and Risk Management |
Xerox ·
Secaucus, NJ
·
July 2011 to July 2016
|
- Lead Advanced Data Analytics team tasked with product development for new data science products and services.
- Worked closely with clients to identify issues and opportunities surrounding employee benefit programs and deliver financial/strategic recommendations.
- Performed valuations of liabilities in excess of $10 Billion through engineering of life contingency models and identify actionable insights.
- Forecasted future funding and reporting requirements to develop budget recommendations and risk management strategies.
- Communicated quantitative results to non-technical supervisors and clients through presentations and written reports to inform decision making process.
Education
Georgia Institute of Technology |
Dec. 2019
|
Master of Science Computer Science
Specialization in Machine Learning and Interactive Intelligence
The College of New Jersey |
May 2011
|
Bachelor of Arts Mathematics
Minors in Finance and Statistics
Skills
Math, Stats and Machine Learning:
Probability Theory,
Machine Learning,
Deep Learning,
Statistics,
Simulation,
Optimization,
Natural Language Processing,
Computer Vision
Computer Science:
Databases,
APIs,
Containerization,
Architecture Design,
Data Structures,
Algorithms,
Testing
Tools/Technology:
python,
SQL,
TensorFlow,
AWS,
Docker,
Kubernetes,
Spark,
Kafka,
NoSQL,
Git/Github
Self Directed Learning
Data Engineering Nanodegree |
Sept. 2019
|
Udacity |
Deep Learning Specialization |
Feb. 2018
|
Coursera / deeplearning.ai |
Deep Learning Nanodegree |
May 2017
|
Udacity |
Side Projects
Sagecli
·
Dec. 2019 to Current
|
- Open-source command-line interface for AWS Sagemaker written in Go.
- Enables Data Scientists and Machine Learning Engineers to create and manage Jupyter Notebook Instances, Model Training Jobs and Model Endpoints from the command line.
- Easily define Model Training Jobs and Model Endpoints with YAML configuration files.
Tramline
·
Dec. 2019 to Current
|
- B2B SaaS product for monitoring Machine Learning models in production.
- Log model predictions, track distribution shift of model features and labels and send alerts when models need to be retrained.
- Web-based dashboard for viewing logs, dashboards and configuring alerts to be sent via slack, email or webhook.
Talks & Blogs
Evolving Reddit’s ML Model Deployment and Serving Architecture |
Oct. 2021
|
RedditEng |
Deep Learning Methods for NLP |
Sept. 2019
|
O'Reilly Strata 2019 / O'Reilly AIConf 2019 / ODSC East 2018 |
Deploying Data Science Applications |
May 2019
|
ODSC East 2019 / ODSC Accelerate AI 2019 |
How Neural Networks Learn Distributed Representations |
Feb. 2018
|
O'Reilly Media |
Convolutional Neural Networks for Language Tasks |
Feb. 2018
|
O'Reilly Media |
Introduction to LSTMs with TensorFlow |
Jan. 2018
|
O'Reilly Media |