William C. Huguenin
Databases & Cloud
SQL (MySQL, Postgres)
AWS (EC2, Redshift)
Python Libraries
Data Science & Analytics
Predictive Analytics
Regression Analysis
Clustering & Segmentation
Database ETL & Reporting
Web Scraping & APIs
Data Visualization
D3.js (basic)
Periscope Data
Metis Data Science Bootcamp
Data Science Certificate 2016
Tufts University
BA Quantitative Economics 2011
Magna Cum Laude
University College London
Study Abroad: Economics 2010
New York, NY
Data Scientist
10/2018 - Current

Research and advisory firm benchmarking the digital performance of consumer brands.

  • Automated the classification of Instagram images by content type using Google Vision API data, non-negative matrix factorization, and k-mean clustering, enabling deeper social media marketing research company-wide  (ongoing project).

  • Developed improved methodology for benchmarking influencer marketing performance for beauty and haircare brands. Onboarded new data partner and scaled data ingestion and QA processes, expanding analysis coverage from 50 to 1500 influencers.

Junior Data Scientist
02/2017 - 09/2018

  • Built pipelines for storage, enrichment, and reporting of scraped Chinese search engine, e-commerce, and social media data, reducing analyst burden in accessing and analyzing data.

Abt Associates
Bethesda, MD
Programmer Analyst - Social and Economic Policy
11/2014 - 03/2016

Mission-driven, global leader in public policy research and program implementation .

  • Analyzed college enrollment and retention outcomes for low-income students participating in federal college access programs. Proposed results-driven improvements to deployment of pilot college advising program, leading to $300,000 contract extension from client agency.

  • Promoted twice and six months ahead of schedule. Received six spot bonus awards recognizing outstanding quality of work on project management and data analysis.

Research Assistant - Social and Economic Policy
08/2011 - 10/2014

  • Constructed postsecondary academic and employment outcome measures (e.g. degree(s) obtained, time to degree(s)) for students participating in National Science Foundation scholarship programs. Used metrics to monitor success of current NSF programming and drive recommendations for future improvements.

  • Standardized and cleaned large (millions of records) property transaction and foreclosure dataset for analysis of HUD community redevelopment programs.

Side Projects

Built Python library allowing users to join Pandas dataframes on approximate rather than exact key matches.

Yelp User Segmentation

Segmented Yelp users by topics most often discussed in restaurant reviews in order to customize users' online experience.

Airline Sentiment Analysis

Compared performance of various sentiment analyzers on tweets sent to U.S. airlines. Project featured in Data Science Weekly Issue 140.