Huafeng (Hua) Zhang

A critical thinker who wants to connect the dots using data-driven solutions

Coding: R, Python, SQL, GitHub
Statistics: Hypothesis Testing, Clustering & Regression Modeling, Natural Language Processing, Sampling, Experimental Design
Montana State University
B.S. Mathematics - Statistics 2017
Minor in Economics (GPA: 3.76)
Major GPA 3.80, Overall GPA: 3.80
Pi Mu Epsilon (Math Honor Society)
Graduate with the Highest Honor
Additional Info
• Completed graduate level statistics courses on sampling, experimental design, probability theory
• Completed data science courses on DataCamp and Coursera
• Fluent in Chinese and English
the Refugee Center Online · Data Science Consultant
11/'17 - 08/'18
Portland, OR
Research Analyst · Google LLC · 
Seattle, WA
07/'18 - Present

  • Lead and manage multiple analysis tasks for risk assessment across several products at Google
  • Develop workflows and dashboards for teams to daily monitor unexpected behavior such as fraud detection, data policy violation and etc
  • Interact with several Google core systems to create metrics to support several audits
  • Built advanced analytical tools for auditors to enable them to do statistical sampling or visualize text data using one click button
  • Host data trainings to give auditors capability of working with data independently and understanding of some common statistical learning method
  •  Use statistics and research to support diversity, equity, and inclusion efforts at Google

Data Scientist · CityBldr · 
Seattle, WA
10/'17 - 06/'18

  •  Leading project on translating unstructured information into interpretable features for CityBldr's machine learning pipeline using NLP techniques such as Naive Bayes and Topic Modeling 
  • Developing comprehensive model of each city’s unique development infrastructure by combining data from various sources (the only one in the data science team to build and implement the project pipeline)
  • Scraped and stored real estate/GIS data using web scraping, SQL and QGIS
  • Pulling insights using multivariate data analysis to inform company strategy and public communication messages, and presenting results to engineering, project management and sales teams

Statistical Consultant · Montana State University · 
Bozeman, MT
01/'17 - 08/'17

  • Helped researchers identify patterns in antibody profiles by performing hierarchical cluster analysis, and applied a chi-squared test to compare these patterns to the antibody profiles 
  • Presented Monte Carlo simulation of African lion population sampling to ecology researcher using R shiny
  • Summarized project status and provided feedback to consultants and clients

Improving a Predictive Model of Student Progress by Adding Learned Features from Unstructured Text Data

Applying mixed effects logistic regression approach and investigating NLP methods to derive additional features from unstructured text responses.

Financial Health of United States Banks in 2016

Used k‐means cluster analysis in R to study patterns of financial health; found failed banks were all in one cluster.

Study and Analysis of Household Wi‐Fi Speed

Designed balanced 3‐factor factorial experiment; collected data; used three‐way ANOVA and orthogonal contrast test in SAS.

Algorithms for Climate Data Sonification

Built generalized additive model for temperature data; created R algorithm to see if sonification can enhance understanding.

Analysis of Website Traffic Data

Used Google Analytics API for traffic data; applied ANOVA F‐Test and Tukey‐Kramer test to find optimal maintenance days.

Conditional Probability and Information Retrieval

Explained how to use conditional probability to solve a particular problem in the field of information retrieval.