Data science projects

Insights From “Helpful” Reviews

Fictitious review for a cat food product.

I used NLP and neural networks to analyze product reviews from Amazon and predict whether they readers would rate them “helpful” (or not). You can read about how I preprocessed and modeled this funky data in this blog post.

A Random Forest for the Trees

bar chart showing accuracy scores of four different machine learning models

I developed a machine learning model that can predict the type of tree growing on a tract of land using only cartographic data like elevation, aspect, and distance to water. You can read about how I evaluated the model in this blog post.

Best Bets in Oregon Real Estate Investment

bar plot showing current and projected future median home prices in five zip-codes

Using data on median home price per county for the last 22 years, I identified the best places to invest in real estate and performed time series modeling to project return on investment in the near future. See my blog post about how I used Facebook Prophet to do this.

Evaluating the Discount Strategy at Northwind Traders

bar plot showing average increase in quantity of items in an order at various discount levels

After extracting data from a SQLite database, I used hypothesis testing to evaluate the effects of discounts on orders placed at the (fictitious) Northwind Traders company. See my blog post on this project.

Pricing Midrange Homes in King County, WA

plot showing locations of homes in King County, color-coded by sale price

Focusing on homes within reach of a middle-income family, I used multiple linear regression to determine which factors most affect home prices (and what families can do about that). You can read about how I prepared the data for modeling in this blog post.