Simpsons Dialogue Search Engine with SpaCy and Scikit-learn

After learning the basics of spaCy, we decided that a Simpsons quote search engine would be a perfectly cromulent project. Luckily, Kaggle has a Simpsons dataset with dialogue from the first 27 seasons. We were not particularly concerned with the lack of dialogue from seasons 28-31. [Read More]

Predicting Box Office Revenue

I created a model from data on over 7300 movies from TMdB, and used it to predict box office revenues for a Kaggle competition. An interative version is deployed on Heroku, and the notebook is on my GitHub.

Did Iron Man kill romantic dramedy?

“Romantic dramedies” thrived from the early 90’s to 2009, then box office revenue suddenly plummeted. Was the rise of superheroes and space operas to blame? I set out to find an explanation: Full article.