Black Patients Miss Out On Promising Cancer Drugs

Wrapped up my summer fellowship at ProPublica last week when our investigative piece was published! Give it a read here:

Black Patients Miss Out On Promising Cancer Drugs

A ProPublica analysis found that black people and Native Americans are under-represented in clinical trials of new drugs, even when the treatment is aimed at a type of cancer that disproportionately affects them.

The accompanying data methodology is here: How We Compared Clinical Trial and Cancer Incidence Data

This story was co-published with STAT and can also be found on Mother Jones.


For this story, I pitched the idea and did a ton of research, data analysis, reporting, interviews, all the data visualization— a huge thank you to my wonderful co-author Caroline Chen and amazing editor Sisi Wei!

The story was on the front page the day it published and seemed to be received well. I’ve learned so much from this fellowship and have been super grateful for this opportunity from ProPublica and the Google News Lab.


Update—Statement of impact since our story was published:

Our story was featured on Information is the Best Medicine, a black-owned talk radio station in Pennsylvania, as well as Axios, Vice, Mother Jones and The Atlantic’s People v. Cancer forum. It was reprinted in the Boston Globe and Indianz, a Native American publication. Nonprofit BIO Ventures for Global Health also wrote an op-ed in response to our story, noting that “clinical trials are perpetuating existing health care disparities across the globe.”

In the course of interviewing these patients, we realized that many people don’t understand how trials work, which prompted us to create the Cancer Patient’s Guide to Clinical Trials. The guide has been shared by the Leukemia and Lymphoma Society.

In order to report this story, we made our own database of trial demographics, drawn from FDA websites. Our story focused on cancer, but we made our database — which covers all drugs approved since 2015 — public so that other reporters don’t have to replicate our manual labor.

Predicting Readmission Risk after Orthopedic Surgery

My colleagues and I from the Clinical Research Informatics Core at Penn Medicine gave poster presentations at the Public Health session of the Symposium on Data Science and Statistics last week.

Here's the abstract:

Our project examined hospital readmissions after knee and hip replacement surgeries that took place within the University of Pennsylvania health system. We used a variety of information available within patient electronic health records and an assortment of machine learning tools to predict the risk of readmission for any given patient at the time of discharge after a primary joint replacement surgery. We faced challenges related to missing data. We used a number of different machine learning models such as logistic regression, random forest and gradient boosted trees. We also used an automated machine learning pipeline tool, TPOT, that uses a genetic algorithm to search through the machine learning model/parameter space to automatically suggest successful machine learning pipelines. We trained multiple models that predicted readmissions better than the existing clinical methods, with statistically significant increases in AUC over the clinical baseline. Finally our models suggested a number of features useful for readmission prediction that are not used at all in the existing clinician model. We hope our new models can be used in practice to help target patients at high risk of readmission after joint replacement surgery, and to help inform which interventions may be most useful.
SDSS Poster Presentation

Machine Learning for Healthcare

Yesterday I gave a dev talk at Philly Tech Week on machine learning for healthcare, slides embedded below.

Description: "How are machine learning and data science being adopted in healthcare? From diagnostics, risk predictions, and more, this session will provide an overview of machine learning applications using electronic health records, walk through the process of how a model might be trained and used, and discuss methods for improving interpretability to augment medical decision-making."

Here's a link to the slides of you want to see my notes.

I think the talk went pretty well. In fact, I think I am actually a pretty good speaker, although I'm not sure how much I get out of speaking personally. The talk was pretty well attended, and I did receive a lot of positive feedback, so hopefully I inspired some people in healthcare or machine learning in some way or another.

Music and Mood: Assessing the Predictive Value of Audio Features on Lyrical Sentiment


aka - what's the relationship between the audio features of a song and how positive or negative its lyrics are? 

aka - data analysis of my spotify music data + sentiment analysis + supervised machine learning

aka - my senior thesis

the full jupyter notebook used to conduct this data analysis can be found on my github here: Spotify Data Analysis

(pg. 32 and onward is just the full python jupyter notebook in the appendix.)


Algorithmic Bias

I recently wrote a final paper for my Digital Culture course, titled "Algorithmic Bias and the Myth of Big Data Neutrality" - a really interesting and really important topic to consider in moving forward in our increasingly technological society.

penn play promotional profile pictures

for short, ppppp

adobe photoshop, lightroom  

fnar 247: environmental animation master post

I'm currently taking FNAR247: Environmental Animation. I am really enjoying the course so far so I figured I'd create a post documenting my progress. We have 1-week mini projects that are relatively open-ended but are based loosely on some theme. These are mainly modeled and scripted in 3DS Max using Maxscript, with post-processing work done in Adobe After Effects.

Here are gif-ified (aka a bit too compressed for my liking but also gifs are cool) versions of a few of the animations:


adobe photoshop, digital painting.


nice to get back in the art groove for the summer.


rotary telephone

maya 2014, rendered w mental ray

ambient occlusion pass

ambient occlusion pass

made in 3d computer modeling w scott white

it was our first project, but i went back the past few days and tightened it up, added the cord, played around with lighting. it's crazy to see how much i've improved over the semester!

definitely taught me a lot of patience. i'm pretty sure i've attempted to make every single piece on this thing at least twice. i have at least 8 copies of the body, and i know i made the rotary face at least just as many times. i'd like to think all of the hard work paid off, though!

next project: maya dartboard...

project dump

some misc work from the year

morton salt girl - work in progress 3d modeling, sp '14 made in maya 2014 and mudbox

morton salt girl - work in progress
3d modeling, sp '14
made in maya 2014 and mudbox

art design and digital culture, sp '14 made in blender and after effects

art design and digital culture, sp '14
made in blender and after effects

figure drawing, fall '13

figure drawing, fall '13