
Data science ethics
Some common ways people do this, either intentionally or unintentionally, include:
Claiming causality where it’s not in the scope of inference of the underlying study
Distorting axes and scales to make the data tell a different story
Visualizing spatial areas instead of human density for issues that depend on and affect humans
Omitting uncertainty in reporting
How plausible is the statement in the title of this article?

What does “research shows” mean?

Moore, Steven C., et al. “Association of leisure-time physical activity with risk of 26 types of cancer in 1.44 million adults.” JAMA internal medicine 176.6 (2016): 816-825.
What is the difference between these two pictures? Which presents a better way to represent these data?

What is wrong with this picture? How would you correct it?


What is wrong with this picture? How would you correct it?


What is wrong with this picture? How would you correct it?


Do you recognize this map? What does it show?





On December 19, 2014, the front page of Spanish national newspaper El País read “Catalan public opinion swings toward ‘no’ for independence, says survey”.



Calling Bullshit
The Art of Skepticism in a
Data-Driven World
by Carl Bergstrom and Jevin West
