Review of Exploratory Data Analysis best practices

Summary of best notebooks and blogs

Data Science
Author

Jaume Amores

Published

May 31, 2024

Note: this post is just a draft in progress. As of now, it consists of a collection of random notes.

Introduction

The objective of this post is to distill a set of lessons, best practices, and guidelines, on the available blogs and notebooks on EDA. This is a work-in-progress post and contains just a set of random thoughts and pointers at the moment.

Beyond EDA

EDA can be regarded as providing visual explanations based on statistical analysis of data. Each of these two axes, visual explanations and statistical analysis, are in some sense orthogonal to each other. In this section, we explore each of these individually.

Bibliography

Papers

https://theta.edu.pl/wp-content/uploads/2012/10/exploratorydataanalysis_tukey.pdf

https://www.researchgate.net/profile/Dr-Subhendu-Pani/publication/337146539_IJITEE/links/5dc70b124585151435fb427e/IJITEE.pdf

https://books.google.ie/books?hl=en&lr=&id=LbLrbQp6FoMC&oi=fnd&pg=PA33&dq=review+of+exploratory+data+analysis&ots=-KBhcbcCwk&sig=4lyu8HHwwKMduML5aTdzHGMkXCk&redir_esc=y#v=onepage&q=review%20of%20exploratory%20data%20analysis&f=false

https://www.stat.berkeley.edu/~brill/Stat153/EDASage.pdf

https://vyomaonline.com/studymaterial/uploads/pdf/2020/12/06_5abebacc31a558ecd6ba2d836f1db10f.pdf#page=60

https://air.unimi.it/bitstream/2434/815200/2/csur21main.pdf

https://d1wqtxts1xzle7.cloudfront.net/100103374/00401706.2019.167953520230320-1-e0odmm-libre.pdf?1679350681=&response-content-disposition=inline%3B+filename%3DExploratory_Data_Analysis_With_MATLAB.pdf&Expires=1717173901&Signature=X0aq2tc0X-fRmde2ti3zAFbiuzJ5QBMenCfXkcLmPm1MtLpQHXl8j6ARC0dYwxsppQxErHWGsqlp5NWBz4egQsryD0X5sIhlP311rLTt0mk6MlDH3O7rx0CrmxTD9rTRpJriKx5hmxh9mS3lKeutMXUuQvjvEEAP7ncY9PfpVuj8bIym924lEsg7076TyhcP6nSZJqGdannBaCYyRJSaisM1DG-mG1LEZx0BBS-D9EjEVQrpP3Ygk9Py6V8Wio1C7VqzSxnxRKTsUdMO23yn7Roc~RLGUetDrs2keopDmb2Fj4vKHyaqQ5AYYfSJdquoEvjwRCL32XCQQ49hTw__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA

https://arxiv.org/pdf/1911.00568

https://web.archive.org/web/20191128121756id_/https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/9781118422007.fmatter

https://drive.google.com/file/d/15Z83ztCYTqHSpDZdxzm_CTy-FzZXlicP/view

https://friendly.github.io/6135/papers/Buja-etal-2009.pdf

https://www.redalyc.org/pdf/2990/299023509014.pdf

DATA VISUALIZATION IN EXPLORATORY DATA ANALYSIS: AN OVERVIEW OF METHODS AND TECHNOLOGIES

Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference

Exploratory data analysis (EDA) machine learning approaches for ocean world analog mass spectrometry

https://www.academia.edu/download/100103374/00401706.2019.167953520230320-1-e0odmm.pdf

https://www.sciencedirect.com/science/article/pii/S2772662223000528

Exploratory Data Analysis for Building Energy Meters Using Machine Learning https://journal.ittelkom-pwt.ac.id/index.php/jtece/article/download/934/348

Prediction of Automobiles Prices Using Exploratory Data Analysis Based on Improved Machine Learning Techniques

https://www.jatit.org/volumes/Vol101No11/7Vol101No11.pdf

https://library.acadlore.com/ATAIML/2022/1/1/ATAIML_01.01_03.pdf

Notebooks

https://www.kaggle.com/competitions/titanic/discussion/176690

https://www.kaggle.com/code/neomatrix369/chaieda-sessions-titanic

https://www.kaggle.com/code/allohvk/titanic-advanced-eda?scriptVersionId=77739368

https://www.kaggle.com/discussions/general/431668#2390311

https://cran.r-project.org/web/packages/dlookr/vignettes/EDA.html

https://r4ds.had.co.nz/exploratory-data-analysis.html

https://r4ds.hadley.nz/

github

https://github.com/xiaodaigh/awesome-eda

https://github.com/qinwf/awesome-R