A nice and elegant book on data science

## Some Key Ideas

### The Epicycle

1. Question and question refining
2. EDA
3. Modeling
4. Interpretation
5. Communication

The example of asthma in US is a nice, easy and clear example about the integration of these activities.

### Types of Questions

Leek, J. T., & Peng, R. D. (2015). What is the question? Science, 347(6228), 1314–1315.

1. Descriptive
2. Exploratory
3. Inferential
4. Predictive
5. Causal
6. Mechanistic

### A Good Question

1. of interest to you audience
3. plausible in your knowledge framework; it should be finding correlations that can already be identified as correlated using the domain knowledge.
4. answerable: the question should be answerable with current technology or dataset or theory.
5. specificity: quantify measures, population, sampling, as much as possible

### Bias

1. recall bias: about the sample response

### When you are asked to do something

1. communicate with others to make sure that you can agree on a question to be answered
2. make sure the question is a good question
3. determine what type of question it is

## MISC

Some random thoughts.

### We need a knowledge database for the company

Going through the data analysis process, I found that it is often important to make connections to the current knowledge. For example, it is the key step to make sure the question is not answered.

For academic research, it is usually done through looking up in the literature. When the objective or question is related to some internal data and internal product, it is generally not possible to look up in some public database.

Then we need a data analysis question/objective database. While developing the business, we could accumulate a lot of analysis/questions. If some questions are correlated to other questions, it is generally a good idea to make a connection.

Planted: by ;

No backlinks identified. Reference this note using the Note ID reading/art-of-data-science.md in other notes to connect them.

L Ma (2019). 'The Art of Data Science', Datumorphism, 04 April. Available at: https://datumorphism.leima.is/reading/art-of-data-science/.