8.1. Introduction¶
Probability is the study of random events.
Statistics is the discipline of using data samples to support claims about populations.
Computation is a tool that is well suited to quantitative analysis.
8.1.1. Anecdotal evidence¶
Evidence based on unpublished data and usually personal (opinion).
Small number of observations
Selection bias
people who join a discussion might be more inclined towards a specific conclusion
Confirmation bias
People who believe the claim might be more likely to contribute examples that confirm it. People who doubt the claim are more likely to cite counter examples.
Inaccuracy
Personal stories, often misremembered, misrepresented, repeated inaccurately.
8.1.2. Statistical approach¶
Data collection
Descriptive statistics
Exploratory data analysis
Hypothesis testing
Estimation
Population
A group we are interested in studying [a group of people, animals, minerals etc.]
Cross sectional study
A study that collects data about a population at a particular point in time
Longitudinal study
A study that follows a population over time, collecting data from the same group repeatedly.
Respondent
A person who responds to a survey.
Cohort
A group of respondents.
Sample
The subset of population used to collect data.
Representative
A sample is representative if every member of the population has the same chance of being in the sample.
Oversampling
The technique of increasing the representation of a sub-population in order to avoid errors due to small sample sizes.
Record
A collection of information about a single person or other object of study.
Field
One of the named variables that makes up a record.
Table
A collection of records.
Raw data
Values collected and recorded with little or no checking, calculation or interpretation.
Recode
A value that is generated by calculation and other logic applied to raw data.
Summary statistic
The result of a computation that reduces a dataset to a single number (or a small set of numbers) that captures some characterisitc of the data.
Apparent effect
A measurement or summary statistic that suggests that something interesting is happening.
Statistically significant
Any apparent effect is statistically significant if it is unlikely to occur by chance.
Artifact
An apparent effect that is caused by bias, measurement error, or some other kind of error.
Change log
- Last Modified
$Id: intro.rst 249 2012-08-05 06:17:57Z shailesh $