Variable (Statistics): Difference between revisions

From Rice Wiki
No edit summary
 
(2 intermediate revisions by 2 users not shown)
Line 21: Line 21:


It is possible for a categorical variable to be denoted with numbers; a common example would be an ID number. The biggest difference between categorical variables denoted as numbers and numerical variables is the fact that the sum/mean of categorical variables does not have meaning, whereas that of numerical variables do.
It is possible for a categorical variable to be denoted with numbers; a common example would be an ID number. The biggest difference between categorical variables denoted as numbers and numerical variables is the fact that the sum/mean of categorical variables does not have meaning, whereas that of numerical variables do.
A '''random''' variable is a variable whose outcome is the result of a random process.


= Notation =
= Notation =


A capitalized character (usually <math>X, Y, Z</math>) is used to denote ''all possible values'' of a variable. This is called a '''major'''.
A capitalized character (usually <math>X, Y, Z</math>) is used to denote ''all possible values'' of a variable. This is called a '''major'''. When we say "variable", we usually mean this.


A lower case character corresponding to the major is used to denote a specific value of that major (such as <math>x, y, z</math>. This is called a '''statistic'''.
A lower case character corresponding to the major is used to denote a specific value of that major (such as <math>x, y, z</math>. This is called a '''statistic'''.


= Random Variable =
A '''random''' variable is a numerical variable whose outcome is the result of a random process (i.e. we don't know what will happen for certain). See [[Random Variable]].
[[Category:Statistics]]
[[Category:Statistics]]

Latest revision as of 06:25, 19 March 2024

In statistics, a variable is a characteristic of a subject that varies in a non-random way.

Overview and Related Definitions

At the top level of statistics, we investigate a population: a set of units that we are interested in studying.

Populations are almost always impossible to study due to their massive size and other constraints. Therefore, we take a sample: a subset of the population.

A subject is a unit that we study in a population or a sample, and a variable is a particular characteristic of the subject that we are interested in studying.

Types of Variables

There are two types of variables, each with two sub-categories that have useful properties:

  • Quantitative/Numerical variables are measured with numbers. Their sum has meaning.
    • Continuous numerical variables can theoretically take on any number within an interval, whereas
    • Discrete numerical variables have natural gaps
  • Qualitative/Categorical variables are measured as labels. Their sum does not have meaning.
    • Ordinal categorical variables have a natural ordering, whereas
    • Nominal categorical variables do not

It is possible for a categorical variable to be denoted with numbers; a common example would be an ID number. The biggest difference between categorical variables denoted as numbers and numerical variables is the fact that the sum/mean of categorical variables does not have meaning, whereas that of numerical variables do.

Notation

A capitalized character (usually ) is used to denote all possible values of a variable. This is called a major. When we say "variable", we usually mean this.

A lower case character corresponding to the major is used to denote a specific value of that major (such as . This is called a statistic.

Random Variable

A random variable is a numerical variable whose outcome is the result of a random process (i.e. we don't know what will happen for certain). See Random Variable.