T-Test Formula

5/5 - (1 bình chọn)

What is the t-Test Formula?

In statistics, the term “t-test” refers to the hypothesis test in which the test statistic follows a Student’s t-distribution. It is used to check whether two data sets are significantly different from each other or not.

One of the variants of the t-test is the one-sample t-test which is used to determine if the sample is significantly different from the population. The formula for a one-sample t-test is expressed using the observed sample mean, the theoretical population means, sample standard deviation, and sample size. Mathematically, it is represented as,

t = ( x̄ – μ) / (s / √n)

where

  • = Observed Mean of the Sample
  • μ = Theoretical Mean of the Population
  • s = Standard Deviation of the Sample
  • n = Sample Size

In case the statistics of two samples are to be compared, then a two-sample t-test must be used. Its formula is expressed using respective sample means, sample standard deviations, and sample sizes. Mathematically, it is represented as:

t = ( x̄1 – x̄2) / √ [(s2/ n ) + (s2/ n )]

Where,

  • = Observed Mean of 1st Sample
  • = Observed Mean of 2nd Sample
  • s= Standard Deviation of 1st Sample
  • s2= Standard Deviation of 2nd Sample
  • = Size of 1st Sample
  • = Size of 2nd Sample

Examples of t-Test Formula (With Excel Template)

Let’s take an example to understand the calculation of the t-Test Formula in a better manner.

t-Test Formula – Example #1

Let us take the example of a classroom of students that appeared for a test recently. A sample of 10 students was chosen from a total of 150 students. Calculate the sample’s t-test score if the mean score of the entire class is 78 and the mean score of the sample is 74 with a standard deviation of 3.5. Also, comment on whether the sample statistics are significantly different from the population at a 99.5% confidence interval.

Solution:

t-Test value is calculated using the formula given below:

t = ( x̄ – μ) / (s / √n)

  • t = (74 – 78) / (3.5 / √10)
  • t = -3.61

Therefore, the sample’s absolute t-test value is 3.61, which is less than the critical value (3.69) at a 99.5% confidence interval with a degree of freedom of 9. So, the hypothesis of the sample statistic is different than the population and can be rejected.

t-Test Formula – Example #2

Let us take the example of two samples to illustrate the concept of a two-sample t-test. The two samples have means of 10 and 12, standard deviations of 1.2 and 1.4, and sample sizes of 17 and 15. Determine if the sample’s statistics are different at a 99.5% confidence interval.

Solution:

t-Test value is calculated using the formula given below

t = ( x̄1 – x̄2) / √ [(s2/ n ) + (s2/ n )]

  • t = (10 – 12) /√ [(1.22 / 17) + (1.42 / 15)]
  • t = -4.31

Therefore, the absolute t-test value is 4.31, which is greater than the critical value (3.03) at a 99.5% confidence interval with a degree of freedom of 30. So, the hypothesis that the statistics of the two samples are significantly different can’t be rejected.

Explanation

The formula for a one-sample t-test can be derived by using the following steps:

Step 1: Determine the observed sample mean, and the theoretical population means specified. The sample mean and population mean is denoted by x̄ and μ, respectively.

Step 2: Next, determine the standard deviation of the sample, and it is denoted by s.

Step 3: Next, determine the sample size, which is the number of data points in the sample. It is denoted by n.

Step 4: Finally, the formula for a one-sample t-test can be derived using the observed sample mean (step 1), the theoretical population means (step 1), sample standard deviation (step 2), and sample size (step 3), as shown below.

t = ( x̄ – μ) / (s / √n)

The formula for the two-sample t-test can be derived using the following steps:

Step 1: Determine the observed sample mean of the two samples under consideration. The sample means are denoted by x̄1 and x̄2.

Step 2: Next, determine the standard deviation of the two samples, which are denoted by and.

Step 3: Next, determine the size of the two samples, which are denoted by and.

Step 4: Finally, the formula for a two-sample t-test can be derived using observed sample means (step 1), sample standard deviations (step 2), and sample sizes (step 3) as shown below.

t = ( x̄1 – x̄2) / √ [(s2/ n ) + (s2/ n )]

Relevance and Use of t-Test Formula

It is imperative for a statistician to understand the concept of t-test as it holds significant importance while drawing conclusive evidence about whether or not two data sets have statistics that are not very different. This test is run to check the validity of a null hypothesis based on the critical value at a given confidence interval and degree of freedom. However, please note that the student’s t-test is applicable for data set with a sample size of less than 30.

What Is a T-Test?

A t-test is an inferential statistic used to determine if there is a significant difference between the means of two groups and how they are related. T-tests are used when the data sets follow a normal distribution and have unknown variances, like the data set recorded from flipping a coin 100 times.

The t-test is a test used for hypothesis testing in statistics and uses the t-statistic, the t-distribution values, and the degrees of freedom to determine statistical significance.

Understanding the T-Test

A t-test compares the average values of two data sets and determines if they came from the same population. In the above examples, a sample of students from class A and a sample of students from class B would not likely have the same mean and standard deviation. Similarly, samples taken from the placebo-fed control group and those taken from the drug prescribed group should have a slightly different mean and standard deviation.

Mathematically, the t-test takes a sample from each of the two sets and establishes the problem statement. It assumes a null hypothesis that the two means are equal.

Using the formulas, values are calculated and compared against the standard values. The assumed null hypothesis is accepted or rejected accordingly. If the null hypothesis qualifies to be rejected, it indicates that data readings are strong and are probably not due to chance.

The t-test is just one of many tests used for this purpose. Statisticians use additional tests other than the t-test to examine more variables and larger sample sizes. For a large sample size, statisticians use a z-test. Other testing options include the chi-square test and the f-test.

Using a T-Test

Consider that a drug manufacturer tests a new medicine. Following standard procedure, the drug is given to one group of patients and a placebo to another group called the control group. The placebo is a substance with no therapeutic value and serves as a benchmark to measure how the other group, administered the actual drug, responds.

After the drug trial, the members of the placebo-fed control group reported an increase in average life expectancy of three years, while the members of the group who are prescribed the new drug reported an increase in average life expectancy of four years.

Initial observation indicates that the drug is working. However, it is also possible that the observation may be due to chance. A t-test can be used to determine if the results are correct and applicable to the entire population.

Four assumptions are made while using a t-test. The data collected must follow a continuous or ordinal scale, such as the scores for an IQ test, the data is collected from a randomly selected portion of the total population, the data will result in a normal distribution of a bell-shaped curve, and equal or homogenous variance exists when the standard variations are equal.

T-Test Formula

Calculating a t-test requires three fundamental data values. They include the difference between the mean values from each data set, or the mean difference, the standard deviation of each group, and the number of data values of each group.

This comparison helps to determine the effect of chance on the difference, and whether the difference is outside that chance range. The t-test questions whether the difference between the groups represents a true difference in the study or merely a random difference.

The t-test produces two values as its output: t-value and degrees of freedom. The t-value, or t-score, is a ratio of the difference between the mean of the two sample sets and the variation that exists within the sample sets.

The numerator value is the difference between the mean of the two sample sets. The denominator is the variation that exists within the sample sets and is a measurement of the dispersion or variability.

This calculated t-value is then compared against a value obtained from a critical value table called the T-distribution table. Higher values of the t-score indicate that a large difference exists between the two sample sets. The smaller the t-value, the more similarity exists between the two sample sets.

T-Score

A large t-score, or t-value, indicates that the groups are different while a small t-score indicates that the groups are similar.

Degrees of freedom refer to the values in a study that has the freedom to vary and are essential for assessing the importance and the validity of the null hypothesis. Computation of these values usually depends upon the number of data records available in the sample set.

Paired Sample T-Test

The correlated t-test, or paired t-test, is a dependent type of test and is performed when the samples consist of matched pairs of similar units, or when there are cases of repeated measures. For example, there may be instances where the same patients are repeatedly tested before and after receiving a particular treatment. Each patient is being used as a control sample against themselves.

This method also applies to cases where the samples are related or have matching characteristics, like a comparative analysis involving children, parents, or siblings.

The formula for computing the t-value and degrees of freedom for a paired t-test is:

Equal Variance or Pooled T-Test

The equal variance t-test is an independent t-test and is used when the number of samples in each group is the same, or the variance of the two data sets is similar.

The formula used for calculating t-value and degrees of freedom for equal variance t-test is:

Unequal Variance T-Test

The unequal variance t-test is an independent t-test and is used when the number of samples in each group is different, and the variance of the two data sets is also different. This test is also called Welch’s t-test.

The formula used for calculating t-value and degrees of freedom for an unequal variance t-test is:

Which T-Test to Use?

The following flowchart can be used to determine which t-test to use based on the characteristics of the sample sets. The key items to consider include the similarity of the sample records, the number of data records in each sample set, and the variance of each sample set.

Example of an Unequal Variance T-Test

Assume that the diagonal measurement of paintings received in an art gallery is taken. One group of samples includes 10 paintings, while the other includes 20 paintings. The data sets, with the corresponding mean and variance values, are as follows:

Though the mean of Set 2 is higher than that of Set 1, we cannot conclude that the population corresponding to Set 2 has a higher mean than the population corresponding to Set 1.

Is the difference from 19.4 to 21.6 due to chance alone, or do differences exist in the overall populations of all the paintings received in the art gallery? We establish the problem by assuming the null hypothesis that the mean is the same between the two sample sets and conduct a t-test to test if the hypothesis is plausible.

Since the number of data records is different (n1 = 10 and n2 = 20) and the variance is also different, the t-value and degrees of freedom are computed for the above data set using the formula mentioned in the Unequal Variance T-Test section.

The t-value is -2.24787. Since the minus sign can be ignored when comparing the two t-values, the computed value is 2.24787.

The degrees of freedom value is 24.38 and is reduced to 24, owing to the formula definition requiring rounding down of the value to the least possible integer value.

One can specify a level of probability (alpha level, level of significance, p) as a criterion for acceptance. In most cases, a 5% value can be assumed.

Using the degree of freedom value as 24 and a 5% level of significance, a look at the t-value distribution table gives a value of 2.064. Comparing this value against the computed value of 2.247 indicates that the calculated t-value is greater than the table value at a significance level of 5%. Therefore, it is safe to reject the null hypothesis that there is no difference between means. The population set has intrinsic differences, and they are not by chance.

How Is the T-Distribution Table Used?

The T-Distribution Table is available in one-tail and two-tails formats. The former is used for assessing cases that have a fixed value or range with a clear direction, either positive or negative. For instance, what is the probability of the output value remaining below -3, or getting more than seven when rolling a pair of dice? The latter is used for range-bound analysis, such as asking if the coordinates fall between -2 and +2.

What Is an Independent T-Test?

The samples of independent t-tests are selected independent of each other where the data sets in the two groups don’t refer to the same values. They may include a group of 100 randomly unrelated patients split into two groups of 50 patients each. One of the groups becomes the control group and is administered a placebo, while the other group receives a prescribed treatment. This constitutes two independent sample groups that are unpaired and unrelated to each other.

What Does a T-Test Explain and How Are They Used?

A t-test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment has an effect on the population of interest, or whether two groups are different from one another.

T-Test Formula

The t-test is any statistical hypothesis test in which the test statistic follows a Student’s t-distribution under the null hypothesis. It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known.

T-test uses means and standard deviations of two samples to make a comparison. The formula for T-test is given below:

T-Test Solved Examples

Question 1: Find the t-test value for the following two sets of values: 7, 2, 9, 8 and 1, 2, 3, 4?

Solution:

Formula for mean:

Construct the following table for standard deviation:

Standard deviation for the first set of data: S1 = 3.11

Number of terms in second set: n2 = 4

Mean for second set of data:

One-Sample T-Test Formula

For comparing the mean of a population

from n samples, with a specified theoretical mean μ, we use a one-sample t-test.

Independent Sample T-Test

Students t-test is used to compare the mean of two groups of samples. It helps evaluate if the means of the two sets of data are statistically significantly different from each other.

Paired Samples T-Test

Whenever two distributions of the variables are highly correlated, they could be pre and post test results from the same people. In such cases, we use the paired samples t-test.

Examples Using t-test Formula

Example 1: Calculate a t-test for the following data of the number of times people prefer coffee or tea in five time intervals.

Coffee Tea
4 3
5 8
7 6
6 4
9 7

Solution: let

be the sample of data that prefers coffee and

be the sample of data that prefers tea.

let us find the mean, variance and the SD

Applying the known values in the t-test formula, we get

Example 2: A company wants to improve its sales. The previous sales data indicated that the average sale of 25 salesmen was $50 per transaction. After training, the recent data showed an average sale of $80 per transaction. If the standard deviation is $15, find the t-score. Has the training provided improved the sales?

Solution:

accepted hypothesis:the population mean = the claimed value⇒ μ = μ0

alternate hypothesis: the population mean not equal to the claimed value⇒ μ ≠ μ0

Mean sale = 80, μ = 50, s= 15 and n= 25

substituting the values, we get t= (80-50)/(15/√25)

t = (30 ×5)/10 = 10

looking at the t-table we find 10 > 1.711 . (I.e. CV for α = 0.05). ∴ the accepted hypothesis is not true. Thus we conclude that the training boosted the sales.

Example 3: A pre-test and post-test conducted during a survey to find the study hours of Patrick on weekends. Calculate the t-score and determine (for α = 0.25) if the pre-test and post-test surveys are significantly different?

Pre-test(X) Post-test(Y) X-Y (X-Y)2
1 2 -1 1
2 4 -2 4

Solution:

Σ(X-Y)= -3 = 3

s= Σ(X-Y)2/(n-1) = 52/1 = 25

t= 3/(25/2) = 6/25 = 0.24

here degree of freedom is n-1 = 2-1 =1 and the corresponidng critical value in the t-table for α= 0.25, is 1.

t < CV.

Therefore the scores are not significantly different.

FAQs on T-test Formula

How Do You Calculate The T-test?

The following steps are followed to calculate the t-test.

  • Get the data. Find the mean.
  • Subtract the mean score from each individual score
  • Square the differences.
  • Add up all the squared differences.
  • Find the variance and standard deviation.

What is the Formula for Finding The Independent T-test?

Students t-test is used to compare the mean of two groups of samples.

What is a One-Sample t-test?

The one-sample t-test is the statistical test used to determine whether an unknown population mean is different from a specific value. For example, comparing the mean height of the students with respect to the national average height of an adult.

What is a T-test Formula Used For?

We use the T-test Formula to statistically determine if there is a significant difference between the means of two groups that are related in certain aspects. Examples: a gym center tests the weight loss from a few samples, a company hiring candidates is set to determine the skills of 2 candidates from two different universities at the interview, and so on.

When to use a t test

A t test can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare more than two groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a post-hoc test.

The t test is a parametric test of difference, meaning that it makes the same assumptions about your data as other parametric tests. The t test assumes your data:

  1. are independent
  2. are (approximately) normally distributed
  3. have a similar amount of variance within each group being compared (a.k.a. homogeneity of variance)

If your data do not fit these assumptions, you can try a nonparametric alternative to the t test, such as the Wilcoxon Signed-Rank test for data with unequal variances.

What type of t test should I use?

When choosing a t test, you will need to consider two things: whether the groups being compared come from a single population or two different populations, and whether you want to test the difference in a specific direction.

One-sample, two-sample, or paired t test?

  • If the groups come from a single population (e.g., measuring before and after an experimental treatment), perform a paired t test. This is a within-subjects design.
  • If the groups come from two different populations (e.g., two different species, or people from two separate cities), perform a two-sample t test (a.k.a. independent t test). This is a between-subjects design.
  • If there is one group being compared against a standard value (e.g., comparing the acidity of a liquid to a neutral pH of 7), perform a one-sample t test.

One-tailed or two-tailed t test?

  • If you only care whether the two populations are different from one another, perform a two-tailed t test.
  • If you want to know whether one population mean is greater than or less than the other, perform a one-tailed t test.
t test example
In your test of whether petal length differs by species:
  • Your observations come from two separate populations (separate species), so you perform a two-sample t test.
  • You don’t care about the direction of the difference, only whether there is a difference, so you choose to use a two-tailed t test.

Performing a t test

The t test estimates the true difference between two group means using the ratio of the difference in group means over the pooled standard error of both groups. You can calculate it manually using a formula, or use statistical analysis software.

T test formula

The formula for the two-sample t test (a.k.a. the Student’s t-test) is shown below.

In this formula, t is the t value, x1 and x2 are the means of the two groups being compared, s2 is the pooled standard error of the two groups, and n1 and n2 are the number of observations in each of the groups.

A larger t value shows that the difference between group means is greater than the pooled standard error, indicating a more significant difference between the groups.

You can compare your calculated t value against the values in a critical value chart (e.g., Student’s t table) to determine whether your t value is greater than what would be expected by chance. If so, you can reject the null hypothesis and conclude that the two groups are in fact different.

T test function in statistical software

Most statistical software (R, SPSS, etc.) includes a t test function. This built-in function will take your raw data and calculate the t value. It will then compare it to the critical value, and calculate a p-value. This way you can quickly see whether your groups are statistically different.

In your comparison of flower petal lengths, you decide to perform your t test using R. The code looks like this:

t.test(Petal.Length ~ Species, data = flower.data)

Interpreting test results

If you perform the t test for your flower hypothesis in R, you will receive the following output:

The output provides:

  1. An explanation of what is being compared, called data in the output table.
  2. The t value: -33.719. Note that it’s negative; this is fine! In most cases, we only care about the absolute value of the difference, or the distance from 0. It doesn’t matter which direction.
  3. The degrees of freedom: 30.196. Degrees of freedom is related to your sample size, and shows how many ‘free’ data points are available in your test for making comparisons. The greater the degrees of freedom, the better your statistical test will work.
  4. The p value: 2.2e-16 (i.e. 2.2 with 15 zeros in front). This describes the probability that you would see a t value as large as this one by chance.
  5. A statement of the alternative hypothesis (Ha). In this test, the Ha is that the difference is not 0.
  6. The 95% confidence interval. This is the range of numbers within which the true difference in means will be 95% of the time. This can be changed from 95% if you want a larger or smaller interval, but 95% is very commonly used.
  7. The mean petal length for each group.

t test example

From the output table, we can see that the difference in means for our sample data is −4.084 (1.456 − 5.540), and the confidence interval shows that the true difference in means is between −3.836 and −4.331. So, 95% of the time, the true difference in means will be different from 0. Our p value of 2.2e–16 is much smaller than 0.05, so we can reject the null hypothesis of no difference and say with a high degree of confidence that the true difference in means is not equal to zero.

Presenting the results of a t test

When reporting your t test results, the most important values to include are the t value, the p value, and the degrees of freedom for the test. These will communicate to your audience whether the difference between the two groups is statistically significant (a.k.a. that it is unlikely to have happened by chance).

You can also include the summary statistics for the groups being compared, namely the mean and standard deviation. In R, the code for calculating the mean and the standard deviation from the data looks like this:

flower.data %>%
group_by(Species) %>%
summarize(mean_length = mean(Petal.Length),
sd_length = sd(Petal.Length))

In our example, you would report the results like this:

The difference in petal length between iris species 1 (M = 1.46; SD = 0.206) and iris species 2 (M = 5.54; SD = 0.569) was significant (t (30) = −33.7190; p < 2.2e-16).

The T Score.

The t score is a ratio between the difference between two groups and the difference within the groups.

  • Larger t scores = more difference between groups.
  • Smaller t score = more similarity between groups.

A t score of 3 tells you that the groups are three times as different from each other as they are within each other. So when you run a t test, bigger t-values equal a greater probability that the results are repeatable.

T-Values and P-values

How big is “big enough”? Every t-value has a p-value to go with it. A p-value from a t test is the probability that the results from your sample data occurred by chance. P-values are from 0% to 100% and are usually written as a decimal (for example, a p value of 5% is 0.05). Low p-values indicate your data did not occur by chance. For example, a p-value of .01 means there is only a 1% probability that the results from an experiment happened by chance.

Calculating the Statistic / Test Types

There are three main types of t-test:

  • An Independent Samples t-test compares the means for two groups.
  • A Paired sample t-test compares means from the same group at different times (say, one year apart).
  • A One sample t-test tests the mean of a single group against a known mean.

What is a Paired T Test (Paired Samples T Test / Dependent Samples T Test)?

A paired t test (also called a correlated pairs t-test, a paired samples t test or dependent samples t test) is where you run a t test on dependent samples. Dependent samples are essentially connected — they are tests on the same person or thing. For example:

  • Knee MRI costs at two different hospitals,
  • Two tests on the same person before and after training,
  • Two blood pressure measurements on the same person using different equipment.

When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test

Choose the paired t-test if you have two measurements on the same item, person or thing. But you should also choose this test if you have two items that are being measured with a unique condition. For example, you might be measuring car safety performance in vehicle research and testing and subject the cars to a series of crash tests. Although the manufacturers are different, you might be subjecting them to the same conditions.

With a “regular” two sample t test, you’re comparing the means for two different samples. For example, you might test two different groups of customer service associates on a business-related test or testing students from two universities on their English skills. But if you take a random sample each group separately and they have different conditions, your samples are independent and you should run an independent samples t test (also called between-samples and unpaired-samples).

The null hypothesis for the independent samples t-test is μ1 = μ2. So it assumes the means are equal. With the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H0: µd = 0).

Paired Samples T Test By hand

Example question: Calculate a paired t test by hand for the following data:

Step 1: Subtract each Y score from each X score.

Step 2: Add up all of the values from Step 1 then set this number aside for a moment.

Step 3: Square the differences from Step 1.

Step 4: Add up all of the squared differences from Step 3.

Step 5: Use the following formula to calculate the t-score:

  1. The “ΣD” is the sum of X-Y from Step 2.
  2. ΣD2: Sum of the squared differences (from Step 4).
  3. (ΣD)2: Sum of the differences (from Step 2), squared.

If you’re unfamiliar with the Σ notation used in the t test, it basically means to “add everything up”. You may find this article useful: summation notation.

Step 6: Subtract 1 from the sample size to get the degrees of freedom. We have 11 items. So 11 – 1 = 10.

Step 7: Find the p-value in the t-table, using the degrees of freedom in Step 6. But if you don’t have a specified alpha level, use 0.05 (5%).

So for this example t test problem, with df = 10, the t-value is 2.228.

Step 8: In conclusion, compare your t-table value from Step 7 (2.228) to your calculated t-value (-2.74). The calculated t-value is greater than the table value at an alpha level of .05. In addition, note that the p-value is less than the alpha level: p <.05. So we can reject the null hypothesis that there is no difference between means.

However, note that you can ignore the minus sign when comparing the two t-values as ± indicates the direction; the p-value remains the same for both directions.

 

GIA SƯ TOÁN BẰNG TIẾNG ANH

GIA SƯ DẠY SAT

Math Formulas

Mọi chi tiết liên hệ với chúng tôi :
TRUNG TÂM GIA SƯ TÂM TÀI ĐỨC
Các số điện thoại tư vấn cho Phụ Huynh :
Điện Thoại : 091 62 65 673 hoặc 01634 136 810
Các số điện thoại tư vấn cho Gia sư :
Điện thoại : 0902 968 024 hoặc 0908 290 601

Hãy bình luận đầu tiên

Để lại một phản hồi

Thư điện tử của bạn sẽ không được hiện thị công khai.


*