如何在Python中從頭開始編寫學生t檢驗 · Machine Learning Mastery 博客文章翻譯

# 如何在Python中從頭開始編寫學生t檢驗 > 原文： [https://machinelearningmastery.com/how-to-code-the-students-t-test-from-scratch-in-python/](https://machinelearningmastery.com/how-to-code-the-students-t-test-from-scratch-in-python/) 也許最廣泛使用的統計假設檢驗之一是學生t檢驗。因為有一天你可能會自己使用這個測試，所以深入了解測試的工作原理非常重要。作為開發人員，通過從頭開始實施假設檢驗，可以最好地實現這種理解。在本教程中，您將了解如何在Python中從頭開始實施Student's t檢驗統計假設檢驗。完成本教程后，您將了解： * 學生的t檢驗將評論是否可能觀察到兩個樣本，因為樣本來自同一人群。 * 如何從頭開始實施學生t檢驗兩個獨立樣本。 * 如何從頭開始對兩個相關樣本實施配對學生t檢驗。讓我們開始吧。 ![How to Code the Student's t-Test from Scratch in Python](img/18c211273ce8051c7807ef2e09b805d7.jpg) 如何在Python中從頭開始編寫學生t檢驗照片由 [n1d](https://www.flickr.com/photos/62400641@N07/33385804523/) ，保留一些權利。 ## 教程概述本教程分為三個部分;他們是： 1. 學生的t-測試 2. 學生對獨立樣本的t檢驗 3. 學生對依賴樣本的t檢驗 ## 學生的t-測試 [學生t檢驗](https://en.wikipedia.org/wiki/Student%27s_t-test)是一項統計假設檢驗，用于檢驗是否預期兩個樣本來自同一人群。它以William Gosset使用的化名“ _Student_ ”命名，他開發了該測試。測試通過檢查來自兩個樣品的平均值來確定它們是否彼此顯著不同。它通過計算均值之間差異的標準誤差來做到這一點，如果兩個樣本具有相同的均值（零假設），可以解釋為差異的可能性。通過將其與來自t分布的臨界值進行比較，可以解釋通過測試計算的t統計量。可以使用自由度和百分點函數（PPF）的顯著性水平來計算臨界值。我們可以在雙尾檢驗中解釋統計值，這意味著如果我們拒絕零假設，那可能是因為第一個均值小于或大于第二個均值。為此，我們可以計算檢驗統計量的絕對值，并將其與正（右尾）臨界值進行比較，如下所示： * **如果abs（t-statistic）＆lt; =臨界值**：接受平均值相等的零假設。 * **如果abs（t-statistic）>臨界值**：拒絕平均值相等的零假設。我們還可以使用t分布的累積分布函數（CDF）來檢索觀察t統計量的絕對值的累積概率，以便計算p值。然后可以將p值與選擇的顯著性水平（α）（例如0.05）進行比較，以確定是否可以拒絕原假設： * **如果p> alpha** ：接受平均值相等的零假設。 * **如果p <= alpha** ：拒絕零假設，即平均值相等。在處理樣本的平均值時，測試假設兩個樣本都是從高斯分布中提取的。該測試還假設樣本具有相同的方差和相同的大小，盡管如果這些假設不成立，則對測試進行校正。例如，參見 [Welch的t檢驗](https://en.wikipedia.org/wiki/Welch%27s_t-test)。 Student's t-test有兩個主要版本： * **獨立樣本**。兩個樣本不相關的情況。 * **相關樣本**。樣本相關的情況，例如對同一群體的重復測量。也稱為配對測試。獨立和依賴學生的t檢驗分別通過 [ttest_ind（）](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html)和 [ttest_rel（）](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html) SciPy函數在Python中提供。 **注意**：我建議使用這些SciPy函數計算應用程序的Student t檢驗，如果它們合適的話。庫實現將更快，更不容易出錯。我只建議您出于學習目的自行實施測試，或者在需要修改測試版本的情況下。我們將使用SciPy函數來確認我們自己的測試版本的結果。請注意，作為參考，本教程中提供的所有計算都直接取自“ [t [ _T測試_”中的“ _T Tests_](https://amzn.to/2J2Qibd)”，第三版，2010年。我提到這一點是因為您可能會看到具有不同形式的方程式，具體取決于您使用的參考文本。 ## 學生對獨立樣本的t檢驗我們將從最常見的學生t檢驗形式開始：我們比較兩個獨立樣本的平均值的情況。 ### 計算兩個獨立樣本的t統計量的計算如下： ```py t = observed difference between sample means / standard error of the difference between the means ``` 要么 ```py t = (mean(X1) - mean(X2)) / sed ``` 其中 _X1_ 和 _X2_ 是第一和第二數據樣本而 _sed_ 是平均值之差的標準誤差。平均值之間差異的標準誤差可以計算如下： ```py sed = sqrt(se1^2 + se2^2) ``` 其中 _se1_ 和 _se2_ 是第一和第二數據集的標準誤差。樣本的標準誤差可以計算為： ```py se = std / sqrt(n) ``` 當 _se_ 是樣品的標準誤差時， _std_ 是樣品標準偏差， _n_ 是樣品中的觀察數。這些計算做出以下假設： * 樣本是從高斯分布中提取的。 * 每個樣本的大小大致相等。 * 樣本具有相同的方差。 ### 履行我們可以使用Python標準庫，NumPy和SciPy中的函數輕松實現這些方程。假設我們的兩個數據樣本存儲在變量 _data1_ 和 _data2_ 中。我們可以從計算這些樣本的平均值開始，如下所示： ```py # calculate means mean1, mean2 = mean(data1), mean(data2) ``` 我們在那里一半。現在我們需要計算標準誤差。我們可以手動完成，首先計算樣本標準偏差： ```py # calculate sample standard deviations std1, std2 = std(data1, ddof=1), std(data2, ddof=1) ``` 然后是標準錯誤： ```py # calculate standard errors n1, n2 = len(data1), len(data2) se1, se2 = std1/sqrt(n1), std2/sqrt(n2) ``` 或者，我們可以使用 _sem（）_ SciPy函數直接計算標準誤差。 ```py # calculate standard errors se1, se2 = sem(data1), sem(data2) ``` 我們可以使用樣本的標準誤差來計算樣本之間差異的“_標準誤差”：_ ```py # standard error on the difference between the samples sed = sqrt(se1**2.0 + se2**2.0) ``` 我們現在可以計算t統計量： ```py # calculate the t statistic t_stat = (mean1 - mean2) / sed ``` 我們還可以計算一些其他值來幫助解釋和呈現統計數據。測試的自由度數計算為兩個樣本中觀察值的總和減去2。 ```py # degrees of freedom df = n1 + n2 - 2 ``` 對于給定的顯著性水平，可以使用百分點函數（PPF）計算臨界值，例如0.05（95％置信度）。此功能可用于SciPy中的t分發，如下所示： ```py # calculate the critical value alpha = 0.05 cv = t.ppf(1.0 - alpha, df) ``` 可以使用t分布上的累積分布函數來計算p值，再次在SciPy中。 ```py # calculate the p-value p = (1 - t.cdf(abs(t_stat), df)) * 2 ``` 在這里，我們假設一個雙尾分布，其中零假設的拒絕可以解釋為第一個均值小于或大于第二個均值。我們可以將所有這些部分組合成一個簡單的函數來計算兩個獨立樣本的t檢驗： ```py # function for calculating the t-test for two independent samples def independent_ttest(data1, data2, alpha): # calculate means mean1, mean2 = mean(data1), mean(data2) # calculate standard errors se1, se2 = sem(data1), sem(data2) # standard error on the difference between the samples sed = sqrt(se1**2.0 + se2**2.0) # calculate the t statistic t_stat = (mean1 - mean2) / sed # degrees of freedom df = len(data1) + len(data2) - 2 # calculate the critical value cv = t.ppf(1.0 - alpha, df) # calculate the p-value p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0 # return everything return t_stat, df, cv, p ``` ### 工作示例在本節中，我們將計算一些合成數據樣本的t檢驗。首先，讓我們生成兩個100高斯隨機數的樣本，其方差相同，分別為50和51。我們期望測試拒絕原假設并找出樣本之間的顯著差異： ```py # seed the random number generator seed(1) # generate two independent samples data1 = 5 * randn(100) + 50 data2 = 5 * randn(100) + 51 ``` 我們可以使用內置的SciPy函數 _ttest_ind（）_計算這些樣本的t檢驗。這將為我們提供t統計值和要比較的p值，以確保我們正確地實施了測試。下面列出了完整的示例。 ```py # Student's t-test for independent samples from numpy.random import seed from numpy.random import randn from scipy.stats import ttest_ind # seed the random number generator seed(1) # generate two independent samples data1 = 5 * randn(100) + 50 data2 = 5 * randn(100) + 51 # compare samples stat, p = ttest_ind(data1, data2) print('t=%.3f, p=%.3f' % (stat, p)) ``` 運行該示例，我們可以看到t統計值和p值。我們將使用這些作為我們對這些數據進行測試的預期值。 ```py t=-2.262, p=0.025 ``` 我們現在可以使用上一節中定義的函數對相同的數據應用我們自己的實現。該函數將返回t統計值和臨界值。我們可以使用臨界值來解釋t統計量，以查看測試的結果是否顯著，并且確實手段與我們預期的不同。 ```py # interpret via critical value if abs(t_stat) <= cv: print('Accept null hypothesis that the means are equal.') else: print('Reject the null hypothesis that the means are equal.') ``` 該函數還返回p值。我們可以使用α來解釋p值，例如0.05，以確定測試的結果是否顯著，并且確實手段與我們預期的不同。 ```py # interpret via p-value if p > alpha: print('Accept null hypothesis that the means are equal.') else: print('Reject the null hypothesis that the means are equal.') ``` 我們希望這兩種解釋始終匹配。 The complete example is listed below. ```py # t-test for independent samples from math import sqrt from numpy.random import seed from numpy.random import randn from numpy import mean from scipy.stats import sem from scipy.stats import t # function for calculating the t-test for two independent samples def independent_ttest(data1, data2, alpha): # calculate means mean1, mean2 = mean(data1), mean(data2) # calculate standard errors se1, se2 = sem(data1), sem(data2) # standard error on the difference between the samples sed = sqrt(se1**2.0 + se2**2.0) # calculate the t statistic t_stat = (mean1 - mean2) / sed # degrees of freedom df = len(data1) + len(data2) - 2 # calculate the critical value cv = t.ppf(1.0 - alpha, df) # calculate the p-value p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0 # return everything return t_stat, df, cv, p # seed the random number generator seed(1) # generate two independent samples data1 = 5 * randn(100) + 50 data2 = 5 * randn(100) + 51 # calculate the t test alpha = 0.05 t_stat, df, cv, p = independent_ttest(data1, data2, alpha) print('t=%.3f, df=%d, cv=%.3f, p=%.3f' % (t_stat, df, cv, p)) # interpret via critical value if abs(t_stat) <= cv: print('Accept null hypothesis that the means are equal.') else: print('Reject the null hypothesis that the means are equal.') # interpret via p-value if p > alpha: print('Accept null hypothesis that the means are equal.') else: print('Reject the null hypothesis that the means are equal.') ``` 首先運行該示例計算測試。打印測試結果，包括t統計量，自由度，臨界值和p值。我們可以看到t統計量和p值都與SciPy函數的輸出相匹配。測試似乎正確實施。然后使用t統計量和p值來解釋測試結果。我們發現，正如我們所期望的那樣，有足夠的證據可以拒絕零假設，發現樣本均值可能不同。 ```py t=-2.262, df=198, cv=1.653, p=0.025 Reject the null hypothesis that the means are equal. Reject the null hypothesis that the means are equal. ``` ## 學生對依賴樣本的t檢驗我們現在可以看一下計算依賴樣本的學生t檢驗的情況。在這種情況下，我們收集來自種群的樣本的一些觀察結果，然后應用一些處理，然后從同一樣本收集觀察結果。結果是兩個相同大小的樣本，其中每個樣本中的觀察結果是相關的或配對的。依賴樣本的t檢驗稱為配對學生t檢驗。 ### Calculation 配對學生t檢驗的計算與獨立樣本的情況類似。主要區別在于分母的計算。 ```py t = (mean(X1) - mean(X2)) / sed ``` Where _X1_ and _X2_ are the first and second data samples and _sed_ is the standard error of the difference between the means. 這里， _sed_ 計算如下： ```py sed = sd / sqrt(n) ``` 其中 _sd_ 是依賴樣本平均值與_之間的差異的標準偏差n_ 是配對觀察的總數（例如每個樣本的大小）。 _sd_ 的計算首先需要計算樣本之間的平方差之和： ```py d1 = sum (X1[i] - X2[i])^2 for i in n ``` 它還需要樣本之間（非平方）差異的總和： ```py d2 = sum (X1[i] - X2[i]) for i in n ``` 然后我們可以將sd計算為： ```py sd = sqrt((d1 - (d2**2 / n)) / (n - 1)) ``` 而已。 ### Implementation 我們可以直接在Python中實現配對Student's t-test的計算。第一步是計算每個樣本的平均值。 ```py # calculate means mean1, mean2 = mean(data1), mean(data2) ``` 接下來，我們將需要對的數量（ _n_ ）。我們將在幾個不同的計算中使用它。 ```py # number of paired samples n = len(data1) ``` 接下來，我們必須計算樣本之間的平方差的總和，以及總和差異。 ```py # sum squared difference between observations d1 = sum([(data1[i]-data2[i])**2 for i in range(n)]) # sum difference between observations d2 = sum([data1[i]-data2[i] for i in range(n)]) ``` 我們現在可以計算平均值之差的標準差。 ```py # standard deviation of the difference between means sd = sqrt((d1 - (d2**2 / n)) / (n - 1)) ``` 然后用它來計算平均值之間差異的標準誤差。 ```py # standard error of the difference between the means sed = sd / sqrt(n) ``` 最后，我們擁有計算t統計量所需的一切。 ```py # calculate the t statistic t_stat = (mean1 - mean2) / sed ``` 此實現與獨立樣本實現之間唯一的其他關鍵區別是計算自由度的數量。 ```py # degrees of freedom df = n - 1 ``` 和以前一樣，我們可以將所有這些結合在一起成為可重用的功能。該函數將采用兩個配對樣本和顯著性水平（alpha）并計算t統計量，自由度數，臨界值和p值。完整的功能如下所列。 ```py # function for calculating the t-test for two dependent samples def dependent_ttest(data1, data2, alpha): # calculate means mean1, mean2 = mean(data1), mean(data2) # number of paired samples n = len(data1) # sum squared difference between observations d1 = sum([(data1[i]-data2[i])**2 for i in range(n)]) # sum difference between observations d2 = sum([data1[i]-data2[i] for i in range(n)]) # standard deviation of the difference between means sd = sqrt((d1 - (d2**2 / n)) / (n - 1)) # standard error of the difference between the means sed = sd / sqrt(n) # calculate the t statistic t_stat = (mean1 - mean2) / sed # degrees of freedom df = n - 1 # calculate the critical value cv = t.ppf(1.0 - alpha, df) # calculate the p-value p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0 # return everything return t_stat, df, cv, p ``` ### Worked Example 在本節中，我們將在工作示例中使用與獨立Student's t檢驗相同的數據集。數據樣本沒有配對，但我們會假裝它們。我們希望測試拒絕原假設并找出樣本之間的顯著差異。 ```py # seed the random number generator seed(1) # generate two independent samples data1 = 5 * randn(100) + 50 data2 = 5 * randn(100) + 51 ``` 和以前一樣，我們可以使用SciPy函數評估測試問題，以計算配對t檢驗。在這種情況下， _ttest_rel（）_功能。 The complete example is listed below. ```py # Paired Student's t-test from numpy.random import seed from numpy.random import randn from scipy.stats import ttest_rel # seed the random number generator seed(1) # generate two independent samples data1 = 5 * randn(100) + 50 data2 = 5 * randn(100) + 51 # compare samples stat, p = ttest_rel(data1, data2) print('Statistics=%.3f, p=%.3f' % (stat, p)) ``` 運行該示例計算并打印t統計量和p值。我們將使用這些值來驗證我們自己的配對t檢驗函數的計算。 ```py Statistics=-2.372, p=0.020 ``` 我們現在可以測試我們自己的配對學生t檢驗的實現。下面列出了完整的示例，包括已開發的函數和函數結果的解釋。 ```py # t-test for dependent samples from math import sqrt from numpy.random import seed from numpy.random import randn from numpy import mean from scipy.stats import t # function for calculating the t-test for two dependent samples def dependent_ttest(data1, data2, alpha): # calculate means mean1, mean2 = mean(data1), mean(data2) # number of paired samples n = len(data1) # sum squared difference between observations d1 = sum([(data1[i]-data2[i])**2 for i in range(n)]) # sum difference between observations d2 = sum([data1[i]-data2[i] for i in range(n)]) # standard deviation of the difference between means sd = sqrt((d1 - (d2**2 / n)) / (n - 1)) # standard error of the difference between the means sed = sd / sqrt(n) # calculate the t statistic t_stat = (mean1 - mean2) / sed # degrees of freedom df = n - 1 # calculate the critical value cv = t.ppf(1.0 - alpha, df) # calculate the p-value p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0 # return everything return t_stat, df, cv, p # seed the random number generator seed(1) # generate two independent samples (pretend they are dependent) data1 = 5 * randn(100) + 50 data2 = 5 * randn(100) + 51 # calculate the t test alpha = 0.05 t_stat, df, cv, p = dependent_ttest(data1, data2, alpha) print('t=%.3f, df=%d, cv=%.3f, p=%.3f' % (t_stat, df, cv, p)) # interpret via critical value if abs(t_stat) <= cv: print('Accept null hypothesis that the means are equal.') else: print('Reject the null hypothesis that the means are equal.') # interpret via p-value if p > alpha: print('Accept null hypothesis that the means are equal.') else: print('Reject the null hypothesis that the means are equal.') ``` 運行該示例計算樣本問題的配對t檢驗。計算出的t統計量和p值與我們對SciPy庫實現的期望相匹配。這表明實施是正確的。具有臨界值的t檢驗統計量和具有顯著性水平的p值的解釋都發現了顯著的結果，拒絕了平均值相等的零假設。 ```py t=-2.372, df=99, cv=1.660, p=0.020 Reject the null hypothesis that the means are equal. Reject the null hypothesis that the means are equal. ``` ### 擴展本節列出了一些擴展您可能希望探索的教程的想法。 * 將每個測試應用于您自己設計的樣本問題。 * 更新獨立測試并為具有不同方差和樣本大小的樣本添加校正。 * 對SciPy庫中實現的其中一個測試執行代碼審查，并總結實現細節的差異。如果你探索任何這些擴展，我很想知道。 ## 進一步閱讀如果您希望深入了解，本節將提供有關該主題的更多資源。 ### 圖書 * [普通英語統計](https://amzn.to/2J2Qibd)，第三版，2010年。 ### API * [scipy.stats.ttest_ind API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html) * [scipy.stats.ttest_rel API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html) * [scipy.stats.sem API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.sem.html) * [scipy.stats.t API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html) ### 用品 * [維基百科上的學生t檢驗](https://en.wikipedia.org/wiki/Student%27s_t-test) * [韋爾奇在維基百科上的t檢驗](https://en.wikipedia.org/wiki/Welch%27s_t-test) ## 摘要在本教程中，您了解了如何在Python中從頭開始實施Student's t檢驗統計假設檢驗。具體來說，你學到了： * 學生的t檢驗將評論是否可能觀察到兩個樣本，因為樣本來自同一人群。 * 如何從頭開始實施學生t檢驗兩個獨立樣本。 * 如何從頭開始對兩個相關樣本實施配對學生t檢驗。你有任何問題嗎？在下面的評論中提出您的問題，我會盡力回答。