Where Do Z-Score Tables Come From? (+ how to make them in R)

Every student learns how to look up areas under the normal curve using Z-Score tables in their first statistics class. But what is less commonly covered, especially in courses where calculus is not a prerequisite, is where those Z-Score tables come from: figuring out the area under the normal curve for all possible places you could chop it into two, then making a table from it.
You get the z-score by evaluating the integral of the equation for the bell-shaped normal curve, usually from -Inf to the z-score of interest. This is the same thing that the R command pnorm does when you provide it with a z-score. Here is the slide presentation I put together to explain the use and origin of the Z-Score table, and how it relates to pnorm and qnorm (the command that lets you input an area to find the z-score at which the area to the left is swiped out). It’s free to use under Creative Commons, and is part of the course materials that is available for use with this 2017 book.
One of the fun things I did was to make my own z-score table in R. I don’t know why anyone would WANT to do this — they are easy to find in books, and online, and if you know how to use pnorm and qnorm, you don’t need one at all. But, you can, and here’s how.
First, let’s create a z-score table just with left-tail areas. Using symmetry, we can also use this to get any areas in the right tail, because the area to the left of any -z is the same as any area to the right of any +z. Even though the z-score table contains areas in its cells, our first step is to create a table just of the z-scores that correspond to each cell:
c0 <- seq(-3.4,0,.1) c1 <- seq(-3.41,0,.1) c2 <- seq(-3.42,0,.1) c3 <- seq(-3.43,0,.1) c4 <- seq(-3.44,0,.1) c5 <- seq(-3.45,0,.1) c6 <- seq(-3.46,0,.1) c7 <- seq(-3.47,0,.1) c8 <- seq(-3.48,0,.1) c9 <- seq(-3.49,0,.1) z <- cbind(c0,c1,c2,c3,c4,c5,c6,c7,c8,c9) z</pre> <pre>c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 [1,] -3.4 -3.41 -3.42 -3.43 -3.44 -3.45 -3.46 -3.47 -3.48 -3.49 [2,] -3.3 -3.31 -3.32 -3.33 -3.34 -3.35 -3.36 -3.37 -3.38 -3.39 [3,] -3.2 -3.21 -3.22 -3.23 -3.24 -3.25 -3.26 -3.27 -3.28 -3.29 [4,] -3.1 -3.11 -3.12 -3.13 -3.14 -3.15 -3.16 -3.17 -3.18 -3.19 [5,] -3.0 -3.01 -3.02 -3.03 -3.04 -3.05 -3.06 -3.07 -3.08 -3.09 [6,] -2.9 -2.91 -2.92 -2.93 -2.94 -2.95 -2.96 -2.97 -2.98 -2.99 [7,] -2.8 -2.81 -2.82 -2.83 -2.84 -2.85 -2.86 -2.87 -2.88 -2.89 [8,] -2.7 -2.71 -2.72 -2.73 -2.74 -2.75 -2.76 -2.77 -2.78 -2.79 [9,] -2.6 -2.61 -2.62 -2.63 -2.64 -2.65 -2.66 -2.67 -2.68 -2.69 [10,] -2.5 -2.51 -2.52 -2.53 -2.54 -2.55 -2.56 -2.57 -2.58 -2.59 [11,] -2.4 -2.41 -2.42 -2.43 -2.44 -2.45 -2.46 -2.47 -2.48 -2.49 [12,] -2.3 -2.31 -2.32 -2.33 -2.34 -2.35 -2.36 -2.37 -2.38 -2.39 [13,] -2.2 -2.21 -2.22 -2.23 -2.24 -2.25 -2.26 -2.27 -2.28 -2.29 [14,] -2.1 -2.11 -2.12 -2.13 -2.14 -2.15 -2.16 -2.17 -2.18 -2.19 [15,] -2.0 -2.01 -2.02 -2.03 -2.04 -2.05 -2.06 -2.07 -2.08 -2.09 [16,] -1.9 -1.91 -1.92 -1.93 -1.94 -1.95 -1.96 -1.97 -1.98 -1.99 [17,] -1.8 -1.81 -1.82 -1.83 -1.84 -1.85 -1.86 -1.87 -1.88 -1.89 [18,] -1.7 -1.71 -1.72 -1.73 -1.74 -1.75 -1.76 -1.77 -1.78 -1.79 [19,] -1.6 -1.61 -1.62 -1.63 -1.64 -1.65 -1.66 -1.67 -1.68 -1.69 [20,] -1.5 -1.51 -1.52 -1.53 -1.54 -1.55 -1.56 -1.57 -1.58 -1.59 [21,] -1.4 -1.41 -1.42 -1.43 -1.44 -1.45 -1.46 -1.47 -1.48 -1.49 [22,] -1.3 -1.31 -1.32 -1.33 -1.34 -1.35 -1.36 -1.37 -1.38 -1.39 [23,] -1.2 -1.21 -1.22 -1.23 -1.24 -1.25 -1.26 -1.27 -1.28 -1.29 [24,] -1.1 -1.11 -1.12 -1.13 -1.14 -1.15 -1.16 -1.17 -1.18 -1.19 [25,] -1.0 -1.01 -1.02 -1.03 -1.04 -1.05 -1.06 -1.07 -1.08 -1.09 [26,] -0.9 -0.91 -0.92 -0.93 -0.94 -0.95 -0.96 -0.97 -0.98 -0.99 [27,] -0.8 -0.81 -0.82 -0.83 -0.84 -0.85 -0.86 -0.87 -0.88 -0.89 [28,] -0.7 -0.71 -0.72 -0.73 -0.74 -0.75 -0.76 -0.77 -0.78 -0.79 [29,] -0.6 -0.61 -0.62 -0.63 -0.64 -0.65 -0.66 -0.67 -0.68 -0.69 [30,] -0.5 -0.51 -0.52 -0.53 -0.54 -0.55 -0.56 -0.57 -0.58 -0.59 [31,] -0.4 -0.41 -0.42 -0.43 -0.44 -0.45 -0.46 -0.47 -0.48 -0.49 [32,] -0.3 -0.31 -0.32 -0.33 -0.34 -0.35 -0.36 -0.37 -0.38 -0.39 [33,] -0.2 -0.21 -0.22 -0.23 -0.24 -0.25 -0.26 -0.27 -0.28 -0.29 [34,] -0.1 -0.11 -0.12 -0.13 -0.14 -0.15 -0.16 -0.17 -0.18 -0.19 [35,] 0.0 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.07 -0.08 -0.09Now that we have slots for all the z-scores, we can use pnorm to transform all those values into the areas that are swiped out to the left of that z-score. This part is easy, and only takes one line. The remaining three lines format and display the z-score table:
zscore.df <- round(pnorm(z),4) row.names(zscore.df) <- sprintf("%.2f", c0) colnames(zscore.df) <- seq(0,0.09,0.01) zscore.df 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 -3.40 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002 -3.30 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003 -3.20 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005 -3.10 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007 -3.00 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010 -2.90 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014 -2.80 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019 -2.70 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026 -2.60 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036 -2.50 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048 -2.40 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064 -2.30 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084 -2.20 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110 -2.10 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143 -2.00 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183 -1.90 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233 -1.80 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294 -1.70 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367 -1.60 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455 -1.50 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559 -1.40 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681 -1.30 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823 -1.20 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985 -1.10 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170 -1.00 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379 -0.90 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611 -0.80 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867 -0.70 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148 -0.60 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451 -0.50 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776 -0.40 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121 -0.30 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483 -0.20 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859 -0.10 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247 0.00 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641You can also draw a picture to go along with your z-score table, so that people remember which area they are looking up:
x <- seq(-4,4,0.1)
y <- dnorm(x)
plot(x,dnorm(x),type="l", col="black", lwd=3)
abline(v=-1,lwd=3,col="blue")
abline(h=0,lwd=3,col="black")
polygon(c(x[1:31],rev(x[1:31])), c(rep(0,31),rev(y[1:31])), col="lightblue")
[/code]
It looks like this:
In the slides, code to produce a giant-tail z-score table is also provided (where the areas are > 50%).
Pingback: Where Do Z-Score Tables Come From? (+ how to make them in R) | A bunch of data
Pingback: Where Do Z-Score Tables Come From? (+ how to make them in R) – Mubashir Qasim
Pingback: Distilled News | Data Analytics & R
Can I ask a short question? How would I a adapt the light blue filling to a specific z-value? I guess I have to change the following lines:
abline(v=2, lwd=3, col=”blue”) #change the v-value to the z-value we want (here: from -1 to 2):
polygon(c(x[1:31], rev(x[1:31])), c(rep(0,31), rev(y[1:31])), col=”lightblue”) #What to I have to change here?
Many thanks!
Maybe let me specify my question. I know that I have to replace every “31” by a “61” (see below)…. but I just got there by trial and error. How can I compute which value I have to plug in so that it fits the z-score that I want?
polygon(c(x[1:61],rev(x[1:61])), c(rep(0,61),rev(y[1:61])), col=”lightblue”)
Hi Julius 🙂 You can use “which” to find the position in x where your z-score targets lie. So for example if you want to shade from z=-1.4 to z=+0.3, you could do this:
> which(x==”-1.4″)
[1] 27
> which(x==”0.3″)
[1] 44
> polygon(c(x[27:44],rev(x[27:44])), c(rep(0,18),rev(y[27:44])), col=”lightblue”)
(The “18” in there is just index 44 – index 27 + 1.) Does that work for you?
Great, thank you so much! It works. Very useful!