Class Size and Teacher Effects on Student Achievement and Dropout Rates in University-Level Calculus

This paper describes two studies of class size effects on achievement (on a standardized exam) and dropout rates in calculus at the university level. The first, or preliminary, study was conducted in 1995 at a large land-grant institution in the southern United States, and involved one teacher in both a large and a small section of calculus and a total of 293 students. The second, main study was conducted at a large private university in the western United States in 1997 and 1998, and included four teachers, each in both large and small sections of calculus, and a total of 2,118 students. After accounting for other significant factors, we found that class size itself was not a significant factor in student achievement in calculus at the university level, nor were students more likely to drop from large sections than they were from small sections. However, individual teachers did vary in their effectiveness in different class sizes--some were more effective in large classes than in small ones, while others were less effective in large classes than in small ones. More importantly, the most effective teachers in large sections were more effective than almost all of the remaining teachers in small sections. Implications for university administrators are discussed.

Introduction

Class size and its effect on students have been researched repeatedly in recent years, but most of these studies have focused on elementary and secondary schools rather than on university-level teaching. And very few studies have dealt with mathematics. Yet [Smith and Glass, 1980] gives evidence that class size effects vary with students' age. And others studies indicate that class size effect varies with subject matter--even within a discipline [McConnell and Sosin, 1984,Raimondo et al., 1990]. This indicates a need for a specific study on the effects of class size in university-level mathematics courses. This paper begins to treat this need by studying class size effects on achievement and dropout rates in calculus classes at two large universities.

In a meta-analysis of class-size studies, Glass and Smith [Glass and Smith, 1979,Smith and Glass, 1980] showed for elementary school children that the benefit of small classes is a logarithmic function of size with the marginal benefit of reducing class size being most significant for classes of size 20 and fewer. Moreover, the marginal benefit is very small when classes are larger than 25 or 30 students; that is, there is only little, if any, benefit to reducing class size if the small class has more than 25-30 students. Since most universities cannot afford to reduce class size in introductory mathematics courses to much below 30 students, Glass and Smith's results, if applicable to university-level instruction, would suggest that little or no benefit would be derived from reducing class sizes from relative large to small classes of approximately 30. This study confirms that hypothesis.

Studies of university-level economics and accounting instruction have repeatedly shown little or no significant effect on student achievement from reduced class size [Bellante, 1972,Hill, 1998,Kennedy and Siegfried, 1997]. But again, since class size effects vary with subject and discipline, it was important to study the effect of class size on student achievement in introductory-level mathematics. Our results agree with those obtained in other disciplines.

One concern with many available class-size studies is the fact that although one would expect the effect of class size to vary substantially with the teacher, few studies account for this teacher effect. Some studies include just one teacher (e.g., [Thompson, 1991]). These have the problem that they only show the significance of a size effect for the one teacher involved. But for many reasons, one teacher might be much less effective in a large class than another is. Other studies (e.g., [Williams et al., 1985]) include many teachers, some in large and some in small classes, but without accounting for the teacher in the study. However, one would expect, and our study confirms, that the effect on student achievement due to variation among teachers is much larger than any effect due to class size; thus any model that does not correct for the effect of different teachers will be unable to accurately identify a class-size effect. Indeed, the studies that fail to account for teacher effects generally have a very poor fit between their model and their data.

This study shows that class size in university calculus classes matters only in relation to teacher. In particular, averaging over all teachers in the study, class size had no significant effect on students' achievement in calculus. However, some teachers were substantially less effective in a large class than other teachers were, and one teacher was more effective in a large class than in a small one. More significant is the conclusion of a second analysis comparing the relative effectiveness of different teachers. We found that three teachers of large sections were more effective than all but four of 24 teachers of small sections. Consequently, a student would have been better served in the large section with the better teachers, than in all but four of the small sections.

We should note here that in the main study both large and small sections had associated review sections which met twice weekly. These review sections were all small--about 30 students each. In the preliminary study neither the large nor small sections had associated review sections. It is possible that, although the size of the main lecture is not a significant factor, the size of the review section might have a statistically significant effect on students' achievement. Indeed, Kennedy and Siegfried [Kennedy and Siegfried, 1997] cite work of Attiyeh and Lumsden from 1972, which shows some evidence that this is the case in introductory economics classes. Nevertheless, with or without review sessions, we found no effect on student achievement due to class (main lecture) size.

Of course, student achievement is not the only measure of teaching success. Students' attitudes may also be important, even when they are achieving more in the large sections. Studies of students' attitudes in large and small courses give conflicting results. For example, Wood [Wood et al., 1974] concludes that student ratings of instructors declined as enrollment increased to 240, but beyond that point they began to improve. But others [Marsh et al., 1979] found little correlation between class size and students' attitudes about the course. And Sweeney, et. al. [Sweeney et al., 1983] found that large economics courses were actually preferred over small ones.

Because student evaluations and attitude surveys are relatively unreliable measures and appear to vary widely from class to class, and from day to day within a class, we chose to use student dropout rates as a more objective measure of student attitude.

In an analysis of student dropout rates for large and small classes (with identical instructors) we found that class size was not a significant factor. Again (aside from students' preparation for the course and natural aptitude) the teacher appears to be the chief factor affecting dropout rates.

Since, both in terms of student achievement and in terms of students' likelihood to complete the course, the teacher was a significant factor, and class size was not, we conclude that even if a university can afford to offer calculus in small sections with first-rate teachers, it may still not be the best strategy. One of those teachers may well be more effective in a large class than most or all of the others in a small class; and therefore, students would be better served in the large section than in most or all of the small ones. At most universities, however, small classes are achieved only by hiring adjuncts, graduate students, and less-qualified faculty. In such a case, reducing class size may be doing students a disservice while simultaneously increasing instruction costs.

Preliminary Study: University A

Model

The object of the preliminary study, which was performed at a large land-grant institution in the southern United States (University A ), was to study the effect of class size (S) on the performance of Calculus I students, while adjusting for other important effects such as students' general mathematical ability, as measured by their ACT score (A), students' sex (G), students' ethnic group (E), teacher(T), and semester (M). Performance was measured principally by a standardized departmental final exam, but we were also interested to see if students' success (as measured by their grade) in Calculus II might be differently affected by class size. The following statistical models were used for analysis of the data.

Model to measure size effects on final exam scores

Let y_l(i,j,k) be the performance on a standardized departmental exam of the student l in the class k taught by the teacher j in a class of type i (large if i=1, small if i=0).

There are also potential effects due to interaction between different effects, for example

Also, there is an adjustment for initial aptitude and preparation, as measured by the ACT score A_l and scaled by a linear factor $\beta.$

Letting $\mu$ be the overall mean, the basic model is
$\begin{multline} y_{l(i,j,k)} = \mu + S_i + T_j + G_{l(i,j,k)} + E_{l(i,j,k)} + ... ...jkl} + \beta A_{l(i,j,k)} + \delta_{k(i,j)} + \epsilon_{l(i,j,k)}.\end{multline}$
Here $\delta_{k(i,j)}$ is the random error term associated to each class, and $\epsilon_{l(i,j,k)}$ the random error associated to each student.

The initial factors of sex and ethnic group were decided on because other studies have found them to be significant (see, for example [McConnell and Sosin, 1984]). It also seemed likely that the teacher would have a significant effect on achievement, and so teacher was also included as an initial factor.

This statistical model is similar to a standard split-plot model, but differs in that we have an adjustment for the covariate (ACT score). The use of the covariate is important to compensate for the fact that various factors beyond our control may affect the types of students enrolled in the different-sized classes. For example, the small classes fill up more quickly, leaving those who register late in the large classes (this may also partially account for the common perception that large classes are worse). The covariate helps to account for these differences.

Model to measure Calculus I size effects on performance in Calculus II

The preceding model describes the effects of class size on student mastery of the material in Calculus I as measured by the departmental final exam. But it also seemed possible that class size in Calculus I might have a different effect on students' performance in the follow-up course, Calculus II, than it did on performance on the Calculus I final exam. The model we used to test this hypothesis was similar to the preceding model, but it used the restricted data set of only those students that continued on to Calculus II. Success in Calculus II was measured using students' final grade in Calculus II, after adjusting for the additional effect of different teachers for Calculus II and adjusting for the higher order interactions that might be associated with this teacher effect.

Although it would be interesting and important to know the effects of large class size on subsequent enrollment in mathematics, it is probably impossible to draw conclusions about such effects. The difficulty arises because student enrollment in subsequent mathematics courses, especially in Calculus II, is primarily dependent upon program requirements rather than student preference.

Data

In this study, one instructor taught both a large (ca. 90 students) and a small (ca. 35 students) section of introductory calculus in the course of one academic year (1995-6). Eight other instructors taught small sections, and their data are used to standardize the final exam, in particular to account for differences between semesters. Their data also helped identify some changes that needed to be made for the main study at University B. This set contains data for 293 students.

Conclusions of the preliminary study

It was impossible to measure the interaction of teacher and class size because only one teacher taught both large and small sections. However, for that one teacher, class size was not statistically significant. The interaction effects of class size with ethnic group and with sex were also not significant. See Tables 1 and 2 for more details.

Table 1: University A: Initial model. The initial model included many potential factors, but most were not significant. Neither class size, nor any of its higher-order interactions are significant. For further discussion of these results, see Section 2.3.
2|c| University A: Initial model.

Model

R-Square Pr > F

0.501564 0.0001

Source Pr > F

A = ACTMATH 0.0001

T = TEACHER 0.0002

M = SEMESTER 0.0287

E = ETHN 0.3037

G = SEX 0.6897

S = SIZE 0.5010

ET = ETHN*TEACHER 0.0908

EM = ETHN*SEMESTER 0.3148

AE = ACTMATH*ETHN 0.9240

GT = SEX *TEACHER 0.0289

GM = SEX *SEMESTER 0.3812

AG = ACTMATH*SEX 0.9755

SE = ETHN*SIZE 0.2297

SG = SEX *SIZE 0.6945

**Table 1:** **University A: Initial model.** The initial model included many potential factors, but most were not significant. Neither class size, nor any of its higher-order interactions are significant. For further discussion of these results, see Section 2.3.
2\|c\| University A: Initial model.

Model
R-Square	Pr > F
0.501564	0.0001

Source	Pr > F

A = ACTMATH	0.0001
T = TEACHER	0.0002
M = SEMESTER	0.0287
E = ETHN	0.3037
G = SEX	0.6897
S = SIZE	0.5010
ET = ETHN*TEACHER	0.0908
EM = ETHN*SEMESTER	0.3148
AE = ACTMATH*ETHN	0.9240
GT = SEX *TEACHER	0.0289
GM = SEX *SEMESTER	0.3812
AG = ACTMATH*SEX	0.9755
SE = ETHN*SIZE	0.2297
SG = SEX *SIZE	0.6945

**Table 2:** **University A: Final Model** This model is the result of systematic elimination of the insignificant variables. Note that ethnic group and sex are not significant themselves, but rather only the interaction terms with teacher. That coincides with the intuition that sex and ethnicity themselves do not play a role in achievement, but the way the teacher responds to them does. For further discussion of these results, see Section 2.3.
2\|c\| University A: Final model.

Model
R-Square	Pr > F
0.490446	0.0001

Source	Pr > F

TEACHER	0.0001
ACTMATH	0.0001
SEMESTER	0.0276
ETHN	0.2957
SEX	0.6877
ETHN*TEACHER	0.0796
SEX *TEACHER	0.0306

By far the most significant influences on student performance were initial preparation and aptitude, as measured by the ACT math score, and teacher (although only one teacher taught both large and small sections, eight others taught small sections). The fact that the teacher would have an effect is not surprising. But the magnitude of this effect was large compared to all others (except the ACT scores). This seemed to indicate that class size effects cannot be effectively measured without carefully adjusting for teacher. Moreover, the widely varying nature of results of other studies that did not adjust for teacher effect (or which only include one teacher) indicate the potential for a large interaction between size and teacher.

Another large effect was associated with the semester in which the course was taught. This is probably due to the fact that students who are well-prepared for Calculus I by their high school program are likely to take Calculus I in the Fall semester of their freshman year, whereas the remaining students are more likely to take Calculus I after first taking a semester of prerequisites.

The time of day the courses were offered varied through the regular school day (8 am to 4 pm), but despite some expectations to the contrary, time was not significant. It appears that the covariate accounted for essentially all variation associated to differences in time of day.

The model measuring the performance of students in the subsequent class, Calculus II, showed no additional information given by using students' grade in Calculus II to measure performance in Calculus I. In fact, after systematic removal of insignificant factors, the best model for student performance in Calculus II appeared to be one that depends only on teacher (of the Calculus II section) and the students' score on the Calculus I final. Because tracking students to Calculus II gave no information about student mastery of Calculus I beyond that given by the final, that aspect was dropped in the main study.

Finally, although the R² value of .49 for this model was stronger than many that have been published on class size (ranging from R²=.39 [Glass and Smith, 1979] down to R² =.01 [Williams et al., 1985]) it still seemed relatively weak, indicating a need for a better covariate and for inclusion of other significant factors.

Implications for the Main Study

Main Study: University B

The goal of the main study was to decide if size had a significant effect on student achievement in calculus, and to see if the teacher-size interaction was significant. This study was conducted at a large, private university in the western United States (University B ).

Main Study Model

We originally expected the ACT and the pretest to be linearly dependent (at least after accounting for other factors such as students' age). However, a test of this hypothesis showed no significant correlation between the ACT math score and the pretest. This is probably because, as explained above, they actually test different things. Consequently, we included both in the model.

Main Study Data

The primary data consist of pretest and final exam scores for 1,984 students in first-semester calculus and 134 students in second-semester calculus, collected over two years at University B. The data also include the various other potential factors described in Section 3.1 that might influence student achievement.

The final exam was written by a departmental committee to represent the core topics and skills that were considered most important for students to know. This was considered a good measure of learning since it represented the consensus of a large number of mathematics instructors about what constitutes successful calculus learning.

For the purposes of this study, small classes are classes with 20-35 students, while large classes contain 150-240 students. Both kinds of classes included review sessions (20-35 students) twice a week with a teaching assistant.

Students who had taken the course previously were not included. Students who dropped the class were also not included, since they did not take the final exam. There was some concern that weak students might be more likely to drop from a large section, but a separate logistic regression showed that for a given teacher, and after adjusting for pretest scores, students drop essentially randomly (see section 5).

Calculus I data

The Calculus I data cover four semesters (Fall and Winter of 1997 and 1998), and 27 teachers. One teacher (Teacher Q ) taught both small and large sections in Winter 1997, a different teacher (Teacher L ) taught both small and large sections in Fall 1997, a third teacher (Teacher T ) taught both small and large sections in Winter 1998, and a fourth teacher (Teacher AA ) taught both small and large in Fall 1998. Some of these four taught large and small sections in other semesters (but not simultaneously). One other teacher (Teacher M ) taught only large sections, and the remaining teachers taught only small sections. These other teachers are included for purposes of standardizing the pretest and final, and for estimating the relative magnitude of the teacher-size effect.

This set contains data for 1,984 students. They are divided into six ethnic groups and 11 major colleges (and also the option of an undeclared major).

Calculus II data

One teacher taught both small and large sections of Calculus II in Fall 1997. The total number of students in the data set (after removing students who dropped or who had taken the course before) is 134. The same demographic data were included here as those included in the Calculus I data set.

Main Study Analysis

Calculus I analysis

Initially the model considered the potential effects of many factors, as described above. Class size alone was not significant, nor were most of its higher-order interactions (see Table 3). After a standard, systematic elimination of insignificant variables, the model had as its main factors ACT, pretest, teacher, semester, major college, and the teacher-size interaction. Unlike in the case of University A, at University B the interactions between teacher and ethnicity and between teacher and sex are not significant (see Table 4).

**Table 3:** **University B: Calculus I. Initial model.** The initial model included a large number of potential factors including total hours of credit the student had earned (THOURS), student's age (AGE), ethnic group (ETHN), sex (SEX), current course load (LOAD), and major college (MAJCOLL). Most of these were insignificant and were systematically removed. For discussion of these results, see Section 3.3.1
2\|c\| University B: Calculus I, initial model.

Model
R-Square	Pr > F
0.641257	0.0001

Source	Pr > F

SEMESTER	0.0001
ACTMATH	0.0001
PRETEST*SEMESTER	0.0001
TEACHER	0.0001
LOAD	0.0572
AGE	0.0161
AGE*AGE	0.0074
MAJCOLL	0.0292
SEX	0.9126
ETHN	0.5431
SIZE	0.1204
THOURS	0.7504
THOURS*THOURS	0.3043
TEACHER*SEX	0.9633
ACTMATH*AGE	0.0589
LOAD*MAJCOLL	0.4501
LOAD*LOAD	0.2868
SIZE*SEX	0.4114
SIZE*ETHN	0.4553
TEACHER*SIZE	0.1491
TEACHER*ETHN	0.4676
SEMESTER*TEACHER	0.0091
AGE*THOURS	0.4642

Major college is probably significant because those who have aptitudes in mathematics are most likely to major in mathematically challenging fields, like engineering. But since interest and aptitude are not easily changed by the university, they are of relatively little interest to teachers and administrators.

The factor that is the most interesting is the (weakly significant) teacher-size interaction term. The fact that size itself is not significant and that this interaction term is weakly significant in the final model shows that the size effect, if there is any size effect at all, depends primarily upon characteristics of each individual teacher.

This final model had an R-squared value of .61--a substantial improvement over the preliminary (University A) study.

**Table 4:** **University B: Calculus I.** The final model, developed by systematic elimination of insignificant variables (including class size), shows that class size itself is not a significant factor, but the teacher-size interaction term might be significant. For a discussion of these results, see Section 3.3.1
2\|c\| University B: Calculus I, final model.

Model
R-Square	Pr > F
0.612772	0.0001

Source	Pr > F

SEMESTER	0.0001
ACTMATH	0.0001
TEACHER	0.0001
PRETEST	0.0001
MAJCOLL	0.0265
TEACHER*SIZE	0.0475

Calculus II analysis

Again, a variety of different potential factors were included and then insignificant terms were systematically eliminated (see Table 5). And again, size is not significant. Teacher-size interaction is not measurable here, since only one teacher is involved.

Table 5: University B: Calculus II. This model, developed by systematic elimination of insignificant variables, shows that class size is not significant for this particular teacher. For discussion of these results, see Section 3.3.2.
2|c| University B: Calculus II, final model.

Model

R-Square Pr > F

0.452498 0.0001

Source Pr > F

ACTMATH 0.0131

PRETEST 0.0001

SIZE 0.6948

LOAD 0.0014

AGE 0.0238

ACTMATH*AGE 0.0255

**Table 5:** **University B: Calculus II.** This model, developed by systematic elimination of insignificant variables, shows that class size is not significant for this particular teacher. For discussion of these results, see Section 3.3.2.
2\|c\| University B: Calculus II, final model.

Model
R-Square	Pr > F
0.452498	0.0001

Source	Pr > F

ACTMATH	0.0131
PRETEST	0.0001
SIZE	0.6948
LOAD	0.0014
AGE	0.0238
ACTMATH*AGE	0.0255

Unlike first-semester calculus, age seems to be a factor, as well as a student's current course load. The appearance of age as a factor may be due to the fact that many students appear to delay taking their second semester of calculus; whereas many students take their first semester of calculus in their freshman year. On the other hand, many students with high school background in calculus will take second-semester calculus immediately in their freshman year. This gap between some but not all students' first and second semesters of calculus seems the most likely explanation for the role that students' age plays. The interaction between ACT math and age may be significant because of the fact that students who take second-semester calculus immediately in their freshman year have also just recently taken the ACT, whereas others who delay taking second-semester calculus may have taken the ACT several years earlier, thus its predictive value will likely vary somewhat with the student's age.

The variation from student course load is harder to explain, but a possible explanation would be that students in their first semester at the university will often simply take the recommended general education courses (including calculus I) and the recommended total hours, whereas more experienced students will vary from the norm, perhaps because they know better what they want and how to accomplish it. Consequently, course load in the second-semester calculus courses may reflect a student's personal choices and attitudes toward school work, rather than reflecting the advice of a the university or a counselor.

Major college plays no role in Calculus II, but it was significant in Calculus I. We conjecture the reason is that many majors require only first-semester calculus, which also fulfills some university general education requirements, whereas, most majors that require second-semester calculus are either in Engineering or Physical and Mathematical Sciences, which have relatively similar coefficients in the first-semester model (see table above). Moreover, few students take the course as an elective.

Conclusions of the Main Study

Class size was not significant, and even the teacher-size interaction effect was only weakly significant. No other interaction terms involving size were significant. This suggests that if there is any effect on students' achievement due to class size, it is a function of the individual teacher and her or his ability and attitude, rather than a function of the size alone.

The important question to ask about class size is whether it is in the students' and the university's best interest to increase or decrease class sizes. The insignificance of size as factor in achievement is, taken alone, not enough to answer that question. In particular, we must ask whether some teachers in large classes are more effective than others in small classes. Also, it is important to know if more students drop out of large classes, since their data could not be included in the study without final exam scores (failing students who did not drop were part of the main study). These two questions are the subject of the additional analyses described in Sections 4 and 5.

Additional analysis--Net Effect of Teacher and Teacher-Size Effects

In order to decide whether a good teacher in a large section was more effective than other teachers in small sections, we solved for the (biased) coefficients in the previous Calculus I model.

Table 6: University B : Calculus I. Estimated coefficients (biased) for the University B Calculus I model. For discussion of these results see Section 4
5|c| University B: Calculus I. Estimated coefficients

Parameter Estimate Parameter Estimate

2c|SEMESTER INTERCEPT 10.34158770

1997 Winter -22.81697752 ACTMATH 1.81378293

1997 Fall -30.78852062 PRETEST 0.28145443

1998 Winter -31.56136904

1998 Fall 0.00000000

2c|TEACHER

Teacher A 2.97328398

Teacher B 14.14475384

Teacher C -2.84780267

Teacher D -5.83704943

Teacher E -1.72930695 2c|TEACHER*SIZE

Teacher F 1.42747917 Teacher L large -6.08437615

Teacher G 2.36437273 Teacher L small 0.00000000

Teacher H 8.75866094 Teacher Q large -5.25948571

Teacher I 1.99462535 Teacher Q small 0.00000000

Teacher J -1.34914128 Teacher T large -3.17380207

Teacher K 1.36048855 Teacher T small 0.00000000

(large and small) Teacher L 5.36603618 Teacher AA large 2.61853700

(large only) Teacher M 3.67414925 Teacher AA small 0.00000000

Teacher N -3.54302678

Teacher O 7.33831105 2c|MAJOR COLLEGE

Teacher P -4.96070232 art 1.06288729

(large and small) Teacher Q 10.34994654 biology 3.09426616

Teacher R 0.41012342 business 1.70441351

Teacher S 2.57185348 education -1.64708684

(large and small) Teacher T 1.01064676 engineering 0.95833944

Teacher U -0.05621085 family science -8.03393284

Teacher V -0.53756900 health/PE 0.72086673

Teacher W -0.65055273 humanities 2.41684870

Teacher X -4.58577073 nursing 1.55783270

Teacher Y -0.33776313 phys/math science 2.44752224

Teacher Z 1.50086908 social science -0.36290885

(large and small) Teacher AA 0.00000000 undeclared 0.00000000

**Table 6:** **University B : Calculus I.** Estimated coefficients (biased) for the University B Calculus I model. For discussion of these results see Section 4
5\|c\| University B: Calculus I. Estimated coefficients

	Parameter	Estimate	Parameter	Estimate

	2c\|SEMESTER	INTERCEPT	10.34158770
	1997 Winter	-22.81697752	ACTMATH	1.81378293
	1997 Fall	-30.78852062	PRETEST	0.28145443
	1998 Winter	-31.56136904
	1998 Fall	0.00000000

	2c\|TEACHER
	Teacher A	2.97328398
	Teacher B	14.14475384
	Teacher C	-2.84780267
	Teacher D	-5.83704943
	Teacher E	-1.72930695	2c\|TEACHER*SIZE
	Teacher F	1.42747917	Teacher L large	-6.08437615
	Teacher G	2.36437273	Teacher L small	0.00000000
	Teacher H	8.75866094	Teacher Q large	-5.25948571
	Teacher I	1.99462535	Teacher Q small	0.00000000
	Teacher J	-1.34914128	Teacher T large	-3.17380207
	Teacher K	1.36048855	Teacher T small	0.00000000
(large and small)	Teacher L	5.36603618	Teacher AA large	2.61853700
(large only)	Teacher M	3.67414925	Teacher AA small	0.00000000
	Teacher N	-3.54302678
	Teacher O	7.33831105	2c\|MAJOR COLLEGE
	Teacher P	-4.96070232	art	1.06288729
(large and small)	Teacher Q	10.34994654	biology	3.09426616
	Teacher R	0.41012342	business	1.70441351
	Teacher S	2.57185348	education	-1.64708684
(large and small)	Teacher T	1.01064676	engineering	0.95833944
	Teacher U	-0.05621085	family science	-8.03393284
	Teacher V	-0.53756900	health/PE	0.72086673
	Teacher W	-0.65055273	humanities	2.41684870
	Teacher X	-4.58577073	nursing	1.55783270
	Teacher Y	-0.33776313	phys/math science	2.44752224
	Teacher Z	1.50086908	social science	-0.36290885
(large and small)	Teacher AA	0.00000000	undeclared	0.00000000

The results are listed in Table 6. We found that the best teachers in large sections (three of four who taught large sections) were better for student achievement than all but four of the remaining 24 teachers who taught in small sections.

In particular, teacher Q had an effect of 10.3 and a teacher-size interaction effect of -5.2 for large classes, making a total effect of 5.1 to a student's final exam score in teacher Q's large section. Teacher M only taught large sections, and had an effect of 3.7, and teacher AA had a total effect of 2.62 in large classes. However, only four small-section teachers (B=14.1, H=8.8, L=5.4, and O=7.3) had a better effect in their small sections than these three teachers (Q, M and AA) of large sections. The remaining 20 teachers taught only small sections, and they had an effect that ranged from -5.8 up to 2.57.

Also note that the teacher-size coefficient for teacher W is positive--indicating that teacher W was actually more effective in the large class than the small one.

Finally, as one would perhaps expect, the variation due to class size (i.e. the variation among the teacher-size interaction terms) was small (6.08) compared to the variation due to teachers (19.97). This helps explain the differing conclusions (and poor fit between model and data) in many existing class size studies that do not account for variation due to teacher--any size effects are completely masked by teacher effects.

Additional Analysis--Dropping out

Drop Model and Analysis

Using a simple logistic regression, we analyze the influence of class size on dropping in both first- and second-semester calculus. For each of the teachers in the University B Calculus I and Calculus II data sets who taught both large and small sections simultaneously, we let D_i(j) denote the odds that student i in class j of type k (large or small) will drop the class. Let P_i denote student i's pretest score, which will be scaled by a linear factor $\alpha$ ,and let S_k denote the effect due to being in a class of type k on the odds of dropping.

For each teacher we compare the two models

$\begin{displaymath} \log(D_{i(j,k)})= S_k + \alpha P_i + \delta_{j,k} + \epsilon_{i(j,k)}\end{displaymath}$

(1)

and

$\begin{displaymath} \log(D_{i(j,k)})= \alpha P_i + \delta_{j,k} + \epsilon_{i(j,k)},\end{displaymath}$

(2)

where $\delta_{j,k}$ is the random error term associated to each class, and $\epsilon_{i(j,k)}$ is the random error associated to each student.

For two of the four Calculus I teachers (teachers Q and AA) the total number of students who dropped was so small, (3 of 183 and 8 of 238, respectively), that no conclusions about dropout rates could reasonably be drawn from their classes. For both of the remaining two teachers of Calculus I and the teacher of Calculus II, the Wald Chi-square indicated that size was not significant in the first model (2), and the value of c did not change much when size was deleted (model (3)). These results are summarized in Table 7.

Table 7: Dropout odds: Logistic regression to evaluate the effect of size on a student's likelihood of dropping the class shows that for these teachers there is no significant effect due to class size. For a discussion of these results see Section 5.1
Calculus I Teacher L Winter 1998 Calculus I Teacher T Fall 1997 Calculus II

218 students 21 dropped 257 students 82 dropped 170 students 36 dropped

With size (Equation (2)) With size (Equation (2)) With size (Equation (2))

c= 0.664 c = 0.587 c= 0.654

Pr > Wald Pr > Wald Pr > Wald

Variable Chi-Square Variable Chi-Square Variable Chi-Square

INTERCPT 0.0061 INTERCPT 0.0999 INTERCPT 0.4633

SIZE 0.5187 SIZE 0.2835 SIZE 0.5509

PRETEST 0.0001 PRETEST 0.4654 PRETEST 0.0036

Without size (Equation (3)) Without size (Equation (3)) Without size (Equation (3))

c = 0.658 c = 0.541 c = 0.652

Pr > Wald Pr > Wald Pr > Wald

Variable Chi-Square Variable Chi-Square Variable Chi-Square

INTERCPT 0.0004 INTERCPT 0.0976 INTERCPT 0.2871

PRETEST 0.0001 PRETEST 0.4179 PRETEST 0.0038

**Table 7:** **Dropout odds:** Logistic regression to evaluate the effect of size on a student's likelihood of dropping the class shows that for these teachers there is no significant effect due to class size. For a discussion of these results see Section 5.1
Calculus I Teacher L	Winter 1998	Calculus I Teacher T	Fall 1997	Calculus II
218 students	21 dropped	257 students	82 dropped	170 students	36 dropped

With size	(Equation (2))	With size	(Equation (2))	With size	(Equation (2))

c= 0.664		c = 0.587		c= 0.654

	Pr > Wald		Pr > Wald		Pr > Wald
Variable	Chi-Square	Variable	Chi-Square	Variable	Chi-Square

INTERCPT	0.0061	INTERCPT	0.0999	INTERCPT	0.4633
SIZE	0.5187	SIZE	0.2835	SIZE	0.5509
PRETEST	0.0001	PRETEST	0.4654	PRETEST	0.0036

Without size	(Equation (3))	Without size	(Equation (3))	Without size	(Equation (3))

c = 0.658		c = 0.541		c = 0.652

	Pr > Wald		Pr > Wald		Pr > Wald
Variable	Chi-Square	Variable	Chi-Square	Variable	Chi-Square

INTERCPT	0.0004	INTERCPT	0.0976	INTERCPT	0.2871
PRETEST	0.0001	PRETEST	0.4179	PRETEST	0.0038

We conclude that, as in the case of achievement, the influence of class size on the odds of students' dropping is small or nonexistent. If it is a factor, it probably varies with the teacher, but it appears to be insignificant for the three teachers involved in this study.

Summary

Our main conclusions are that class size itself has no significant effect on performance or dropout rate. There was a mild teacher-size effect on student achievement, but good teachers in large classes were more effective than most of the teachers in small classes. Of the factors the university can control, the teacher is by far the most important--much more so than class size. Students are best served in a large class with a better teacher than in a small class with most teachers.

These results apply only to the difference between classes of about 30 and classes of about 180. It is very possible (and even likely, based on evidence from elementary schools [Glass and Smith, 1979]) that a significant difference might exist between very small classes (ten or fewer students) and those which are small for a university mathematics class (20-30 students).

It is also important to remember that, although both the preliminary and the main studies found no significant effect due to size of the lecture section, all classes in the main study were supplemented by small review sessions (held twice weekly with a teaching assistant). While the preliminary study showed no significant effect due to class size even without these small review sections, the review sections appeared to be helpful to students in both large and small sections alike. Also, as mentioned in the introduction, it is possible that the size of the review session has a significant impact on student achievement, although the size of the main lecture does not. Further research in this direction is warranted.

We conclude that, both in terms of student achievement and also in terms of cost to the university, probably the best strategy for universities offering calculus classes would be to seek out and reward the best teachers of large sections for their work, rather than hiring average or mediocre instructors to reduce class size. Any resources that otherwise might have been used to reduce class size should probably be used instead to reduce the size of review sections and to reward these expert teachers of large sections. This strategy will simultaneously provide better instruction to calculus students and reduce instruction costs.

Acknowledgements

I am grateful to Govinda Weerakoody and David Whiting for help with the models, and to Ralph Brown, Bruce Collings, Tamara Cooper, Missie Elkins, Pedro Geoffrey, Donald Jarvis, Matt Johnson, and Krishnaswamy Venkata for helpful discussions. Finally, I am grateful to Heidi Jarvis for help with typesetting and proofreading.

Abstract:

Class Size and Teacher Effects on Student Achievement and Dropout Rates in University-Level Calculus

Introduction

Preliminary Study: University A

Model

Model to measure size effects on final exam scores

Model to measure Calculus I size effects on performance in Calculus II

Data

Conclusions of the preliminary study

Implications for the Main Study

Main Study: University B

Main Study Model

Main Study Data

Calculus I data

Calculus II data

Main Study Analysis

Calculus I analysis

Calculus II analysis

Conclusions of the Main Study

Additional analysis--Net Effect of Teacher and Teacher-Size Effects

Additional Analysis--Dropping out

Drop Model and Analysis

Summary

Acknowledgements

References

Footnotes