BY MAI MIKSIC | In the early 2000’s, the performance of Quebec’s students rivaled other top performing countries on international tests. However, these numbers were deceptive in that they masked Quebec’s high dropout rates. Even today, dropout rates are disappointing, so much so that the government has been accused of hiding the actual numbers (see here). Government officials knew they needed to address this issue, and attempted to do so by implementing the Quebec Education Program (QEP). Up until now, we have largely been in the dark about the effects of this reform on student performance.
A new study by Haeck, Lefebvre, and Merrigan (2014) analyzed the data and reported on the findings. The authors found strikingly negative results, suggesting that Quebec’s administrators rushed too quickly to implement an instructional approach that had little empirical support. At the same time, given these results are the first to come out of the reform, one can ask if the authors were perhaps too zealous in drawing their conclusions. An in-depth analysis of Haeck et al.’s (2014) study is necessary in order to determine the reliability of their results and if potential comparisons to the United States are warranted.
The Quebec Education Program
The QEP was based on a socio-constructivist teaching approach, which “focused on problem-based and self-directed learning…moved away from the traditional/academic approaches of memorization, repetitions, and activity books, to a much more comprehensive approach focused on learning in a contextual setting in which children are expected to find answers for themselves” (Haeck, Lefebvre, & Merrigan, 2014, p. 139). As a result, students took on a more active role in determining the direction of their learning. Students were expected to be able to build on previous knowledge and incorporate new knowledge into their understanding of the concepts. Cross-curricular competency was also emphasized.
All schools, both public and private, were required to implement the new education program. Given that education is regulated at the provincial level, Quebec’s education reform affected only students in the province and students from other provinces continued as they were. All children in Quebec were exposed to the reform according to the same timeline, such that parents were not able to self-select their children out of the reform, except by moving out of the province. Haeck et al. (2014) were confident that the centralized nature of the reformed ensured that the implementation was uniform across the entire province, although there was no actual data to back this up.
Importantly, there was also no formal monitoring of the implementation of the reform, which limits our understanding of the mechanism of the implementation. What we do know is that province-wide examinations were adapted to the new curriculum. As a result, we can infer that teachers tended to “teach to the test” in order to ensure that their students maintained adequate test scores. It is pertinent to recall however that no consequences or incentives were tied to teacher performance.
Teachers and administrators were provided with training prior to implementation. Principals and teachers started planning for the implementation in June 2000 and enacted their plans in September 2000. Administrators and teachers were allowed to modify the approach to meet the needs of their particular school. Finally, education counselors were assigned to each school in order to support teachers as the adapted to the new teaching methods.
Haeck et al. (2014) used the National Longitudinal Survey of Children and Youth (NLSCY) as their primary dataset for analyses. The NLSCY provided a sample of 7,745 students in Quebec and compared them to 33,390 students from the rest of Canada. The authors did a descriptive analysis of the students’ characteristics in order to determine whether students in Quebec were statistically significantly different from students from the rest of Canada. Results showed that the two samples were comparable, with no significant differences in cognitive ability (as measured by the Peabody Picture Vocabulary Test) or family characteristics (income, parent education, family formation, etc.). Still, authors included these factors as controls in their analyses in order to factor out their influence on the results.
The main outcome of interest in this study was the NLSCY math test scores. Math scores were collected for the children between 1996 and 2008, although test scores were only available every other year. The main cohort used for analyses started first grade in either 1999 or 2000. In order to further validate the findings, making sure that results are not unique to this specific cohort, the authors also examined other cohorts: children who entered first grade in 2003-2004, those who entered in 2005, as well as those who entered in 2007. Due to the nature of the dataset, the authors had to adjust their set up, which complicated the available outcomes. In one of the cohorts, years 1996 and 1998 were dropped because of a low response rate; as a result, the authors only collected data from the base year 2000 when the students were in grades 3-4 and grades 5-6.
It is important to note that the NLSCY was not collected with the intention measuring the outcomes of the QEP. The tests used in the NLSCY were not the tests administered by the province to evaluate the progress of students under the reform. Remember, the province’s tests were adapted to measure the outcomes of the new curriculum. For example, the adapted tests used by the province administrators might measure the critical thinking skills of the students by requiring students to analyze passages whereas the test used in the NLSCY might simply be a series of multiple choice questions. Therefore, using the test results in the NLSCY may not be an appropriate measure of the outcomes of interest for the reform – but they do offer an independent source of data through which to evaluate the policy.
This study’s greatest strength is the methodology used for analyzing the data. The authors employed a difference-in-differences (DID) method, which is often used by economists to examine the causal effects of policy reforms. DID is used when there are two groups being compared, one of which is exposed to the new policy while the other group is not; there must be two or more time periods in which the outcome variable has been measured. In this case, the children exposed to Quebec’s education reform (the treatment group) were compared to the rest of the children in Canada (the comparison group). Using a complex regression method, the average treatment effect for the children in Quebec is subtracted from the average gain for the children in the rest of Canada.
The advantage of this approach is that it eliminates biases that result from permanent differences between the treatment group and the comparison group, as well as trend changes that could also affect the outcomes. This method can factor out the effects of other policy changes, such as Quebec’s childcare reform in 2000, which could have affected children’s outcomes- something that merely using control variables cannot do. Additionally, since this method is regression based, the use of control variables can also be employed to adjust for personal characteristics; as previously mentioned above, the authors used control variables such as family income and parent education.
While DID is an extremely complex and sophisticated econometric method for determining causal effects of policy reforms, it is not without weaknesses. As a result, in addition to doing DID, the authors used a change-in-changes (CIC) approach which addresses some of the faults of DID. While a discussion of the technical differences between the two methods are outside the scope of this article, one can generally say that CIC can address some of the assumptions that are fundamental to DID. For example, DID assumes that the effects of the policy reform are additive, and CIC relaxes this assumption. In sum, CIC is able to not only measure the average treatment effect over time but is also able to measure the effects at each time point.
There is a glaring omission in the analyses that would alarm any researcher: there was, essentially, no information about teachers. The NLSCY only had data on teachers up until 2002 (analyses are based on data that goes through 2008) and the response rate from teachers was less than 50 percent. Such a low response rate means that the data was not a good representation of what all teachers were like in Quebec. Most often, any omission of teacher quality is extremely concerning when analyzing the effects of an education reform. Researchers often want, at minimum, access to the years of education or experience in their regression equations in order to control for the influence of teacher quality. Another concern that needs to be addressed is whether teacher quality in Quebec changed in any substantial way during the reform compared to the teachers in the rest of Canada. So, what is the quality of teachers like in Quebec?
Here is what we know about teachers in Quebec. Teachers are required to have a four year bachelor’s degree and be certified in order to teach. The authors of this study rightly point out that teachers self-select according to their teaching preferences, whether it be at what level they want to teach at (kindergarten through secondary schools) or their main teaching domain (language arts, math, science, etc.). As a result, it is assumed that teachers select a field in which they both enjoy and excel teaching in. According to the authors, teachers are hired based on their “potential teaching skills” (however that is measured) and strict seniority rules (years of experience), governed by collective bargaining regulations.
Haeck et al. (2014) state that they could not find any evidence that teachers with less experience were assigned to reform groups and believe that there is no reason to think that such a strategy was adopted across the province. This is confusing, considering the reform was supposed to have been uniformly implemented across the province. Importantly, the authors do not state how exactly they went about determining this, given that there was barely any information about the teachers in the first place. Also, the authors do not mention (and we can assume they did not investigate) the level of teacher support for the reform. As a result, they infer that teacher support and quality is of no concern and did not affect their results.
However, common sense tells us that teacher support and quality is could be important when it comes to implementing a reform. First, it is possible that newer teachers may be more open to new methods of teaching, and that more experienced teachers may have established instructional methods and may be resistant to new methods. If there is resistance to the reform then there is reason to believe that the implementation of the reform may be compromised. After all, there is no data on whether the reform was implemented as it was intended to be. A simple search of the internet revealed that there was indeed a great deal of resistance from teachers and teacher unions (see here and here). Thus, the fact that teacher quality and support were not included in the analyses has major implications for how we interpret the results of this study.
While it is not necessarily the authors’ fault that there was essentially no data on teachers, the fact that the authors so cavalierly dismissed the issue is disturbing. Most researchers would include some sort of statement cautioning readers on how to interpret the results because of this lack of information. This hints at the possibility that the authors may have biases when it comes to being in favor of or against the reform.
DID and CIC produce highly technical results that are difficult to translate to an everyday metric. As a result, the specifics of the results will not be discussed in depth. Instead, the general findings will be reported and its implications discussed.
The authors reported their main findings to be negative: students did poorly on math tests over time. However this result is rather misleading, as we will see when we examine the CIC results. According to the authors, the DID results indicated that the reform had consistent statistically significant negative effects on students’ math scores. The authors claimed that results were the same regardless of whether control variables were used or not. Additionally, it seemed that lower performing students were more affected by the reform than higher performing students. The authors found that the negative results were stronger in high school than in the lower grades, suggesting an additive effect.
To correct for some of the weakness of the DID approach, the authors performed a more comprehensive CIC analysis which revealed more specific findings. The CIC results exposed one of the fundamental flaws of this study, which is that the authors failed to emphasize in their abstract, introduction, and conclusion the differential effects on students based on length of exposure to the reform. Remember, the authors looked at several different cohorts of students, who were exposed to different amounts of time to the reform.
Interestingly, the CIC analysis showed that there were different short term and long term results. The short term results, collected from students who were exposed to the reform from fifth grade onward, showed that there were indeed negative findings consistent with DID results. However, CIC results showed that the cohort that was exposed to the reform from second grade through high school (eight years) did not seem to be affected by the reform. This does not mean, however, that there were any positive effects- simply that there were no effects.
There are a number of possible reasons for these differential effects; for example, it might be that it could have taken teachers longer to adapt to the new instructional methods. On the flipside, it is possible that teachers realized the reform was not working and decided to revert to their previous teaching methods. Without any teacher data though, there is no way to tell what might have happened. One thing is for sure, the fact that the authors highlight mainly the negative findings indicate a possible bias. An impartial reporting of the results would have emphasized that the results were mixed and true conclusions cannot be drawn from such results.
The authors of this study make the case that the results of the QEP have practical implications for comprehensive school reform in the United States. This conclusion is questionable, given the differences between Quebec, and more generally Canada, and the United States. The United States has had to contend with not only high dropout rates but also low performance on international tests; whereas Canada has had consistently higher PISA rankings than the U.S. Would a socio-constructivist instructional curriculum work in the United States? There are far too many variables to consider to jump to that conclusion.
More broadly, does a socio-constructivist curriculum work in general? The authors seem to believe that the reform did great harm to students’ learning. However, I argue that the lack of teacher data is a serious omission that could not only influence the outcomes of this study but also masks the mechanism through which the reform worked. Additionally, the long term results of the reform showed null results and not negative results. Further, we know nothing about how the reform affected students’ reading abilities. Also, recall that the authors highlight that Quebec’s administrators’ main concern was the high dropout rate. Yet this study used math test results instead of dropout rates, which may not be relevant to the actual outcome of interest. After all, we know that Quebec had high test results to begin with. In sum, the authors highlight the negative results of this study, when in actuality they jumped to make a sweeping conclusion that was not necessarily backed up by their results.
In the end though, the authors should be commended for the statistical methods used in this study. It was comprehensive to do both a DID and CIC analysis. Future studies should try to use similar rigorous methods to analyze socio-constructivist instructional reforms. If any strong conclusion could be drawn from this study, it is that there needs to be more research on this topic. It is just as important to determine what doesn’t work in closing the achievement gap as it is determining what does work.
Haeck, C., Lefebvre, P., & Merrigan, P. (2014). The distributional impacts of a universal school reform on mathematical achievements: A natural experiment from Canada. Economics of Education Review, 41, 137-160.