Learning by comparing with Wikipedia: the value to students’ learning

The main purpose of this research work is to describe and evaluate a learning technique that actively uses Wikipedia in an online master’s degree course in Statistics. It is based on the comparison between Wikipedia content and standard academic learning materials. We define this technique as ‘learning by comparing’. In order to evaluate the performance of this learning technique, data from different academic semesters was collected. Through different hypothesis tests, the academic performance of the students following a learning-by-comparing strategy is compared with the case where Wikipedia is not used. Additionally, during the course the students are asked about the reliability, currentness, completeness and usefulness of Wikipedia, as rated on a 5-point Likert scale. This data is used to analyse the perceived quality of Wikipedia, for each statistical concept of the course, and to discover its relationship with academic performance. To that end, descriptive statistics, dependence tests, and contrasts of means have been performed. El objetivo principal de este trabajo de investigación es describiry evaluar una técnica de aprendizaje que utiliza activamente Wikipedia, en el marco de una asignatura de Estadística de un máster en línea. Esta técnica se fundamenta en la comparación entre el contenido de Wikipedia y materiales académicos de aprendizaje estándar. Definimos esta técnica como «aprendizaje mediante la comparación». Con el fin de evaluar el rendimiento de esta técnica de aprendizaje, se han recogido datos de diferentes semestres académicos. Por medio de varias pruebas de hipótesis, se compara el desempeño académico de los estudiantes que han seguido esta estrategia de aprendizaje mediante la comparación con el de los que no la han utilizado. Durante el curso se pregunta a los estudiantes sobre la fiabilidad, la actualización, la completitud y la utilidad de Wikipedia, mediante una escala Likert de 5 puntos. Estos datos son utilizados en este artículo para analizar la calidad percibida de Wikipedia, para cada concepto estadístico de la asignatura, y para descubrir su relación con el rendimiento académico. Para ello, se han utilizado estadística descriptiva, pruebas de dependencia y contrastes de medias.


Introduction
It is well known among lecturers that the complete understanding of a concept or topic commonly arises when one has to explain that concept or topic to students. Is it possible to reproduce such a situation in the learning process of a student? Is it possible to do it on a virtual course? This paper gives a possible answer to those questions, through a case study related to a pilot that was carried out for two semesters on the master's degree programme in Information and Knowledge Society in the IN3 Institute at the Open University of Catalonia (UOC), Spain (http:// in3.uoc.edu). The solution proposed in this paper is inspired on the proposal and on the positive results obtained in Peters et al. (2013), and founded on the active use of Wikipedia through a guided-discovery learning strategy. In the case of e-learning, this constructivist strategy is deemed adequate (Moreno & Bailly-Baillière, 2002).
Although Wikipedia is broadly used by students at any academic level as an initial source of information, it is difficult to find higher education courses in which Wikipedia plays a central role in the learning process (Llados et al, 2013). In this paper, we show how it has been used in a course on Advanced Statistics to improve the students' academic performance. During the course, students were asked to answer different questions related to their perceived quality of Wikipedia compared to that of the standard academic learning materials of the course.
In this comparison process, as an important part of their learning performance, they have to provide evidence (examples) about the level of completeness, reliability, currentness and usefulness of Wikipedia in order to support their answers. The students are guided to learn by comparing different sources of information. For each unit of the course, the students are provided with two types of learning resources. On the one hand, students have a Wikipedia article related to the main topic of the unit and, on the other hand, they have several academic documents (from the UOC or from other universities).
In this paper, we will perform two types of analysis. First, we test the influence of the comparative use of Wikipedia in the learning framework. The first hypothesis that will be tested is the following: In order to evaluate the performance of this learning technique, data from different academic semesters was collected. We examine whether the students' academic performance (measured with the final grades obtained in the course) is better in those semesters when Wikipedia is used than in the previous ones.
Second, we analyse the students' perceived quality of Wikipedia and its relationship to the learning process. Since the assessment activities of the course focus on very different quantitative concepts, it is important to understand if the perceived quality changes across topics. We are also interested in determining if it depends on the type of students, where type is determined by the students' grades across their assessment. The hypotheses associated with these initial statements and tested in this paper are the following: Hypothesis 2: Students agree with the idea that Wikipedia has a good level of quality.

Hypothesis 3: The perceived quality of Wikipedia does not depend on the students' grades.
This paper is organised as follows: using the framework of constructivist learning theory, Section 2 defines the learning-by-comparing technique. Section 3 describes the application of this technique to a practical case.

Learning by comparing
Research in analogy shows that comparing two instructional examples facilitates knowledge transfer (Gill, 2011).
One of the main objectives of education is to facilitate the ability to apply knowledge to different situations, which is central to learning abstract concepts in the framework of analogical reasoning (Richland et al., 2004).
In analogical reasoning, students are required to develop the ability to find underlying structural similarities among corresponding objects (Holyoak & Thagard, 1995). In this paper we propose a learning technique that is based on this analogical reasoning. In the learning-by-comparing technique, students have to analyse both the similarities and the differences between two (or maybe more) learning materials that explain the same topic. They have to compare different ways of describing the same subject.
Academic literature provides evidence of the fact that students gain a robust learning when analogical reasoning is promoted in general (Gentner & Namy, 2006), and also in the specific field of statistics (Thomas, 2008).
Comparison between analogues also yields stronger solution schemas (Gill, 2011). In this paper, we will test those results, in the special case of a virtual course on Quantitative Methods.
In educational research, there is still debate about how comparisons should be made and what the appropriate level of direct instruction is to optimally facilitate that comparison (Koedinger & Aleven, 2007). Constructivist learning theory states that guided-discovery learning is the optimal solution to those questions, where few instructions are given to the students (Mayer, 2004). We will place our theoretical framework here, where there is a balance between the pure constructivist perspective and the instructional approach. In the case of e-learning, this is the most adequate option (Moreno & Bailly-Baillière, 2002). The active use of information and communication technologies (ICTs) in the virtual learning environment for different purposes (communication, text navigation, information access, etc.), results in a very positive experience of learning to build knowledge (Hernández Requena, 2008).

Course description
In the master's degree programme in Information and Knowledge Society at IN3-UOC (http://in3.uoc.edu), a course where Wikipedia has been actively used as a learning tool has been developed; "Advanced Quantitative Methods in Knowledge Society Research".
One of the main objectives of this course is to complement the statistical knowledge developed in previous basic quantitative courses by obtaining a good knowledge of some of the most relevant advanced quantitative techniques, their advantages and disadvantages, their applicability according to the type of data and objects of study, and their complementarity. With these techniques, the students perform various assessment activities by using different statistical packages and discussing possible relationships of dependence or interdependence between variables.
Although it is a practical course, where each technique is applied to particular cases, with real data, they also have basic references in the form of both Web materials and a recommended bibliography to understand the theoretical foundations of each technique.

Course methodology
The learning methodology of this course is that of a continuous assessment activity. This activity is a learning strategy integrated into the learning process, conceived as a mechanism for learning and giving reciprocal feedback. It is the most appropriate strategy in the constructivist learning methodology within the framework of e-learning (Jonassen et al., 1999 There are four assessment activities, one for each unit of the course. Each activity has a theoretical part and an applied part. The design of these assessment assignments aims to contribute to the following learning objectives and competences of the master's degree programme of which this course forms part: • Good knowledge of the most relevant quantitative and qualitative techniques, their advantages and disadvantages, their applicability according to the type of data and objects of study, and their complementarity.
• Ability to determine the feasibility and reliability, and the strengths and weaknesses of different methods and techniques.
• Awareness of the possibilities, opportunities and issues posed by empirical analysis of the Internet and other ICTs.
• Mastering of a statistical suite that facilitates the application of statistical techniques, analysis of data and drawing of conclusions.
To answer the questions proposed in the assessments, the students are provided with the following learning resources: Theoretical part: • Wikipedia: this free encyclopaedia is used to introduce different theoretical concepts.
• Learning materials: including some parts of books, or other Web materials. These are used to give to the students the foundations of each statistical technique. These materials also introduce the student to the basic concepts associated with each technique. On average, three references are given per unit. is used to test the hypothesis. The discussion of the article, through the questions stated in each problem set, is the centre of each assessment activity, and will help students understand its benefits and disadvantages.
• A statistical package and data: since this course is oriented towards the application of the proposed techniques, statistical packages are needed in order to do computations. Different statistical packages are used (such as Gretl, MX and JavaNNS), depending on the characteristics of each topic, to analyse the data and to complement the discussion of the reference article.
In the theoretical part, the students are asked about the similarities between the two sources of information, concerning some aspects of the quality of the information. They have to compare the documents in which the same topic is explained. With respect to the applied part, they have to reproduce the computations that are performed in the research article, but using different variables in the datasets associated with the research article or even different datasets. Finally, they have to interpret the results obtained with the statistical package and to compare them with those proposed in the research article.

Data sources
In order to analyse the active use of Wikipedia in the learning process of the course, we considered two different data sources. The first one is the data obtained from a questionnaire that was introduced in each of the four assessment activities. The students were asked about their perceptions of Wikipedia with respect to the following four quality aspects; completeness, reliability, currentness and usefulness: Students are required to answer these questions by comparing Wikipedia with the "standard" learning materials, and to provide evidence (examples) to support their answers. To facilitate the statistical analysis, they have to rate their answers on a 5-point Likert scale (where 1 = "completely disagree" and 5 ="completely agree").
The second source of information is the students' academic grades in each assessment activity and the final grade for the course. This data will allow us to evaluate the performance of this learning technique. The grades rank from 1 (D, poor grade) to 5 (A, excellent grade).
All this data was collected from different academic semesters in order to compare the results obtained. We have considered the second semester of the 2011/2012 academic year, when Wikipedia was not used, and the two semesters of the 2012/2013 academic year, when Wikipedia was actively implemented in the learning technique. Table 1 shows the number of students enrolled on the courses considered in the analysis.

Results
First, a preliminary analysis of the final grades of the course (ranging from 1 to 5), based on continuous assessment, is conducted across the different learning strategies (that is, without and with the Wikipedia comparisons). The results of the ANOVA test in Table 2 show that there is a significant difference (p-value equals 0.073) between the mean of the two semesters when Wikipedia was used (W=1) and the first semester (W=0).
Since students are not assigned randomly to the groups, factors other than the use of Wikipedia may have an influence on the explanation of the difference obtained in the students' academic performance (Rosenbaum, 2002).
For the purpose of this paper, and taking into account that the process of recruitment and the structure of the master's degree course are the same for the analysed semesters, we will assume that there is some homogeneity between the students' characteristics. This assumption will allow us to focus our attention on the most relevant difference between semesters: the active use of Wikipedia.
Hence, we can infer that the different academic outcomes are clearly influenced by the use of these two different learning strategies. Furthermore, since the mean when Wikipedia was used (4.61) is greater than in the case where it was not (4.29), there is a positive effect of the active use of Wikipedia on the students' academic performance. This is one of the main results in this paper (Hypothesis 1), and is consistent with the literature showing students gain robust learning when analogical reasoning is promoted (Gentner & Namy, 2006).
After checking the suitability of the proposed technique, we wanted to know if this positive result depended on the different topics of the course. By comparing the means of the students' results in group W=1 in each assessment (AA1=3.83, AA2=4.28, AA3=4.20, and AA4=4.12) through a two-sample t-test, we find that there are no significant differences between them. All p-values are greater than 0.1 (see Table 3).  The perceived quality of Wikipedia is measured by four different factors: the perception of its completeness, its reliability, its currentness and its usefulness (Chuttur, 2009). If we consider that there are no significant differences between assessments (based on the results in Table 3), we can aggregate the perception of each factor through the assessments, and analyse whether these four factors measure a unique construct (the quality of Wikipedia).
As can be seen in Table 4, Cronbach's alpha associated with these items is equal to 0.76, and greater than the requested 0.70 level (Cronbach, 1947). Concerning the item-total correlation, we find that all values are above 0.60, the recommended level for field studies (Ahn et al., 2007). Furthermore, the factor loadings are also greater than the recommended value of 0.60 (Ahn et al., 2007). Hence, from the internal point of view, the construct "Quality of Wikipedia", measured through these four factors, is deemed adequate.   In the last row of Table 5, we can see that all students agree that Wikipedia is complete, reliable, current and useful, since each factor's mean is above 3 (answers ranged from 1="completely disagree" to 5="completely agree"), and closer to 4. Taking into account the results in the previous item analysis, we can affirm that students agree with the idea that Wikipedia has a good quality level (Hypothesis 2). The most valued factor in each assessment, and also in the aggregated case, is always currentness (clearly above 4). Compared with classic academic resources, the students perceive that Wikipedia is current, contains recent information and recent references. This result does not coincide with the opinion of faculty members from different knowledge areas. Preliminary conclusions about the quality of Wikipedia in a more general study (Project WIKI4HE,http://oer.uoc.edu/wiki4HE/about/) show that currentness is the worst valued factor. Only 14% of respondents agree that Wikipedia is up to date . The least valued factor by students is Wikipedia's completeness. Although all figures are above 3, it seems that this is the factor that has to be improved in order to enhance the perceived quality of Wikipedia. The students believe that the academic learning resources are more complete than Wikipedia. In any case, we have to bear in mind that Wikipedia is just an encyclopaedia, whose aim is to facilitate a good introduction to the subjects. Its aim is not to give a complete view of a topic.
Finally, we wanted to check if there was some relationship between the perceived quality of Wikipedia and academic performance in each assessment. To that end, we will compare the mean of the perception of each quality factor for each possible grade level. In Table 6 the results of the ANOVA test are presented. Since all p-values are clearly greater than 0.05, we can conclude that there is no difference between groups. Hence, the perception of quality does not depend on the students' grades (Hypothesis 3). This shows the robustness of the measurement of the perception of each factor, and the strong effect of Wikipedia on the improvement of academic performance (tested in Table 2).

Conclusions
The main result of the paper shows that there is a significant difference between the two semesters of the 2011/2012 academic year and the previous course when Wikipedia was not used. The active use of Wikipedia in the learning process, through the learning-by-comparing technique, improves the students' academic performance.
This conclusion also bears out the known result that students gain a robust learning when analogical reasoning is promoted (Gentner & Namy, 2006).
The main findings on the students' perceived quality of Wikipedia indicate that they agree with the idea that the encyclopaedia is complete, reliable, current and useful. Although there is a positive perception of quality, there are some quality factors that obtain better scores than others. The most valued quality aspect was the currentness of the content, and the least valued was its completeness. We have shown that this result is robust, since it does not depend on the topic studied or the grades obtained by the students.
This work also illustrates that Wikipedia can be viewed not just as an encyclopaedia that can be used occasionally to resolve certain doubts, but also as an active element of the learning process. In this paper, we have shown how it can be implemented in an online course, through the learning-by-comparing technique, and its apparent positive effect on student learning.
A limitation of this analysis is that it has been performed on a specific type of course (online master's degree course in Statistics) and on small groups of students. Future research should apply this learning technique to other knowledge areas and bigger groups of students from different higher education levels to assess its usefulness.
Another limitation of the research results is the assumption that the only difference between groups has to be associated with the active use of Wikipedia. Although there are no "external" differences between semesters (for example, in the process of recruitment of the students, or in the structure of the master's degree course, etc.), there may be other "internal" factors (apart from the use of Wikipedia) influencing the students' academic performance.
While the most relevant difference between semesters is the active use of Wikipedia, we have to take into account that there may be significant differences in those factors associated with the students' personal characteristics. The students differ from one semester to the other. Since we do not have data about those factors, it has not been possible to incorporate them into the paper's discussion. In future research, something that needs to be tested is whether the differences between groups can also be explained by other internal factors such as these. Additionally, to avoid the potential bias in the estimated effects of the use of Wikipedia due to the fact that the students are not randomly allocated to a group (W=0 or W=1), adjustments for the propensity score should be made in future research (Rosenbaum & Rubin, 1983).
Finally, given the observed discrepancy between the students' and the faculty members' perceived quality of Wikipedia (analysed in the project WIKI4HE), understanding the differences will also be part of the future research plan.