Statistics Education and Educational Research based on Reproducible Computing

Patrick Wessa, Bart Baesens, Stephan Poelmans, Ed van Stee

K.U.Leuven Association, Integrated Faculty of Business and Economics



Workshop website: http://www.freestatistics.org/workshop/



Introduction

What is the focus of this workshop?

This workshop may be useful for anyone with an interest in one of the following three topics:

The main focus of this workshop is on Statistics Education or any type of education where students need to be able to interact with and communicate about empirical research results. In this sense, the workshop may be of interest to academics from various fields.

The second focus is on Educational Research and Quality Control (of the course environment). The workshop clearly illustrates how the learning outcomes (as measured by objective exams) can be related to (or predicted by) various factors such as: objectively measured learning activities, learning attitudes, social interaction/networking, etc... The model that describes such a relationship is useful for research purposes and allows us to control/improve the quality of our education.

Anyone with an interest in presenting Empirical Research results in a form which allows readers to fully reproduce and reuse the underlying computations, may find this workshop useful too. The pedagogical focus of this workshop does not imply that Reproducible Computing technology is solely useful for educational purposes.

What is new? How does this relate to previous research?

The novelty about our newly developed Reproducible Computing technology1 lies in the fact that it empowers students (and the educator) to easily archive, exchange, reproduce, and reuse R computations [21], [22], [32]. This technological innovation allows us to create and maintain a learning environment that supports social constructivism which can be shown to be very helpful in learning statistics [23], [33]. The basic idea is to create an environment where students are allowed to interact with each other (and the tutor) about (a series of) research-related activities (such as assignments or workshops) based on the R language [16] and the R Framework2.

Within the context of ICT-based and math-related education, the academic community has shown great interest in the role and importance of social and individual constructivism ([19], [18], [4], [12]) and its implementation in statistics education in particular [13].

The following quote summarizes the importance and the great interest of educational researchers in constructivism [10]:

Constructivism is a philosophy that supports student construction of knowledge. Since students uniquely construct their knowledge, instructional strategies that support constructivist philosophies naturally advocate student understanding. Instructional trends in the mathematics and statistics education communities support the active-learning orientation of constructivist philosophy. I posit that, while not the only philosophy of teaching and learning, constructivism is one of the best such philosophies. One question remains: "How do instructional strategies that support student knowledge construction address the needs of all students?"

In September 2007, our early research results were presented at the Applied Statistics conference: the relationships between student's learning attitudes [11], social interaction (through group work and Peer Assessments), learning experiences [11], and exam scores were investigated [20]. One of the conclusions in the presentation was that social interaction through Peer Assessment (which is used as a "learning activity" rather than a "evaluation tool") was very beneficial for the learning experiences of students, which in turn is correlated with final exam performance.

In the presentation it was also concluded that the main disadvantage of the proposed constructivist approach to statistics education lies in the fact that students (and the educator) have to assess a series of workshop submissions that are (almost) irreproducible. Solving the difficulties that are involved in reproducing the research results from students is a "conditio sine qua non" if the constructivist approach to statistics learning is to be used on a large scale.

Another important aspect of this problem is related to the fact that educators are only able to assess the output (= submitted paper) when they request students to work on an assignment. The educator has a pretty good idea of what the learning goals are and what the end result should be. There is, however, no information about the learning/research process that leads to the result. Therefore, the educator is unaware of any difficulty that might have occurred during the process:

The Compendium Platform solves all of these problems through its underlying Reproducible Computing technology. The main benefits of Reproducible Computing are based on the fact that it effectively supports:

Goals of the Workshop

These are the goals of the workshop:

Detailed outline of the Workshop

Every session takes about 50 minutes – depending on the feedback we receive from registered participants. There is a 10 minute break between each session. The detailed outline shown below is subject to change and primarily depends on reported interests of participants. Registration is required and participants are asked to send us feedback about their interests through an online voting system: aspects with a high number of votes are emphasized during each session. Participants who use R scripts in education/research are encouraged to send us samples so that we can integrate them in the workshop.

Session 1. Brief description of the underlying technology and pedagogical aspects of Reproducible Computing:


Session 2. Hands-on session (create your own calculations and use them in your research texts or courses). Note: a comprehensive multimedia tutorial will be made available on CD to all participants.


Session 3. Hands-on session: (requires some basic knowledge of the R or S-plus language)

Requirements & Limitations

The following requirements and limitations apply:

Acknowledgments

This research is funded by the OOF/13 2007 grant and supported by the K.U.Leuven Association.

References

[1]. de Leeuw J., “Reproducible research: the bottom line,” in Department of Statistics Papers, 2001031101, Department of Statistics, UCLA, 2001

[2]. Conole, G., Dyke, M., Oliver, M., and Seale, J.: Mapping pedagogy and tools for effective learning design, Computers & Education 43, 2004

[3]. Donoho D. L. and X. Huo, “Beamlab and reproducible research,” International Journal of Wavelets, Multiresolution and Information Processing, 2004

[4]. Eggen P. and D. Kauchak, Educational Psychology: Windows on Classrooms. Upper Saddle River, NJ: Prentice Hall, 5th ed. ed., 2001

[5]. Gentleman R., “Applying reproducible research in scientific discovery,” BioSilico, 2005

[6]. Green P. J., “Diversities of gifts, but the same spirit,” The Statistician, pp. 423–438, 2003

[7]. Koenker R. and A. Zeileis, “Reproducible econometric research (a critical review of the state of the art),” in Research Report Series, no. 60, Department of Statistics and Mathematics Wirtschaftsuniversität Wien, 2007

[8]. Leisch F., “Sweave and beyond: Computations on text documents,” in Proceedings of the 3rd International Workshop on Distributed Statistical Computing, (Vienna, Austria), 2003

[9]. Milis, K., Wessa, P., Poelmans, S., Doom, C., and Bloemen, E.: The Impact of Gender on the Acceptance of Virtual Learning Environments, Proceedings of the International Conference of Education, Research and Innovation, International Association of Technology, Education and Development, 2008

[10]. Miller, J. B.: Examining the interplay between constructivism and different learning styles, www.stat.auckland.ac.nz/ ~iase/publications/1/8a4_mill.pdf, 2005

[11]. Moodle: A Free, Open Source Course Management System for Online Learning, http://www.moodle.org, 2008

[12]. Moreno L., C. Gonzalez, I. Castilla, E. Gonzalez, and J. Sigut, “Applying a constructivist and collaborative methodological approach in engineering education,” Computers & Education, vol. 49, pp. 891–915, 2007

[13]. Mvududu, Nyaradzo: A Cross-Cultural Study of the Connection Between Students' Attitudes Toward Statistics and the Use of Constructivist Strategies in the Course, Journal of Statistics Education 11(3), 2003

[14]. Peng R. D., F. Dominici, and S. L. Zeger, “Reproducible epidemiologic research,” American Journal of Epidemiology, 2006

[15]. Poelmans, S., Wessa, P., Milis, K., Bloemen, E., and Doom, C.: Usability and Acceptance of E-Learning in Statistics Education, based on the Compendium Platform, Proceedings of the International Conference of Education, Research and Innovation, International Association of Technology, Education and Development, 2008

[16]. R Development Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2008. ISBN 3-900051-07-0

[17]. Schwab M., N. Karrenbach, and J. Claerbout, “Making scientific computations reproducible,” Computing in Science & Engineering, vol. 2, no. 6, pp. 61–67, 2000

[18]. Smith E., “Social constructivism, individual constructivism and the role of computers in mathematics education,” Journal of mathematical behavior, vol. 17, no. 4, 1999

[19]. Von Glasersfeld E., “Learning as a constructive activity,” in Problems of Representation in the Teaching and Learning of Mathematics, pp. 3–17, Hillsdale, NJ: Lawrence Erlbaum Associates, 1987

[20]. Wessa, P., Learning Attitudes, Peer Assessment, and Gender in the context of a Social Constructionist Statistics Course, Applied Statistics Conference, 2007

[21]. Wessa P., “Learning statistics based on the compendium and reproducible computing,” in Proceedings of the World Congress on Engineering and Computer Science (International Conference on Education and Information Technology), UC Berkeley, San Francisco, USA, 2008

[22]. Wessa P. and E. van Stee, Statistical Computations Archive (online software at http://www.freestatistics.org). K.U.Leuven Association, Belgium, 2008

[23]. Wessa P., “How reproducible research leads to non-rote learning within a socially constructivist e-learning environment,” in Proceedings of the 7th European Conference on e-Learning, (Cyprus), 2008

[24]. Wessa P., Free Statistics Software (online software at http://www.wessa.net). Office for Research Development and Education, 1.1.23-r2 ed., 2008

[25]. Wessa P., “A framework for statistical software development, maintenance, and publishing within an open-access business model,” Computational Statistics, 2008

[26]. Wessa P., “Measurement and control of statistics learning processes based on constructivist feedback and reproducible computing,” in Proceedings of the 3rd International Conference on Virtual Learning, (Constanta, Romania), 2008

[27]. Wessa, P.: Assessment of Reproducible Computing as an E-Learning Tool in Statistics Education, Proceedings of the World Conference on E-Learning in Corporate, Government, Healthcare, & Higher Education, 2008

[28]. Wessa, P.: Discovering Computer-Assisted Learning Processes based on Objective Exam Score Transformations, Proceedings of the World Congress on Educational Sciences, 2009

[29]. Wessa, P.: Designing Statistical Learning Environments with Educational Compendium Technology, Proceedings of Computer-Assisted Learning (CAL'09), 2009

[30]. Wessa, P., and Baesens B.: Fraud Detection in Statistics Education based on the Compendium Platform and Reproducible Computing, IEEE Proceedings of the World Congress on Computer Science and Information Engineering (CSIE), 2009

[31]. Wessa, P., and Baesens, B.: Explorative Data Mining of Constructivist Learning Experiences and Activities with Multiple Dimensions, Proceedings of the International Conference on Computer and Instructional Technologies, World Academy of Science, Engineering and Technology, 2009

[32]. Wessa, P.: Reproducible Computing: a new Technology for Statistics Education and Educational Research, IAENG Transactions on Engineering Technologies, American Institute of Physics, Eds: Rieger, Burghard, Amouzegar, Mahyar A., and Ao, Sio-Iong, *forthcoming*, 2009

[33]. Wessa, P.: How Reproducible Computing Leads to Non-Rote Learning Within Socially Constructivist Statistics Education, Electronic Journal of e-Learning 6, *forthcoming*, 2009

[34]. Wessa, P.: Quality Control of Statistical Learning Environments and Prediction of Learning Outcomes through Reproducible Computing, International Journal of Computers, Communications & Control 4(2), 2009

1The purpose of this project is to facilitate the creation, maintenance, and permanent storage of statistical computation objects that empower authors to publish reproducible and reusable research (Compendium) through a series of web services. A Compendium is defined as any document that contains references (URLs) to permanently stored objects that can be retrieved, recomputed, and reused in real time without the need to download or install anything on the client machine. The underlying philosophy is that referencing stored computations allows authors to create reproducible and reusable research. In addition, this mechanism effectively facilitates peer review and collaboration among students and scientists. The use of this system is free of charge for educational and research purposes.

2There are several fundamental problems with statistical software development in the academic community. In addition, the development and dissemination of academic software will become increasingly difficult due to a variety of reasons. To solve these problems, a new framework for statistical software development, maintenance, and publishing was developed: it is based on the paradigm that academic and commercial software should be both cost-effectively created/maintained and published with Marketing Principles in mind. The framework has been seamlessly integrated into a highly successful website (www.wessa.net) that operates as a provider of free web-based statistical software.