As an empirical economist, I crunch a lot of numbers, and Stata is my statistical software of choice, as it is for many economists. Stata is powerful, easy to use, and well designed. It can be run interactively, using an extensive set of draw-down menus, or by writing and submitting batch programs. When our Economics Department decided to develop a new, required data analysis and econometrics course for all our undergrad majors, Stata was the early odds-on favorite. But upon further thought, and with some trepidation, we have adopted R instead. This fall I will be offering the course for the first time... this summer I am teaching myself enough R to stay a step ahead of my undergrads... I hope!
Why R? Unless you are a programmer or a mathemetician, it is not an intuitive or user-friendly program. But it is extremely powerful and versatile, increasingly widely used in academics and business, open source, and free. It offers every basic econometric technique, and most advanced techniques. It creates beautiful graphics. The minority of our majors who go on to graduate level work in Economics or a related discipline can use R in their research, or easily jump to Stata or something else. For the majority of our students who will pursue employment in the private sector, some facility with R will be a real plus in the job market.
So, how to teach R to a class full of undergrads with limited or no programming background and a fairly high incidence of quanti-phobia? ... while simultaneously teaching the concepts and applications of regression analysis? Our first strategy is to require a 2-unit lab course alongside the regular 4-unit lecture section. The lab will be devoted to hands-on data analysis: Turn on your computers, download that data, and run that R script. I can't expect them to write a program from scatch, so my plan is to provide basic R scripts as templates they can tweak and adapt for their own purposes.
To help make R as painless as we can, we have engaged an R-savvy undergrad, Bobby Fatemi, to help write a course-specific guide to R for Santa Clara students, which will stick to just what they need to know, step by step. We plan to use hands-on examples with data from the assigned textbook, Stock and Watson's Introduction to Econometrics.
In the lecture portion of the class, I plan to use the tablet for board work, alongside a second computer running R for demonstrating the data analysis. Of course I want the students to learn about data analysis-- the promise and pitfalls of its use in social science. But I also hope to encourage them to "play" with the techniques. How much does lingering high unemployment really reduce Barack Obama's chances of reelection? Instead of bullshitting about it, we can download some unemployment and election data and estimate a model!
So: onward! More reports to follow... wish me luck.