Unit 5. Statistics with R: Introduction and Descriptive Statistics
Leyre Castro
Summary. In this unit you will learn the basics of how R works and how to get comfortable interacting with this software. In addition to the information here, next units will include examples of how to conduct in R the different analyses explained in those units.
Prerequisite Units
Unit 1. Introduction to Statistics for Psychological Science
Unit 2. Managing Data
Unit 3. Descriptive Statistics for Psychological Research
Statistical Software for Data Analysis
Using any kind of statistical software will allow you to avoid mistakes and be faster in the computation of your statistical analyses. To start taking advantage of computer software for your data analysis, spreadsheets (like Excel) are good because they allow you to organize the data the way you want, you can sort and filter data, you can count and summarize values, calculate basic descriptive statistics, and make graphs. But if you want to move beyond summaries and basic graphs, you will need more specialized statistical software. Some common and traditional statistical applications are SPSS and SAS, but they require a (very expensive) commercial license. A non-commercial option is jamovi, an open-source application, with point-and-click interfaces for common tasks and data exploration and modeling. But you may have data that do not fit into the rows and columns that standard statistical applications expect or you may have questions that go beyond what the drop-down menus allow you to do. In that case, you will be better off by using a programming language like R because that gives you the ultimate control, flexibility, and much power in analyzing your data in ways that specifically address the questions that matter to you.
What is R?
R is an open-source free programming software. So, R is free, and you can keep updating it without any cost and have it always available in your computer, regardless of the university or company in which you are. In addition, there are many great websites, videos, and tutorials to get you started (and to acquire advanced knowledge) with R, and to explain you how to do statistics with R. We include links to that information down below.
R is very versatile. Because R is, basically, a programming language, it can be used for a variety of things, not just statistics. As you get better at using R for data analysis, you are also learning to program. If you are interested in science, it is highly likely that you will need to learn the basics of computer modeling, or you may want to develop useful apps, or automatize tasks in your business, or conduct surveys online, or communicate information through data visualization. All this can be done with R.
Related to the previous point, R is highly extensible. When you download and install R, you get all the basic “packages,” and those are very powerful on their own. Packages are specific units of code that add more functionality to R. Because R is open and so widely used, it has become a standard tool in statistics, so many people write their own packages that extend the system. And these packages are freely available too. There is a large R community for code development and support; indeed, for any kind of special analysis that you need to conduct, be reassured that there will be a package in R, and great explanations about how to perform it. Also, many recent advanced statistical textbooks use R. So, if you learn how to do your basic statistics in R, then you will be a lot closer to being able to use the state-of-the-art methods in psychological statistics and data analysis. In summary, learning R is a very good use of your time.
R is a real programming language. To some people this might seem like a bad or scary thing, but in truth, programming is a core research skill across many of the social and behavioral sciences.
Think about how many surveys and experiments are done online or on computers in a laboratory. Think about all those online social environments which you might be interested in studying. Also, think about how much time you will save and how much accuracy you will gain if you collect data in an automated fashion. If you do not know how to program, then learning how to do statistics using R is a good way to start. Indeed, if you have to or want to learn another programming language in the future, the experience with R will facilitate your learning tremendously.
Learning R
Before moving forward, we highly recommend that you first watch the following tutorial, that you can access through the LinkedIn Learning button in your MyUI page:
The tutorial is called Learning R, by Barton Poulson, and is 2h, 51m long. It is a very clear and well-paced introduction to R. You will learn to: install R and RStudio, navigate the RStudio environment, import data from a spreadsheet, data visualization, and how to perform a number of data analysis.
In addition, these are some of the best materials available:
Learning Statistics with R, by Danielle Navarro
https://learningstatisticswithr.com/
Excellent, entertaining, and very clear book, freely available online. It includes statistical explanations and how to conduct all the analyses in R. You can download it as a pdf.
R coder. All about R programming
https://r-coder.com/r-tutorials/
Very clear, well-organized, and helpful tutorials for all basic statistics.
Statistics, R programming, and Data Science with Professor Marin
https://www.youtube.com/c/marinstatlectures/playlists?view=50&shelf_id=15
YouTube videos with excellent explanations and examples about how to use R and how to conduct a variety of analyses.
Descriptive Statistics with R
To start doing descriptive statistics with R, you will find excellent instructions in these specific pages from the websites listed above:
Bar Charts and Pie Charts in R
https://www.youtube.com/watch?v=Eph_Y0BmHU0&list=PLqzoL9-eJTNCzF2A6223SQm0rLYtw6hJE&index=2
Histograms in R
https://www.youtube.com/watch?v=Hj1pgap4UOY&list=PLqzoL9-eJTNCzF2A6223SQm0rLYtw6hJE&index=4
Mean, Standard Deviation, and Frequencies in R
https://www.youtube.com/watch?v=ACWuV16tdhY&list=PLqzoL9-eJTNCzF2A6223SQm0rLYtw6hJE&index=11
Descriptive Statistics in R
https://r-coder.com/r-statistics/