ArabBarometR Graphing Guide
2024-10-18
Introduction
Purpose
This guide is meant to give instructions on how to create graphs using the ArabBarometR
package. If you do not yet have ArabBarometR
installed, see either the README file on the package’s GitHub repo, or Installing and Updating ArabBarometR
in the Appendix.
This guide was created using the bookdown package, which comes with many fun and useful features. An overview of these features can be found in the Guide Features section in the general ArabBarometR
Guide.
As such, this guide assumes a few things:
- That you have the
ArabBarometR
package installed on your machine.- If this is not true, please take a look at the README file on the
ArabBarometR
GitHub page.
- If this is not true, please take a look at the README file on the
- You have a basic understanding of how to load data and packages in R.
- If this is not true, you can pursue either this guide or this website.
Best Practices
Before we begin, there are a few best practices that you are highly encouraged to adhere to.
- Load you data directly from the source.
- Rather than downloading AB data from the Drive and saving it in a folder on your computer, then re-downloading it every time it gets updated, just put in the path from where you are to the latest data on the Drive.
- This will save you a significant amount of space on your machine.
- It is also the best way to guarantee that you are:
- Using the latest data.
- Using the original, unedited data each time you start/work on a project.
- Save your code in the same place you save your graphs.
- Instead of saving your code in one place and your output in another, save everything in the same spot.
- This way, when you spot an error in your graphs or want to add more, you know exactly where the code used to create the output is.
- Load only the packages you need.
- Many packages in R do not necessarily play nicely with each other.
- Often, packages will have functions with the same name that do different things. If you call one of those function in your code, R will use the function from the last package you loaded.
General Guide Process
Assuming we are working with clean data, the graphs are created in two steps:
- Create a summary of the data you want to graph.
- Graph that summary.
For each type of graph, the guide will first go through how to create one graph, then how to create many graphs.
Example Data
The examples in this guide will be created using data from Wave VI. For single survey graphs, it will use data from from Wave VI, Survey I. For trend graphs, it will use data from Wave VI, Survey I, II, and III, where Survey III is the final survey we conducted in Wave VI.
As of version 1.13.10.9001
, the ArabBarometR
package includes a sample of this data, so you will not have to download any extra data sets to follow along. If you are using an older version of ArabBarometR
(not recommended), the Wave VI data can be found on the Arab Barometer website.
The sample data has an artificial Q13
column. The variable was not collected during Wave VI since this is usually filled out by an enumerator in the survey area and Wave VI was conducted over the phone. For demonstration purposes, a Q13
column was randomly generated and added to the data.
Header Code
In order to create certain graphs (e.g., demographic graphs, comparative graphs for nominal questions, etc.), the data needs to be prepared correctly. The first chapter in this guide introduces several functions included in the ArabBarometR
package that allow the user to easily prepare the data.
The code examples in this guide will focus on how to create graphs. That is, the examples will not include loading and preparing the data or loading libraries. As you begin this guide, make sure your header looks something like the following:
setwd("/where/graphs/will/live/")
library(ArabBarometR) # Our package
library(dplyr) # Lots of handy functions
library(purrr) # Mapping functions
# Preparing Survey 1 data with no pipes:
## Adding demographic columns:
survey1 <- add_demographics(survey1)
## Creating dummies for examples:
survey1 <- dummy_all(survey1,
Q2061A)
# Preparing Survey 2 data with R native pipe:
survey2 <- survey2 |>
add_demographics() |>
dummy_all(Q2061A)
# Preparing Survey 3 data with dplyr pipe:
survey3 <- survey3 %>%
add_demographics() %>%
dummy_all(Q2061A)
The above code prepares each of our example data sets using the same two functions, but using three different methods.
When adding demographic variable columns and creating dummy columns from variable Q2061A
for survey1
(Wave VI Part 1), each step is done on the data independently. When doing the same manipulations on survey2
(Wave VI Part 2) and survey3
(Wave VI Part 3), the actions are chained together using pipes. Data preparation for survey2
uses native pipes |>
, while data preparation for survey3
uses pipes from the dplyr
package. All three operations have the same outcome. Q: Why use one method over the other? A: Personal preference.
To learn more about see the section A Note on Piping in the Appendix.
Chapter 1 introduces the functions used to prepare the data.