9 Getting Started with jamovi
Install and Open jamovi
First, you need to install jamovi, if it’s not already on the computer you are working on. To do this, go to https://www.jamovi.org/ and download the correct version for your computer (make sure you get the “solid” version, which is recommended for most users). Once it is installed, open the program to see what it looks like. You will see something like this:
To the left is the spreadsheet view, and to the right is where the results of statistical tests will appear. Down the middle is a bar separating these two regions and this can be dragged to the left or the right to change their sizes.
In jamovi data are represented in a spreadsheet with each column representing a “variable” and each row representing a “case” or “participant.”
It is possible to simply begin typing values into the jamovi spreadsheet as you would in any other spreadsheet software. Alternatively, existing data sets in Excel or the CSV (.csv) file format can be opened in jamovi. Simply click on the “hamburger” ( ) in the top left corner of the jamovi window, click on “Import,” and find the file that you want to import. You can also easily import SPSS, SAS, Stata and JASP files directly into jamovi. To open an existing jamovi file (with the .omv file extension) select the hamburger, select “Open” and then choose the appropriate file.
Entering and Cleaning Your Data
There are four important steps to entering and cleaning your data: checking your data are set up correctly; computing new variables; transforming variables; and using filters.
1. Checking your data are set up correctly
When you enter your data in jamovi, you will typically (for now at least) enter it with one row for each participant. Always create a variable to identify each participant by. For example, each participant in the study might be given a number. The same number would appear on all parts of the study they complete, as well as in jamovi. You can call this variable “Participant” in jamovi, and set it to be an ID variable (see below). If you are collecting data online using a platform like SurveyMonkey or gorilla, those platforms might automatically assign participant IDs to your participants. In those cases, you may as well continue to use the IDs supplied by those platforms, for consistency and so that you can keep track of who is who.
Variables in jamovi
It’s important to understand the different types of variables in jamovi and how they map onto our levels of measurement.
Variables in jamovi can be one of three data types:
- Integer, meaning the values are discrete whole numbers
- Decimal, meaning the values are numbers with decimals
- Text, meaning the values are alphanumeric, not just numeric
Furthermore, variables in jamovi can be one of four measure types:
- Nominal
- Ordinal
- Continuous (meaning jamovi combines interval and ratio and doesn’t distinguish between the two)
- ID (used for any identifying variable you likely wouldn’t ever analyze, like participant ID number or name)
There are a few great things about jamovi when it comes to these data variables. First, jamovi will try to automatically determine what the data and measure types are when you type in data or when you open a dataset; this is fabulous, until it goes wrong. It’s important that you always double check your data and measure types first!
Second, those little icons will be really helpful to let you know what variables can go in which boxes. For example, we would never analyze a nominal variable as our dependent variable for a t-test, and jamovi will help remind you of that. When performing an independent samples t-test, the dependent variables box will have a little ruler icon indicating you should be putting continuous variables in that box. Similarly, it will tell you to put nominal or ordinal variables in the grouping variable (independent variable) box.\
It’s really important to check that the data types and measurement types of your variables are correct. You should open the Setup ( ) option under the Data tab to check.
When you’re in Setup, here are the things you should be doing for all variables:
- Make sure the variable name is meaningful to you and will appear nicely in your data visualizations or tables (e.g., don’t write
Q1
but ratherBDI_1
for the variable that is participants’ responses to the first item on the Beck Depression Inventory). - Add a description to your variable so you have more context. Maybe you compute a mean of all the BDI items and in the description you write
Mean of all BDI items
for the description of yourBDI_Mean
variable. - Check your measure and data types are correct.
- Specify if there is a code for missing values. Make sure the code does not match the code you use for actual variables! For example, if I have a variable that ranges from 0-10, then I wouldn’t use 9 as a code for missing values; instead, I might use 99 or -9.
- Add labels to levels. For example, the variable
Athlete
is 0 for non-athlete and 1 for athlete. Rather than keeping just the 0 and 1, you can specify under Levels that 0 is non-athlete.
So, you might have something that looks like this:
You will see that for setting up the “Athlete” variable, I simply typed “Athlete” into the first box, “Athletic status” into the second box (where you can write a longer description). I selected “Nominal” for Measure type. Initially, the values that I imported into jamovi were 0’s and 1’s, so I translated this information for jamovi by typing “Yes” and “No” respectively into the boxes under “Levels.”
2. Computing Variables
Sometimes you need to create new variables from your raw (meaning uncleaned) data. Perhaps you collected data on a scale that has five items. Normally, we create an average score of all the five items and that new computed average score is what we use in our analyses.
Let’s open the Big 5 dataset built into jamovi. You can open this dataset by clicking the three horizontal lines on the top left of jamovi (the menu), choose Open, then select Data Library. In the main Data Library folder is a dataset called Big 5.
This dataset has the scores on all five subscales of the Big Five personality test. Let’s imagine we want the average score of the entire Big Five test. To do this, first put your cursor in the column where you would like to create the new variable. Then, click on the Data tab and choose Compute
. You may need to double-click on the column to show the computation window above your data. Rename the computed variable (e.g., Big5_Mean
), add in a description, and then create the formula.
In this case, we need to select the function MEAN
. To do so, click on the fx and scroll down below Functions until you see MEAN
. When you click on MEAN
, it provides a template of what the formula should look like. We need to specify the function MEAN()
, add all the variables we want to calculate in the mean (i.e., the five subscales of the Big 5), and there are two alternative options: ignore_missing
is defaulted to 0 (meaning DON’T ignore missing, or rather include missing) and min_valid
is defaulted to 0 (meaning it’s ignoring this; perhaps you only want to include people that have at least three valid cases).
The basic formula, then is to do MEAN(var1, var2, … varn). You can see what we need to do with this dataset below. There’s actually no missing data, so the two additional arguments aren’t necessary for us to worry about.
Note that you must type the variable names exactly as they appear in your data area – if you mistype them, you will get an error and your mean will not be computed!
If you’d like to learn more about computed variables in jamovi, check out this jamovi blog post on the topic.
3. Transforming Data
Sometimes we want to take an existing variable and transform it in some way or we want to do a computation across multiple variables (e.g., reverse-score multiple items in a dataset). If you want to learn more about transforming variables, the jamovi blog has a great blog post on the topic.
Reverse-scoring
Sometimes items need to be reverse-scored because they are in the opposite direction of the entire scale.
Let’s imagine we have a Happiness Scale with the following four items:
- I am happy.
- I am content.
- Life is overall positive.
- I am unhappy.
The happiness scale suggests that higher scores is higher happiness, but the fourth item is opposite. Higher scores on that item actually indicate lower happiness. Therefore, we would need to use “Transform” to recode the items so the highest score is the lowest score and so on. For example, if it were rated on a 5-point scale then you would need to recode so a 1 = 5, 2 = 4, 3 = 3, 4 = 2, and 5 = 1.
Recoding
Maybe we want to recode variables. Perhaps we want to recode the Neuroticism scale into low, moderate, and high extraversion. The scale ranges from 1-5, so I’m going to say that scores between 1-2.333 are low, 2.334 to 3.666 is moderate, and 3.667 to 5 is high. First, I create a new Transform variable. Then I need to specify the transformation. Click Edit to do so (or, when creating a new transformation, click the transformation and select Create New Transform).
We need to specify the recode conditions. Click Add recode condition
twice. For the first formula, we want to specify that if the $source (meaning the score for the variable we’re using for the transformation, in this case Neuroticism) is less than or equal to 2.333, then it will be recoded as low
. Notice the use of apostrophes around the text! We do the same for moderate. Then we can end with an else statement: all other values (else) are recoded as high
. We can either let it auto determine the measure type, but I like to be in control of my data and therefore specify it is an ordinal variable.
Multiple transformations
Maybe we instead want to do a computation across multiple variables. Perhaps we have multiple items that need to be reverse-scored, or in our case we want to use our previous Low_mod_high
transformation to perform on all the subscales of the Big 5.
We can click a new variable (e.g., Openness), select Transform, rename the variable, and select the Low_mod_high
transformation we already used. Voila! The work we did previously can easily be used again in this analysis.
4. Using Filters
Sometimes we only want to analyze certain pieces of our data. We can filter by rows and by columns. Check out this blog post by jamovi on more details of filters.
Row filters
Maybe we only want to analyze data from people who are low in neuroticism. We would create the following filter, by selecting Filters . Type Neuroticism_cat == 'low'
in the box, as shown below.
You’ll notice at the very left of the dataset there is a new column named Filter 1
(the name of the filter) and there will either be an X or a green check mark indicating whether it’s removed (X) or kept (check) in the analyses.
If you want to take off the filter, but keep it available, click on the filter column and toggle the green button on the top right from active
to inactive
. It will then grey out the column.
A couple things to note:
- Notice that to say it equals to
low
you have to use a double equal sign:==
- Another common thing you may want to specify is that the variable is not equal to something. You would use the following:
!=
- Otherwise you should be familiar with the other operations:
<
,>
,<=
,>=
Column Filters
Column filters are useful when you want to use a filter for some but not all of your analyses. Rather than creating a filter, we need to compute a new variable using the FILTER()
function. For example, we can compute a new variable that is FILTER(Neutoricism_cat, Neuroticism_cat == 'low'
. Then we could use that new variable in an analysis (in this case it’s not very useful because there is no variability in this variable, but there are useful times for using column filters for analyses).