Combine multiple waves of data

Stack multiple survey data files to create one aggregate dataset

Updated at June 23rd, 2023

There are two types of data merging. One is to combine different datasets by appending more rows, and the other is joining data to merge in new variables.

The example code in this tutorial appends rows of data from multiple surveys vertically, resulting in a larger combined dataset. This is also referred to as "stacked" data. This method is suitable when the surveys use the same data column names for the same questions (e.g. different waves of the same study). New questions are fine, so long as the data column name is not the same as an old question from a previous wave.  

Add data tables

Before creating a data process, add date tables and upload files for each dataset. 


Data process code

Declare a variable for each data table that will be combined (e.g. var W1,) and assign a value by referring to the respective data table name (e.g. data[“wave1”]). 

Enter the variables in the Protobi.stack_rows function array. 

Add the return rows statement to recall the result of the stacking process. 

 var W1 = data["wave1"]
 var W2 = data["wave2"]
 var W3 = data["wave3"]
 var W4 = data["wave4"]
 
 var rows = Protobi.stack_rows([W1 ,W2, W3, W4]);
 
 return rows;
Reminder: For data processes, "Save" and "Run" the process after you are done editing the code view. To use the result of the process as the primary data for the project, you will need to set it as "Primary".
Data processes are specific to each project, and your code may not look identical to our example.

Was this article helpful?