Basic data process functions

Written in JavaScript

Updated at July 30th, 2020

In the first tutorial of this series, we explained what data processing is and the two types of data processes in Protobi. 

We also mentioned that Protobi uses JavaScript for data processing. JavaScript is one of the most widely used programming languages. There are many resources you can find on the web that teach JavaScript for beginners, but we've included some JavaScript basics here that are relevant to common data processes used in Protobi.

So if you're new to Javascript, welcome, and keep in mind you're learning a language that's not specific to Protobi but broadly applicable to many data visualization applications.

Download our Protobi JavaScript cheat sheet.pdf that summarizes this article for quick reference. 

JavaScript basics

Declare a variable

Before they are used, all variables have to be declared. "var" keyword is used to declare a variable. 

A semi colon will terminate this declaration. However, if a statement is not explicitly terminated with a semicolon, a semicolon will be automatically inserted by JavaScript engine.

var test1;

There are rules to naming variables:

  • The variable cannot be a reserved keyword
  • Cannot start with a number
  • Cannot contain a space or a hyphen

In section below we show the different variable value types to set an initial value (initialize the variable).

Variable value types

By default, the value of variables that you define in Javascript are undefined. You define variables using literals. Literals represent the actual value you use in the code. These actual values are the values that get computed. 

String literal

Set a variable to a string (a sequence of characters). A string is a data type is used to represent text rather than numbers. The set of characters can also contain spaces and numbers. A string is enclosed in single quotes or double quotes.

var test1 = "This is a string";

Numeric literal

Numeric literals can be integers or numbers with decimal points. 

var test1 = 0;

Boolean literal

Boolean can either be true or false.

Note: Both the words "True" and "False" are reserved key words, so they cannot be variable names. 

var test1 = true;

Object literal

You define (and create) a JavaScript object with an object literal. Objects are variables that can contain many values. An object literal is a comma-separated list of name-value pairs wrapped in curly braces. The values are written as name:value pairs (name and value separated by a colon).

var test1 = {
    name:"My first object",
    type:"Object-literal",
    color:"blue"
};

Array literal

An array is an ordered list of values, separated by commas and wrapped in square brackets. Arrays can store multiple values. Values can be string, number, boolean or object literals.

var test1 = ["1","2","3","4","5","6","7"];

Null

Null is used specifically in situations where you want to explicitly clear the value of a variable. 

var test1 = null;


Operators 

JavaScript includes operators as in other languages. An operator performs some operation on single or multiple operands (data value) and produces a result. For example 1 + 2, where + sign is an operator and 1 is left operand and 2 is right operand.

Arithmetic

+

Addition
- Subtraction
/ Division
* Multiplication

Comparison

== Equal
!= Not equal
> Greater than
>= Greater than or equal to
< Less than
<= Less than or equal to
=== Equal to, and the same data type (Identical)
!== Not identical

Note: Double and triple equals are used for comparison, single equal (=) is used for assigning value to a variable.

Logical 

&& AND
|| OR


Basic data process functions

Define a global variable

You can create new variables as you wish. We can write the variable "Q6_threshold" into existence and give it a value. If you assign a value to a variable that has not been declared, it will automatically become a global variable. A global variable has global scope: All scripts and functions on a web page can access it. 

var Q6_threshold = 10

Declare the "rows" variable

In data processing, we retrieve the data we want to modify by declaring a variable and initializing the value as a data table. Protobi convention is to define the data as "rows". 

Use the dot notation below to retrieve the target file.

var rows = data.main;

Alternatively, instead of dot notation you use square brackets, and you pass a string that determines the name of the target file. 

var rows = data["main"];


After you've initialized the rows variable, when you refer to the object "row" you can use dot notation and the data column you refer to will be retrieved from the target data file.

Iterate over rows

Iteration, is a process that allows us to simplify our algorithm by stating that we will repeat certain steps until told otherwise. 

JavaScript arrays have a method .forEach(fn) whose argument is a function that iterates over each row and calls a function with each row and its index as arguments. It takes code nested within the function, and loops through every respondent in the datafile. 

Within the curly braces is where you place the block of code. A block is one or more statements grouped together and executed in sequence.

rows.forEach(function(row) {});

Return rows

The return statement is used to stop the execution of a function and return the result of that function. Data processes typically end in this line of code:

return rows;

Define a new value on the row

You can define a new value on row using dot notation. "test1" can be an already existing data column, or one that is written into existence. The code assigns a value (in this case an empty string value). If the variable has an existing value, it will be overwritten by the value you specify. 

row.test1 = "";

You can assign values based off stated conditions. 

if(row.q2_2 == '1'|| row.q2_3 == '1')    row.test1 = '1'  
if(row.q2_2 == '2'|| row.q2_3 == '2')    row.test1 = '2'    

Add up several values on the same row

Notice in the code below we've put a "+" before each value. The + operator immediately before a variable will return the numeric representation of the variable. This will ensure a number will be evaluated as a number rather than as a string, and be added numerically (2+7=9) not alphabetically ("2" + "7" = "27").

row.test1 = (+row.test1_1) + (+row.test1_2) + (+row.test1_3)

Refer to another value on the row

You can refer to another value on the row by using an "If-- else" conditional statement. 

Next to "if" we have our condition (if row.Q6A is greater than Q6_threshold), then we will execute our statement (row.test1 = 1). If the condition is not met (else) then row.test1 = 0. 

if (+row.Q6A >= Q6_threshold) row.test1 = 1; 
else row.test1  = 0

Ternary operator

An alternative to the "If-- else" statement above is using ternary logic. The ternary operator is the only JavaScript operator that takes three operands. 

We have an expression with the condition in parenthesis. Directly after the question mark is the value we will assign row.test1 if the condition is true. Directly following the colon is the value assigned  if the condition is false. 

row.test1 = (+row.Q6A >= Q6_threshold) ? 1 : 0


Expand to see a practical example of some of the data processing code we reviewed above. 

Example code

Here we declare rows as data table "main". We then iterate over rows to create a wave and a s5_sum column in the data. Finally, we return rows.  

var rows = data["main"]       //Declare rows as main data table

rows.forEach(function(row) {  //Do something for each row...

    row.wave = 1              //Define a new value (wave)  
    row.s5_sum = (+row.s5_1) + (row.s5_2) + (+row.s5_3)   //Create sum value by adding rows
})
return rows;                  //Return the result of the function
 

How the example looks on the code page of a data process:

Keeping code organized

Indent with nest blocks

In the code below, we indent each block of code that is nested underneath the "rows.forEach" function. This makes the code easier to read, especially if there are many lines of code. 

Include comments

Above each section of code that serves a different purpose, it's recommended that the person writing out the code include comments. This will help you, or others viewing the code later down the road if the data process becomes complex. It's useful to include your name and date the code was implemented. Comments are denoted by the " // " in front.

rows.forEach(function(row) { 

// Tiffany, 03-07-2020, Clear data at q1 of respondents that we want to exclude
    if (row.respid ==001) row.q1 = "";
    if (row.respid ==002) row.q1 = "";
    if (row.respid ==003) row.q1 = "";

// Tiffany, 03-07-2020, Blanking out values in Q6_a if zero values in Q6
    if (row.Q6 ==0) row.Q6_a = "";
    if (row.Q6 ==0) row.Q6_b = "";

});


Reminder: For data processes, "Save" and "Run" the process after you are done editing the code view. To use the result of the process as the primary data for the project, you will need to set it as "Primary".
Data processes are specific to each project, and your code may not look identical to our example.




Sources:

“JavaScript Primer.” JavaScript Primer, node-ardx.org/js-primer.

“Javascript Operators.” TutorialsTeacher.com, www.tutorialsteacher.com/javascript/javascript-operators.



Was this article helpful?