Weights

Set global or respondent-level weights

Conceptual illustration showing a data table with horizontal bars in a window, balanced on a fulcrum with a triangular weight, representing the concept of data weighting in Protobi.

Protobi allows you to weight data with respondent-level weights.  

By default, Protobi counts each respondent equally. In practice we may need to weight some responses or respondents more than others.

Why weight?

A survey sample might not exactly match the population in some aspect. For instance, in a survey of 1000 consumers we collect a sample that has 600 male and 400 female respondents. However, there is other data that tells us the actual market in that category is 50% male and 50% female. 

GenderSurveyPopulation
Male60%50%
Female40%50%

In this case, we might need to differentially weight the responses to match both the population proportions and the sample size.  To do this we can 

  • Up-weight female respondents by 1.250 (i.e. 400 x 1.250 = 500)
  • Down-weight male respondents by 0.8333 (i.e. 600 x 0.8333 = 500)

Another use for weighting is weighting by patient volume or purchase volume.   In a survey of physicians about their treatment patterns, some physicians may treat many patients and some fewer. We can weight respondents by the number of patients the physician treats, so that the results project to the patients they treat.

Define a weight column or value

In your data file

To weight data differentially in Protobi, your dataset must have a column that has a weight value for each respondent. If the column is not already in your data file, you can add it manually and upload the revised data. You name the weight field anything, e.g., 'RESP_WT'. In the example above, the column would have the value 1.25 for female respondents, and 0.8333 for male respondents.

If your SPSS SAV file does identify a weight column, Protobi should automatically recognize it as a global weight.

In Protobi

Alternatively, you can define a weight as a scalar (e.g. 1.02 or 100), instead of a field name. This can be useful if you need to weight all respondents equally but by some number other than one. This works with both global and individual weights.

Project properties dialog showing the Weight field input containing the scalar value '1.02', highlighted with a red border. Other visible fields include ID field set to 'ID column', Decimals for percents and counts both set to '1', and buttons for Edit JSON and Local history.

Weight by an element

You can theoretically weight by any element in your project by referencing its key in the Weight field.

For instance we could weight each customer by their annual sales. To use sales as a weight would mean to assign each respondent their own weight value that is equal to their answer for this particular question. 

Protobi element view showing the 'sales' variable labeled as 'Sales in thousands'. The panel displays a scrollable list of numeric values (ranging from 0.11 to 6.556) with each row showing 0.6% frequency. A Filter input box and Apply button are visible at the top.

There are questions that are not suitable to use as weights. For instance, if you were to weight by the below element, type, you are assigning weights based of the underlying unformatted values. 74% of respondents would get a weight of "0", and 26% of respondents a weight of "1". 

Protobi element showing 'type' variable displaying Vehicle type with two categories: Automobile (73.9%) and Truck (26.1%). The chart includes horizontal blue bars, with statistics showing Mean of 0.261 and N of 157.

Unformatted type values:

Same 'type' element as previous image, but now showing the underlying unformatted values as category labels: '0' (73.9%) and '1' (26.1%), with Mean 0.261 and N 157. This reveals the actual numeric codes stored in the data.

If we were to apply type as a weight to the below element, we would reduce the N size from 157 to 41. Effectively discounting all respondents with vehicle type "Automobile" because the underlying value is "0". 

Protobi element showing 'Manufacturer' with a scrollable list of car brands and their percentages. Values range from 2.4% (Cadillac, Chrysler, Lincoln, etc.) to 14.6% (Dodge), with blue horizontal bars. The sample size N is 41, and a footer note reads 'Weighted by type' in italics.

Set individual weights

More properties dialog

For individual elements specify weights by selecting "More properties" from the context menu. Next to the "weight" field you can enter the data column to reference. 

Element properties dialog for 'manufact' element showing various configuration fields. The 'weight' field is highlighted with a red border and contains the value 'sales', indicating this element is weighted by the sales variable.

You can also enter Excel-like formulas to specify weights. 

Simple text field showing the weight formula '=q1*q2' with label 'weight' and style 'heritable' displayed to the left. The formula demonstrates Excel-like syntax for combining two variables as a weight calculation.

Note: When adding numeric fields in a formula you need to include a "+" sign before the variable name to tell it to add numerically (2+7=9) not alphabetically ("2" + "7" = "27").  

If weights are specified, a default footnote appears; you can overwrite the footnote .

Protobi chart showing question 'q7' asking about favorability opinions of political institutions. Four items are displayed with green arrow indicators: 'The Republican Party' (28.0%), 'The Democratic Party' (27.6%), 'The Supreme Court' (42.5%), and 'Congress' (15.5%), each with blue horizontal bars. The footer shows 'Compact to Very favorable, or Mostly favorable' and 'Weighted by =q1*q2' in red box.

In JSON editor

Weights for an individual element can be set in JSON as well. Select the element and press "Edit JSON...." to modify the element properties in JSON syntax. Define a property "weight" with the name of the weight field.

Weights on individual elements override any global weights. For instance, to avoid weighting the global weight field by itself, we can specify "weight": null.

The example below is from the car_sales.sav dataset and here we're weighting automobiles by the field sales :

JSON editor dialog titled 'Edit element properties)' displaying JSON code for the 'manufact' element. The code shows various properties including 'roundby': null, 'key': 'manufact', 'title': 'Manufacturer', 'type': 'string', and importantly 'weight': 'sales' highlighted in green, demonstrating how to set weights in JSON syntax.

Set a global weight

You can define a weight field globally that applies to all elements. Press the ☰ icon under the toolbar to edit Project properties. Enter the name of the column specifying weights.

Protobi toolbar showing navigation tabs (Q1-Q10, Q15-Q23, Q24, Q24-Q40, Q46-Q48, Q49-Q55) with the hamburger menu icon (☰) circled in red. Below is a teal button labeled 'Edit project properties' and the current section 'Q1-Q10' with green indicator showing 'Section 1'.

This will bring up the Project properties dialog where you can specify a global weight:

Project properties dialog with the 'Weight field' containing 'sales' highlighted with a red border. The dialog shows various project settings including P-value (0.05), Toolbar position (top), Crosstab significance tests (Complement), Show bars ((default)), ID field (ID column), and decimal settings (both set to 1).

Toggle weights on/off

If a global weight field is defined for your dataset (e.g., using data column S8), you'll see a new toolbar button  that specifies the weight that the project is using. 

Protobi toolbar showing five buttons in a row: hamburger menu (☰), 'Global' (blue), 'S8' (green), 'N=3688' (white), 'Clear' (light blue), and 'Set base' (blue). The green 'S8' button indicates the active weight scheme being applied.

You can press this to toggle weights on or off. Its name will change from "Weighted" to "Unweighted" so you can quickly see if the results are weighted or unweighted. "Weighted" is the default.

Define more than one global weight scheme

Protobi can include multiple weight schemes. For instance, a study may weight data differently to project results to the population of patients and physicians.   

Protobi looks for a special group element with the key **$**weights, and interprets each child element of this group as a variable that can be used as weights.  Each appears in the Weights dropdown.

Press on the

Small button with a plus (+) symbol on a light gray background, representing the add button at the end of the tabs list used to create or find special groups like $weights.

button at the end of the list of tabs, and enter "$weights". This will find or create a group with that key. Within this group add child elements corresponding to each weight column. One way is to drag weight elements from the the tree on the left and drop them into the **$**weights group (optionally hold the Shift key when dropping to copy rather than move).

Alternatively you can directly specify weight columns by editing the JSON for the **$**weights group. For example:

{
    "roundby": "auto",
    "key": "$weights",
    "children": [
        "S8v1",
        "S1",
        "S2"
    ],
    "type": "empty"
}

Now the dropdown menu in the Weight button in the toolbar will contain all the children as global weight schemes:

Dropdown menu from the weight toggle button showing 'S8v1' selected (green button at top), followed by options: 'Toggle weights off', 'SELECT WEIGHT FIELD' header, 'S8v1' (highlighted in light blue), 'S1', and 'S2'. The menu demonstrates multiple weight scheme options.

Weight to multiple target characteristics

The weighting examples above are applicable when you only want to apply one weight scheme to the project at a time. However, you might want to weight to multiple target characteristics. We can do this using Random Iterative Method, also known as RIM weighting or raking.

For example, a survey sample may have set quotas to equally sample physicians and nurses , but in the target market nurses may be 70% of customers.

GenderSurvey Population
Physician50%30%
Nurse50%70%

Similarly, the distribution of region in the survey may differ from the target population.

RegionSurvey samplePopulation
East40%30%
West60%70%

It is possible to calculate one weight scheme that adjusts the distribution for more than one variable.  This is a little more complex, as setting weights for one variable may affect the distribution of other correlated variables.   

For this you can use an iterative algorithm called "Rim weighting", described here and also in the accordion below. This gist shows a function that calculates weights for selected variables, and runs in a data process .

Example RIM weighting algorithm

Protobi.get_tables(["main", "OE"], function(err, data) {
    if (err) return callback(err);
    
    Protobi.get_elements(function(err, protobi) {
        if (err) return callback(err);
    
        var rows= data["main"] //primary data file
        protobi.setData(rows);
        
        Protobi.calculate_rim_weights(
            protobi, 
            'weight', {
            "s3":{
               "1":0.49,
               "2":0.51
            },
            "s2":{
                "0":0.84,
                "1":0.16
            },
            "region":{
                "Northeast":0.17,
                "Midwest":0.21,
                "West":0.24,
                "South":0.38,
            }
        })
    return callback (null, rows)
    })
})

Delete

Support

Weighting can get complex and each firm has its own approaches. Our support team is ready to help you with your specific goals. We can help you setup code that does what you need and show you how to modify it from there. Please contact us at support@protobi.com.

Video Tutorial

Weight data