Baseline data

Set the baseline scenario for comparisons

Protobi logo icon showing a small visualization with yellow and blue bars arranged horizontally, representing the baseline comparison feature.

Comparing results to baseline

Pressing on any value in Protobi creates a filter that applies to every element. This enables you to see how respondents who answered one question answered other questions. Press anywhere to filter everywhere.

Protobi automatically compares every result to a baseline distribution so that you can immediately see important differences and quickly test hypotheses.

But wait... how do we compare what's different? That's typically the first question. That's what baseline is for. In this tutorial we will teach you how to view your baseline, then dive deeper into its significance.

Quick start guide:

  • Press to create filters for defining one scenario (see Press to query)
  • Press "Set base" to set the current scenario as the baseline
  • Press to create filters for another scenario
  • The solid bars now represent your new scenario
  • The thin black outlines represent your baseline scenario
  • Triangles indicate significant differences between current and baseline

Initial view

When you first open a dataset in Protobi it displays results for all respondents. Below are the first two questions from the Gender and Generations Survey from Pew Research.

We can see 30.5% of all respondents say they are "Very happy," and 25.4% say they are in "Excellent" health.

Protobi data view showing two survey questions side by side. Question q1 asks about happiness levels with responses ranging from "Very happy" (30.5%) to "Not too happy" (15.4%). Question q2 asks about health ratings with responses from "Excellent" (25.4%) to "Poor" (4.9%). Each displays horizontal blue bar charts with percentages, N=2511 respondents.

Press to filter

Press on "Excellent" in q2 to show results for just those respondents. We can see out of the 100% of respondents who reported themselves in "Excellent" health, 49.6% also said they are "Very happy."

Filtered Protobi view showing only respondents in "Excellent" health (N=637). Question q1 now shows 49.6% are "Very happy" with thin black outlines behind the blue bars indicating baseline comparison. Question q2 shows "Excellent" at 100.0% in yellow/gold, with other health categories at 0.0%, demonstrating the active filter. Blue triangles appear on several bars indicating significant differences from baseline.

Here you can observe there are thin black outlines and solid color bars .

The percentages and solid color bars reflect the current scenario, (i.e., only those in "Excellent" health). The thin black outlines reflect the baseline scenario (which initially includes all respondents).

You can hover to see the value baseline frequency. Here, of those in "Excellent" health, 49.6% are "Very happy" compared to 30.5% for all respondents.

Close-up view of question q1 with a blue tooltip displayed over the "Very happy" bar. The tooltip shows "current: 49.6%" and "baseline: 30.5%" comparing the filtered data to the original baseline. The bar has a blue triangle indicator showing statistical significance.

Protobi shows blue triangles wherever the current scenario is significantly different from baseline. Here, the triangle indicates that 49.6% is significantly higher than 30.5%.

Protobi is smart enough to recognize that "Very happy" respondents are a subset of all respondents, so in this case the triangle is really comparing "Very happy" to not "Very happy" respondents.

Set current filters as baseline

But let's say we want to make a strict comparison between non-overlapping groups, to compare those respondents who are NOT in "Excellent" health to those who ARE in "Excellent" health.

Press the toolbar button "Set base" to make the baseline scenario equal to the current scenario:

Protobi toolbar showing multiple action buttons. The "Set base" button is highlighted with a red border, positioned between "Clear" and "#/%" buttons. Other visible buttons include N=637 (sample size), Format, [NA], Crosstab, Export, Save, and Scenarios. Protobi view after setting baseline to "Excellent" health respondents. Question q1 shows happiness distribution (49.6% Very happy, 42.7% Pretty happy, 6.0% Not too happy, 1.7% Don't know/Refused) with N=637. Question q2 displays "Excellent" at 100.0% in yellow with no baseline outlines or triangles, as current and baseline are now identical.

Select other filters

Now shift+press on "Excellent" to select those respondents who are NOT in "Excellent" health:

Protobi comparison view showing respondents NOT in "Excellent" health (N=1874) versus baseline of those in "Excellent" health. Question q1 shows 24.0% Very happy (vs 49.6% baseline), 53.6% Pretty happy, 18.6% Not too happy with blue triangles indicating significant differences. Question q2 shows 0.0% Excellent, 69.3% Good in yellow, 23.8% Only fair, 6.5% Poor, with baseline comparisons visible.

We can see above, we're now running a strong comparison between two distinct groups. The groups being those who are NOT in "Excellent" health (current scenario, solid bars) versus those who ARE in "Excellent" health (baseline scenario, thin black outline).

See also

Baseline scenarios are great for comparing specific subsets. But if your goal is to systematically compare every value, another approach is to create a crosstab:

Protobi crosstab view showing question q1 (happiness) cross-tabulated by q2 (health). The table displays happiness responses in rows (Very happy, Pretty happy, Not too happy, Don't know/Refused) and health categories in columns (Overall, Excellent, Good, Only fair, Poor, Don't know/Refused). Cells show percentages with blue shading indicating higher values, with N values at bottom and Chi Square statistic of 416.9 (p=0.000).