Package 'webgazeR'

Title: Tools for Processing Webcam Eye Tracking Data
Description: A companion package to gazeR. Functions for reading and pre-processing webcam eye tracking data.
Authors: Jason Geller
Maintainer: Jason Geller <[email protected]>
License: GPL-3
Version: 0.10.3
Built: 2026-06-03 06:51:39 UTC
Source: https://github.com/jgeller112/webgazeR

Help Index


Analyze Sampling Rates (Trial and Subject Levels, with Histogram)

Description

Computes sampling rate for each trial (subject × trial) and summarizes at the subject level. Uses distinct timepoints to avoid duplicate time values. Plots histogram of subject-level sampling rates.

Usage

analyze_sampling_rate(eye_data, summary_stat = "Median")

Arguments

eye_data

A dataframe with subject, trial, time columns.

summary_stat

Either "median" (default) or "mean".

Value

A tibble with subject, trial, SR_trial, SR_subject


Assign coordinates to areas of interest

Description

Takes a data frame of gaze positions (or other locations), plus screen size and aoi size (or location), and computes the area of interest (AOI) for each location. Defaults assume standard four-corner design.

Usage

assign_aoi(
  gaze,
  screen_size = c(1024, 768),
  aoi_size = c(400, 300),
  aoi_loc = NULL,
  X = "CURRENT_FIX_X",
  Y = "CURRENT_FIX_Y"
)

Arguments

gaze

data frame containing positions

screen_size

size of the screen in pixels. Defaults to c(1024, 768) and assumes reversed vertical (i.e., [0,0] is top left).

aoi_size

size of AOIs in pixels. Defaults to a c(400, 300) width-height pair and assumes AOIs are in screen corners. AOIs will be coded numerically 1 to 4 in reading order (left to right, top to bottom), with 0 as center location.

aoi_loc

location of rectangular AOIs. Use as alternative to aoi_size for non-corner AOIs. Each AOI location should be a separate row in a data frame that has variables xmin, xmax, ymin, and ymax. Assumes reversed vertical (i.e., [0,0] is top left). AOIs will be coded numerically in row order.

X

name of variable containing X coordinates. Defaults to "CURRENT_FIX_X"

Y

name of variable containing Y coordinates. Defaults to "CURRENT_FIX_Y"

Value

Original gaze data frame with AOI column added. Non-AOI and off-screen gazes are marked NA.


Example behavioural data from a Gorilla experiment

Description

Trial-level behavioural output from the same Gorilla experiment as eyedata. Used in the vignette to illustrate merging behavioural responses with gaze samples.

Usage

behav_data

Format

A data frame with 66,350 rows and 78 columns. Notable columns include participant and session identifiers (e.g., Participant.Public.ID, Participant.Private.ID), trial and spreadsheet metadata (Trial.Number, Spreadsheet.Row, Zone.Name), response data (Reaction.Time, Response, Correct), trial design columns (ANSWER, targetword, soundfile, trialtype, the tl*/tr*/bl*/br* picture and code columns), and a subject identifier matched to eyedata.

Source

Gorilla Experiment Builder (https://gorilla.sc/).


Compute Intersubject Correlation (ISC)

Description

Computes ISC using either the **pairwise correlation** method or the **leave-one-out** method. The pairwise method computes a full correlation matrix and averages Fisher z-transformed values. The leave-one-out method computes the correlation of each participant's time series with the mean of all others.

Usage

calculate_isc(
  data_matrix,
  method = c("pairwise", "leave-one-out"),
  cor_method = c("pearson", "spearman"),
  return_matrix = FALSE
)

Arguments

data_matrix

A numeric matrix or data.frame where rows represent time points and columns represent participants.

method

A string specifying the ISC computation method. Options are '"pairwise"' (default) or '"leave-one-out"'.

cor_method

Correlation type passed to [stats::cor()]. One of '"pearson"' (default) or '"spearman"'.

return_matrix

Logical. If 'TRUE' and 'method = "pairwise"', return the full participant-by-participant correlation matrix (self-correlations set to NA) instead of a summary vector. Ignored for 'method = "leave-one-out"'.

Value

A named numeric vector of ISC values (one per participant), or a correlation matrix if 'return_matrix = TRUE'.


Downsample gaze data

Description

This function combines gaze samples into time bins and optionally aggregates the data.

Usage

downsample_gaze(
  dataframe,
  bin.length = 50,
  timevar = "time",
  aggvars = c("subject", "condition", "target", "trial", "object", "time_bin")
)

Arguments

dataframe

DataFrame containing gaze data.

bin.length

Length of time bins (in milliseconds).

timevar

Column name representing time.

aggvars

Vector of variable names to group by for aggregation. Use "none" to skip aggregation.

Value

DataFrame with time bins added and optionally aggregated data.


Extract AOI-related Columns from Webcam Files and Calculate Locations

Description

This function reads in multiple Gorilla webcam files, extracts the 'loc', 'x_normalised', 'y_normalised', 'width_normalised', and 'height_normalised' columns, and calculates the bounding box coordinates for the AOIs. It also rounds all numeric columns to 3 decimal places.

Usage

extract_aois(file_paths, zone_names = NULL)

Arguments

file_paths

A list of file paths to webcam files (in .xlsx format).

Value

A dataframe containing distinct rows with AOI-related columns and calculated coordinates.


Example webcam eye-tracking data

Description

A sample of webcam-based eye-tracking gaze samples exported from a Gorilla visual-world-paradigm experiment. Used to demonstrate the preprocessing pipeline in the package vignette.

Usage

eyedata

Format

A data frame with 104,823 rows and 24 columns. Key columns include:

subject

Participant identifier.

trial

Trial number within a participant.

time

Sample time within a trial (ms).

time_stamp

Raw sample timestamp.

type

Sample type (e.g., prediction, calibration).

screen_index

Screen index within the trial.

x_pred, y_pred

Predicted gaze coordinates in screen pixels.

x_pred_normalised, y_pred_normalised

Predicted gaze coordinates normalised to [0, 1].

convergence

WebGazer convergence value for the prediction.

face_conf

Face-detection confidence for the sample.

zone_name

Name of the interest area (zone) the sample belongs to.

zone_x, zone_y, zone_width, zone_height

Zone position and size in pixels.

zone_x_normalised, zone_y_normalised, zone_width_normalised, zone_height_normalised

Normalised zone position and size.

filename

Source data file the sample came from.

Source

Gorilla Experiment Builder (https://gorilla.sc/).


Filter or Label Data Based on Sampling Rate Threshold

Description

This function applies a sampling rate threshold and either removes or labels "bad" subjects/trials based on their sampling rates.

Usage

filter_sampling_rate(
  data,
  threshold,
  action = c("remove", "label"),
  by = c("subject", "trial", "both")
)

Arguments

data

A dataframe with columns: subject, trial, SR_subject, SR_trial.

threshold

Numeric. Sampling rate threshold to apply.

action

"remove" (default) to delete bad data or "label" to flag bad data.

by

"subject", "trial", or "both" to specify where to apply the threshold.

Value

A dataframe with either rows removed or bad subjects/trials labeled.


Find Image Location in a Given Set of Locations

Description

Find Image Location in a Given Set of Locations

Usage

find_location(locations, image)

Arguments

locations

A character vector of image names at each location.

image

A character string: the image to locate.

Value

A character string representing the location name or NA if not found.


Compute Gaze Dispersion

Description

Calculates centroid and dispersion from gaze data.

Usage

gaze_dispersion(data, x, y, grouping_vars)

Arguments

data

A data frame or tibble.

x

A string: name of the column with X gaze coordinates.

y

A string: name of the column with Y gaze coordinates.

grouping_vars

A character vector of column names to group by (e.g., subject, condition, trial).

Value

A tibble with centroid_x, centroid_y, dispersion, and log_dispersion for each group.


Calculate Out-of-Bounds Proportion by Subject and Trial

Description

This function calculates the number and percentage of gaze points that fall outside the screen dimensions, and optionally removes only the out-of-bounds gaze points.

Usage

gaze_oob(
  data,
  subject_col = "subject",
  trial_col = "trial",
  x_col = "x",
  y_col = "y",
  screen_size = c(1, 1),
  remove = FALSE
)

Arguments

data

A data frame containing gaze data.

subject_col

A string specifying the name of the column that contains the subject identifier. Default is "subject".

trial_col

A string specifying the name of the column that contains the trial identifier. Default is "trial".

x_col

A string specifying the name of the column that contains the X coordinate. Default is "x".

y_col

A string specifying the name of the column that contains the Y coordinate. Default is "y".

screen_size

A numeric vector of length 2 specifying the screen width and height. Default is c(1, 1) assuming normalized coordinates.

remove

Logical; if TRUE, removes points outside of screen dimensions. Default is FALSE.

Value

A list containing:

subject_results

Summary of missingness at the subject level.

trial_results

Summary of missingness at the trial level.

data_clean

Dataset with optional removal of out-of-bounds points and missingness annotations.


Interpolate missing gaze data (X and Y) within trials, with optional max gap

Description

Interpolate missing gaze data (X and Y) within trials, with optional max gap

Usage

interpolate_gaze(
  x,
  x_col = "Gaze_X",
  y_col = "Gaze_Y",
  trial_col = "Trial",
  subject_col = "Subject",
  time_col = "Time",
  max_gap = Inf
)

Arguments

x

A data frame containing gaze data.

x_col

The name of the X gaze column (as string).

y_col

The name of the Y gaze column (as string).

trial_col

The name of the trial column (default = "Trial").

subject_col

The name of the subject column (default = "Subject").

time_col

The name of the time column used for sorting (default = "Time").

max_gap

Maximum number of consecutive missing samples to interpolate. Gaps larger than this remain NA (default = Inf).

Value

A tibble with interpolated gaze X and Y columns (replacing originals).


Standardize a Dataframe to WebGazer Conventions

Description

This function takes a dataframe and renames columns to match WebGazer conventions: subject, trial, time, x, y. All other columns are preserved.

Usage

make_webgazer(
  data,
  col_map = list(subject = "subject", trial = "trial", time = "time", x = "x", y = "y")
)

Arguments

data

A dataframe containing gaze data.

col_map

A named list mapping your current columns to WebGazer names: 'subject', 'trial', 'time', 'x', 'y'.

Value

A dataframe with renamed columns but preserves all other original columns.


Merge and Process Webcam Eye-Tracking Files

Description

Read, merge, and standardize webcam eye-tracking data from multiple platforms into a single **long-format** data frame with columns: 'subject', 'trial', 'time', 'x', 'y' (+ any retained trial-level metadata).

Usage

merge_webcam_files(
  file_paths,
  screen_index = NULL,
  kind = c("gorilla", "jspsych", "psychopy"),
  col_map = list(subject = "participant_id", trial = "spreadsheet_row", time =
    "time_elapsed", x = "x", y = "y"),
  array_col = NULL,
  array_key = TRUE,
  trial_filter = NULL,
  out_dir = NULL,
  overwrite = FALSE,
  file_prefix = "eye_long"
)

Arguments

file_paths

Character vector of paths to files to read.

screen_index

Optional. If provided, filters Gorilla data by one or more screen indices (requires a 'screen_index' column in the data).

kind

Data collection platform. One of '"gorilla"', '"jspsych"', '"psychopy"'.

col_map

Named list mapping your file's column names to standardized names: 'subject', 'trial', 'time', 'x', 'y'. For jsPsych/PsychoPy you usually only need 'subject' and 'trial' because 'time/x/y' are read from the sample arrays.

array_col

For 'kind = "jspsych"' or '"psychopy"': name of the column containing per-trial eye samples (list-column or JSON string). Required for these kinds.

array_key

Logical. If 'FALSE', assume unkeyed triplets '(t,x,y)'. If 'TRUE', use keys when present.

trial_filter

Optional function 'f(df) -> df' to filter trial rows before parsing (useful for jsPsych; e.g., keep only trials that contain WebGazer samples).

out_dir

Optional. If not 'NULL', writes one CSV per participant (subject) into this folder.

overwrite

Logical. If 'TRUE', overwrite existing per-participant files in 'out_dir'.

file_prefix

Character. Prefix for per-participant output files written to 'out_dir'.

Details

The function supports:

- **Gorilla** ('kind = "gorilla"'): expects already-long data where 'type == "prediction"', then renames columns to the standardized schema using 'col_map'.

- **jsPsych** ('kind = "jspsych"'): supports either (a) jsPsych **JSON trials exports** that contain a top-level 'data' object (e.g., 'subject-<id>.json'), or (b) tabular files (.csv/.tsv/.xlsx) where each row is a trial and 'array_col' contains per-trial eye samples (as a JSON string). For JSON files, if the subject column is missing, the function will infer it from the filename pattern 'subject-<id>.json'.

- **PsychoPy** ('kind = "psychopy"'): expects tabular trial-level data where 'array_col' contains per-trial eye samples (list-column or JSON string).

## Eye sample formats in 'array_col' The per-trial sample array can be any of: - **Unkeyed triplets**: '[[t, x, y], ...]' (or a 3-column matrix/data.frame) - Keyed objects: ⁠[{t=..., x=..., y=...}, ...]⁠ or ⁠[{time=..., x=..., y=...}, ...]⁠

If 'array_key = FALSE', samples are treated as positional '(t, x, y)' and field names are ignored. If 'array_key = TRUE', the parser uses keys (preferring 'col_map$time/x/y', then falling back to '"time"/"t"', '"x"', '"y"').

Value

A data frame containing aggregated long-format eye data across all files.

Examples

## Not run: 
# Gorilla (already-long predictions)
df <- merge_webcam_files(
  file_paths = "gorilla_export.csv",
  kind = "gorilla",
  col_map = list(
    subject = "participant_id", trial = "spreadsheet_row",
    time = "time_elapsed", x = "x", y = "y"
  )
)

# jsPsych JSON trials export (subject inferred from filename if needed)
df <- merge_webcam_files(
  file_paths = "subject-6085bd39a5358.json",
  kind = "jspsych",
  col_map = list(subject = "subject_id", trial = "trial_index"),
  array_col = "webgazer_data",
  array_key = TRUE
)

# jsPsych CSV (requires subject_id column in the CSV)
df <- merge_webcam_files(
  file_paths = "jspsych_export.csv",
  kind = "jspsych",
  col_map = list(subject = "subject_id", trial = "trial_index"),
  array_col = "webgazer_data",
  array_key = TRUE
)

## End(Not run)

Plot Proportion of Looks Over Time for Interest Areas (IAs)

Description

This function creates a time-course plot of the proportion of looks to specified Interest Areas (IAs). Optionally, it can facet the plot by an experimental condition. Custom labels for each IA can be specified through the 'ia_mapping' argument to define the display order.

Usage

plot_IA_proportions(
  data,
  ia_column,
  time_column,
  proportion_column,
  condition_column = NULL,
  ia_mapping,
  use_color = TRUE
)

Arguments

data

A data frame containing the data to plot.

ia_column

The name of the column containing Interest Area (IA) identifiers.

time_column

The name of the column representing time (e.g., milliseconds).

proportion_column

The name of the column with the proportion of looks for each IA.

condition_column

Optional. The name of the column representing experimental conditions. If not provided, the plot will not be faceted by condition.

ia_mapping

A named list specifying custom labels for each IA in the desired display order (e.g., 'list(IA1 = "Target", IA2 = "Cohort", IA3 = "Rhyme", IA4 = "Unrelated")').

use_color

Logical. If 'TRUE' (default), the plot will use colors to differentiate Interest Areas. If 'FALSE', different line types, shapes, and line widths will be used instead.

Value

A ggplot2 plot of the proportion of looks over time for each IA, optionally faceted by condition.


Apply a moving average smoothing function to gaze data (X and Y). This is generally recommended after up-sampling the data.

Description

Apply a moving average smoothing function to gaze data (X and Y). This is generally recommended after up-sampling the data.

Usage

smooth_gaze(
  x,
  n = 5,
  x_col = "Gaze_X",
  y_col = "Gaze_Y",
  trial_col = "Trial",
  subject_col = "Subject"
)

Arguments

x

A data frame containing gaze data.

n

The window size (in samples) for the moving average.

x_col

The name of the X gaze column (as string).

y_col

The name of the Y gaze column (as string).

trial_col

The name of the trial column used for grouping (default = "Trial").

subject_col

The name of the subject column used for grouping (default = "Subject").

Value

A tibble with smoothed gaze X and Y columns (replacing originals).


Time-windowed Intersubject Correlation (ISC) per subject

Description

Computes ISC for each subject by correlating that subject's time series against the mean of all *other* subjects within sliding windows, then Fisher z-transforms correlations, averages them per subject, and converts back to r.

Usage

time_window_isc(
  data,
  window_size = 10,
  step = 1,
  min_overlap = 3,
  method = c("pearson", "spearman"),
  return_per_window = FALSE
)

Arguments

data

A numeric matrix or data.frame with rows = time points, columns = participants.

window_size

Integer, window length in time points (default 10).

step

Integer, step size between consecutive windows (default 1).

min_overlap

Integer, minimum number of non-NA paired points within a window required to compute a correlation (default 3).

method

Correlation method passed to [stats::cor()], usually "pearson" (default), or "spearman".

return_per_window

Logical; if TRUE, also return a data.frame of per-window Fisher-z correlations per subject (default FALSE).

Details

Correlations of exactly +/-1 are clamped to +/-0.9999999 before Fisher z-transformation so they are included in the average rather than silently dropped.

Value

If 'return_per_window = FALSE', a named numeric vector of ISC values (length = n subjects). If 'TRUE', a list with: - 'isc': named numeric vector of per-subject ISC (as above) - 'per_window': data.frame with columns 'window_start', 'window_end', 'subject', 'z'


Up-sample gaze and pupil data

Description

Increase the sampling frequency to 'target_hz' Hz by inserting additional rows. Missing values in gaze and pupil data will be preserved for later interpolation.

Usage

upsample_gaze(
  x,
  pupil_cols = c("Pupil_Diameter"),
  gaze_cols = c("x_pred_normalised", "y_pred_normalised"),
  target_hz = 1000,
  upsample_pupil = TRUE
)

Arguments

x

A dataframe containing gaze and pupil data with columns: 'subject', 'trial', and 'time'.

pupil_cols

Character vector of pupil diameter column names.

gaze_cols

Character vector of gaze position column names.

target_hz

Target sampling frequency (default is 1000 Hz).

upsample_pupil

Logical; if 'TRUE', pupil data will also be upsampled.

Value

A dataframe with up-sampled time points and an 'up_sampled' column.


Visual-world-paradigm fixation counts

Description

Aggregated fixation counts by participant, condition, and time bin for a visual-world-paradigm analysis. Suitable for demonstrating empirical-logit or growth-curve modelling of gaze proportions.

Usage

vwp_counts

Format

A data frame with 814 rows and 6 columns:

subject

Participant identifier.

condition_num

Numeric coding of condition (e.g., -0.5 / 0.5) suitable as a contrast.

time_bin

Time bin relative to target onset (ms).

condition

Condition label (e.g., "unrelated").

fix

Number of samples in the bin that fell on the target interest area.

total_fix

Total number of samples in the bin across all interest areas.

Source

Aggregated from the example webcam eye-tracking data shipped with this package.