| Title: | Tools for Processing Webcam Eye Tracking Data |
|---|---|
| Description: | A companion package to gazeR. Functions for reading and pre-processing webcam eye tracking data. |
| Authors: | Jason Geller |
| Maintainer: | Jason Geller <[email protected]> |
| License: | GPL-3 |
| Version: | 0.10.3 |
| Built: | 2026-06-03 06:51:39 UTC |
| Source: | https://github.com/jgeller112/webgazeR |
Computes sampling rate for each trial (subject × trial) and summarizes at the subject level. Uses distinct timepoints to avoid duplicate time values. Plots histogram of subject-level sampling rates.
analyze_sampling_rate(eye_data, summary_stat = "Median")analyze_sampling_rate(eye_data, summary_stat = "Median")
eye_data |
A dataframe with subject, trial, time columns. |
summary_stat |
Either "median" (default) or "mean". |
A tibble with subject, trial, SR_trial, SR_subject
Takes a data frame of gaze positions (or other locations), plus screen size and aoi size (or location), and computes the area of interest (AOI) for each location. Defaults assume standard four-corner design.
assign_aoi( gaze, screen_size = c(1024, 768), aoi_size = c(400, 300), aoi_loc = NULL, X = "CURRENT_FIX_X", Y = "CURRENT_FIX_Y" )assign_aoi( gaze, screen_size = c(1024, 768), aoi_size = c(400, 300), aoi_loc = NULL, X = "CURRENT_FIX_X", Y = "CURRENT_FIX_Y" )
gaze |
data frame containing positions |
screen_size |
size of the screen in pixels. Defaults to c(1024, 768) and assumes reversed vertical (i.e., [0,0] is top left). |
aoi_size |
size of AOIs in pixels. Defaults to a c(400, 300) width-height pair and assumes AOIs are in screen corners. AOIs will be coded numerically 1 to 4 in reading order (left to right, top to bottom), with 0 as center location. |
aoi_loc |
location of rectangular AOIs. Use as alternative to aoi_size for non-corner AOIs. Each AOI location should be a separate row in a data frame that has variables xmin, xmax, ymin, and ymax. Assumes reversed vertical (i.e., [0,0] is top left). AOIs will be coded numerically in row order. |
X |
name of variable containing X coordinates. Defaults to "CURRENT_FIX_X" |
Y |
name of variable containing Y coordinates. Defaults to "CURRENT_FIX_Y" |
Original gaze data frame with AOI column added. Non-AOI and off-screen gazes are marked NA.
Trial-level behavioural output from the same Gorilla experiment as
eyedata. Used in the vignette to illustrate merging
behavioural responses with gaze samples.
behav_databehav_data
A data frame with 66,350 rows and 78 columns. Notable columns include
participant and session identifiers (e.g., Participant.Public.ID,
Participant.Private.ID), trial and spreadsheet metadata
(Trial.Number, Spreadsheet.Row, Zone.Name), response
data (Reaction.Time, Response, Correct), trial design
columns (ANSWER, targetword, soundfile,
trialtype, the tl*/tr*/bl*/br* picture
and code columns), and a subject identifier matched to
eyedata.
Gorilla Experiment Builder (https://gorilla.sc/).
Computes ISC using either the **pairwise correlation** method or the **leave-one-out** method. The pairwise method computes a full correlation matrix and averages Fisher z-transformed values. The leave-one-out method computes the correlation of each participant's time series with the mean of all others.
calculate_isc( data_matrix, method = c("pairwise", "leave-one-out"), cor_method = c("pearson", "spearman"), return_matrix = FALSE )calculate_isc( data_matrix, method = c("pairwise", "leave-one-out"), cor_method = c("pearson", "spearman"), return_matrix = FALSE )
data_matrix |
A numeric matrix or data.frame where rows represent time points and columns represent participants. |
method |
A string specifying the ISC computation method. Options are '"pairwise"' (default) or '"leave-one-out"'. |
cor_method |
Correlation type passed to [stats::cor()]. One of '"pearson"' (default) or '"spearman"'. |
return_matrix |
Logical. If 'TRUE' and 'method = "pairwise"', return the full participant-by-participant correlation matrix (self-correlations set to NA) instead of a summary vector. Ignored for 'method = "leave-one-out"'. |
A named numeric vector of ISC values (one per participant), or a correlation matrix if 'return_matrix = TRUE'.
This function combines gaze samples into time bins and optionally aggregates the data.
downsample_gaze( dataframe, bin.length = 50, timevar = "time", aggvars = c("subject", "condition", "target", "trial", "object", "time_bin") )downsample_gaze( dataframe, bin.length = 50, timevar = "time", aggvars = c("subject", "condition", "target", "trial", "object", "time_bin") )
dataframe |
DataFrame containing gaze data. |
bin.length |
Length of time bins (in milliseconds). |
timevar |
Column name representing time. |
aggvars |
Vector of variable names to group by for aggregation. Use "none" to skip aggregation. |
DataFrame with time bins added and optionally aggregated data.
This function reads in multiple Gorilla webcam files, extracts the 'loc', 'x_normalised', 'y_normalised', 'width_normalised', and 'height_normalised' columns, and calculates the bounding box coordinates for the AOIs. It also rounds all numeric columns to 3 decimal places.
extract_aois(file_paths, zone_names = NULL)extract_aois(file_paths, zone_names = NULL)
file_paths |
A list of file paths to webcam files (in .xlsx format). |
A dataframe containing distinct rows with AOI-related columns and calculated coordinates.
A sample of webcam-based eye-tracking gaze samples exported from a Gorilla visual-world-paradigm experiment. Used to demonstrate the preprocessing pipeline in the package vignette.
eyedataeyedata
A data frame with 104,823 rows and 24 columns. Key columns include:
Participant identifier.
Trial number within a participant.
Sample time within a trial (ms).
Raw sample timestamp.
Sample type (e.g., prediction, calibration).
Screen index within the trial.
Predicted gaze coordinates in screen pixels.
Predicted gaze coordinates normalised to [0, 1].
WebGazer convergence value for the prediction.
Face-detection confidence for the sample.
Name of the interest area (zone) the sample belongs to.
Zone position and size in pixels.
Normalised zone position and size.
Source data file the sample came from.
Gorilla Experiment Builder (https://gorilla.sc/).
This function applies a sampling rate threshold and either removes or labels "bad" subjects/trials based on their sampling rates.
filter_sampling_rate( data, threshold, action = c("remove", "label"), by = c("subject", "trial", "both") )filter_sampling_rate( data, threshold, action = c("remove", "label"), by = c("subject", "trial", "both") )
data |
A dataframe with columns: subject, trial, SR_subject, SR_trial. |
threshold |
Numeric. Sampling rate threshold to apply. |
action |
"remove" (default) to delete bad data or "label" to flag bad data. |
by |
"subject", "trial", or "both" to specify where to apply the threshold. |
A dataframe with either rows removed or bad subjects/trials labeled.
Find Image Location in a Given Set of Locations
find_location(locations, image)find_location(locations, image)
locations |
A character vector of image names at each location. |
image |
A character string: the image to locate. |
A character string representing the location name or NA if not found.
Calculates centroid and dispersion from gaze data.
gaze_dispersion(data, x, y, grouping_vars)gaze_dispersion(data, x, y, grouping_vars)
data |
A data frame or tibble. |
x |
A string: name of the column with X gaze coordinates. |
y |
A string: name of the column with Y gaze coordinates. |
grouping_vars |
A character vector of column names to group by (e.g., subject, condition, trial). |
A tibble with centroid_x, centroid_y, dispersion, and log_dispersion for each group.
This function calculates the number and percentage of gaze points that fall outside the screen dimensions, and optionally removes only the out-of-bounds gaze points.
gaze_oob( data, subject_col = "subject", trial_col = "trial", x_col = "x", y_col = "y", screen_size = c(1, 1), remove = FALSE )gaze_oob( data, subject_col = "subject", trial_col = "trial", x_col = "x", y_col = "y", screen_size = c(1, 1), remove = FALSE )
data |
A data frame containing gaze data. |
subject_col |
A string specifying the name of the column that contains the subject identifier. Default is "subject". |
trial_col |
A string specifying the name of the column that contains the trial identifier. Default is "trial". |
x_col |
A string specifying the name of the column that contains the X coordinate. Default is "x". |
y_col |
A string specifying the name of the column that contains the Y coordinate. Default is "y". |
screen_size |
A numeric vector of length 2 specifying the screen width and height. Default is c(1, 1) assuming normalized coordinates. |
remove |
Logical; if TRUE, removes points outside of screen dimensions. Default is FALSE. |
A list containing:
Summary of missingness at the subject level.
Summary of missingness at the trial level.
Dataset with optional removal of out-of-bounds points and missingness annotations.
Interpolate missing gaze data (X and Y) within trials, with optional max gap
interpolate_gaze( x, x_col = "Gaze_X", y_col = "Gaze_Y", trial_col = "Trial", subject_col = "Subject", time_col = "Time", max_gap = Inf )interpolate_gaze( x, x_col = "Gaze_X", y_col = "Gaze_Y", trial_col = "Trial", subject_col = "Subject", time_col = "Time", max_gap = Inf )
x |
A data frame containing gaze data. |
x_col |
The name of the X gaze column (as string). |
y_col |
The name of the Y gaze column (as string). |
trial_col |
The name of the trial column (default = "Trial"). |
subject_col |
The name of the subject column (default = "Subject"). |
time_col |
The name of the time column used for sorting (default = "Time"). |
max_gap |
Maximum number of consecutive missing samples to interpolate. Gaps larger than this remain NA (default = Inf). |
A tibble with interpolated gaze X and Y columns (replacing originals).
This function takes a dataframe and renames columns to match WebGazer conventions: subject, trial, time, x, y. All other columns are preserved.
make_webgazer( data, col_map = list(subject = "subject", trial = "trial", time = "time", x = "x", y = "y") )make_webgazer( data, col_map = list(subject = "subject", trial = "trial", time = "time", x = "x", y = "y") )
data |
A dataframe containing gaze data. |
col_map |
A named list mapping your current columns to WebGazer names: 'subject', 'trial', 'time', 'x', 'y'. |
A dataframe with renamed columns but preserves all other original columns.
Read, merge, and standardize webcam eye-tracking data from multiple platforms into a single **long-format** data frame with columns: 'subject', 'trial', 'time', 'x', 'y' (+ any retained trial-level metadata).
merge_webcam_files( file_paths, screen_index = NULL, kind = c("gorilla", "jspsych", "psychopy"), col_map = list(subject = "participant_id", trial = "spreadsheet_row", time = "time_elapsed", x = "x", y = "y"), array_col = NULL, array_key = TRUE, trial_filter = NULL, out_dir = NULL, overwrite = FALSE, file_prefix = "eye_long" )merge_webcam_files( file_paths, screen_index = NULL, kind = c("gorilla", "jspsych", "psychopy"), col_map = list(subject = "participant_id", trial = "spreadsheet_row", time = "time_elapsed", x = "x", y = "y"), array_col = NULL, array_key = TRUE, trial_filter = NULL, out_dir = NULL, overwrite = FALSE, file_prefix = "eye_long" )
file_paths |
Character vector of paths to files to read. |
screen_index |
Optional. If provided, filters Gorilla data by one or more screen indices (requires a 'screen_index' column in the data). |
kind |
Data collection platform. One of '"gorilla"', '"jspsych"', '"psychopy"'. |
col_map |
Named list mapping your file's column names to standardized names: 'subject', 'trial', 'time', 'x', 'y'. For jsPsych/PsychoPy you usually only need 'subject' and 'trial' because 'time/x/y' are read from the sample arrays. |
array_col |
For 'kind = "jspsych"' or '"psychopy"': name of the column containing per-trial eye samples (list-column or JSON string). Required for these kinds. |
array_key |
Logical. If 'FALSE', assume unkeyed triplets '(t,x,y)'. If 'TRUE', use keys when present. |
trial_filter |
Optional function 'f(df) -> df' to filter trial rows before parsing (useful for jsPsych; e.g., keep only trials that contain WebGazer samples). |
out_dir |
Optional. If not 'NULL', writes one CSV per participant (subject) into this folder. |
overwrite |
Logical. If 'TRUE', overwrite existing per-participant files in 'out_dir'. |
file_prefix |
Character. Prefix for per-participant output files written to 'out_dir'. |
The function supports:
- **Gorilla** ('kind = "gorilla"'): expects already-long data where 'type == "prediction"', then renames columns to the standardized schema using 'col_map'.
- **jsPsych** ('kind = "jspsych"'): supports either (a) jsPsych **JSON trials exports** that contain a top-level 'data' object (e.g., 'subject-<id>.json'), or (b) tabular files (.csv/.tsv/.xlsx) where each row is a trial and 'array_col' contains per-trial eye samples (as a JSON string). For JSON files, if the subject column is missing, the function will infer it from the filename pattern 'subject-<id>.json'.
- **PsychoPy** ('kind = "psychopy"'): expects tabular trial-level data where 'array_col' contains per-trial eye samples (list-column or JSON string).
## Eye sample formats in 'array_col'
The per-trial sample array can be any of:
- **Unkeyed triplets**: '[[t, x, y], ...]' (or a 3-column matrix/data.frame)
- Keyed objects: [{t=..., x=..., y=...}, ...] or [{time=..., x=..., y=...}, ...]
If 'array_key = FALSE', samples are treated as positional '(t, x, y)' and field names are ignored. If 'array_key = TRUE', the parser uses keys (preferring 'col_map$time/x/y', then falling back to '"time"/"t"', '"x"', '"y"').
A data frame containing aggregated long-format eye data across all files.
## Not run: # Gorilla (already-long predictions) df <- merge_webcam_files( file_paths = "gorilla_export.csv", kind = "gorilla", col_map = list( subject = "participant_id", trial = "spreadsheet_row", time = "time_elapsed", x = "x", y = "y" ) ) # jsPsych JSON trials export (subject inferred from filename if needed) df <- merge_webcam_files( file_paths = "subject-6085bd39a5358.json", kind = "jspsych", col_map = list(subject = "subject_id", trial = "trial_index"), array_col = "webgazer_data", array_key = TRUE ) # jsPsych CSV (requires subject_id column in the CSV) df <- merge_webcam_files( file_paths = "jspsych_export.csv", kind = "jspsych", col_map = list(subject = "subject_id", trial = "trial_index"), array_col = "webgazer_data", array_key = TRUE ) ## End(Not run)## Not run: # Gorilla (already-long predictions) df <- merge_webcam_files( file_paths = "gorilla_export.csv", kind = "gorilla", col_map = list( subject = "participant_id", trial = "spreadsheet_row", time = "time_elapsed", x = "x", y = "y" ) ) # jsPsych JSON trials export (subject inferred from filename if needed) df <- merge_webcam_files( file_paths = "subject-6085bd39a5358.json", kind = "jspsych", col_map = list(subject = "subject_id", trial = "trial_index"), array_col = "webgazer_data", array_key = TRUE ) # jsPsych CSV (requires subject_id column in the CSV) df <- merge_webcam_files( file_paths = "jspsych_export.csv", kind = "jspsych", col_map = list(subject = "subject_id", trial = "trial_index"), array_col = "webgazer_data", array_key = TRUE ) ## End(Not run)
This function creates a time-course plot of the proportion of looks to specified Interest Areas (IAs). Optionally, it can facet the plot by an experimental condition. Custom labels for each IA can be specified through the 'ia_mapping' argument to define the display order.
plot_IA_proportions( data, ia_column, time_column, proportion_column, condition_column = NULL, ia_mapping, use_color = TRUE )plot_IA_proportions( data, ia_column, time_column, proportion_column, condition_column = NULL, ia_mapping, use_color = TRUE )
data |
A data frame containing the data to plot. |
ia_column |
The name of the column containing Interest Area (IA) identifiers. |
time_column |
The name of the column representing time (e.g., milliseconds). |
proportion_column |
The name of the column with the proportion of looks for each IA. |
condition_column |
Optional. The name of the column representing experimental conditions. If not provided, the plot will not be faceted by condition. |
ia_mapping |
A named list specifying custom labels for each IA in the desired display order (e.g., 'list(IA1 = "Target", IA2 = "Cohort", IA3 = "Rhyme", IA4 = "Unrelated")'). |
use_color |
Logical. If 'TRUE' (default), the plot will use colors to differentiate Interest Areas. If 'FALSE', different line types, shapes, and line widths will be used instead. |
A ggplot2 plot of the proportion of looks over time for each IA, optionally faceted by condition.
Apply a moving average smoothing function to gaze data (X and Y). This is generally recommended after up-sampling the data.
smooth_gaze( x, n = 5, x_col = "Gaze_X", y_col = "Gaze_Y", trial_col = "Trial", subject_col = "Subject" )smooth_gaze( x, n = 5, x_col = "Gaze_X", y_col = "Gaze_Y", trial_col = "Trial", subject_col = "Subject" )
x |
A data frame containing gaze data. |
n |
The window size (in samples) for the moving average. |
x_col |
The name of the X gaze column (as string). |
y_col |
The name of the Y gaze column (as string). |
trial_col |
The name of the trial column used for grouping (default = "Trial"). |
subject_col |
The name of the subject column used for grouping (default = "Subject"). |
A tibble with smoothed gaze X and Y columns (replacing originals).
Computes ISC for each subject by correlating that subject's time series against the mean of all *other* subjects within sliding windows, then Fisher z-transforms correlations, averages them per subject, and converts back to r.
time_window_isc( data, window_size = 10, step = 1, min_overlap = 3, method = c("pearson", "spearman"), return_per_window = FALSE )time_window_isc( data, window_size = 10, step = 1, min_overlap = 3, method = c("pearson", "spearman"), return_per_window = FALSE )
data |
A numeric matrix or data.frame with rows = time points, columns = participants. |
window_size |
Integer, window length in time points (default 10). |
step |
Integer, step size between consecutive windows (default 1). |
min_overlap |
Integer, minimum number of non-NA paired points within a window required to compute a correlation (default 3). |
method |
Correlation method passed to [stats::cor()], usually "pearson" (default), or "spearman". |
return_per_window |
Logical; if TRUE, also return a data.frame of per-window Fisher-z correlations per subject (default FALSE). |
Correlations of exactly +/-1 are clamped to +/-0.9999999 before Fisher z-transformation so they are included in the average rather than silently dropped.
If 'return_per_window = FALSE', a named numeric vector of ISC values (length = n subjects). If 'TRUE', a list with: - 'isc': named numeric vector of per-subject ISC (as above) - 'per_window': data.frame with columns 'window_start', 'window_end', 'subject', 'z'
Increase the sampling frequency to 'target_hz' Hz by inserting additional rows. Missing values in gaze and pupil data will be preserved for later interpolation.
upsample_gaze( x, pupil_cols = c("Pupil_Diameter"), gaze_cols = c("x_pred_normalised", "y_pred_normalised"), target_hz = 1000, upsample_pupil = TRUE )upsample_gaze( x, pupil_cols = c("Pupil_Diameter"), gaze_cols = c("x_pred_normalised", "y_pred_normalised"), target_hz = 1000, upsample_pupil = TRUE )
x |
A dataframe containing gaze and pupil data with columns: 'subject', 'trial', and 'time'. |
pupil_cols |
Character vector of pupil diameter column names. |
gaze_cols |
Character vector of gaze position column names. |
target_hz |
Target sampling frequency (default is 1000 Hz). |
upsample_pupil |
Logical; if 'TRUE', pupil data will also be upsampled. |
A dataframe with up-sampled time points and an 'up_sampled' column.
Aggregated fixation counts by participant, condition, and time bin for a visual-world-paradigm analysis. Suitable for demonstrating empirical-logit or growth-curve modelling of gaze proportions.
vwp_countsvwp_counts
A data frame with 814 rows and 6 columns:
Participant identifier.
Numeric coding of condition (e.g., -0.5 /
0.5) suitable as a contrast.
Time bin relative to target onset (ms).
Condition label (e.g., "unrelated").
Number of samples in the bin that fell on the target interest area.
Total number of samples in the bin across all interest areas.
Aggregated from the example webcam eye-tracking data shipped with this package.