Represent curves as a weighted sum of spline basis functions.

tfb_spline(data, ...)

# S3 method for data.frame
tfb_spline(
  data,
  id = 1,
  arg = 2,
  value = 3,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

# S3 method for matrix
tfb_spline(
  data,
  arg = NULL,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

# S3 method for numeric
tfb_spline(
  data,
  arg = NULL,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

# S3 method for list
tfb_spline(
  data,
  arg = NULL,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

# S3 method for tfd
tfb_spline(
  data,
  arg = NULL,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

# S3 method for tfb
tfb_spline(
  data,
  arg = NULL,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

# S3 method for default
tfb_spline(
  data,
  arg = NULL,
  domain = NULL,
  penalized = TRUE,
  global = FALSE,
  resolution = NULL,
  ...
)

Arguments

data

a matrix, data.frame or list of suitable shape, or another tf-object containing functional data.

...

arguments to the calls to mgcv::s() setting up the basis and mgcv::magic() or mgcv::gam.fit() (if penalized is TRUE). If not user-specified here, tidyfun uses k = 25 cubic regression spline basis functions (i.e., bs = "cr") by default, but this should (!) be set appropriately.

id

The name/number of the column defining which data belong to which function.

arg

optional vector of argument values

value

The name/number of the column containing the function evaluations.

domain

range of the arg.

penalized

should the coefficients of the basis representation be estimated via mgcv::magic() (default) or ordinary least squares.

global

Defaults to FALSE. If TRUE and penalized = TRUE, all functions share the same smoothing parameter (see Details).

resolution

resolution of the evaluation grid. See details for tfd().

Value

a tfb-object

Details

The basis to be used is set up via a call to mgcv::s() and all the spline bases discussed in mgcv::smooth.terms() are available, in principle. Depending on the value of the penalized- and global-flags, the coefficient vectors for each observation are then estimated via fitting a GAM (separately for each observation, if !global) via mgcv::magic() (least square error, the default) or mgcv::gam() (if a family argument was supplied) or unpenalized least squares / maximum likelihood.

After the "smoothed" representation is computed, the amount of smoothing that was performed is reported in terms of the "percentage of variability preserved", which is the variance (explained deviance, in the general case) of the smoothed function values divided by the variance of the original values (null deviance, in the general case). Reporting can be switched off with verbose = FALSE.

The ... arguments supplies arguments to both the spline basis (via mgcv::s()) and the estimation (via mgcv::magic() or mgcv::gam()), most important:

  • how many basis functions k the spline basis should have, the default is

  • which type of spline basis bs should be used, the default is cubic regression splines ("cr") - a family-argument to the fitters for data for which squared errors are not a reasonable criterion for the representation accuracy (see mgcv::family.mgcv() for what's available).

  • an sp-argument for manually fixing the amount of smoothing (see mgcv::s()), which (drastically) reduces the computation time.

If global == TRUE, the routine first takes a subset of curves (10\ curves sampled deterministically, at most 100, at least 5) on which smoothing parameters per curve are estimated and then uses the mean of the log smoothing parameter of those for all curves. This can be much faster than optimizing the smoothing parameter for each curve on large datasets. For very sparse data, it would be preferable to estimate a joint smoothing parameter directly for all curves, this is not what's implemented here.

Methods (by class)

  • data.frame: convert data frames

  • matrix: convert matrices

  • numeric: convert matrices

  • list: convert lists

  • tfd: convert tfd (raw functional data)

  • tfb: convert tfb: modify basis representation, smoothing.

  • default: convert tfb: default method, returning prototype when data is NULL

See also

mgcv::smooth.terms() for spline basis options.