Create a blueprint

blueprint(
  name,
  command,
  description = NULL,
  metadata = NULL,
  annotate = FALSE,
  metadata_file_type = c("csv"),
  metadata_file_name = NULL,
  metadata_directory = NULL,
  metadata_file_path = NULL,
  extra_steps = NULL,
  ...,
  class = character()
)

Arguments

name

The name of the blueprint

command

The code to build the target dataset

description

An optional description of the dataset to be used for codebook generation

metadata

The associated variable metadata for this dataset

annotate

If TRUE, during cleanup the metadata will "annotate" the dataset by adding variable attributes for each metadata field to make metadata provenance easier and responsive to code changes.

metadata_file_type

The kind of metadata file. Currently only CSV.

metadata_file_name

The file name for the metadata file. If the option blueprintr.use_local_metadata_path is set to TRUE, then the default file name will be the name of the blueprint script, minus the .R extension. Otherwise, this will default to the name of the blueprint.

metadata_directory

Where the metadata file will be stored. If the option blueprintr.use_local_metadata_path is set to TRUE, then the default location will be the folder where the blueprint script is located. Otherwise, this will default to here::here("blueprints")

metadata_file_path

Overrides the metadata file path generated by metadata_directory, name, and metadata_file_type if not NULL.

extra_steps

A list() of extra 'bpstep' objects, which add extra targets to the workflow after the desired dataset has completed its cleanup phase. Uses of this could include generating codebooks or other reports based on the built data. See bp_add_bpstep() for more details.

...

Any other parameters and settings for the blueprint

class

A subclass of blueprint capability, for future work

Value

A blueprint object

Cleanup Tasks

blueprintr offers some post-check tasks that attempt to match datasets to the metadata as much as possible. There are two default tasks that run:

  1. Reorders variables to match metadata order.

  2. Drops variables marked with dropped == TRUE if the dropped variable exists in the metadata.

The remaining tasks have to be enabled by the user:

  • If labelled = TRUE in the blueprint() command, all columns will be converted to labelled() columns, provided that at least the description field is filled in. If the coding column is present in the metadata, then categorical levels as specified by a coding() will be added to the column as well. In case the description field is used for detailed column descriptions, the title field can be added to the metadata to act as short titles for the columns.