Skip to content

TransformDefinition

TransformDefinition

Bases: BaseDefinition

Create and manage a transform to perform a schema to schema crosswalk on a tabular data source.

Parameters:

Name Type Description Default
transform TransformModel | dict | Path | str | None

Path to a transform definition, or a dictionary conforming to a transform model.

None
crosswalk CrosswalkDefinition | CrosswalkModel | dict | Path | str | None

A definition, or a dictionary conforming to the CrosswalkModel, or a path to a saved definition.

None
data_source DataSourceModel | dict | Path | str | None

Path to a tabular data source, or a dictionary conforming to a data source model.

None
Example

Create a new TransformDefinition, and perform a crosswalk, then save the definition and transformed data as follows:

import whyqd as qd

transform = qd.TransformDefinition(crosswalk=CROSSWALK, data_source=DATASOURCE)
transform.process()
transform.save(directory=DIRECTORY)

get: TransformModel | None property

Get the transform model.

Returns:

Type Description
TransformModel | None

Pydantic TransformModel or None

process()

Perform a crosswalk. You can access the dataframe after completion at .data, if it exists.

Raises:

Type Description
ValueError

If there are missing required destination fields in the crosswalk.

save(*, filename=None, mimetype=None, directory=None, created_by=None, hide_uuid=False)

Save model as a json file, and save crosswalked destination dataframe as a chosen mimetype.

Info

NOTE: by default, transformed data are saved as PARQUET as this is the most efficient.

Declare your mime type like so:

MIMETYPE = "csv" # upper- or lower-case is fine

Parameters:

Name Type Description Default
directory str | Path | None

Defaults to working directory

None
filename str | None

Defaults to model name

None
mimetype str | None

whyqd supports saving to CSV, XLS, XLSX, Feather and Parquet files. Defaults to Parquet.

None
created_by str | None

Declare the model creator/updater

None
hide_uuid bool

Hide all UUIDs in the nested JSON output.

False

Returns:

Type Description
bool

Boolean True if saved.

set(*, transform=None, crosswalk=None, data_source=None)

Update or create the transform.

Parameters:

Name Type Description Default
transform TransformModel | dict | Path | str | None

Path to a transform definition, or a dictionary conforming to a transform model.

None
crosswalk CrosswalkDefinition | CrosswalkModel | dict | Path | str | None

A definition, or a dictionary conforming to the CrosswalkModel, or a path to a saved definition.

None
data_source DataSourceModel | dict | Path | str | None

Path to a tabular data source, or a dictionary conforming to a data source model.

None

validate(*, transform, data_destination, mimetype_destination=None, data_source=None, mimetype_source=None)

Validate the transformation process and all data checksums. Will perform all actions on each interim data source.

Parameters:

Name Type Description Default
transform TransformModel | dict | Path | str

Path to a transform definition, or a dictionary conforming to a transform model.

required
data_destination DataSourceModel | dict | Path | str

Path to a tabular data source, or a dictionary conforming to a data source model. Destination data for crosswalk validation.

required
mimetype_destination str | MimeType | None

whyqd supports reading from CSV, XLS, XLSX, Feather and Parquet files. Required if data_destination is not of DataSourceModel.

None
data_source DataSourceModel | dict | Path | str | None

Path to a tabular data source, or a dictionary conforming to a data source model. Should be defined in transform, but you may have a different version from a different location.

None
mimetype_source str | MimeType | None

whyqd supports reading from CSV, XLS, XLSX, Feather and Parquet files. Required if data_source is provided (i.e. not from the transform) and not of DataSourceModel.

None

Raises:

Type Description
ValueError

If any steps fail to validate.

Returns:

Type Description
bool

A boolean True on successful validation.