collective.transmute.utils#

collective.transmute.utils.data#

Data utilities for collective.transmute.

This module provides helper functions for sorting and manipulating data structures used in the transformation pipeline. Functions here are designed to be reusable across steps and reporting.

collective.transmute.utils.data.sort_data_by_value(data: dict[str, int], reverse: bool = True) tuple[tuple[str, int], ...][source]#

Sort a dictionary by its values and return a tuple of key-value pairs.

Parameters:
  • data (dict[str, int]) -- The dictionary to sort.

  • reverse (bool, optional) -- Whether to sort in descending order (default: True).

Returns:

A tuple of (key, value) pairs sorted by value.

Return type:

tuple[tuple[str, int], ...]

Example

>>> data = {'a': 2, 'b': 5, 'c': 1}
>>> sort_data_by_value(data)
(('b', 5), ('a', 2), ('c', 1))

collective.transmute.utils.default_page#

Default page utilities for collective.transmute.

This module provides helper functions for handling and merging default page items in the transformation pipeline. Functions here are designed to support merging parent item data into default pages, and to handle special cases such as Link types.

Handle the default page when the item is a Link type.

Parameters:

item (PloneItem) -- The item to process as a Link.

Returns:

The updated item converted to a Document type with link text.

Return type:

PloneItem

collective.transmute.utils.default_page._merge_items(parent_item: PloneItem, item: PloneItem, keys_from_parent: tuple[str, ...]) PloneItem[source]#

Merge selected keys from the parent item into the current item.

Parameters:
  • parent_item (PloneItem) -- The parent item whose keys will be merged.

  • item (PloneItem) -- The current item to update.

  • keys_from_parent (tuple[str, ...]) -- Keys to copy from the parent item.

Returns:

The updated item with merged keys and parent UID.

Return type:

PloneItem

collective.transmute.utils.default_page.handle_default_page(parent_item: PloneItem, item: PloneItem, keys_from_parent: tuple[str, ...]) PloneItem[source]#

Handle the default page by merging the parent item into the current item.

If the item is a Link, it's converted to a Document with link text. Otherwise, selected keys from the parent are merged into the item.

Parameters:
  • parent_item (PloneItem) -- The parent item whose keys will be merged.

  • item (PloneItem) -- The current item to update.

  • keys_from_parent (tuple[str, ...]) -- Keys to copy from the parent item.

Returns:

The updated item with merged keys and parent UID.

Return type:

PloneItem

collective.transmute.utils.exportimport#

Export/import utilities for collective.transmute.

This module provides asynchronous helper functions for preparing and handling metadata and relations during the transformation pipeline. Functions here are used for reading, processing, and writing metadata and relations files, according to the format expected by plone.exportimport.

async collective.transmute.utils.exportimport.initialize_metadata(src_files: SourceFiles, dst: Path) MetadataInfo[source]#

Initialize and load metadata from source files into a MetadataInfo object.

Parameters:
  • src_files (SourceFiles) -- The source files containing metadata.

  • dst (Path) -- The destination path for metadata.

Returns:

The loaded metadata information object.

Return type:

MetadataInfo

async collective.transmute.utils.exportimport.prepare_metadata_file(metadata: MetadataInfo, state: PipelineState, settings: TransmuteSettings) AsyncGenerator[tuple[dict | list, Path], None][source]#

Prepare and yield metadata files for export, including debug and relations data.

Parameters:
  • metadata (MetadataInfo) -- The metadata information object.

  • state (PipelineState) -- The pipeline state object.

  • settings (TransmuteSettings) -- The transmute settings object.

Yields:

tuple[dict | list, Path] -- Tuples of data and their corresponding file paths.

async collective.transmute.utils.exportimport.prepare_redirects_data(redirects: dict[str, str], metadata_path: Path, state_paths: list[tuple[str, str, str]], site_root: str) AsyncGenerator[tuple[dict[str, str], Path], None][source]#

Prepare and yield redirects data for export as a JSON file.

This function takes a mapping of redirects and yields it with the output file path. The output file is named 'redirects.json' and is used by plone.exportimport.

Parameters:
  • redirects (dict[str, str]) -- Mapping of source paths to destination paths.

  • metadata_path (Path) -- Path to the metadata file. Used to determine output location.

  • state_paths (list[tuple[str, str, str]]) -- List of valid paths from the pipeline state.

  • site_root (str) -- The root path for the destination site.

Yields:

tuple[dict[str, str], Path] -- The filtered redirects mapping and the output file path.

Example

>>> async for result in prepare_redirects_data(
...     redirects, metadata_path, state_paths, site_root
... ):
...     data, path = result
...     print(path)
async collective.transmute.utils.exportimport.prepare_relations_data(relations: list[dict[str, str]], to_fix: dict[str, str], metadata_path: Path, state: PipelineState) AsyncGenerator[tuple[list[dict], Path], None][source]#

Prepare and yield relations data for export.

Parameters:
  • relations (list[dict[str, str]]) -- List of relations dictionaries.

  • to_fix (dict[str, str]) -- Mapping of UUIDs to fix.

  • metadata_path (Path) -- Path to the metadata file.

  • state (PipelineState) -- The pipeline state object.

Yields:

tuple[list[dict], Path] -- Tuples of relations data and their corresponding file paths.

collective.transmute.utils.files#

File utilities for collective.transmute.

This module provides asynchronous and synchronous helper functions for reading, writing, exporting, and removing files and data structures used in the transformation pipeline. Functions here support JSON, CSV, and binary blob operations.

collective.transmute.utils.files._sort_content_files(content: list[Path]) list[Path][source]#

Order content files numerically by filename.

Parameters:

content (list[Path]) -- List of file paths to sort.

Returns:

Sorted list of file paths.

Return type:

list[Path]

collective.transmute.utils.files.check_path(path: Path) bool[source]#

Check if a path exists.

Parameters:

path (Path) -- The path to check.

Returns:

True if the path exists, False otherwise.

Return type:

bool

collective.transmute.utils.files.check_paths(src: Path, dst: Path) bool[source]#

Check if both source and destination paths exist.

Parameters:
  • src (Path) -- The source path.

  • dst (Path) -- The destination path.

Returns:

True if both paths exist.

Return type:

bool

Raises:

RuntimeError -- If either path does not exist.

async collective.transmute.utils.files.csv_dump(data: dict | list, header: list[str], path: Path) Path[source]#

Dump data to a CSV file.

Parameters:
  • data (dict or list) -- The data to write to CSV.

  • header (list[str]) -- The list of column headers.

  • path (Path) -- The file path to write to.

Returns:

The path to the written CSV file.

Return type:

Path

async collective.transmute.utils.files.csv_loader(path: Path) list[dict][source]#

Load data from a CSV file.

Parameters:

path (Path) -- The file path to read from.

Returns:

The loaded data from the CSV file.

Return type:

Data

async collective.transmute.utils.files.export_blob(field: str, blob: dict, content_path: Path, item_id: str) dict[source]#

Export a binary blob to disk and update its metadata.

Parameters:
  • field (str) -- The field name for the blob.

  • blob (dict) -- The blob metadata and data.

  • content_path (Path) -- The parent content path.

  • item_id (str) -- The item identifier.

Returns:

The updated blob metadata including the blob path.

Return type:

dict

async collective.transmute.utils.files.export_item(item: PloneItem, parent_folder: Path) ItemFiles[source]#

Export an item and its blobs to disk.

Parameters:
  • item (PloneItem) -- The item to export.

  • parent_folder (Path) -- The parent folder for the item.

Returns:

An object containing the data file path and blob file paths.

Return type:

ItemFiles

async collective.transmute.utils.files.export_metadata(metadata: MetadataInfo, state: PipelineState, consoles: ConsoleArea, settings: TransmuteSettings) Path[source]#

Export metadata to disk, including debug and relations files if needed.

Parameters:
  • metadata (MetadataInfo) -- The metadata information object.

  • state (PipelineState) -- The pipeline state object.

  • consoles (ConsoleArea) -- The console area for logging.

  • settings (TransmuteSettings) -- The transmute settings object.

Returns:

The path to the last written metadata file.

Return type:

Path

collective.transmute.utils.files.get_src_files(src: Path) SourceFiles[source]#

Return a SourceFiles object containing metadata and content files from a directory.

Parameters:

src (Path) -- The source directory to scan.

Returns:

An object containing lists of metadata and content files.

Return type:

SourceFiles

async collective.transmute.utils.files.json_dump(data: dict | list, path: Path) Path[source]#

Dump JSON data to a file asynchronously.

Parameters:
  • data (dict or list) -- The data to serialize and write.

  • path (Path) -- The file path to write to.

Returns:

The path to the written file.

Return type:

Path

collective.transmute.utils.files.json_dumps(data: dict | list) bytes[source]#

Dump a dictionary or list to a JSON-formatted bytes object.

Parameters:

data (dict or list) -- The data to serialize to JSON.

Returns:

The JSON-encoded data as bytes.

Return type:

bytes

async collective.transmute.utils.files.json_reader(files: Iterable[Path]) AsyncGenerator[tuple[str, PloneItem], None][source]#

Asynchronously read JSON files and yield filename and data.

Parameters:

files (Iterable[Path]) -- Iterable of file paths to read.

Yields:

tuple[str, PloneItem] -- Filename and loaded JSON data.

collective.transmute.utils.files.remove_data(path: Path, consoles: ConsoleArea | None = None)[source]#

Remove all data inside a given path, including files and directories.

Parameters:
  • path (Path) -- The path whose contents will be removed.

  • consoles (ConsoleArea, optional) -- The console area for logging (default: None).

collective.transmute.utils.item#

Item utilities for collective.transmute.

This module provides helper functions for generating UIDs, handling parent paths, creating image items, and managing relations in the transformation pipeline. Functions support common item operations.

collective.transmute.utils.item.add_annotation(item_uid: str, key: str, value: Any, state: PipelineState)[source]#

Add a new annotation to the PipelineStep.

Parameters:
  • item_uid (Item UID) -- The UID of the item to be annotated.

  • key (Annotation key)

  • value (str) -- The annotation value

  • state (PipelineState) -- Pipeline state object

collective.transmute.utils.item.add_relation(src_item: PloneItem, dst_item: PloneItem, attribute: str, metadata: MetadataInfo)[source]#

Add a new relation to the relations list in metadata.

Parameters:
  • src_item (PloneItem) -- The source item for the relation.

  • dst_item (PloneItem) -- The destination item for the relation.

  • attribute (str) -- The attribute name for the relation.

  • metadata (MetadataInfo) -- The metadata object to update.

collective.transmute.utils.item.all_parents_for(id_: str) set[str][source]#

Given an @id, return all possible parent paths.

Parameters:

id (str) -- The item id (path) to process.

Returns:

A set of all parent paths for the given id.

Return type:

set[str]

Example

>>> all_parents_for('a/b/c')
{'a', 'a/', 'a/b', 'a/b/'}
collective.transmute.utils.item.create_image_from_item(parent: PloneItem) PloneItem[source]#

Create a new image object to be placed inside the parent item.

Parameters:

parent (PloneItem) -- The parent item containing image data.

Returns:

A new image item dictionary.

Return type:

PloneItem

Example

>>> parent = {'@id': 'folder', 'image': {'filename': 'img.png'}}
>>> img_item = create_image_from_item(parent)
>>> img_item['@type']
'Image'
collective.transmute.utils.item.generate_uid() str[source]#

Generate a new UID for an item.

Returns:

A unique identifier string without dashes.

Return type:

str

Example

>>> uid = generate_uid()
>>> len(uid) == 32
True
collective.transmute.utils.item.get_annotation(item_uid: str, key: str, default_value: Any, state: PipelineState) Any[source]#

Return an existing annotation value from the PipelineStep.

Parameters:
  • item_uid (Item UID) -- The UID of the item to be annotated.

  • key (Annotation key)

  • state (PipelineState) -- Pipeline state object

Returns:

The value stored in the annotation

Return type:

value

collective.transmute.utils.item.pop_annotation(item_uid: str, key: str, default_value: Any, state: PipelineState) Any[source]#

Pop an existing annotation from the PipelineStep.

Parameters:
  • item_uid (Item UID) -- The UID of the item to be annotated.

  • key (Annotation key)

  • state (PipelineState) -- Pipeline state object

Returns:

The value stored in the annotation

Return type:

value

collective.transmute.utils.performance#

Performance utilities for collective.transmute.

This module provides context managers and helpers for timing and reporting performance metrics during the transformation pipeline. Functions support logging execution times.

collective.transmute.utils.performance.report_time(title: str, consoles: ConsoleArea)[source]#

Context manager to report the start and end time of a process.

Parameters:
  • title (str) -- The title or label for the timed process.

  • consoles (ConsoleArea) -- The console area for logging messages.

Example

>>> with report_time('Step 1', consoles):
...     # code to time

collective.transmute.utils.pipeline#

Pipeline utilities for collective.transmute.

This module provides helper functions for loading pipeline steps and processors by dotted names, checking step availability, and managing step configuration in the transformation pipeline. Functions support pipeline extensibility and dynamic loading.

collective.transmute.utils.pipeline.check_steps(names: tuple[str, ...]) list[tuple[str, bool]][source]#

Check if pipeline step functions can be loaded from dotted names.

Parameters:

names (tuple[str, ...]) -- Tuple of dotted function names.

Returns:

List of (name, status) tuples indicating if each step is available.

Return type:

list[tuple[str, bool]]

collective.transmute.utils.pipeline.load_all_steps(names: tuple[str, ...]) tuple[PipelineStep | ReportStep | PrepareStep, ...][source]#

Load and return all pipeline step functions from a tuple of dotted names.

Each name should be a string in dotted notation (e.g., 'module.submodule.func'). Steps are loaded using load_step and returned as a tuple. If a step cannot be loaded, a RuntimeError will be raised by load_step.

Parameters:

names (tuple[str, ...]) -- Tuple of dotted function names to load.

Returns:

Tuple of loaded pipeline step functions.

Return type:

tuple[PipelineStep | ReportStep | LoaderStep, ...]

Raises:

RuntimeError -- If any step cannot be loaded.

collective.transmute.utils.pipeline.load_processor(type_: str, settings: TransmuteSettings) ItemProcessor[source]#

Load a processor function for a given type from settings.

Parameters:
  • type (str) -- The type for which to load the processor.

  • settings (TransmuteSettings) -- The transmute settings object containing processor configuration.

Returns:

The loaded processor function.

Return type:

ItemProcessor

Raises:

RuntimeError -- If the processor function cannot be found.

collective.transmute.utils.pipeline.load_step(name: str) PipelineStep[source]#

Load a pipeline step function from a dotted name.

Parameters:

name (str) -- The dotted name of the function (e.g., module.submodule.func).

Returns:

The loaded pipeline step function.

Return type:

PipelineStep

Raises:

RuntimeError -- If the module or function cannot be found.

Example

>>> step = load_step('my_module.my_step')

collective.transmute.utils.portal_types#

Portal type utilities for collective.transmute.

This module provides helper functions for mapping and fixing portal types based on settings in the transformation pipeline. Functions support type normalization and lookup.

collective.transmute.utils.portal_types.fix_portal_type(type_: str) str[source]#

Return the mapped portal type for a given type using settings.

Parameters:

type (str) -- The type to map to a portal type.

Returns:

The mapped portal type, or an empty string if not found.

Return type:

str

Example

>>> fix_portal_type('Document')
'Document'  # or mapped value from settings

collective.transmute.utils.redirects#

collective.transmute.utils.redirects.add_redirect(redirects: dict[str, str], src: str, dest: str, site_root: str) None[source]#

Add a redirect mapping from source to destination.

This function adds a new redirect entry to the provided redirects dictionary. If the source path already exists in the dictionary, it will be overwritten with the new destination path.

Parameters:
  • redirects (dict[str, str]) -- The redirects mapping to update.

  • src (str) -- The source path for the redirect.

  • dest (str) -- The destination path for the redirect.

Example

>>> add_redirect(redirects, '/old-path', '/new-path')
collective.transmute.utils.redirects.filter_redirects(raw_redirects: dict[str, str], valid_paths: set[str]) dict[str, str][source]#

Filter redirects to include only those with valid destination paths.

This function returns a new dictionary containing only redirects whose destination path is either external or present in the set of valid paths.

Parameters:
  • raw_redirects (dict[str, str]) -- Mapping of source paths to destination paths.

  • valid_paths (set[str]) -- Set of valid internal destination paths. Should include site root prefix.

Returns:

Filtered redirects mapping with only valid destinations.

Return type:

dict[str, str]

Example

>>> filtered = filter_redirects(raw_redirects, valid_paths)
collective.transmute.utils.redirects.initialize_redirects(raw_redirects: dict[str, str], settings: TransmuteSettings | None = None) dict[str, str][source]#

Initialize and normalize a mapping of redirects for migration.

This function updates source and destination paths in the redirects mapping according to the configured site roots in settings. If the source and destination roots differ, it replaces the source root prefix in both keys and values with the destination root, ensuring all redirects are valid for the target site.

Parameters:
  • raw_redirects (dict[str, str]) -- Raw mapping of source paths to target paths.

  • settings (TransmuteSettings | None) -- The transmute settings object. If None, the default settings will be used.

Returns:

The cleaned and normalized redirects mapping.

Return type:

dict[str, str]

Example

>>> redirects = initialize_redirects(settings, raw_redirects)

Note

If the source and destination roots are identical, no replacement occurs. Only paths starting with the source root are updated.

collective.transmute.utils.querystring#

Querystring utilities for collective.transmute.

This module provides helper functions for cleaning up, deduplicating, and post-processing querystring definitions used in Plone collections and listing blocks. Functions support normalization and transformation of querystring items and values.

collective.transmute.utils.querystring._process_date_between(raw_value: list[str]) tuple[str, list[str] | str][source]#

Process a date between operation for querystring items.

Parameters:

raw_value (list[str]) -- List containing two date strings.

Returns:

The operation and processed value(s).

Return type:

tuple[str, list[str] | str]

collective.transmute.utils.querystring.cleanup_querystring(query: list[dict]) tuple[list[dict], bool][source]#

Clean up the querystring of a collection-like object or listing block.

Parameters:

query (list[dict]) -- The querystring to clean up.

Returns:

The cleaned querystring and a post-processing status flag.

Return type:

tuple[list[dict], bool]

collective.transmute.utils.querystring.cleanup_querystring_item(item: dict) tuple[dict, bool][source]#

Clean up a single item in a querystring definition.

Parameters:

item (dict) -- The querystring item to clean up.

Returns:

The cleaned item and a post-processing status flag.

Return type:

tuple[dict, bool]

collective.transmute.utils.querystring.deduplicate_value(value: list | None) list | None[source]#

Deduplicate values in a list, preserving None.

Parameters:

value (list or None) -- The list to deduplicate.

Returns:

The deduplicated list, or None if input is None.

Return type:

list or None

collective.transmute.utils.querystring.parse_path_value(value: str) str[source]#

Parse a path value to ensure it is a valid URL or UID reference.

Parameters:

value (str) -- The path value to parse.

Returns:

The parsed path value, possibly converted to UID format.

Return type:

str

Example

>>> parse_path_value('12345678901234567890123456789012')
'UID##12345678901234567890123456789012##'
collective.transmute.utils.querystring.post_process_querystring(query: list[dict], state: PipelineState) list[dict][source]#

Post-process a querystring, replacing UID references with actual paths.

Parameters:
  • query (list[dict]) -- The querystring to post-process.

  • state (PipelineState) -- The pipeline state object containing UID-path mapping.

Returns:

The post-processed querystring.

Return type:

list[dict]

collective.transmute.utils.settings#

Settings utilities for collective.transmute.

This module provides helper classes and functions for handling custom TOML encoding and registration of encoders for settings serialization. Functions and classes support custom data types in configuration files.

class collective.transmute.utils.settings.SetItem[source]#

TOMLKit Array subclass for encoding Python sets as TOML arrays.

Returns:

The set converted to a list of strings.

Return type:

list[str]

unwrap() list[str][source]#

Unwrap the set item to a list of strings.

Returns:

The set as a list of strings.

Return type:

list[str]

collective.transmute.utils.settings._fix_arrays(table: Table) Table[source]#

Ensure all arrays in the table are multiline.

collective.transmute.utils.settings.register_encoders()[source]#

Register custom encoders for tomlkit to handle Python sets.

Example

>>> register_encoders()
collective.transmute.utils.settings.set_encoder(obj: set) Item[source]#

Encode a Python set as a TOMLKit Item (Array).

Parameters:

obj (set) -- The set to encode.

Returns:

The TOMLKit Item representing the set.

Return type:

Item

Raises:

ConvertError -- If the object is not a set.

collective.transmute.utils.settings.settings_to_toml(data: dict) TOMLDocument[source]#

Convert a dictionary containing the settings to a TOMLDocument.

Parameters:

data (dict) -- The dictionary to convert.

Returns:

The resulting TOMLDocument.

Return type:

TOMLDocument

collective.transmute.utils.workflow#

Workflow utilities for collective.transmute.

This module provides helper functions for rewriting workflow history and review states in Plone items during the transformation pipeline. Functions support configuration-driven workflow normalization and migration.

collective.transmute.utils.workflow._default_rewrite(settings: dict, actions: list[WorkflowHistoryEntry]) list[WorkflowHistoryEntry][source]#

Convert a list of workflow actions.

Parameters:

actions (list of actions) -- The original list of workflow actions.

Returns:

The converted list of workflow actions.

Return type:

list of actions

collective.transmute.utils.workflow.rewrite_settings() dict[source]#

Return workflow rewrite settings from the transmute configuration.

Returns:

Dictionary containing workflow and state rewrite mappings.

Return type:

dict

Example

>>> settings = rewrite_settings()
>>> settings['states']
{'visible': 'published'}
collective.transmute.utils.workflow.rewrite_workflow_history(item: PloneItem) PloneItem[source]#

Rewrite review_state and workflow_history for a Plone item.

Configuration should be added to transmute.toml, for example:

[review_state.rewrite]
states = {"visible" = "published"}
workflows = {"plone_workflow" = "simple_publication_workflow"}
Parameters:

item (PloneItem) -- The item whose workflow history and review state will be rewritten.

Returns:

The updated item with rewritten workflow history and review state.

Return type:

PloneItem

Example

>>> item = {'review_state': 'visible', 'workflow_history': {...}}
>>> rewrite_workflow_history(item)