Aligner

class Aligner(force_download: bool | None = None)[source]

Bases: object

A class for aligning new registries.

Instantiate the aligner.

Attributes Summary

alt_key_match

Set this if there's another part of the data besides the ID that should be matched

alt_keys_match

getter_kwargs

Keyword arguments to pass to the getter function on call

include_new

Should new entries be included automatically? Only set this true for aligners of very high confidence (e.g., OBO Foundry but not BioPortal)

internal_registry

Get the internal registry.

normalize_invmap

skip_deprecated

Set to true if you don't want to align to deprecated resources

subkey

Methods Summary

align([dry, show, force_download])

Align and output the curation sheet.

cli()

Construct a CLI for the aligner.

get_curation_row(external_id, external_entry)

Get a sequence of items that will be ech row in the curation table.

get_curation_table(**kwargs)

Get the curation table as a string, built by tabulate.

get_skip()

Get the mapping prefixes that should be skipped to their reasons (strings).

prepare_external(external_id, external_entry)

Prepare a dictionary to be added to the bioregistry for each external registry entry.

print_curation_table(**kwargs)

Print the curation table.

write_curation_table()

Write the curation table to a TSV.

write_registry()

Write the internal registry.

Attributes Documentation

alt_key_match: ClassVar[str | None] = None

Set this if there’s another part of the data besides the ID that should be matched

alt_keys_match: ClassVar[str | None] = None
getter_kwargs: ClassVar[Mapping[str, Any] | None] = None

Keyword arguments to pass to the getter function on call

include_new: ClassVar[bool] = False

Should new entries be included automatically? Only set this true for aligners of very high confidence (e.g., OBO Foundry but not BioPortal)

internal_registry

Get the internal registry.

normalize_invmap: ClassVar[bool] = False
skip_deprecated: ClassVar[bool] = False

Set to true if you don’t want to align to deprecated resources

subkey: ClassVar[str] = 'prefix'

Methods Documentation

classmethod align(dry: bool = False, show: bool = False, force_download: bool | None = None) None[source]

Align and output the curation sheet.

Parameters:
  • dry – If true, don’t write changes to the registry

  • show – If true, print a curation table

  • force_download – Force re-download of the data

classmethod cli()[source]

Construct a CLI for the aligner.

get_curation_row(external_id, external_entry) Sequence[str][source]

Get a sequence of items that will be ech row in the curation table.

Parameters:
  • external_id – The external registry identifier

  • external_entry – The external registry data

Returns:

A sequence of cells to add to the curation table.

Raises:

TypeError – If an invalid value is encountered

The default implementation of this function iterates over all of the keys in the class variable curation_header and looks inside each record for those in order.

Note

You don’t need to pass the external ID. this will automatically be the first element.

get_curation_table(**kwargs) str | None[source]

Get the curation table as a string, built by tabulate.

get_skip() Mapping[str, str][source]

Get the mapping prefixes that should be skipped to their reasons (strings).

prepare_external(external_id: str, external_entry: Dict[str, Any]) Dict[str, Any][source]

Prepare a dictionary to be added to the bioregistry for each external registry entry.

The default implementation returns external_entry unchanged. If you need more than that, override this method.

Parameters:
  • external_id – The external registry identifier

  • external_entry – The external registry data

Returns:

The dictionary to be added to the bioregistry for the aligned entry

print_curation_table(**kwargs) None[source]

Print the curation table.

write_curation_table() None[source]

Write the curation table to a TSV.

write_registry() None[source]

Write the internal registry.