parse_curie

parse_curie(curie, sep=':')[source]

Parse a CURIE, normalizing the prefix and identifier if necessary.

Parameters
  • curie (str) – A compact URI (CURIE) in the form of <prefix:identifier>

  • sep (str) – The separator for the CURIE. Defaults to the colon “:” however the slash “/” is sometimes used in Identifiers.org and the underscore “_” is used for OBO PURLs.

Return type

Union[Tuple[str, str], Tuple[None, None]]

Returns

A tuple of the prefix, identifier. If not parsable, returns a tuple of None, None

The algorithm for parsing a CURIE is very simple: it splits the string on the leftmost occurrence of the separator (usually a colon “:” unless specified otherwise). The left part is the prefix, and the right part is the identifier.

>>> parse_curie('pdb:1234')
('pdb', '1234')

Address banana problem >>> parse_curie(’go:GO:1234’) (‘go’, ‘1234’) >>> parse_curie(’go:go:1234’) (‘go’, ‘1234’) >>> parse_curie(’go:1234’) (‘go’, ‘1234’)

Address banana problem with OBO banana >>> parse_curie(‘fbbt:FBbt:1234’) (‘fbbt’, ‘1234’) >>> parse_curie(‘fbbt:fbbt:1234’) (‘fbbt’, ‘1234’) >>> parse_curie(‘fbbt:1234’) (‘fbbt’, ‘1234’)

Address banana problem with explit banana >>> parse_curie(‘go.ref:GO_REF:1234’) (‘go.ref’, ‘1234’) >>> parse_curie(‘go.ref:1234’) (‘go.ref’, ‘1234’)

Parse OBO PURL curies >>> parse_curie(‘GO_1234’, sep=”_”) (‘go’, ‘1234’)

Banana with no peel: >>> parse_curie(“omim.ps:PS12345”) (‘omim.ps’, ‘12345’)