biolord.BiolordModule#

class biolord.BiolordModule(n_genes, n_samples, x_loc, ordered_attributes_map=None, categorical_attributes_map=None, n_latent=32, n_latent_attribute_categorical=4, n_latent_attribute_ordered=16, gene_likelihood='normal', reconstruction_penalty=100.0, unknown_attribute_penalty=10.0, use_batch_norm=True, use_layer_norm=False, unknown_attribute_noise_param=0.1, unknown_attributes=True, attribute_dropout_rate=None, decoder_width=512, decoder_depth=4, decoder_activation=True, attribute_nn_width=None, attribute_nn_depth=None, attribute_nn_activation=True, eval_r2_ordered=False, decoder_dropout_rate=0.1, seed=0)[source]#

The biolord module.

Parameters:
  • n_genes (int) – Number of input genes.

  • n_samples (int) – Number of layers.

  • x_loc (str) – The expression data location.

  • ordered_attributes_map (Optional[dict[str, int]]) – Dictionary of ordered classes and their dimensions.

  • categorical_attributes_map (Optional[dict[str, dict]]) – Dictionary for categorical classes, containing categorical values with keys as each category name and values as the categorical integer assignment.

  • n_latent (int) – Latent dimension.

  • n_latent_attribute_ordered (int) – Latent dimension of ordered attributes.

  • n_latent_attribute_categorical (int) – Latent dimension of categorical attributes.

  • gene_likelihood (Literal['normal', 'nb', 'poisson']) – The gene_likelihood model.

  • reconstruction_penalty (float) – MSE error to reconstruction loss.

  • use_batch_norm (bool) – Use batch norm in layers.

  • use_layer_norm (bool) – Use layer norm in layers.

  • unknown_attribute_noise_param (float) – Noise strength added to encoding of unknown attributes.

  • unknown_attributes (bool) – Whether to include learning for unknown attributes

  • attribute_dropout_rate (Optional[dict[str, float]]) – Dropout rate.

  • attribute_nn_width (Optional[dict[str, int]]) – Ordered attributes autoencoder layers’ width.

  • attribute_nn_depth (Optional[dict[str, int]]) – Ordered attributes autoencoder number of layers.

  • attribute_nn_activation (bool) – Use activation in ordered attributes.

  • decoder_width (int) – Decoder layers’ width.

  • decoder_depth (int) – Decoder number of layers.

  • decoder_activation (bool) – Use activation in decoder.

  • eval_r2_ordered (bool) – Evaluate the R2 w.r.t. the ordered attribute. Set to True only if ordered attributes are binned.

  • decoder_dropout_rate (float) – Decoder dropout rate.

  • seed (int) – Random seed.

Attributes table#

Methods table#

generative(latent[, library])

Runs the generative step.

get_expression(tensors, **inference_kwargs)

Computes gene expression means and standard deviation.

get_inference_input(tensors, **kwargs)

Convert tensors to valid inference input.

inference(genes, sample_indices, ...[, ...])

Apply module inference.

loss(tensors, inference_outputs, ...)

Computes the module's loss.

r2_metric(tensors, generative_outputs)

Evaluate the \(R^2\) metric over gene expression.

unknown_attribute_penalty_loss(...)

Computes the content penalty term in the loss.

Attributes#

Methods#

generative#

BiolordModule.generative(latent, library=None)[source]#

Runs the generative step.

Parameters:
  • latent (Tensor) – The concatenated decomposed latent space.

  • library (Optional[Tensor]) – Library sizes for each cell.

Return type:

dict[str, Any]

Returns:

Dictionary with the generative predictions of the expression distribution.

get_expression#

BiolordModule.get_expression(tensors, **inference_kwargs)[source]#

Computes gene expression means and standard deviation.

Parameters:
  • tensors (dict[str, Tensor]) – Considered inputs.

  • inference_kwargs (Any) – Additional arguments.

Return type:

tuple[tensor, tensor]

Returns:

Prediction of gene expression mean and standard deviation.

get_inference_input#

BiolordModule.get_inference_input(tensors, **kwargs)[source]#

Convert tensors to valid inference input.

Parameters:
  • tensors (dict[Any, Any]) – Considered inputs.

  • kwargs – Additional arguments

Return type:

dict[str, Any]

Returns:

Dictionary with the module’s expected input tensors (genes, sample_indices, categorical_attribute_dict, and ordered_attribute_dict).

inference#

BiolordModule.inference(genes, sample_indices, categorical_attribute_dict, ordered_attribute_dict, nullify_attribute=None)[source]#

Apply module inference.

Parameters:
  • genes (Tensor) – Input expression.

  • sample_indices (Tensor) – Indices in the AnnData object of the input samples.

  • categorical_attribute_dict (dict[Any, Any]) – Dictionary with categorical attributes as keys and the attribute sample labels as values.

  • ordered_attribute_dict (dict[Any, Any]) – Dictionary with ordered attributes as keys and the attribute sample values as values.

  • nullify_attribute (Optional[list]) – Attributes to exclude from inferred latent space.

Return type:

dict[str, Any]

Returns:

Dictionary with the module’s expected input tensors (genes, sample_indices, categorical_attribute_dict, and ordered_attribute_dict).

loss#

BiolordModule.loss(tensors, inference_outputs, generative_outputs)[source]#

Computes the module’s loss.

Parameters:
  • tensors (dict[str, Tensor]) – Considered model inputs.

  • inference_outputs (dict[Literal['latent_unknown_attributes'], Tensor]) – Inference step outputs.

  • generative_outputs (dict[Literal['distribution', 'means', 'variances'], Tensor]) – Generative step outputs.

Return type:

dict[str, float]

Returns:

The loss elements.

r2_metric#

BiolordModule.r2_metric(tensors, generative_outputs)[source]#

Evaluate the \(R^2\) metric over gene expression.

Parameters:
Return type:

tuple[float, float]

Returns:

The \(R^2\) of the mean and standard deviation predictions of the gene expression.

unknown_attribute_penalty_loss#

static BiolordModule.unknown_attribute_penalty_loss(latent_unknown_attributes)[source]#

Computes the content penalty term in the loss.

Return type:

float