biolord.Biolord#
- class biolord.Biolord(adata, model_name=None, module_params=None, n_latent=128, train_classifiers=False, split_key=None, train_split='train', valid_split='test', test_split='ood')[source]#
The biolord model class.
- Parameters:
adata (
AnnData
) – Annotated data object.module_params (
Optional
[dict
[str
,Any
]]) – Hyperparameters for the model’s module initialization, e.g,BiolordModule
orBiolordClassifyModule
.n_latent (
int
) – Number of latent dimensions used for the latent embedding.train_classifiers (
bool
) – Whether to activate aBiolordClassifyModule
.split_key (
Optional
[str
]) – Key inanndata.AnnData.obs
used to split the data between train, test and validation.train_split (
str
) – Value inanndata.AnnData.obs
['{split_key}']
marking the train set.valid_split (
str
) – Value inanndata.AnnData.obs
['{split_key}']
marking the validation set.test_split (
str
) – Value inanndata.AnnData.obs
['{split_key}']
marking the test set.
Examples
import scanpy as sc import biolord adata = sc.read(...) biolord.Biolord.setup_anndata( adata, ordered_attributes_keys=["time"], categorical_attributes_keys=["cell_type"] ) model = biolord.Biolord(adata, n_latent=256, split_key="split") model.train(max_epochs=200, batch_size=256)
Attributes table#
Data splitter. |
|
Model's name. |
|
Model's module. |
|
The model's training plan. |
Methods table#
|
Expression prediction over given inputs. |
|
Returns the accuracy of the retrieval task over the pre-defined |
|
Compute embedding of a categorical attribute. |
|
Processes |
|
Return the unknown attributes latent space and full latent variable. |
|
Compute embedding of an ordered attribute. |
|
Load a saved model. |
|
The model's gene expression prediction for a given |
|
Save the model. |
|
Setup function. |
|
Train the |
Attributes#
data_splitter#
- Biolord.data_splitter#
Data splitter.
model_name#
- Biolord.model_name#
Model’s name.
module#
- Biolord.module#
Model’s module.
training_plan#
- Biolord.training_plan#
The model’s training plan.
Methods#
compute_prediction_adata#
- Biolord.compute_prediction_adata(adata, adata_source, target_attributes, add_attributes=None)[source]#
Expression prediction over given inputs.
- Parameters:
adata (
AnnData
) – Annotated data object containing possible values of thetarget_attributes
.adata_source (
AnnData
) – Annotated data object we wish to make predictions over, e.g., change theirtarget_attributes
.target_attributes (
list
[str
]) – Attributes to make predictions over.add_attributes (
Optional
[list
[str
]]) – Additional attributes to add toanndata.AnnData.obs
from the original adata to the prediction adata object.
- Return type:
- Returns:
Annotated data object containing predictions of the cells in all combinations of the
target_attributes
.
evaluate_retrieval#
get_categorical_attribute_embeddings#
get_dataset#
get_latent_representation_adata#
- Biolord.get_latent_representation_adata(adata=None, indices=None, batch_size=512, nullify_attribute=None)[source]#
Return the unknown attributes latent space and full latent variable.
- Parameters:
- Return type:
- Returns:
Two
AnnData
objects providing the unknown attributes latent space and the concatenated decomposed latent respectively.
get_ordered_attribute_embedding#
load#
- classmethod Biolord.load(dir_path, adata=None, accelerator='auto', device='auto', **kwargs)[source]#
Load a saved model.
- Parameters:
dir_path (
str
) – Directory where the model is saved.adata (
Optional
[AnnData
]) – AnnData organized in the same way as data used to train model.accelerator (
str
) – Supports passing different accelerator types (“cpu”, “gpu”, “tpu”, “ipu”, “hpu”, “mps, “auto”) as well as custom accelerator instances.device (
Union
[int
,list
[int
],str
]) – The device to use. Can be set to a positive number (int or str), or"auto"
for automatic selection based on the chosen accelerator.kwargs (
Any
) – Keyword arguments forscvi()
- Return type:
- Returns:
The saved model.
predict#
save#
- Biolord.save(dir_path=None, overwrite=False, save_anndata=False, **anndata_save_kwargs)[source]#
Save the model.
- Parameters:
- Return type:
- Returns:
Nothing, just saves the model.
setup_anndata#
- classmethod Biolord.setup_anndata(adata, ordered_attributes_keys=None, categorical_attributes_keys=None, categorical_attributes_missing=None, retrieval_attribute_key=None, layer=None, **kwargs)[source]#
Setup function.
- Parameters:
adata (
AnnData
) – Annotated data object.ordered_attributes_keys (
Optional
[list
[str
]]) – Validanndata.AnnData.obs
oranndata.AnnData.obsm
keys for the ordered attributes.categorical_attributes_keys (
Optional
[list
[str
]]) – Validanndata.AnnData.obs
keys for the categorical attributes.categorical_attributes_missing (
Optional
[dict
[str
,str
]]) – Categories representing missing labels. Only used iftrain_classifiers=True
.retrieval_attribute_key (
Optional
[str
]) – Validanndata.AnnData.obs
key for an attribute to evaluate retrieval performance over.layer (
Optional
[str
]) – Expression layer inanndata.AnnData.layers
to use. IfNone
, useanndata.AnnData.X
.kwargs (
Any
) – Keyword arguments forregister_fields()
.
- Return type:
- Returns:
Nothing, just sets up
adata
.
train#
- Biolord.train(max_epochs=None, accelerator='auto', device='auto', train_size=0.9, validation_size=None, plan_kwargs=None, batch_size=128, early_stopping=False, **trainer_kwargs)[source]#
Train the
Biolord
model.- Parameters:
max_epochs (
Optional
[int
]) – Maximum number of epochs for training.accelerator (
str
) – Supports passing different accelerator types (“cpu”, “gpu”, “tpu”, “ipu”, “hpu”, “mps, “auto”) as well as custom accelerator instances.device (
Union
[int
,list
[int
],str
]) – The device to use. Can be set to a positive number (int or str), or"auto"
for automatic selection based on the chosen accelerator.train_size (
float
) – Fraction of training data in the case of randomly splitting dataset to train/validation ifsplit_key
is not set in model’s constructor.validation_size (
Optional
[float
]) – Fraction of validation data in the case of randomly splitting dataset to train/validation ifsplit_key
is not set in model’s constructor.batch_size (
int
) – Size of mini-batches for training.early_stopping (
bool
) – IfTrue
, early stopping will be used during training on validation dataset.plan_kwargs (
Optional
[dict
[str
,Any
]]) – Keyword arguments forTrainingPlan
.trainer_kwargs (
Any
) – Keyword arguments forTrainRunner
.
- Return type:
- Returns:
Nothing, just trains the
Biolord
model.