Frequently asked questions
How to use the GPU?
Using a GPU may significantly speed up Novae's training or inference.
If you have a valid GPU for PyTorch, you can set the accelerator
argument (e.g., one of ["cpu", "gpu", "tpu", "hpu", "mps", "auto"]
) in the following methods: model.fit(), model.fine_tune(), model.compute_representations().
When using a GPU, we also highly recommend setting multiple workers to speed up the dataset __getitem__
. For that, you'll need to set the num_workers
argument in the previous methods, according to the number of CPUs available (num_workers=8
is usually a good value).
For more details, refer to the API of the PyTorch Lightning Trainer and to the API of the PyTorch DataLoader.
How to load a pretrained model?
We highly recommend loading a pre-trained Novae model instead of re-training from scratch. For that, choose an available Novae model name on our HuggingFace collection, and provide this name to the model.save_pretrained() method:
from novae import Novae
model = Novae.from_pretrained("MICS-Lab/novae-human-0") # or any valid model name
How to avoid overcorrecting?
By default, Novae corrects the batch-effect to get shared spatial domains across slides.
The batch information is used only during training (fit
or fine_tune
), which should prevent Novae from overcorrecting in zero_shot
mode.
If not using the zero_shot
mode, you can provide the min_prototypes_ratio
parameter to control batch effect correction: either (i) in the fine_tune
method itself, or (ii) during the model initialization (if retraining a model from scratch).
For instance, if min_prototypes_ratio=0.5
, Novae expects each slide to contain at least 50% of the prototypes (each prototype can be interpreted as an "elementary spatial domain"). Therefore, the lower min_prototypes_ratio
, the lower the batch-effect correction. Conversely, if min_prototypes_ratio=1
, all prototypes are expected to be found in all slides (this doesn't mean the proportions will be the same overall slides, though).
How do I save my own model?
If you have trained or fine-tuned your own Novae model, you can save it for later use. For that, use the model.save_pretrained() method as below:
Then, you can load this model back via the model.from_pretrained() method:
How to turn lazy loading on or off?
By default, lazy loading is used only on large datasets. To enforce a specific behavior, you can do the following:
# never use lazy loading
novae.settings.disable_lazy_loading()
# always use lazy loading
novae.settings.enable_lazy_loading()
# use lazy loading only for AnnData objects with 1M+ cells
novae.settings.enable_lazy_loading(n_obs_threshold=1_000_000)
How to update the logging level?
The logging level can be updated as below:
import logging
from novae import log
log.setLevel(logging.ERROR) # or any other level, e.g. logging.DEBUG
How to disable auto-preprocessing
By default, Novae automatically run data preprocessing for you. If you don't want that, you can run the line below.
How to contribute?
If you want to contribute, check our contributing guide.
How to resolve any other issue?
If you have any bugs/questions/suggestions, don't hesitate to open a new issue.