novae.utils
novae.utils.spatial_neighbors(adata, radius=100, slide_key=None, pixel_size=None, technology=None, percentile=None, set_diag=False)
Create a Delaunay graph from the spatial coordinates of the cells (in microns).
The graph is stored in adata.obsp['spatial_connectivities']
and adata.obsp['spatial_distances']
. The long edges
are removed from the graph according to the radius
argument (if provided).
Note
The spatial coordinates are expected to be in microns, and stored in adata.obsm["spatial"]
.
If the coordinates are in pixels, set pixel_size
to the size of a pixel in microns.
If you don't know the pixel_size
, or if you don't have adata.obsm["spatial"]
, you can also provide the technology
argument.
Info
This function was updated from squidpy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata |
AnnData
|
AnnData object |
required |
radius |
tuple[float, float] | float | None
|
|
100
|
slide_key |
str | None
|
Optional key in |
None
|
pixel_size |
float | None
|
Number of microns in one pixel of the image (use this argument if |
None
|
technology |
str | None
|
Technology or machine used to generate the spatial data. One of |
None
|
percentile |
float | None
|
Percentile of the distances to use as threshold. |
None
|
set_diag |
bool
|
Whether to set the diagonal of the spatial connectivities to |
False
|
Source code in novae/utils/_build.py
novae.utils.prepare_adatas(adata, slide_key=None, var_names=None)
Ensure the AnnData objects are ready to be used by the model.
Note
It performs the following operations:
- Preprocess the data if needed (e.g. normalize, log1p), in which case raw counts are saved in
adata.layers['counts']
- Compute spatial neighbors (if not already computed), using novae.utils.spatial_neighbors
- Compute the mean and std of each gene
- Save which genes are highly variable, in case the number of genes is too high
- If
slide_key
is provided, ensure that allslide_key
are valid and unique - If using a pretrained model, save which genes are known by the model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata |
AnnData | list[AnnData] | None
|
An |
required |
slide_key |
str | None
|
Optional key of |
None
|
var_names |
set | list[str] | None
|
Only used when loading a pretrained model. To not use it yourself. |
None
|
Returns:
Type | Description |
---|---|
list[AnnData]
|
A list of |
Source code in novae/utils/_validate.py
novae.utils.load_dataset(pattern=None, tissue=None, species=None)
Automatically load slides from the Novae dataset repository.
Selecting slides
The function arguments allow to filter the slides based on the tissue, species, and name pattern. Internally, the function reads this dataset metadata file to select the slides that match the provided filters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern |
str | None
|
Optional pattern to match the slides names. |
None
|
tissue |
list[str] | str | None
|
Optional tissue (or tissue list) to filter the slides. E.g., |
None
|
species |
list[str] | str | None
|
Optional species (or species list) to filter the slides. E.g., |
None
|
Returns:
Type | Description |
---|---|
list[AnnData]
|
A list of |
Source code in novae/utils/_data.py
novae.utils.toy_dataset(n_panels=3, n_domains=4, n_slides_per_panel=1, xmax=500, n_vars=100, n_drop=20, step=20, panel_shift_lambda=5, slide_shift_lambda=1.5, domain_shift_lambda=2.0, slide_ids_unique=True, compute_spatial_neighbors=False, merge_last_domain_even_slide=False)
Creates a toy dataset, useful for debugging or testing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_panels |
int
|
Number of panels. Each panel will correspond to one output |
3
|
n_domains |
int
|
Number of domains. |
4
|
n_slides_per_panel |
int
|
Number of slides per panel. |
1
|
xmax |
int
|
Maximum value for the spatial coordinates (the larger, the more cells). |
500
|
n_vars |
int
|
Maxmium number of genes per panel. |
100
|
n_drop |
int
|
Number of genes that are randomly removed for each |
20
|
step |
int
|
Step between cells in their spatial coordinates. |
20
|
panel_shift_lambda |
float
|
Lambda used in the exponential law for each panel. |
5
|
slide_shift_lambda |
float
|
Lambda used in the exponential law for each slide. |
1.5
|
domain_shift_lambda |
float
|
Lambda used in the exponential law for each domain. |
2.0
|
slide_ids_unique |
bool
|
Whether to ensure that slide ids are unique. |
True
|
compute_spatial_neighbors |
bool
|
Whether to compute the spatial neighbors graph. We remove some the edges of one node for testing purposes. |
False
|
Returns:
Type | Description |
---|---|
list[AnnData]
|
A list of |
Source code in novae/utils/_data.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
|