Utils
novae.utils.spatial_neighbors(adata, slide_key=None, radius=None, pixel_size=None, technology=None, coord_type=None, n_neighs=None, delaunay=None, n_rings=1, percentile=None, set_diag=False, reset_slide_ids=True)
Create a Delaunay graph from the spatial coordinates of the cells.
The graph is stored in adata.obsp['spatial_connectivities']
and adata.obsp['spatial_distances']
. The long edges
are removed from the graph according to the radius
argument (if provided).
Info
This function was updated from squidpy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata
|
AnnData | list[AnnData]
|
An |
required |
slide_key
|
str | None
|
Optional key in |
None
|
radius
|
tuple[float, float] | float | None
|
|
None
|
technology
|
str | SpatialTechnology | None
|
Technology or machine used to generate the spatial data. One of |
None
|
coord_type
|
str | CoordType | None
|
Either |
None
|
n_neighs
|
int | None
|
Number of neighbors to consider. If |
None
|
delaunay
|
bool | None
|
Whether to use Delaunay triangulation to build the graph. If |
None
|
n_rings
|
int
|
See |
1
|
percentile
|
float | None
|
See |
None
|
set_diag
|
bool
|
See |
False
|
reset_slide_ids
|
bool
|
Whether to reset the novae slide ids. |
True
|
Source code in novae/utils/_build.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
novae.utils.quantile_scaling(adata, multiplier=5, quantile=0.2, per_slide=True)
Preprocess fluorescence data from adata.X
using quantiles of expression.
For each column X
, we compute asinh(X / 5*Q(0.2, X))
, and store them back.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata
|
AnnData | list[AnnData]
|
An |
required |
multiplier
|
float
|
The multiplier for the quantile. |
5
|
quantile
|
float
|
The quantile to compute. |
0.2
|
per_slide
|
bool
|
Whether to compute the quantile per slide. If |
True
|
Source code in novae/utils/_preprocess.py
novae.utils.prepare_adatas(adata, var_names=None)
Ensure the AnnData objects are ready to be used by the model.
Note
It performs the following operations:
- Preprocess the data if needed (e.g. normalize, log1p), in which case raw counts are saved in
adata.layers['counts']
- Compute the mean and std of each gene
- Save which genes are highly variable, in case the number of genes is too high
- If using a pretrained model, save which genes are known by the model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata
|
AnnData | list[AnnData] | None
|
An |
required |
var_names
|
set | list[str] | None
|
Only used when loading a pretrained model. Do not use it yourself. |
None
|
Returns:
Type | Description |
---|---|
list[AnnData]
|
A list of |
Source code in novae/utils/_validate.py
novae.utils.load_dataset(pattern=None, tissue=None, species=None, custom_filter=None, top_k=None, dry_run=False)
Automatically load slides from the Novae dataset repository.
Selecting slides
The function arguments allow to filter the slides based on the tissue, species, and name pattern. Internally, the function reads this dataset metadata file to select the slides that match the provided filters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str | None
|
Optional pattern to match the slides names. |
None
|
tissue
|
list[str] | str | None
|
Optional tissue (or tissue list) to filter the slides. E.g., |
None
|
species
|
list[str] | str | None
|
Optional species (or species list) to filter the slides. E.g., |
None
|
custom_filter
|
Callable[[DataFrame], Series] | None
|
Custom filter function that takes the metadata DataFrame (see above link) and returns a boolean Series to decide which rows should be kept. |
None
|
top_k
|
int | None
|
Optional number of slides to keep. If |
None
|
dry_run
|
bool
|
If |
False
|
Returns:
Type | Description |
---|---|
list[AnnData]
|
A list of |
Source code in novae/utils/_data.py
novae.utils.toy_dataset(n_panels=3, n_domains=4, n_slides_per_panel=1, xmax=500, n_vars=100, n_drop=20, step=20, panel_shift_lambda=5, slide_shift_lambda=1.5, domain_shift_lambda=2.0, slide_ids_unique=True, compute_spatial_neighbors=False, merge_last_domain_even_slide=False)
Creates a toy dataset, useful for debugging or testing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_panels
|
int
|
Number of panels. Each panel will correspond to one output |
3
|
n_domains
|
int
|
Number of domains. |
4
|
n_slides_per_panel
|
int
|
Number of slides per panel. |
1
|
xmax
|
int
|
Maximum value for the spatial coordinates (the larger, the more cells). |
500
|
n_vars
|
int
|
Maxmium number of genes per panel. |
100
|
n_drop
|
int
|
Number of genes that are randomly removed for each |
20
|
step
|
int
|
Step between cells in their spatial coordinates. |
20
|
panel_shift_lambda
|
float
|
Lambda used in the exponential law for each panel. |
5
|
slide_shift_lambda
|
float
|
Lambda used in the exponential law for each slide. |
1.5
|
domain_shift_lambda
|
float
|
Lambda used in the exponential law for each domain. |
2.0
|
slide_ids_unique
|
bool
|
Whether to ensure that slide ids are unique. |
True
|
compute_spatial_neighbors
|
bool
|
Whether to compute the spatial neighbors graph. We remove some the edges of one node for testing purposes. |
False
|
Returns:
Type | Description |
---|---|
list[AnnData]
|
A list of |
Source code in novae/utils/_data.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|