Loading/adding datasets
scyan.data.list(dataset_name=None)
Show existing datasets and their different versions/table names.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_name |
Optional[str]
|
Optional dataset name. If provided, only display the version names of the provided |
None
|
Source code in scyan/data/datasets.py
scyan.data.load(dataset_name, version='default', table='default', reducer=None)
Load a dataset, i.e. its AnnData
object and its knowledge table. Public datasets available are "poised"
, "aml"
, "bmmc"
, and "debarcoding"
; note that, if the dataset was not loaded yet, it is automatically downloaded (requires internet connection). Existing dataset names and versions/tables can be listed using scyan.data.list.
Note
If you want to load your own dataset, you first have to create it.
Note
The data is saved by default inside <home_path>/.scyan_data
. Optionally, if scyan
repository was cloned, you can create <scyan_repository_path>/data
and use it instead of the default data folder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_name |
str
|
Name of the dataset. Either one of your dataset, or one public dataset among |
required |
version |
Optional[str]
|
Name of the |
'default'
|
table |
Optional[str]
|
Name of the knowledge table that should be loaded. If |
'default'
|
reducer |
Optional[str]
|
Name of the umap reducer that should be loaded. If |
None
|
Returns:
Type | Description |
---|---|
Tuple[AnnData, DataFrame]
|
Tuple containing the requested data, i.e. by default a tuple |
Source code in scyan/data/datasets.py
scyan.data.add(dataset_name, *objects, filename='default', overwrite=False)
Add an object to a dataset (or create it if not existing). Objects can be AnnData
objects, a knowledge-table (i.e. a pd.DataFrame
), or a UMAP
reducer. The provided filenames are the one you use when loading data with scyan.data.load.
Note
You will be able to load this dataset with scyan.data.load as long as you added at least a knowledge-table and a adata
object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_name |
str
|
Name of the dataset in which the object will be saved. |
required |
*objects |
List[Union[AnnData, DataFrame, UMAP]]
|
Object(s) to save. |
()
|
filename |
Union[str, List[str]]
|
Name(s) without extension of the file(s) to create. The default value ( |
'default'
|
overwrite |
bool
|
If |
False
|
Raises:
Type | Description |
---|---|
ValueError
|
If the object type is not one of the three required. |
Source code in scyan/data/datasets.py
scyan.data.remove(dataset_name, version=None, table=None, reducer=None)
Remove file(s) from a dataset folder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_name |
str
|
Name of the dataset. Use |
required |
version |
Optional[str]
|
Name of the |
None
|
table |
Optional[str]
|
Name of the |
None
|
reducer |
Optional[str]
|
Name of the |
None
|