Spatial Datasets
Spatial transcriptomic datasets can be visualized and analyzed in rakaia. The preprocessing steps for spatial datasets are slightly different from antibody-based imaging datasets, and will also vary slightly across different technologies.
Required format
Users will likely need to use a spatial analysis library to prepare their data for import, such as spatialdata or scanpy. Import of raw instrument outputs, such as 10x output bundles, are not directly supported.
IMPORTANT: If importing with a zarr directory path, users should not add a forward or backward trailing slash (/ or \) to the path, or they may receive an import error like the following:

All spatial datasets that contain marker expression need to be imported into rakaia as one of two formats:
- as a
spatialdatazarrdirectory, created using the appropriate technology-specific reader fromspatialdata. This currently supports 10x Visium, Visium HD, and Xenium. More information onspatialdatafile reading can be found here - as an Anndata object with the file extension
.h5adSpecifically, the Anndata object must have aspatialarray in theobsmslot that contains both the x and y coordinates for each spatial measurement, whether it corresponds to a cell, spot, etc. Libraries such asscanpyandsquidpyare Python libraries that have provide readers for raw spatial data into this format. These libraries will be referenced in the article below.
rakaia currently expects every spatial dataset sample or slide (e.g. Visium slide/Xenium sample run) to be loaded as its own .h5ad file. If users have multiple samples or slides, each with a unique set of spots or in-situ transcripts, the dataset should be divided into multiple Anndata objects and exported as individual files.
rakaia also enables multi-slide datasets with 10x Visium through spatialdata + zarr (more info below).
Importing multiple zarr stores per session
From rakaia 0.26.0 and later, users can import multiple zarr stores into one session provided that the biomarker panel is the same across all samples. For example, with the following data directory structure:

The user can import all 4 spatial ROIs into the session by simply importing from the parent filepath:

This will parse all 4 zarr directories into the session, allowing for efficient multi-sample or multi-slide spatial analysis.
Spot-based assays: 10X Visium V1, V2
Spot-based spatial technologies such as the 10X Visium Spatial Gene Expression profile transcript counts summarized at the spot level. The Visium technology is capable of profiling tens of thousands of markers per spot, providing comprehensive spatial context of the transcriptome.
Raw Visium data (from the space ranger standard directory output as described here) should be read using either read_visium function from either scanpy or squidpy. Below is a minimal example showing how the data can be preprocessed and exported into a compatible file format:
import squidpy as sq
import os
# specify the input directory with the outs subdir
input_dir = "/path_to_visium_raw/outs/"
adata = sq.read_visium(input_dir)
adata.var_names_make_unique()
# specify the output file as an anndata object
out_anndata = "/output_dir/visium.h5ad"
adata.write_h5ad(out_anndata)
The output h5ad Visium file should typically not be larger than 200-300mb. If the size significantly exceeds this range (1-2 GB or larger), then it is likely that the user has cached a full-sized WSI (i.e. H & E) in the file. Caching these large images will result in significant performance slowdowns in rakaia. To avoid this, users should ensure that the uns slot in the object is cleared of any fields except the required scalefactors slot:
adata.uns = {'spatial': {str(list(adata.uns['spatial'].keys())[0]): {
'scalefactors': adata.uns[
'spatial'][list(adata.uns['spatial'].keys())[0]]['scalefactors']}}}
This retains the scale factors required for Visium to render in rakaia, while removing any additional slot caches that rakaia does not use.
When the Anndata object is read into rakaia, the spot size and region dimension will automatically be computed from the spatial coordinates and scale factors:

10x Visium with spatialdata + zarr
Visium assays read through a spatialdata zarr directory can have multiple samples/slides. Every unique sample or slide should have its own shape frame in the shapes slot. The spatialdata_io.visium reader should be used to generate the zarr store (More information can be found here)
Non-spot based assays
Non spot-based assays behave differently in rakaia as the user has more flexibility over the visualization parameters. Specifically, this means that the user may specify a unique visualization size for the data in the image, which isn't supported with spot-based assays because the spot scaling factors are computed automatically from the input data.
Binned expression assays: 10x Visium HD
The HD version of 10X Visium differs slightly from the spot-based technology used in either V1 or V2. Instead of using circular spots that have gaps among them, HD offers tiled, barcoded squares without gaps, as described here. This results in data that can be binned at summarized at three different micron resolutions: 2, 8, and 16.
Currently, the only supported reader in Python for HD datasets is the spatialdata HD reader. The reader generates aggregate expression profiles for each micron resolution, and each of these profiles can be exported as an Anndata file for visualization in rakaia. This example notebook here shows an example of how to export these binned profiles. each bin can then be imported into rakaia as a separate ROI with the same set of marker genes, but with different dimensions and resolution.

Visium HD with spatialdata + zarr
Visium HD assays read through spatialdata + zarr will be split into multiple ROIs based on the bin size. The zarr store should have a matched shape + table slot for every bin size with the following table keys: 'square_002um', 'square_008um', 'square_016um'.The spatialdata_io.visium_hd reader should be used to generate the zarr store (More information can be found here)
rakaia will output only the bins for 8 and 16um, as the 2um bin size is both too sparse and large to effectively render.
From rakaia v0.25.0, the segmentation masks for Visium HD will be available for the bin sizes above. However, due to the binning resolution, the masks will not retain the same shape and resolution as they would appear in the Visium HD cell segmentation summary, so they should be used more for positional inference or object ID annotation rather than accurate spatial projection.