A complete summary of our Visium, Scanpy, Squidpy, and Xenium pipelines.
The Core Concept
Grinding up a tissue for sequencing tells you what is there, but destroys the map. Spatial Transcriptomics keeps the "address" of every cell. Today, we used two distinct platforms:
10x Visium
"Digital Staining"
Uses a slide with ~5,000 spots (55ยตm each). Captures the entire transcriptome (~20k genes). It's slightly "blurry" because one spot can contain 1โ10 cells.
10x Xenium
"Subcellular High-Def"
Uses single-molecule FISH. Looks at a curated panel (e.g., 500 genes) but provides exact X-Y coordinates of every individual RNA molecule inside the cell.
Tutorial 1: Basic Scanpy Spatial
We started with the "Hello World" of spatial biology: a Human Lymph Node mapped using sc.datasets.visium_sge.
๐งน
Quality Control (QC)
We filtered spots with extreme counts and removed spots with >20% mitochondrial genes (a sign of cell death/stress).
๐งฎ
Normalization
Sequencing depth was scaled to remove technical variance, followed by log-transformation. Top 2000 Highly Variable Genes (HVGs) selected.
๐งฉ
Leiden Clustering
PCA โ KNN Graph โ UMAP. We grouped spots that "spoke the same molecular language" into transcriptional clusters.
Mapping Genes to Tissue
The magic of Scanpy's spatial module is projecting transcriptional data back onto the physical H&E image.
The Key Finding: We identified Cluster 9 as a distinct follicular region, confirmed by the marker gene CR2. We proved we can "digitally" identify tissue structures without a pathologist manually labeling every region.
Tutorial 2: Visium Fluorescence
We shifted focus from transcriptomics to image space using Squidpy. Instead of an H&E stain, we used DAPI (DNA) and antibody stains (NEUN/GFAP) on a mouse brain slice.
DAPI Image
Raw Fluorescence
โ
Watershed
Nuclei Segmentation
โ
Features
Intensity & Texture
The "Aha" Moment
By running Leiden clustering only on the extracted image features (no sequencing data!), we were able to subdivide the Hippocampus into known anatomical sub-layers. The gene expression data had previously grouped it all as one single block.
Tutorial 3: Visium H&E Math
We moved past visualization into true Spatial Statistics using Squidpy on a mouse brain H&E section. This proved relationships mathematically.
๐ค
Neighborhood Enrichment
A permutation test proving the Pyramidal layers and the Hippocampus are statistically significant physical neighbors.
๐
Co-occurrence
Calculating the conditional probability of finding specific clusters as you increase the radius from a given spot.
๐
Ligand-Receptor
Used OmniPath/CellPhoneDB logic to find protein-level conversations happening at physical borders.
The "Clumpiness" Index: Moran's I
Instead of just finding cluster markers, we searched for Spatially Variable Genes (SVGs). We calculated the global spatial autocorrelation.
Moran's I Score
0.76
Example score for gene Olfm1
A score near 1.0 means the gene expression is highly structured. A score near 0 means it's randomly scattered noise.
Identifying genes like Olfm1 and Plp1 proves that biology is organized. These genes aren't randomly distributed in a "soup"โthey are the architects of the brain's structural layout.
Tutorial 4: The Xenium Era
We stepped into the big leagues: 11,898 individual cells from a human lung cancer sample using the modern SpatialData and Zarr framework.
Xenium datasets contain millions of coordinates (Points) and exact cell boundaries (Shapes). Old-school memory formats crash; Zarr allows chunked loading.
Advanced Graphing: Delaunay
Visium spots sit on a perfect grid. Xenium cells are messy, irregularly packed shapes in a real tumor. How do we know who is a neighbor?
Delaunay Triangulation creates a network of triangles between cell centroids to perfectly define adjacent neighbors.
Centrality Scores: Using this graph, we calculated closeness and degree centrality to identify specific tumor cells acting as "hubs" in the microenvironment.
Key Terminology Toolkit
Data Structures
AnnData: The master "Data Box." Holds counts (.X), metadata (.obs), and coordinates (.obsm).
SpatialData: The modern standard. Aligns images, transcript points, and cell shapes natively.
Biological Concepts
Spots vs Cells: Visium uses 55ยตm circles (1-10 cells). Xenium segments true single cells.
Clusters: Groups of spots/cells with highly similar gene expression profiles (Leiden).
H&E Staining: The classic pink and purple dye. Hematoxylin (purple) binds to DNA/nuclei. Eosin (pink) binds to proteins/cytoplasm.
The "Holy Trinity" of Spatial
Throughout the day, our pipeline successfully bridged the three critical pillars of modern spatial biology:
Morphology
Image features, texture, and segmentation defining structure.
Proximity
Neighborhood enrichment & co-occurrence graphs.
Function
Moran's I patterns and Ligand-Receptor cell signaling.
Summary & Final Deliverables
๐
Clean Repositories
Segmented tutorials into 4 clean directories, renamed output images sequentially, and drafted Caveman-style Markdown READMEs.
๐
Ready for Research
Moved from basic spot visualization to sub-cellular interactive mapping. We are ready to model tumor microenvironments.