A human embryonic limb cell atlas resolved in space and time

Human tissue sample collection

First trimester human embryonic tissue was collected from elective termination of pregnancy procedures at Addenbrookes Hospital (Cambridge, UK) under full ethical approval from the East of England–Cambridge Central Research Ethics Committee (REC-96/085; for scRNA-seq and Visium), and at Guangzhou Women and Children’s Medical Center (China) under approval of the Research Ethics Committee of Zhongshan School of Medicine (ZSSOM), Sun Yat-sen University (ZSSOM-2019-075) and Guangzhou Women and Children’s Medical Center (2022-050A01, for in situ hybridization and immunohistochemistry). Consent was obtained after the decision was made to terminate pregnancy, in advance of the procedure. Experiments also followed the 2021 International Society for Stem Cell Research (ISSCR) guidelines in working on human embryos. Informed written consent was obtained from all donors before termination of pregnancy and tissue collection. No developmental abnormalities were visible or known in any of the embryos collected. All human data generated from China were registered at the China National Center for Bioinformation (PRJCA012474) and have been approved by the Chinese Ministry of Science and Technology for the Review and the Approval of Human Genetic Resources (2023BAT0445). For light-sheet fluorescence microscopy, tissues were obtained through INSERM’s HuDeCA Biobank and made available in accordance with the French bylaw. Permission to use human tissues was obtained from the French agency for biomedical research (Agence de la Biomédecine, Saint-Denis La Plaine, France; no. PFS19-012) and the INSERM Ethics Committee (IRB00003888). Written, informed consent was given for tissue collection by all patients. Embryonic age (PCW) was estimated using the independent measurement of the crown rump length, using the formula PCW (days) = 0.9022 × crown rump length (mm) + 27.372. PCW was recorded as week and day, separated by a decimal point; for example, PCW5.6 translates to 5 weeks and 6 days.

Human tissue processing and scRNA-seq data generation

Embryonic limbs were dissected from the trunk under a microscope using sterile microsurgical instruments. To capture cells of the interzone, four samples (a hindlimb and a forelimb from both PCW5.6 and PCW6.1) were then further dissected into proximal, middle (containing undisturbed interzone) and distal thirds before dissociation. For the PCW5.1 sample, no further dissection was performed due to small size and the limb was dissociated as a whole. For all other samples, the limb was dissected into proximal and distal halves before dissociation.

Dissected tissues were mechanically chopped into a mash, and then were digested in Liberase TH solution (50 μg ml−1; 05401135001, Roche) at 37 °C for 30–40 min till no tissue piece was visible. Digested tissues were filtered through 40-μm cell strainers followed by centrifugation at 750g for 5 min at 4 °C. Cell pellets were resuspended with 2% FBS in PBS if the embryos were younger than PCW8, otherwise red blood cell lysis (00-4300, eBioscience) was performed. The single-cell suspensions derived from each sample were then loaded onto separate channels of a Chromium 10x Genomics single-cell 3′ version 2 library chip as per the manufacturer’s protocol (PN-120233, 10x Genomics). cDNA sequencing libraries were prepared as per the manufacturer’s protocol and sequenced using an Illumina Hi-seq 4000 with 2 x 150-bp paired-end reads.

Mouse tissue sample collection and scRNA-seq data generation

Timed pregnant C57BL/6J wild-type mice were ordered from Jackson Laboratories. On arrival, timed pregnant mice were housed singly and maintained in solid-bottom Zyfone individually ventilated microisolator caging ((Lab Products) 7′′ wide × 12′′ length × 6′′ height). All cages were sanitized in a cagewash facility with a final rinse temperature of at least 180 F° before use. Each cage contained autoclaved hardwood chip bedding (Aspen Chip Bedding, Northeastern Products) and two sheets of tissue paper for nest building enrichment. All mice were fed irradiated standard rodent diet (PicoLab Rodent Diet 5053, PMI Nutrition International), and provided with ad libitum reverse osmosis water via water pouches (Hydropac, Lab Products) on arrival, before the start of any experimental manipulation. Animal rooms were maintained on a 14:10 h light–dark cycle with an hour-long dawn–dusk period with humidity ranging from 30% to 70% and temperatures ranging from 71 °F to 75 °F in compliance with the Guide for the Care and Use of Laboratory Animals. Animals were checked daily by the animal care staff to check for health and the availability of food, water and cage conditions. Embryos were collected at E12.5, E13.5 and E16.5. Only right-side forelimbs and hindlimbs were used in this study: n = 5 at the E12.5 timepoint, n = 5 at E13.5 and n = 2 at E16.5. No randomization, blinding or sample size choice were done. The sex of the embryos was not known or selected. Hindlimbs and forelimbs were pooled separately in ice-cold HBSS (14175-095, Gibco), and dissected into proximal, middle and distal limb regions, which were again separately pooled in 200 μl of HBSS placed in a drop in the centre of a 6-cm culture plate. Tissues were then minced with a razor blade and incubated with an addition of 120 μl of diluted DNase solution (04716728001, Roche) at 37 °C for 15 min. The DNase solution consisted of 1 ml UltraPure water (10977-015, Invitrogen), 110 μl 10× DNase buffer and 70 μl DNase stock solution. Of diluted Liberase TH (05401151001, Roche), 2 ml was then added to the plate, and the minced tissue suspension was pipetted into a 15-ml conical centrifuge tube. The culture plate was rinsed with 2 ml, and again with 1 ml of fresh Liberase TH, which was serially collected and added to the cell suspension. The suspension was incubated at 37 °C for 15 min, triturated with a P1000 tip and incubated for an additional 15 min at 37 °C. For the Liberase TH solution, 50X stock was prepared by adding 2 ml PBS to 5 mg of Liberase TH. Working solution was made by adding 100 μl 50X stock to 4.9 ml PBS. After a final gentle trituration of the tissue with a P1000 tip, the suspension was spun at 380g in a swinging bucket rotor at 4 °C for 5 min. After removing the supernatant, cells were resuspended in 5 ml of 2% FBS in PBS, and filtered through a pre-wetted 40-μm filter (352340, Falcon). After spinning again at 380g at 4 °C for 5 min, the supernatant was removed and cells were resuspended in 200 µl 2% FBS in PBS. A small aliquot was diluted 1:10 in 2% FBS/PBS and mixed with an equal volume of Trypan Blue for counting on a haemocytometer. The full suspension was diluted to 1.2 million cells per millilitre for processing on the 10x Genomics Chromium Controller, with a target of 8,000 cells per library. Libraries were processed according to the manufacturer’s protocol, using the v3 Chromium reagents. All animal procedures were performed according to protocols approved by the Institutional Animal Care and Use Committee at the California Institute of Technology. Animals were housed in an AAALAC-accredited facility in accordance with the Guide for the Care and Use of Laboratory Animals.

Visium spatial transcriptomic experiments of human tissue

Whole embryonic limb samples at PCW6–PCW8 were embedded in OCT within cryo wells and flash-frozen using an isopentane and dry ice slurry. Ten-micron-thick cryosections were then cut in the desired plane and transferred onto Visium slides before haematoxylin and eosin staining and imaged at ×20 magnification on a Hamamatsu Nanozoomer 2.0 HT Brightfield. These slides were then further processed according to the 10x Genomics Visium protocol, using a permeabilization time of 18 min for the PCW6 samples and 24 min for older samples. Images were exported as tiled tiffs for analysis. Dual-indexed libraries were prepared as in the 10x Genomics protocol, pooled at 2.25 nM and sequenced four samples per Illumina Novaseq SP flow cell with read lengths of 28 bp R1, 10 bp i7 index, 10 bp i5 index and 90 bp R2.

Digit region analysis of Visium data

The Visium data were clustered by the Louvain algorithm after filtering genes that were expressed in less than one spot and performing normalization and logarithmization. After that, the spot clusters of interest were annotated based on haematoxylin and eosin histology and marker genes. The differential expression testing was performed by Wilcoxon test using Scanpy (

Alignment, quantification and quality control of scRNA-seq data

Droplet-based (10x) sequencing data were aligned and quantified using the Cell Ranger Single-Cell Software Suite (v3.0.2, 10x Genomics). The human reference is the hg38 genome refdata-cellranger-GRCh38-3.0.0, available at: The mouse reference is the mm10 reference genome refdata-gex-mm10-2020-A, available at: Published mouse scRNA-seq FASTQ files were downloaded from ENCODE’s portal and the Gene Expression Omnibus76,77,78. The following quality control steps were performed: (1) cells that expressed fewer than 200 genes (low quality) were excluded; (2) genes expressed by less than five cells were removed; and (3) cells in which over 10% of unique molecular identifiers were derived from the mitochondrial genome were removed.

Alignment and quantification of human Visium data

Raw FASTQ files and histology images were processed, aligned and quantified by sample using the Space Ranger software v.1.1.0, which uses STAR v.2.5.1b52 for genome alignment, against the Cell Ranger hg38 reference genome refdata-cellranger-GRCh38-3.0.0, available at:

Doublet detection of scRNA-seq data

Doublets were detected with an approach adapted from a previous study82. In the first step of the process, each 10x lane was processed independently using the Scrublet to obtain per-cell doublet scores. In the second step of the process, the standard Scanpy processing pipeline was performed up to the clustering stage, using default parameters83. Each cluster was subsequently separately clustered again, yielding an over-clustered manifold, and each of the resulting clusters had its Scrublet scores replaced by the median of the observed values. The resulting scores were assessed for statistical significance, with P values computed using a right tailed test from a normal distribution centred on the score median and a median absolute deviation-derived standard deviation estimate. The median absolute deviation was computed from above-median values to circumvent zero truncation. The P values were corrected for false discovery rate with the Benjamini–Hochberg procedure and were used to assess doublet level. The clusters from batch-corrected overall clustering across all the samples that have median scores lower than 0.1 and are supported by an absence of exclusive marker genes or literature were manually curated and removed (1,450 doublets were removed in human data and 958 in mouse data).

Data preprocessing and integration of scRNA-seq data

Preprocessing included data normalization (pp.normalize_per_cell with 10,000 counts per cell after normalization), logarithmization (pp.log1p), highly variable genes detection (pp.highly_variable_genes and select for highly correlated ones as previously described76) per batch and merging, data feature scaling (pp.scale), cell cycle and technical variance regressing (tl.score_gene_cell_cycle and pp.regress_out(adata,[‘S_score’, ‘G2M_score’, ‘n_counts’, ‘percent_mito’])), and principal component analysis (tl.pca with 100 components) performed using the Python package Scanpy (v.1.8.2). bbknn (v.1.5.1) was used to correct for batch effect between sample identities with the following parameters (n_pcs = 100, metric = ‘Euclidean’, neighbors_within_batch = 3, trim = 299, approx = false). Following this, further dimension reduction was performed using uniform manifold approximation and projection (UMAP) (scanpy tl.umap with default parameters) based on the corrected neighbourhood graph of bbknn.

Clustering and annotation of scRNA-seq data

We first applied Leiden graph-based clustering (scanpy tl.leiden with default parameters) to perform unsupervised cell classification. Each cluster was then subclustered if heterogeneity was still observed and was manually annotated (see Supplementary Table 1 for marker genes) and curated as previously described84. To make sure all the curated Leiden clusters could clearly be mapped onto their UMAP embedding coordinates, we performed the partition-based graph abstraction (PAGA) (tl.paga with the Leiden clusters) and reran UMAP with the initial position from the PAGA.

Deconvolution of human Visium data using cell2location

To map clusters of cells identified by scRNA-seq in the profiled spatial transcriptomics slides, we used the cell2location method85. In brief, this involved first training a negative binomial regression model to estimate reference transcriptomic profiles for all the scRNA-seq clusters in the developing limb. Next, lowly expressed genes were excluded as per recommendations for use of cell2location, leaving 13,763 genes for downstream analysis. Next, we estimated the abundance of each cluster in the spatial transcriptomics slides using the reference transcriptomic profiles of different clusters. This was applied to all slides simultaneously, using the sample ID as the batch_key and categorical_covariate_keys. To identify microenvironments of colocalizing cell clusters, we used non-negative matrix factorization implementation in scikit-learn, utilizing the wrapper in the cell2location package86. A cell type was considered part of a microenvironment if the fraction of that cell type in said environment was over 0.2.

Alignment and merging of multiple Visium sections using VisiumStitcher

To analyse the whole PCW8.1 human hindlimb, we took three consecutive 10-µm sections from different regions and placed them on different capture areas of the same Visium library preparation slide. The first section spanned the distal femur, knee joint and proximal tibia (sample C42A1), the second the proximal thigh (sample C42B1) and the third the distal tibia, ankle and foot (sample C42C1).

The images from these three Visium capture areas were then aligned using the TrackEM plugin (Fiji)87. Following affine transformations of C42B1 and C42C1 to C42A1, the transformation matrices were exported to an in-house pipeline ( for complementary alignment of the spot positions from the SpaceRanger output to the reconstructed space. In addition, we arbitrarily decided that overlapping regions would keep the image from the centre portion (see Extended Data Fig. 6a) while keeping all the spots in the data matrix. Next, we merged the three library files and matched the reconstructed image to the unified AnnData object.

Trajectory analysis of human scRNA-seq data

Development trajectories were inferred by combining diffusion maps, PAGA and force-directed graph. The first step of this process was to perform the first nonlinear dimensionality reduction using diffusion maps (scanpy tl.diffmap with 15 components) and recompute the neighbourhood graph (scanpy pp.neighbors) based on the 15 components of diffusion maps. In the second step of this process, PAGA (scanpy tl.paga) was performed to generate an abstracted graph of partitions. Finally, force-directed graph was performed with the initial position from PAGA (scanpt tl.draw_graph) to visualize the development trajectories.

RNA velocity calculations for mesenchymal compartment

The scVelo version 0.24 package for Python was used to calculate a ratio of spliced-to-unspliced mRNA abundances in the dataset88. The data were subclustered to the mesenchymal compartment for a single sample (PCW7.2). The data were then processed using default parameters following preprocessing as described in Scanpy scVelo implementation. The samples were preprocessed using functions for detection of minimum number of counts, filtering and normalization using scv.pp.filter_and_normalize and followed by scv.pp.moments function using default parameters. The gene-specific velocities were then calculated using with mode set to stochastic and functions, and visualized using function.

Cell–cell communication analysis of human scRNA-seq data

Cell–cell communication analysis was performed using (v.2.1.4) for each dataset at the same stage of development89,90. The stage-matched Visium data were used to validate the spatial distance and expression pattern of significant (P < 0.05) ligand–receptor interactions.

Regulon analysis of transcription factors

To carry out transcription factor network inference, analysis was performed as previously described91 using the pySCENIC Python package (v.0.10.3). For the input data, we filtered out the genes that were expressed in less than 10% of the cells in each cell cluster. Then, we performed the standard procedure including deriving co-expression modules (pyscenic grn), finding enriched motifs (pyscenic ctx) and quantifying activity (pyscenic aucell).

Integration of human and mouse scRNA-seq data

Mouse orthologues were first ‘translated’ to human genes using MGI homology database ( Processed human and mouse data were then merged together using outer join of all the genes. The matched dataset was then integrated by MultiMAP92 (the MultiMAP_Integration() function), using separately pre-calculated principal components and the union set of previously calculated mouse and human feature genes (including both orthologues and non-orthologues) to maximize biological variance. Downstream clustering and embedding were performed in the same way as previously described and cell-type annotation was based on marker genes. Cell-type composition of proximal, middle and distal segments of the same limb was visualized using function. To capture the differential expression of sparsely captured genes, the odds ratio of the percentages of non-zero cells between groups of cells was used to select for proximal/distal or forelimb/hindlimb biased genes with a cut-off at 30-fold and 3-fold, respectively.


The limb samples were post-fixed in 4% paraformaldehyde for 24 h at 4 °C followed by paraffin embedding. A thickness of 4-μm sections were boiled in 0.01 M citrate buffer (pH 6.0) after dewaxing. Immunofluorescence staining was then carried out as previously described93. Primary antibodies for RUNX2 (1:50; sc-390715, Santa Cruz), THBS2 (1:100; PA5-76418, Thermo Fisher), COL2A1 (1:200; sc-52658, Santa Cruz), PITX1 (1:30; Ab244308, Abcam), PAX3 (1:1; AB_528426 supernatant, DSHB), ALDH1A3 (1:50; 25167-1-AP, Proteintech) and MYH3 (1:3; AB_528358 supernatant, DSHB) and anti-KERA (1:1,000; HPA039321, Sigma-Aldrich) were incubated overnight at 4 °C. After washing, sections were incubated with appropriate secondary antibodies Alexa Flour 488 goat anti-mouse IgG1 (1:400; A-21121, Invitrogen), Alexa Flour 647 goat anti-mouse IgG2b (1:400; A-21242, Invitrogen), Alexa Flour 488 goat anti-mouse IgG (H + L) (1:400; A-11029, Invitrogen) and Alexa Flour 546 goat anti-rabbit IgG (H + L) (1:400; A-11035, Invitrogen) at room temperature for 1 h, and were mounted using FluorSave Reagent (345789, Calbiochem). For 3,3-diaminobenzidine staining, we used a streptavidin–peroxidase broad spectrum kit (SP-0022, Bioss) and 3,3-diaminobenzidine solution (ZLI-9017, ZSGB-BIO) following the manuals from the manufacturers. The primary antibodies PI16 (1:500; HPA043763, Sigma-Aldrich), FGF19 (1:500; DF2651, Affinity) and NEFH (1:1,000; 2836, Cell Signaling) were applied. Single-plane images were acquired using an inverted microscope (DMi8, Leica).


Fresh tissue samples were embedded in OCT and frozen at −80 °C until analysis. Cryosections were cut at a thickness of 10 μm or 12 μm using a cryostat (Leica CM1950 or CM3050). Before staining, tissue sections were post-fixed in 4% paraformaldehyde for 15 min at 4 °C. After a series of 50%, 70%, 100% and 100% ethanol dehydration for 5 min each, tissue sections were treated with hydrogen peroxide for 10 min. Next, the sections were digested with protease IV (322336, ACD) for 20–30 min at room temperature; alternatively, they were digested with protease III (322337, ACD) for 15 min after heat-induced epitope retrieval. RNA-ISH was then carried out manually or using BOND RX (Leica) by using the RNAscope Multiplex Fluorescent Reagent Kit v2 Assay (323110, ACD) or the PinpoRNA multiplex fluorescent RNA in situ hybridization kit (PIF2000, GD Pinpoease) according to the instructions by the manufacturers. To visualize targeted RNAs from individual channels, different tyramide signal amplification (TSA) fluorescent substrates were incubated. Two sets of fluorophores TSA520, TSA570 and TSA650 (PANOVUE) and Opal 520, Opal 570 and Opal 650 (Akoya Biosciences) were used and consistent results were obtained. For the staining of four probes, the RNAscope 4-plex Ancillary Kit (323120, ACD) was applied additionally, and a combination of fluorophores TSA520, TSA570, Opal620 and Opal690 were used. The stained sections were imaged with either AxioScan.Z1 (Zeiss) or the Opera Phenix High-Content Screening System (PerkinElmer).

RNA-ISH colocalization analysis

Colocalization analysis was performed by first identifying the expressed genes on raw images through the utilization of a pixel classifier trained with the software ilastik94. Subsequently, the predicted mask image was subjected to analysis, with the probability of co-occurrence determined by tallying the instances in which one gene coexists with another at the same 0.14 × 0.14 μm pixel, and dividing this by the total number of pixels in which the gene of interest was expressed, regardless of the presence of the other gene.

Light-sheet fluorescence microscopy

Embryonic and fetal limbs were dissected from morphologically normal specimens collected from PCW5 to PCW6.5. Candidate antibodies were screened by immunofluorescence on cryosections obtained from OCT-embedded specimens as previously described95. Whole-mount immunostaining of the limbs was performed as previously described, with primary antibody incubation at 37 °C reduced to 3 days followed by 1 day in secondary antibodies. Samples were embedded in 1.5% agarose and optically cleared with solvents using the iDisco+ method. Cleared samples were imaged with a Blaze light-sheet microscope (Miltenyi Biotec) equipped with a 5.5MP sCMOS camera controlled by Imspector Pro 7.5.3 acquisition software. A ×12 objective with ×0.6 or ×1 magnification (MI plan NA 0.53) was used. Imaris (v10.0, BitPlane) was used for image conversion, processing and video production.

The antibodies used for light-sheet fluorescence microscopy

IRX1 Sigma-Aldrich cat. no. HPA043160, RRID: AB_10794771 (1/200e); MSX1 R&D Systems cat. no. AF5045, RRID: AB_2148804 (1/500e); LHX2 Abcam cat. no. ab184337, RRID: AB_2916270 (1/1,000e); SOX9 Abcam cat. no. ab196184, RRID: AB_2813853 (1/500e); MAFB Abcam cat. no. ab223744, RRID: AB_2894837 (1/500e); donkey anti-rabbit IgG H&L (Alexa Fluor 555) Abcam cat. no. ab150062, RRID: AB_2801638 (1/800e); and donkey anti-goat IgG H&L (Alexa Fluor 750) Abcam cat. no. ab175745, RRID: AB_2924800 (1/300e).

MSC knockdown in human primary myoblasts

Isolation of human primary myoblast cells

The thighs from human embryos were processed as previously described96, except that the dissociated cells were not treated with erythrocyte lysis solution, and were incubated with anti-human CD31 (12-0319-41, eBioscience), CD45 (12-0459-41, eBioscience) and CD184 (17-9999-41, eBioscience) antibodies for cell sorting. Fluorescent activated cell sorting (BD, influx) sorted CD31CD45CD184+ cells were cultured in complete growth medium DMEM supplemented with 20% FCS and 1% penicillin–streptomycin (15140122, Gibco).

Small interfering RNA transfection

Human primary myoblasts were seeded into a six-well plate one night before transfection. When the cell density reached approximately 50% confluence, oligos of small interfering RNA against MSC and negative control were transfected using Lipofectamine 3000 reagent (L3000015, Invitrogen) at a final concentration of 37.5 nM. After incubation for 16 h, the growth medium was replaced with differentiation medium containing 2% horse serum and 1% penicillin–streptomycin in DMEM. After culturing for an additional 6–8 h, the cells were collected for RNA extraction. Initially, three siRNA oligos (9242-1, 9242-2 and 9242-3, Bioneer) were tested, and the third one with sense sequences 5′-GAAGUUUCCGCAGCCAACA-3′ were used in this study.

RNA extraction and qPCR

Total cell RNA was extracted with the EZ-press RNA purification kit (B0004D, EZBioscience), and the cDNA was synthetized using the PrimeScript RT Master Mix Kit (RR036A, Takara). The qPCR was performed using PerfectStart Green qPCR Super Mix (AQ601, TransGen Biotech) on a real-time PCR detection system (LightCycle480 II, Roche). RPLP0 served as an internal control, and the fold enrichment was calculated using the formula 2−ΔΔCt. The following primers (5′−3′) were used:


Ethics statement

The work done in the UK was supported by the National Institute for Health and Care Research Cambridge Biomedical Research Centre (NIHR203312) and provided by the Cambridge Biorepository for Translational Medicine ( Human fetal samples were provided by the National Institute for Health and Care Research Cambridge Biomedical Research Centre and collected under the Research Ethics Committee-approved study 96/085. The work done in China was approved by the Research Ethics Committee of Zhongshan School of Medicine (ZSSOM), Sun Yat-sen University (ZSSOM-2019-075) and Guangzhou Women and Children’s Medical Center (2022-050A01). At both centres, consent was obtained from the patient following the decision to terminate the pregnancy and in advance of the procedure. All animal procedures were performed according to protocols approved by the Institutional Animal Care and Use Committee at the California Institute of Technology. Details of human tissue sample collection are in the Methods section of this article. The views expressed are those of the authors and not necessarily those of the National Institute for Health and Care Research or the Department of Health and Social Care.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link

Back to top button