[ad_1]

### Data

Observational concentrations of DIP, DIC, ALK and O_{2} were downloaded from the Global Ocean Data Analysis Project website, the second version (GLODAPv2 (ref. ^{15}); Extended Data Fig. 1). DOC observations were from ref. ^{16}. The data were then binned into the ocean circulation inverse model (OCIM) model grid that has a horizontal resolution of 2° × 2° and vertically 24 layers, with finer resolution in the upper ocean and coarser resolution in the deep ocean. The updated DOC compilation contains 25,869 valid data points (Extended Data Fig. 2) after binning to our model grid and has better coverage than previously widely used ones (ref. ^{51}) that only had 14,034 valid data points in the model grid. The DOC dataset has a slight seasonal bias, with more samples collected in the summer season. However, we think that the influence is minor because: (1) a substantial proportion of the total DOC is composed of refractory DOC; unlike labile and semi-labile DOC, refractory DOC does not exhibit strong seasonality owing to its long residence time in the ocean; (2) we used tracer data from the full-water depth to constrain our model parameters. The deep ocean experiences lesser seasonal variability compared with the surface ocean. Therefore, using full-water-depth data helps to anchor the stability of the inversion. Two NPP products, carbon-based NPP from Sea-viewing Wide Field-of-view Sensor (SeaWiFS CbPM)^{52} and CAFE, were downloaded from https://doi.org/10.6084/m9.figshare.19074521. The NPP products were interpolated and averaged by Nowicki et al.^{24} to the same model grid as used in this study. The climatological ocean temperature and silicate are from World Ocean Atlas 2018 (refs. ^{53,54}). The projected temperature at 2099 was obtained from a CESM-BGC model prediction under the RCP8.5 scenario^{55}. The historical atmospheric *p*CO_{2} data were obtained from ref. ^{56} for the period from 1850 to 2015 and were downloaded from https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record.html (ref. ^{57}) for the period from 2016 to 2020.

### Biogeochemical inverse model

A schematic of the structure of the biogeochemical model is shown in Fig. 1. The model couples the cycling of phosphorus (P), carbon (C) and oxygen (O). The phosphorus model is the base model that provides a biological uptake rate (*γ*(**r**), in which **r** is a position coordinate) in P units (**G** ≡ (*γ*[DIP])), which is then converted to a DIC uptake rate in the carbon model by incorporating a C:P ratio (*r*_{C:P}). In the P-cycle model, the DIP assimilation rate is modelled using a spatial pattern obtained from satellite-derived NPP (mg C m^{−2} day^{−1}) and a gridded surface DIP climatology as follows

$$\gamma \left({\bf{r}}\right)\equiv \left\{\begin{array}{c}\alpha \frac{{\left[\frac{1}{{r}_{{\rm{C:P}}}}\frac{{\rm{NPP}}({\bf{r}})}{{{\rm{NPP}}}_{0}}\right]}^{\beta }}{\frac{{\left[{\rm{DIP}}\right]}_{{\rm{obs}}}({\bf{r}})}{{[{\rm{DIP}}]}_{0}}},\,{\rm{if}}\,z < {z}_{{\rm{c}}},\\ 0,\,{\rm{otherwise}}\end{array}\right.$$

(1)

in which NPP_{0} and [DIP]_{0} are 1 mmol C m^{−2} day^{−1} and 1 μM that are functioned to remove dimensions of NPP and DIP; *α* and *β* are adjustable parameters that are constrained in the inversion; *r*_{C:P} is the C:P ratio that is used to convert NPP from C unit to P unit and modelled according to Galbraith and Martiny^{58} (*r*_{C:P} = (0.006 + 0.0069[DIP]_{obs})^{−1}). *z* and *z*_{C} are water depth and the euphotic zone depth, respectively. Photosynthesis is assumed to occur only in the euphotic zone and to be zero below. The euphotic zone is defined as the top two model layers (73.4 m).

#### Phosphorus model

The phosphorus model considers four explicit tracers: dissolved inorganic phosphorus (DIP), dissolved semi-labile organic phosphorus (DOP), dissolved labile organic phosphorus (DOP_{l}) and particulate organic phosphorus (POP). We assign an e-folding remineralization time (1/*κ*_{l}) of 12 h for DOP_{l} so that it quickly cycles in the upper ocean, with little being transported below the euphotic zone. We use a parameter *δ* to allocate production to labile pools. The remaining production (total production less production to DOP_{l}) is allocated to DOP and POP. The factions *σ*_{P} and (1 − *σ*_{P} − *δ*) of the production allocated, respectively, to DOP and POP are determined by estimating the parameter *σ*_{P} through our Bayesian inversion procedure. The advective-diffusive transport of dissolved tracers (DIP, DOP and DOP_{l} in the P model; DIC, semi-labile dissolved organic carbon (DOC), labile dissolved organic carbon (DOC_{l}), refractory dissolved organic carbon (DOC_{r}) and ALK in the C model; and O_{2} in the O model) is computed using the OCIM tracer transport matrix, \({\bf{T}}\left[{\rm{C}}\right]\equiv \nabla \cdot \left(\vec{U}\left[{\rm{C}}\right]-{\rm{K}}\nabla \left[{\rm{C}}\right]\right)\), in which \(\vec{U}\) is the velocity vector and K is the diffusive term. **Τ** represents the climatological mean circulation of the ocean. The OCIM tracer transport matrix is constrained using salinity, temperature, sea-surface height, CFC-11, CFC-12, ^{14}C, ^{3}He etc. (see DeVries and Holzer^{59} for details). We neglect the advective-diffusive transport of particulate tracers (POP in the P model and PIC and POC in the C model) so that particulate tracers are transported only vertically. The vertical transport of POP is modelled using a sinking flux divergence operator (\({{\bf{F}}}_{{\rm{POP}}}\equiv \nabla \cdot (\vec{w}[{\rm{POP}}])\)), in which \(\vec{w}\) is the sinking speed of POP and is directed downward. We choose a sinking speed that increases linearly with depth and a constant dissolution rate, *κ*_{P} = (1/30) days^{−1}, so that the attenuation of the vertical flux of POP follows a power-law function, *F*(*z*) = *F*(*z*_{0})(*z*/*z*_{0})^{−b}, in which *F*(*z*) and *F*(*z*_{0}) are fluxes at a depth of *z* and *z*_{0}, respectively^{60}. A sensitivity test with *κ*_{P} = (1/60) days^{−1} suggests that the choice of *κ*_{P} does not markedly influence our results. The exponent *b* for the P model (C model in the following section) is defined in the following way (ref. ^{42}), *b*(P) = *b*_{Pθ}*T* + *b*_{P}, in which *b*_{Pθ} and *b*_{P} are two adjustable parameters and *T* is the average temperature of the model euphotic zone. The initial guess of *b*_{Pθ} is set to zero, thereby avoiding any intentional imposition of temperature dependence. The optimization process determines both the sign and magnitude of *b*_{Pθ}. The governing equations for the phosphorus cycle are as follows:

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right][{\rm{DIP}}]=-\,\gamma [{\rm{DIP}}]+{\kappa }_{{\rm{p}}}[{\rm{POP}}]+{\kappa }_{{\rm{dP}}}[{\rm{DOP}}]+{\kappa }_{{\rm{l}}}[{{\rm{DOP}}}_{{\rm{l}}}]+{\kappa }_{{\rm{g}}}([{\rm{DIP}}]-{\overline{[{\rm{DIP}}]}}_{{\rm{obs}}}),$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right][{\rm{DOP}}]={\sigma }_{{\rm{P}}}\gamma [{\rm{DIP}}]-{\kappa }_{{\rm{dP}}}[{\rm{DOP}}],$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right][{{\rm{DOP}}}_{{\rm{l}}}]=\delta \gamma [{\rm{DIP}}]-{\kappa }_{{\rm{l}}}[{{\rm{DOP}}}_{{\rm{l}}}],$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{{\bf{F}}}_{{\rm{POP}}}\right][{\rm{POP}}]=(1-{\sigma }_{{\rm{P}}}-\delta )\gamma [{\rm{DIP}}]-{\kappa }_{{\rm{P}}}[{\rm{POP}}],$$

(2)

in which *κ*_{dP} is the DOP remineralization rate constant that is a function of temperature defined using a Q_{10} function (*κ*_{dP} = *κ*_{Pθ}Q_{10}^{(T−30)/10}), in which *T* is water temperature from World Ocean Atlas 2018 (ref. ^{54}). *κ*_{Pθ} and Q_{10} are optimized in the inversion. *κ*_{l} is the e-folding remineralization time of DOP_{l}, which is fixed at *κ*_{l} = (1/12) h^{−1}. We tested the sensitivity to a smaller *κ*_{l} = (1/24) h^{−1} and found that the choice of *κ*_{l} did not substantially change the fittings to the tracers but could alter the export flux of labile organic matter. We therefore include different *κ*_{l} values in the uncertainty analysis (see the ‘Uncertainty analysis’ section). *κ*_{g} is prescribed to (1/10^{6}) years^{−1} and is used to set the global mean DIP concentration to the observed global mean concentration ([DIP]_{obs}). *κ*_{P} is a prescribed POP remineralization rate constant (*κ*_{P} = (1/30) days^{−1}). A sensitivity test shows that increases or decreases in the fraction of DOP_{l} production (*δ* in equations (2) and (3)) does not alter the fit to the observational data nor does it change the inferred export fluxes of POC and semi-labile DOC. We therefore set *δ* to be zero in the first-round optimization.

#### Carbon model

The carbon model explicitly simulates seven tracers: DIC, DOC_{l}, DOC, DOC_{r}, POC, PIC and ALK (Fig. 1b). The DIP assimilation rate **G** is converted to the DIC assimilation rate by incorporating a C:P ratio (*r*_{C:P}) that is allowed to vary spatially according to the modelled DIP concentration, *r*_{C:P} = (cc[DIP] + dd)^{−1}, in which cc and dd are estimated as part of the inversion. As in the P model, we set the allocation to DOC_{l} to be zero in the first round of optimization. Subsequently, we prescribe the difference between satellite NPP and model organic carbon production as the production for labile DOC_{l}, so that our model production matches satellite NPP exactly. The fraction, *σ*_{C}, of the organic carbon production allocated to POC and DOC pools is estimated as part of the inversion and does not need to be the same as the fraction *σ*_{P} allocated to the POP and DOP pools. A further adjustable parameter, *η*, is used to control the fraction of DOC that is transferred to the refractory pool by bacterial reworking. The remaining DOC fraction (1 − *η*) is remineralized back to DIC. The e-folding decay times of DOC_{r} (*κ*_{ur} and *κ*_{dr} for the upper and deeper ocean, respectively) are estimated as part of the inversion. POC sinks and is gradually remineralized to DIC in the water column. The downward transport of POC is modelled using a flux divergence operator (**F**_{POC}), which is formulated in the same way as the POP sinking flux-divergence operator **F**_{POP} with independent adjustable parameters *b*_{Cθ} and *b*_{C} that are determined as part of the inversion (Extended Data Table 1). Unlike DIP, DIC experiences sea-to-air gas exchange at the surface. This gas exchange is modelled according to the method used for phase 2 of the Ocean Carbon-Cycle Model Intercomparison Project (OCMIP-2)^{61} using a recalibrated piston velocity (see the next section). Also, freshwater precipitation and evaporation can greatly affect surface ocean DIC and ALK concentrations. Precipitation will dilute, whereas evaporation will concentrate their concentrations. A virtual flux according to OCMIP-2 (ref. ^{61}) is applied to model for the effects of precipitation and evaporation on DIC and ALK (**F**_{vDIC}[DIC]_{s} and **F**_{vALK}[ALK]_{s}, in which [DIC]_{s} and [ALK]_{s} are the mean surface-ocean concentrations of DIC and ALK, respectively).

Production of PIC is modelled to be proportional to the production of POC using two adjustable parameters, *r*_{Si} and *r*_{RR}, that are estimated in the inversion. The parameter *r*_{Si} adjusts PIC production according to silicate concentration in the surface ocean in linear form (*R*_{RR} = *r*_{Si}[SiO_{4}^{4−}] + *r*_{RR}). The downward transport of PIC is modelled using a flux divergence operator (**F**_{PIC}), which generates a PIC flux profile that follows an exponential function *F*_{PIC}(*z*) = *F*_{0}exp((*z* − *z*_{0})/*d*), in which *d* is the PIC dissolution length scale, whose value is estimated as part of the inversion (Extended Data Table 1). Compared with a power-law function, an exponential function with a length scale on the order of several thousand metres leads to a much smaller CaCO_{3} dissolution rate in the shallow water in which CaCO_{3} is supersaturated^{62}. Every mole of PIC production consumes two moles of ALK. By contrast, the dissolution of one mole of PIC releases two moles of ALK (equation (3)). From the perspective of carbon, photosynthesis and remineralization of organic matter do not change alkalinity. However, in the processes of photosynthesis and remineralization, chemical forms of nitrogen change, which influences alkalinity so that a mole of organic carbon production increases alkalinity by *r*_{N:C} moles, whereas a mole of organic carbon remineralization decreases alkalinity by *r*_{N:C} moles. The governing equations for carbon cycling are as follows:

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right][{\rm{D}}{\rm{I}}{\rm{C}}]=-({\bf{I}}+(1-{{\rm{\sigma }}}_{{\rm{C}}}-\delta ){r}_{{\rm{R}}{\rm{R}}}){\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}+\eta {\kappa }_{{\rm{d}}{\rm{C}}}[{\rm{D}}{\rm{O}}{\rm{C}}]+{\kappa }_{{\rm{l}}}[{\rm{D}}{\rm{O}}{{\rm{C}}}_{{\rm{l}}}]+{\kappa }_{{\rm{r}}}[{\rm{D}}{\rm{O}}{{\rm{C}}}_{{\rm{r}}}]+{\kappa }_{{\rm{P}}{\rm{I}}{\rm{C}}}[{\rm{P}}{\rm{I}}{\rm{C}}]+{\kappa }_{{\rm{p}}}[{\rm{P}}{\rm{O}}{\rm{C}}]+{{\bf{F}}}_{{{\rm{C}}{\rm{O}}}_{2}}+{{\bf{F}}}_{{\rm{v}}{\rm{D}}{\rm{I}}{\rm{C}}}{[{\rm{D}}{\rm{I}}{\rm{C}}]}_{{\rm{s}}},$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right]\left[{\rm{DOC}}\right]={\sigma }_{{\rm{C}}}{\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}-{\eta \kappa }_{{\rm{dC}}}\left[{\rm{DOC}}\right],$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{{\bf{F}}}_{{\rm{POC}}}\right]\left[{\rm{POC}}\right]=\left(1-{\sigma }_{{\rm{C}}}-\delta \right){\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}-{\kappa }_{{\rm{p}}}\left[{\rm{POC}}\right],$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{{\bf{F}}}_{{\rm{PIC}}}\right]\left[{\rm{PIC}}\right]=\left(1-{\sigma }_{{\rm{C}}}-\delta \right){R}_{{\rm{RR}}}{\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}-{\kappa }_{{\rm{PIC}}}\left[{\rm{PIC}}\right],$$

$$\begin{array}{l}\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right][{\rm{ALK}}]=-2(1-{\sigma }_{{\rm{C}}}-\delta ){R}_{{\rm{RR}}}{\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}+{r}_{{\rm{N}}:{\rm{C}}}{\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}\\ \,\,\,\,-{r}_{{\rm{N}}:{\rm{C}}}(\eta {\kappa }_{{\rm{dC}}}[{\rm{DOC}}]+{\kappa }_{{\rm{r}}}[{\rm{DO}}{{\rm{C}}}_{{\rm{r}}}]+{\kappa }_{{\rm{l}}}[{\rm{DO}}{{\rm{C}}}_{{\rm{l}}}]+{\kappa }_{{\rm{p}}}[{\rm{POC}}])\\ \,\,\,\,\,+2{\kappa }_{{\rm{PIC}}}[{\rm{PIC}}]-{{\bf{F}}}_{{\rm{vALK}}}{[{\rm{ALK}}]}_{{\rm{s}}}+{\kappa }_{{\rm{g}}}([{\rm{ALK}}]-{\overline{[{\rm{ALK}}]}}_{{\rm{obs}}}),\end{array}$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right]\left[{\rm{DO}}{{\rm{C}}}_{{\rm{l}}}\right]=\delta {\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}-{\kappa }_{{\rm{l}}}\left[{\rm{DOC}}\right],$$

$$\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right]\left[{\rm{DO}}{{\rm{C}}}_{{\rm{r}}}\right]=\left(1-\eta \right){\kappa }_{{\rm{dC}}}\left[{\rm{DOC}}\right]-{\kappa }_{{\rm{r}}}\left[{\rm{DO}}{{\rm{C}}}_{{\rm{r}}}\right].$$

(3)

#### Anthropogenic DIC

To use DIC observations to constrain our inverse model, we have to take into account the changing DIC concentration owing to the invasion of anthropogenic CO_{2} into the ocean. To obtain a self-consistent estimate of the anthropogenic carbon signal, we performed a time-dependent simulation using equation (3). Starting from an assumed steady state, we time-stepped our carbon-cycle model forward in time from 1850 to 2020, using an implicit trapezoid-rule time-integration scheme for all terms except for the gas exchange, for which we used an explicit Euler forward scheme. In this calculation, we prescribed the surface SST according to a time-dependent reanalysis product (ref. ^{63}). The transient integration was carried out with a time step size of Δ*t* = 2 months. The atmospheric *p*CO_{2} was prescribed according to ref. ^{56} from 1850 to 2015 and according to ref. ^{57} from 2016 to 2020. We also simulated δ^{14}C to better calibrate the air–sea gas-exchange velocity as described below. The atmospheric δ^{14}C was prescribed according to ref. ^{64} for the period from 1850 to 2015 and according to ref. ^{65} from 2016 to 2020. To produce the initial conditions, we assumed that the system was in steady state in 1850 and used Newton’s method to find the steady state.

To calibrate the air–sea gas exchange parameterization, we re-optimized the scaling factor in the OCIM2 gas-exchange scheme by minimizing the misfit between our modelled δ^{14}C and the GLODAPv2 δ^{14}C data. See Extended Data Fig. 5a,b for the number of observations as a function of time. To compute the misfit, we sampled our model at the location and times of the bottle measurements in the GLODAPv2 database. Our calibration method followed an iterative two-step process in which we first optimized the air–sea gas exchange through a series of transient carbon-cycle simulations. After obtaining the optimal air–sea gas exchange, we subtracted the excess anthropogenic DIC from the GLODAPv2 measurements to produce an estimate of the natural background DIC for the year 1850. The resulting DIC data and optimal gas-exchange velocity were then used for the optimization of the biogeochemistry model (see the ‘Parameter estimation’ section). The optimized biogeochemical model was then used to produce an updated initial condition for the transient carbon-cycle simulation and a re-optimization of the air–sea gas-exchange velocity. We repeated this two-step process until we obtained self-consistent estimates of: (1) the optimal biogeochemical parameter values (Extended Data Table 1); (2) the biogeochemical state; (3) the scaling factor for the air–sea gas–transfer velocity, *a* = 0.234 cm h^{−1} (m s^{−1})^{−2}; (4) transient DIC; and (5) the transient δ^{14}C signal including the combined effects of radioactive decay, the Suess effect and the bomb radiocarbon signal. Extended Data Figure 5c shows a time series of the excess anthropogenic DIC concentration averaged over the top 100 m of the water column and for the water column below 100 m. By 2020, the vertical DIC gradient is reduced by 20%.

#### Oxygen model

Oxygen production is modelled by applying a ratio of oxygen to carbon (*r*_{O:C}) to the DIC assimilation rate (**G***r*_{C:P}). The ratio *r*_{O:C} is optimized in the process of inversion. We convert the DOC and POC remineralization rates (*ηκ*_{dC}[DOC] + *κ*_{r}[DOC_{r}] + *κ*_{l}[DOC_{l}] + *κ*_{p}[POC]) to an oxygen consumption rate using the same *r*_{O:C} ratio and gradually shut down oxygen consumption as the oxygen concentration falls below the critical value (O_{crit} = 5 mmol l^{−1}) using a hyperbolic equation (*R*([O_{2}]) = 0.5 + 0.5tanh[([O_{2}] − O_{crit})/[O_{2}]_{0}]), in which [O_{2}]_{0} (1 mmol l^{−1}) is used to remove the O_{2} dimension. Sea-to-air O_{2} flux (**F**_{O2}) is modelled according to OCMIP-2 (ref. ^{61}):

$$\begin{array}{l}\left[\frac{{\rm{d}}}{{\rm{d}}t}+{\bf{T}}\right][{{\rm{O}}}_{2}]={r}_{{\rm{O}}:{\rm{C}}}{\bf{G}}{r}_{{\rm{C}}:{\rm{P}}}+{{\bf{F}}}_{{\rm{O}}2}-{r}_{{\rm{O}}:{\rm{C}}}{\bf{R}}(\eta {\kappa }_{{\rm{d}}{\rm{C}}}[{\rm{D}}{\rm{O}}{\rm{C}}]\\ \,\,\,\,\,\,\,\,\,+{\kappa }_{{\rm{r}}}[{\rm{D}}{\rm{O}}{{\rm{C}}}_{{\rm{r}}}]+{\kappa }_{{\rm{l}}}[{\rm{D}}{\rm{O}}{{\rm{C}}}_{{\rm{l}}}]+{\kappa }_{{\rm{p}}}[{\rm{P}}{\rm{O}}{\rm{C}}])\end{array}$$

(4)

in which the matrix **R** is a diagonal matrix, whose elements are given by *R*([O_{2}]).

### Parameter estimation

The 21 adjustable parameters of the model (Extended Data Table 1) were estimated using a Bayesian inversion method. In this approach, the solutions to our model equations define the tracer fields as implicit functions of the adjustable parameters, which we then compare with the observations to construct a likelihood function. We obtain the P, C and O fields by finding the steady-state solutions of the governing equations for the P, C and O models (equations (2)–(4)). Because the governing equations for the P model are linear, their steady-state solution can be obtained efficiently by direct matrix inversion after setting the time derivatives in equation (2) to zero. We fix the atmospheric CO_{2} concentration at the preindustrial level (278 ppm) to compute the preindustrial sea-to-air CO_{2} flux (**F**_{CO2}). The steady-state solution for the C model is solved using Newton’s method because of nonlinearity in **F**_{CO2}. The governing equation for O is also nonlinear because of the hyperbolic function (**R**) that turns off oxygen consumption when oxygen concentration is critically low. We solve the oxygen equations using Newton’s method.

To find the most probable parameter values, we minimize the negative logarithm of the posterior probability function, which is equivalent to minimizing the negative log-likelihood because we log-transformed our parameters (except of the slopes of exponent *b*) so that they have flat priors:

$$f=\frac{1}{2}\left({{e}_{{\rm{DIP}}}^{{\prime} }{\bf{W}}}_{{\rm{P}}}{e}_{{\rm{DIP}}}+{e}_{{\rm{DIC}}}^{{\prime} }{{\bf{W}}}_{{\rm{DIC}}}{e}_{{\rm{DIC}}}+{e}_{{\rm{DOC}}}^{{\prime} }{{\bf{W}}}_{{\rm{DOC}}}{e}_{{\rm{DOC}}}+{e}_{{\rm{ALK}}}^{{\prime} }{{\bf{W}}}_{{\rm{ALK}}}{e}_{{\rm{ALK}}}+{e}_{{{\rm{O}}}_{2}}^{{\prime} }{{\bf{W}}}_{{{\rm{O}}}_{2}}{e}_{{{\rm{O}}}_{2}}\right)+{\rm{const.}},$$

(5)

in which the *e*_{X} are column vectors whose elements are given by the difference between modelled and observed concentrations, *e*_{X} = **H**_{X}[X_{mod}] − [X_{obs}], in which the X label denotes the specific tracer and **H**_{x} is a rectangular matrix that picks out the model grid boxes that have observations of tracer X. Because there are no measurements that precisely separate DOC into different pools according to their lability, we sum all three pools in the model (DOC, DOC_{l} and DOC_{r}) and compare the sum to observations. For DIC, we subtracted our estimated anthropogenic DIC from the bottle measurements in the GLODAPv2 database according to the location and time of measurement (see the ‘Anthropogenic DIC’ section). **W**_{X} is a precision matrix for tracer X and is defined in the following way:

$${{\bf{W}}}_{{\rm{X}}}=\frac{1}{{\sigma }_{{\rm{X}}}^{2}}{{\bf{V}}}_{{\rm{X}}},$$

(6)

in which **V**_{X} is a diagonal matrix with the fractional volumes of the model grid boxes (**V** = diag(Δ*V*_{i}/Σ_{i}Δ*V*_{i}), in which the subscript *i* is the index of the grid boxes that have at least one observation) and \({\sigma }_{{\rm{x}}}^{2}\) is the spatial variance of the observations, that is,

$${\sigma }_{{\rm{X}}}^{2}=([{{\rm{X}}}_{{\rm{obs}}}]-{\mu }_{{\rm{X}}}{)}^{{\prime} }\,{{\bf{V}}}_{{\rm{X}}}([{{\rm{X}}}_{{\rm{obs}}}]-{\mu }_{{\rm{X}}}),$$

(7)

with the spatial mean given by

$${\mu }_{X}=\frac{{{\bf{1}}}^{{\prime} }{{\bf{V}}}_{{\rm{X}}}[{{\rm{X}}}_{{\rm{obs}}}]}{{{{\bf{1}}}^{{\prime} }{\bf{V}}}_{{\rm{X}}}},$$

(8)

in which **V**_{X} is a diagonal matrix with the grid-box volumes and the subscript X represents the grid boxes that have observations of tracer X. The bold **1** represents a column vector. The transpose turns it into a row vector. Thus, the numerator yields the volume integral of X_{obs} and the denominator yields the total volume.

The optimization is conducted using MATLAB’s fminunc function, which is computationally efficient because we can supply hand-coded first and second derivatives of the objective function with respect to the adjustable parameters. The optimization generally takes fewer than 100 iterations. The most probable model parameter values are presented in Extended Data Table 1. Parameter error bars that correspond to ±1 standard deviations are calculated using Laplace’s approximation as described in ref. ^{66}.

### Calculation of carbon flux

The two-dimensional vertical-flux field (*f*_{POC}) is calculated by vertically integrating POC remineralization below the euphotic zone (\({f}_{{\rm{P}}{\rm{O}}{\rm{C}}}={\sum }_{i=1}^{z}{\kappa }_{{\rm{p}}}{\rm{P}}{\rm{O}}{{\rm{C}}}_{i}{\Delta V}_{i}{M}_{i}\), in which *i* represents the index for deep grid, Δ*V*_{i} is the volume of the *i*th grid box and *M*_{i} is a mask that is set to 1 below the euphotic zone and 0 elsewhere). In our model, POC is transported vertically in the water column and is not advected to neighbouring grids. This approximation is appropriate for the coarse horizontal resolution of our model. The non-advective-diffusive flux below the first two layers is calculated on the basis of the following power-law function (also known as the Martin curve function), *f*_{POC}(*z*) = *f*_{POC}(*z*_{0})(*z*/*z*_{0})^{−b}, in which *z*_{0} is the euphotic zone depth and *z* is the depth at which non-advective-diffusive flux is calculated. The exponent, *b*, depends linearly on the surface water temperature^{42}. The estimated slope and intercept (*b*_{C} and *b*_{Cθ}) for this linear relationship are presented in Extended Data Table 1. The advective-diffusive fluxes of labile and semi-labile organic carbon are calculated using an adjoint method, which tracks the export and subsequent remineralization of DOC, as described by ref. ^{31}. Only DOC respired below the depth of the euphotic zone is counted as export. The flux of TOC is the sum of the non-advective-diffusive flux and fluxes from labile DOC and semi-labile DOC. We ignore the export of refractory DOC because of its negligible contribution.

To compare the non-advective-diffusive flux to CMIP6 models at their consensus reference depth (about 100 m), we scale our estimated non-advective-diffusive flux using a power-law function with our optimized temperature-dependent *b* exponents.

To compare our export estimates to the geochemical ANCP estimates that are calculated at the base of spatially varying mMLDs obtained from a CESM simulation^{35}, we estimated the export fluxes of POC and semi-labile DOC to the depth *z* = mMLD. For cases in which mMLD is deeper than our euphotic zone depth, we scaled the fluxes using the power-law function with our optimized *b* exponents. For cases in which mMLD is above the base of the euphotic zone, we did not apply the power-law scaling because it tends to amplify errors. In those cases, we used the export flux at the base of the model’s euphotic zone for the comparison. Note that the contribution of labile DOC is ignored when scaling flux down to mMLD owing to the short e-folding decay time (12 h or 24 h).

### Uncertainty analysis

The uncertainty analysis is conducted in two ways. First, we use a Monte Carlo method whereby an ensemble of parameter values is drawn from a multivariate normal distribution whose mean is given by our estimated most probable parameter values and whose covariance matrix is given by the inverse of the matrix of second partial derivatives of the negative logarithm of the posterior probability distribution, that is, by the Hessian matrix. For each ensemble member, we solve the steady-state model equations and calculate the organic carbon fluxes. However, the parameters are so well constrained that their uncertainties are small, and the flux uncertainties calculated this way are small. Second, because the DIP uptake model is constructed with two different satellite NPP products (SeaWiFS CbPM and CAFE), and the 21 adjustable parameters are optimized for each NPP field (Extended Data Table 1), the influence of NPP fields on export fluxes are much larger than that of parameter uncertainties. Also, the e-folding remineralization time of labile DOC is prescribed at 12 h and 24 h. We, therefore, report flux uncertainties estimated from the results based on different initial NPP fields and on different labile DOC e-folding decay timescales. The distributions of the standard deviation of key outputs are illustrated in Extended Data Fig. 9.

### DOC sequestration time

To calculate the DOC sequestration time, we injected unit DOC pulses in the model euphotic zone and tracked this DOC as it was transported by the circulation, respired into DIC (according to the timescale given in Extended Data Table 1) and then transported back to the surface, where it was rapidly removed with a loss frequency of (1 day)^{−1}. We then spatially integrated the removal rate for each DOC pulse to obtain residence-time distributions for the DOC exported from the surface of each water column.

### Sequestration-time-partitioned distribution functions

To compute the sequestration-time-partitioned distribution functions, we use the three-dimensional organic carbon respiration rate to construct a Dirac δ-function pulse of labelled regenerated inorganic carbon. The resulting tracer field is then transported using the circulation model until it is removed in the 36.1-m-thick surface layer of the model using a loss frequency of (1/500) year^{−1}. We integrate the system forward in time for 10,000 years, by which time all of the regenerated-carbon pulse has left the system. We use a second-order-accurate trapezoidal integration rule starting with a time-step size of less than 10^{−4} years and gradually increase it to 10 years by the end of the simulation. A sequestration-time density distribution function is obtained by globally integrating the loss rate and the cumulative distribution function is then obtained by integrating the density function for progressively longer times. To obtain the cumulative sequestration-time distribution for the stock of regenerated DIC, we first integrate the tracer field over the whole volume of the ocean and then integrate the resulting stock for progressively longer sequestration times. By year 10,000, the resulting integral is equal to the global inventory of regenerated DIC.

Source link