Argo: the challenge of continuing 10 years of progress

In only ten years, the Argo Program has grown from an idea into a functioning global observing system for the subsurface ocean. More than 3000 Argo floats now cover all the oceans from the seasonal ice of the Antarctic to the tropics to the Arctic seas. With these instruments operating on 10-day cycles, the array provides 9000 temperature/salinity/depth profiles every month that are quickly available via the GTS or the internet. Argo is recognized as a major advance for oceanography, and a success for Argo’s parent programs, GODAE and CLIVAR, and for the Global Earth Observation System of Systems. The value of Argo data in ocean data assimilation (ODA) and other applications is being demonstrated, and will grow as the dataset is extended in time and as experience of using the data set leads to new applications. The spatial coverage and quality of the Argo dataset are improving, with consideration being given to sampling under seasonal ice at higher latitudes, in additional marginal seas, and to greater depths. Argo data products of value in ODA modeling are being developed, and Argo data are being tested to confirm their consistency with related satellite and in situ datasets. Maintenance of the Argo Program for the next decade and longer is needed for a broad range of climate and oceanographic research and for many operational applications in ocean state estimation and prediction.


Introduction
Ocean data assimilation (ODA) models of the historical ocean from the 1950s to the 1990s are limited to using subsurface datasets that are "opportunistic" in nature rather than purposefully designed for regular global coverage, and are therefore deficient in several respects.The historical data were collected mostly for regional objectives and were restricted to the tracks of research vessels and commercial ships.These limitations ensured sparseness and inhomogeneity in spatial and temporal data distribution, even in the relatively well-sampled northern hemisphere oceans.South of 30 o S few measurements were collected.In addition to sparseness, the historical data are of uneven quality, with a mixture of instrument types and resultant problems due to systematic errors (e.g.Wijffels et al., 2008).
The Argo Program presents a unique opportunity to correct many of these shortcomings in order to obtain more continuous, consistent, and accurate sampling of the present-day and future state of the oceans.Profiling float technology removes the constraint of having to have a ship present, making it possible to obtain high quality data anywhere at any time.Argo is designed (Roemmich et al., 1999) to observe largescale (seasonal and longer, thousand kilometer and larger) sub-surface ocean variability globally.Its high quality temperature and salinity sensors and its comprehensive data management system combine to produce climate-quality data, with new techniques being developed to identify and minimize systematic errors.Argo has achieved a global array of about 3,000 profiling floats, providing 9,000 temperature/salinity profiles per month (Fig 1), far surpassing its historical precursors in global data coverage and overall accuracy.Here we summarize plans for enhancing Argo's value in the coming years, with attention to ODA applications.Plans include improvements to data coverage through array design and implementation, and improvements to data quality, described in the next section.In section 3 we describe Argo data products that are needed for use in ODA models and for evaluation of those models.Section 4 addresses the need to examine the consistency of Argo and in situ and satellite-derived surface datasets.This is another key issue for integrating global observations through ODA models.

The Evolution of Argo
Argo is a broadscale array, designed to accumulate about 100 profiles per season in every 10 o square of ocean.Mesoscale noise is reduced by averaging over many profiles in a region, to estimate large-scale variability.The design (Roemmich et al., 1999) was based on statistics from satellite altimetric height and from earlier subsurface ocean datasets.The prescribed 3 o by 3 o by 10-day spacing of the array between 60 o S and 60 o N decreases the distance between instruments with increasing latitude, but not as steeply as the statistics of variability indicate appropriate (e.g.Stammer, 1997).The present design is a compromise made to provide a high signal-to-noise ratio for known tropical climate variability (Section 3), including El Niño/Southern Oscillation signals, while taking a more exploratory approach to the high latitude oceans.
Using 5 years of global Argo data accumulated from 2004 -2008, and simultaneous altimetric height data, the Argo design is being revisited.An important question is whether interannual variability at middle and high latitudes is being resolved adequately.Argo should be sustained in order to increase the value of its present 5-year global time-series, but it can also evolve for greater efficiency and effectiveness.
Beyond the design of the Argo array, recent developments in profiling float technology create opportunities for extending Argo's core objectives.Some floats are now active in the seasonally ice-covered zones poleward of 60 o .What should be Argo's sampling plan for the high latitude oceans, and how many additional floats are needed there?Glider technology (Davis et al., 2002) makes systematic sampling of ocean boundary currents a possibility.What are the global requirements for high resolution sampling in the boundary currents and marginal seas, to complement the broadscale Argo array?New sensors for biological and geochemical parameters, for wind and rainfall, and for better sampling of temperature and salinity structure in the ocean's surface layer could increase Argo's value, but also its cost.The addition of oxygen sensors to Argo floats holds high promise for addressing global carbon cycle issues (Gruber et al., 2007) Another key objective in Argo implementation is to minimize systematic errors in the data stream.Two such systematic errors in Argo data have been identified and to a large extent corrected.First, over a period of years, slow drift in conductivity measurements occurs in some floats, due to bio-fouling or other causes.This drift can be corrected through careful statistical comparison of sequences of float salinity values to nearby high-quality profiles (Wong et al., 2003), which might consist of either shipboard CTD data or nearby float data.Second, pressure offsets have been identified in some floats (e.g.Willis et al., 2007, Uchida andImawaki, 2008), resulting from pressure sensor drift and from errors in float software.Such systematic errors, some of which were not anticipated, highlight the need for rapid identification and prompt correction of hardware errors or software flaws.A promising technique for detecting systematic errors is comparison of satellite altimetric height with Argo steric height from sequences of profiles, flagging large differences for more careful examination (Guinehut et al., 2008).Another technique is the use of historical climatologies for identifying statistical outlier profiles and instruments.This capability is being improved as Argo-era climatologies replace the earlier ones with more appropriate mean and variability statistics.The Argo array is not yet "complete" with respect to its original design and objectives.The highest priority for Argo's international partnership is to implement further improvements in data coverage and quality needed to meet these requirements.At the same time as the Argo Program is being improved and maintained for its original goals, extensions to the array should be carefully introduced to increase Argo's long-term value.

The Argo-era global ocean
A key step in demonstrating the value of Argo is to show how well it represents the present-day ocean, including the mean, annual cycle and large-scale variability.In modeling applications, data climatologies are used as initial states for predictive models, or as mean states with known variance, to limit unrealistic model variability and trends.Climatologies based on Argo data are more realistic representations of the modern ocean than historical data climatologies, for several reasons.First, the oceans have changed substantially in the past several decades, becoming warmer overall (e.g.Domingues et al., 2008, Levitus et al., 2005), and with regional changes in temperature/salinity characteristics (Curry et al., 2003, Wong et al., 1999, Boyer et al., 2005).Second, the Argo-era ocean is better sampled than the historical ocean, especially in the southern hemisphere (Fig 3), leading to lower estimation errors.For the first time it is possible to construct mean temperature and salinity fields for the ocean over a given period of time.Historical data climatologies are created by blending regional data collections from different eras, and as a consequence the end products are weighted toward different years in different regions.The sparseness of historical data and their spatial and temporal inhomogeneity make it difficult to assign to assign error bounds to these climatologies (e.g.Roemmich and Sutton, 1998).Finally, care taken to minimize systematic errors in the Argo dataset leads to Argo-only climatologies that are not contaminated by mixing of different instrument types.The difference between this Argo-era mean and WOA01, which is based on data spread over more than 50 years, is especially notable south of 30 o S.There the zonal mean difference is 5 dyn cm between 40 and 50 o S, and differences are 10 cm or more in some areas.Main causes of the Argo-minus-WOA01 differences are decadal change and mapping errors due to sparseness in the historical dataset.For modeling the present-day ocean, Argo climatologies will replace the historical data products to provide initialization and background states that are consistent with the era that is represented.
The effectiveness of Argo in resolving large-scale ocean variability can be tested by using satellite altimetric height as a proxy for steric height.The spatial correlation of steric height is very similar to that of altimetric height, though the former is not yet as well-determined by existing global datasets.By sub-sampling altimetric height fields at the locations of Argo profiles, interpolating the sub-sampled data, and then comparing to the full altimetric height dataset, both the signal and noise of Argo fields can be estimated.Many such sampling experiments can be carried out, to test the impact of Argo's increasing coverage between 2004 and 2007, to test its ability to resolve signals of varying space and time-scales, and others.One such experiment is illustrated in Fig 5 .Here the goal is to estimate Argo's ability to detect interannual variability over 15 years of sustained sampling, by assuming that the spatial coverage of the year 2007 is maintained.The 15-year gridded altimetric height record (Ducet et al., 2000) from 1993-2007 is sub-sampled each year at the location and year-day of the 2007 Argo dataset.The sub-sampled anomalies from the 15year mean and annual cycle are objectively interpolated, and then both full and sub-sampled anomaly grids are smoothed with a 10 o x 10 o x 3-month running mean.After smoothing, the temporal RMS signal is estimated from the full dataset, and the RMS noise from the full-minus-sub-sampled differences.Fig 5 shows the zonal means of the RMS signal and the RMS noise.As expected, the signal-to-noise ratio is highest in the tropics due to enhanced signal and reduced noise.Fig 5 also illustrates how the interannual variability grows with the duration of the dataset as a longer-term mean is estimated and removed.Of course, the error in objectively interpolated Argo data can also be estimated directly, but such error estimates are dependent on details of the spatial correlation statistics used in the interpolation (e.g.Davis, 1998).With imperfect statistics, the altimetric height proxy experiments provide a realistic and independent alternative.
In addition to the interpolated Argo-only dataset (Roemmich and Gilson, 2008) used in Fig 4 for purposes of illustration, similar products are being developed by groups around the world.In order to promote their dissemination and usefulness, the Argo Steering Team (http://www-argo.ucsd.edu) is identifying global Argo analyses that are available for distribution.There is much to be gained from comparing techniques and results of different analyses as well as from developing products for different applications.Most ODA models assimilate interpolated data rather than raw datasets.It is essential to provide these applications with accurate datasets and best estimates of the error covariance.Moreover, while ODA models continue to develop, it is critical to provide individual datasets that are extensive enough for statistical interpolation, to compare with multi-dataset model results and with data-withholding model experiments.

Argo and ocean surface datasets
Argo is the dominant subsurface dataset for the present-day ocean, but ODA models assimilate sea surface datasets as well, including sea surface height, sea surface temperature, and air-sea fluxes of heat, water and momentum.It's important to examine the consistency of these datasets with Argo and with one another where they have complementary or overlapping information content.The ODA models allow for random data errors, but problems may arise when systematic errors create inconsistencies between data types.Sea surface temperature (SST) is estimated from satellite measurements, using ocean surface drifters (at ~1 m depth) and other in situ SST measurements for bias correction (e.g.Reynolds et al., 2002).Argo profiles, which collect their shallowest data around 5 m, are not used in most SST products at present.Questions include the magnitude of stratification between the depth of drifter SST measurements and the shallowest Argo data and whether Argo floats, which are more plentiful than surface drifters, are useful for SST estimation.Similarly, Argo's potential usefulness in combination with a scheduled sea surface salinity satellite mission is of interest.To investigate the issue, the surface drifter and Argo datasets were searched to identify nearby pairs of measurements.There were 21,100 Argo profile/drifter data pairs located within 60 km "scaled-distance" of one another during the period 2004-2008.Here "scaled-distance" includes a time difference term, with 1 day equivalent to 10 km.A global study of SSH and steric height variations during 1993-2003(Guinehut et al., 2006) revealed high correlation between the two with some systematic differences due to barotropic ocean forcing.The consistency of global SSH variability in 2003-2007 with the component changes in Argo steric height and ocean mass (from GRACE) was examined by Willis et al. (2008).The annual cycle in globally-averaged SSH was consistent with the sum of steric and mass-related components.However, the four-year increase in SSH by about 12 mm could not be seen in either component.A longer-time-series is needed to be more definitive.These and other examples illustrates the strong need to close the ocean's mass and heat budgets with careful measurements of all components over an extended period of time.
As an example of the close relationship between SSH and steric height, Fig 7 shows the zonally-averaged annual cycle of both quantities, using AVISO SSH (Ducet et al., 2000) and steric height from Roemmich and Gilson (2008).In spite of the high similarity in the annual variations of SSH and steric height, there are also significant differences between them, for example in the amplitudes at about 10 o N and 40 o S. A good test for ODA models is to see whether they reproduce the annual cycles in datasets with overlapping information content and can successfully rationalize differences such as those seen in Fig 7.

Discussion
A key goal of the Argo Program is to provide a global dataset of value for assimilation by ODA models, but also capable of testing the results of those models.The Argo array now includes about 3000 instruments, providing 9000 globally-distributed temperature and salinity profiles monthly, from the sea surface to midocean depth.Nearly 5 years of Argo data, including 400,000 profiles, have been collected since sparse global coverage was achieved in early 2004, comprising stable estimates of the mean and annual cycle for this period.All data are freely available, with about 90% of profiles accessible at two Global Data Assembly Centers within 24 hours of float-surfacing.Argo's ground-breaking open access data policy is central to the value of the program and to building its international partnership.The Argo dataset has been used in a wide variety of basic research problems, and data quality exceeds original expectations.The Argo Program has made rapid progress in the decade since its planning began.Further increases in float numbers and improved coverage in the southern hemisphere, better ability to identify and correct systematic errors, and greater uniformity in production and release of delayed-mode data are all required to achieve the core objectives of the program.The broad Argo user community is needed to demonstrate the high value of the array, and the international Argo partnership must prove its ability to maintain the array for a decade and beyond.
As the Argo era of quasi-uniform, high quality global sampling lengthens, it is important to review and improve the design and objectives of Argo.Low latitude interannual variability is well-resolved in the present dataset, while additional floats are needed at southern latitudes.Continuing advances in profiling float, ocean glider, and sensor capabilities raise new challenges for expansion of Argo's core activities.
Deeper profiling and sampling of seasonal ice-zones, marginal seas, and boundary currents could all extend Argo's limits.Inclusion of new sensors could add important geochemical and biological dimensions.In each case, energy and other added costs need to be weighed against the benefits, and new resources are needed to cover any new costs.An important challenge for Argo is to expand its constituency by demonstrating the value of the dataset in a growing number of applications while maintaining the high data quality and spatial coverage needed for Argo's core objectives.

Figure 1 :
Figure 1: Map of the number of good Argo profiles obtained in each 1 o x 1 o box in the period from January 2004 to July 2008 (top).Number of floats per month providing good data and number of good profiles per month (bottom, red and green bars).

Figure 2 :
Figure 2: The number of Argo floats per degree of latitude providing good profile data, excluding those in marginal seas is shown by the black line (5 o meridional smoothing).Argo's design requirement for 3 o x 3 o open ocean sampling is shown in red.The blue line indicates what would be required for equal area sampling, multiplying the red line by the cosine of latitude.

Figure 3 :
Figure 3: Location of all 4093 temperature/salinity stations to depths at least 1000 m during austral winter(July/August/September), south of 30 o S from 1950-2000 (red dots, source: World Ocean Database), compared to 6291 Argo station locations (black dots) from July/August/September 2008.

Figure 4 :
Figure 4: Contour lines indicate the steric height of the sea surface from Argo data, 0/2000 dbar (dyn cm), 2004-2008 mean.Color shading indicates the difference in steric height, Argo-minus-WOA01.An example of the differences between Argo (Roemmich and Gilson, 2008) and the WOA01 historical data climatology (Conkright et al., 2002) is shown in Fig 4. Mean steric height from Argo, 0/2000 dbar, is shown averaged over the period 2004-2008, during which the Argo array has had global coverage.The difference between this Argo-era mean and WOA01, which is based on data spread over more than 50 years, is especially notable south of 30 o S.There the zonal mean difference is 5 dyn cm between 40 and 50 o S, and differences are 10 cm or more in some areas.Main causes of the Argo-minus-WOA01 differences are decadal change and mapping errors due to sparseness in the historical dataset.For modeling the present-day ocean, Argo climatologies will replace the historical data products to provide initialization and background states that are consistent with the era that is represented.

Figure 5 :
Figure 5: Zonally-averaged large-scale (10 o x 10 o x 3-months) non-seasonal SSH signal (blue) and Argo sampling noise (red) for 15 years of sustained Argo sampling at the 2007 level, estimated from satellite altimetry (see text).The black line shows how the apparent signal is reduced in a shorter (4 year) record.

Figure 6 :
Figure 6: Argo-minus-surface drifter mean and standard deviation of temperature difference ( o C) as a function of "scaled distance" (see text), in 5 km bins, for 21,100 nearby pairs of observations.
Fig 6 shows the means and standard deviations of Argominus-drifter temperature as a function of distance, sorted into 5 km bins.The number of nearby pairs increases from 214 in the 0-5 km bin to 2,987 pairs in the 55-60 km bin.The mean differences are small, 0.02 o C and less, and not statistically significant.While there may be stratification between 1 and 5 m depth in low-wind daytime conditions, Argo data likely are a good approximation of bulk SST at most times.The comparison (Fig 6) suggests that Argo data may be valuable for SST estimation on a global basis, but this issue needs further study.The relationship of satellite-derived sea surface height (SSH) and steric height variability is central to Argo.

Figure 7 .
Figure 7.The annual cycle of zonally-averaged steric height (0/2000 dbar) from Argo data (left panel), 2004 -2007, is compared to that from altimetric height (right panel, AVISO product) from the same period.Air-sea exchanges of heat and freshwater on seasonal time-scales are nearly balanced by oceanic storage(Gill and Niiler, 1973), with seasonal advection being a small residual term in ocean interiors.Roemmich and Gilson (2008) found good agreement seasonally, on hemispheric and global scale, between the SOC/NOC historical data climatology of air-sea fluxes (Josey et al., 1998) and Argo-derived heat storage.Regionally, maximum seasonal amplitudes in heat storage at 40 o N and 35 o S exceeded the amplitudes of airsea flux by about 25 W/m 2 , possibly due to seasonal displacement of the zonal oceanic boundary current fronts at those latitudes.Comparison of air-sea fluxes of freshwater with oceanic freshwater storage is more problematic.Patterns of freshwater storage are spatially more complex than heat storage and estimates of evaporation and precipitation are subject to large errors.A challenge for ODA models is to exploit Argo's global measurements of salinity for improved estimation of variability in the hydrological cycle.
Fig 2).Indeed, over the southern hemisphere the present Argo array is about 750 floats short of its designed number.This shortfall in float numbers shown in Fig 2, in spite of Argo having achieved 3000 active instruments, is due to a combination of factors.Many floats are deployed in marginal seas or poleward of 60 o , which are of value but were not considered in Argo's original design.Other instruments are not producing good profile data due to technical failures.The latter problem is being corrected through deployment of improved instruments.The former requires a greater commitment of floats by Argo national programs, with increased attention to southern hemisphere deployments.Recent increases in fuel costs emphasize the value of autonomous instruments, but occasional ship visits to the remote regions of the oceans are still essential for float deployment.Argo is producing more profile data south of 30 o S during a single austral winter than in the entire pre-Argo history of oceanography.But it is not yet achieving its ambitious sampling objectives there.