DATA ASSEMBLY AND PROCESSING FOR OPERATIONAL OCEANOGRAPHY: 10 YEARS OF ACHIEVEMENTS

Data assembly and processing centers are essential lements of the operational oceanography infrastructure. They provide data and products nee ded by modeling and data assimilation systems; they also provide products directly useable for app lications. The paper will discuss the role and functions of the data centers for operational ocean ography and describe some of the main data assembly centers developed during GODAE: Argo and C oriolis data centers for in-situ data, SSALTO/DUACS and NAVOCEAN for altimetry and GHRSST for SST. An overview of other data centers (wind and fluxes, ocean colour, sea ice) wi ll also be given. Major progresses have been achieved over the past 10 years to validate, interc alibrate and merge altimeter data from multiple satellites. Accuracy and timeliness of products ha ve been improved and new products have been developed. The same is true for sea surface tempera tur data through the GHRSST Pilot Project. A breakthrough in processing, quality control and ass embly for in-situ data has also been achieved through the development of the real time and delaye d mode Argo data system. In-situ and remote sensing data are now systematically and jointly use d for calibration, validation and long term monitoring of the quality and consistency of the gl obal ocean observing system (e.g. altimetry and tide gauges, Argo and altimetry, in-situ and satell ite sea surface temperature data). Main results wi l be illustrated. We will also review the developmen t a d use of products merging in-situ and remote sensing data (e.g. surface currents, 3D temperature nd salinity fields derived from altimetry, joint use of sea surface temperature and ocean colour and alt imeter data). Future issues and main prospects will be discussed.


Introduction
Operational oceanography critically depends on the near real time availability of high quality in-situ and satellite data with a sufficiently dense space and time sampling. This requires first an adequate global ocean observing system. An initial design of a permanent, global and real time observing system has been proposed in the OceanObs 1999 conference and endorsed by GOOS, GCOS and JCOMM. Specific GODAE requirements have been presented in the GODAE strategic plan (GODAE, 2001) and in Le Traon et al., (2001). They are refined and detailed in Harrison et al. (2008), Wilson et al. (2008) and Oke et al. (2008) (this issue). Satellite data include altimeter, sea surface temperature, ocean colour, sea ice and scatterometer data. The main in-situ data include Argo, data from research vessels and ships of opportunity, surface drifters, gliders, multidisciplinary moorings and tide gauges. There has been major progress over the past 10 years to implement the initial system. The main challenge today is to complete the initial design and to ensure its long-term sustainability. A data processing, validation and dissemination infrastructure must also be set up to make sure that high quality data sets and data products are delivered in a timely fashion according to requirements from modeling and data assimilation centers and other metocean applications. Improvements in data and product serving capabilities are reviewed by Blanc et al. (2008) and Haines et al. (2008) (this issue). This paper will be focused on data processing issues; it will review the major achievements made over the past 10 years to improve the quantity, quality and timeliness of data sets and products needed by global operational oceanography systems.

Data processing issues for operational oceanography
Data processing centers or thematic assembly centers (TAC) are an essential component of the global operational oceanography infrastructure. The quantity, quality and availability of data sets and data products directly impact the quality of ocean analyses and forecasts. Products derived from the data themselves are also used directly for applications (e.g. in case of a parameter observed from space at high resolution and whose measurement noise can be reduced through an adequate data processing). More effective data assembly, more timely data delivery, improvements in data quality, better characterization of data errors, development of new or high level data products are among the key data processing needs for operational oceanography. Thanks to GODAE and related activities, major progresses have been achieved, in particular, for high level processing issues (validation, intercalibration, error characterisation, new products). These improvements are mandatory for an effective use of data in assimilation systems and for services.
The role of data processing centers is to provide modelling and data assimilation centers with the "best" data products required for data assimilation both in real-time and delayed mode. Operational oceanography needs two types of data. Firstly, there are those data delivered in near real time that are required for daily/weekly forecasting activities. Secondly there are data that are subject to greater quality control and delivered in delayed mode. These are particularly valuable data for reanalysis work and to assist seasonal forecasting and climate monitoring/prediction where long term stability is essential. This implies clearly defined quality control procedures, validation processes with error characterization as well as reprocessing capabilities for re-analysis purposes.
The data processing centers should also provide data synthesis, climatologies, high level data products (e.g. merging of multiple satellite data sets). Those high level products are needed for applications and research (e.g. a merged altimeter surface current product for marine safety or offshore applications) and are essential to the long-term monitoring of the state of the ocean (e.g. to derive climate indices on the state of the ocean). They are also useful to validate data assimilation systems (e.g. statistical versus dynamical interpolation) and complement products derived through modelling and data assimilation systems.
Data processing centers should also monitor the performances of the observing system (e.g. data availability, possible degradation of performances and/or sampling) and inform their users on the observing system status. It is also essential to establish strong relationships between data processing centers and data assimilation centers. These are needed, in particular, to organize feedback on the quality control performed at the level of data assimilation centers (e.g. comparing an observation with a model forecast), on the impact of data sets and data products in the assimilation systems and on new or future requirements (e.g. new data products, error characterization). As previously mentioned, data centers should also establish direct links with intermediate or end users to develop products directly useable for applications. Data processing centres thus focus on the science and engineering associated with understanding the information and knowledge content of diverse observations. This approach allows them to provide authoritative uncertainty estimates with observational data products that are critical to their effective use in data assimilation systems.

Main achievements over the last 10 years
Over the past 10 years, the great technological advances in data storage, telecommunications and IT infrastructure at a global level have provided the platform on which real time and near-real time ocean data centres have been considerably enhanced in order to meet the needs of operational oceanography applications. Consolidated and more robust user requirements from ocean and atmospheric forecasting systems have also been derived based in part on the activities of GODAE and data assimilation systems have matured to the point where the use of large observational data sets in an operational context is now feasible. As a consequence, new or improved (e.g. better quality and timeliness, better characterization of data errors) input data sets and data products are now available. We will review here the main achievements. The review is far from being exhaustive and we will focus on a few examples of data centers that were deeply involved in GODAE and strongly developed interfaces with modeling and assimilation centers.

Argo: a breakthrough in data management and data processing
During the past decade Argo has revolutionized the distribution of ocean data within the research community. People used to go at sea, acquire data, process them, submit one or more publications using these data and finally submit them to their national center that will periodically transfer the new data to a World Data Center. WOCE had managed to reduce this exclusivity delay to about 2 years. With Argo, it was stated since the beginning that the data will be free access in real-time both on GTS and Internet to serve the meteorological and oceanographic community, operational and research.
The issue for Argo was, apart from obvious need to populate and maintain the ocean with suitable observing floats, to set up an information system able to provide a single entry point to data processed in national centers applying commonly defined quality control procedures at all step of data processing. Two data streams have been identified: a real-time data stream where data are free from gross errors and may be corrected in real-time when the correction is known. The second stream operates in a delayed mode where data profiles have been subjected to detailed scrutiny by oceanographic experts and the adjusted salinity has been estimated by comparison with high quality ship-based CTD data and climatologies. Two main actors were identified in Argo data management: • DACs (Data Assembly Centers): They receive the data via satellite transmission, decode and quality control the data according to a set of 19 real time automatic tests agreed within Argo. Erroneous data are flagged, if possible corrected and then passed to the two global data centers and on the GTS (Global Telecommunication System). One float is under the responsibility of a unique DAC at all stages of its processing. • GDACs (Global Data Centers), located at Coriolis/France and FNMOC/USA, are in charge of collecting the processed Argo data from the 11 DACs and to provide a unique access to the best version and an Argo profile to the users. Data are available, in a common NetCDF format both on FTP and WWW. The two GDACs synchronize their database every day to ensure they provide access to the same dataset.
This architecture has proven that is was efficient, robust, able to serve both operational and research community and sustainable in the long term as relying on professional data centres. This model has been adopted by other international programs such as GOSUD (Global Oceanographic Surface Underway Data) and OceanSITES (Deep Ocean Eulerian observatories) which have both DACS and GDACS and have extended Argo NetCDF format to handle their data.
A primary objective of Argo's data management system is also to provide climate-quality (delayed-mode) Argo data through combined use of statistical tools and a strong involvement of scientific experts in the quality control process. The central task is estimation of the slow (multi-year) drift in salinity due to biofouling or other causes. Fortunately, the accuracy and stability of Argo salinity sensors exceeds original expectations, with most instruments showing no detectable drift for the first several years of deployment. Successful development of stable low-power salinity sensors by Sea-Bird Electronics, in partnership with the Argo Program, has made high data quality possible. The first step in delayed-mode processing is comparison of a sequence of Argo profiles from each instrument with nearby high-quality data (Wong et al., 2003, Owens andWong, 2008). The high-quality dataset used for this comparison comes primarily from research vessel cruises; this is supplemented by the more plentiful dataset of previously-verified float profiles. Scientific judgment and regional expertise come into play whenever the research vessel data provide ambiguous or possibly outdated information, and if the nearby float data tell a different story ( Figure  2). In addition to salinity adjustment, the quality control process includes additional tests, some still under development, for identification of systematic and random errors. These include (i) comparison of Argo data with climatological means and variability, (ii) comparison of satellite altimetric height with steric height from sequences of Argo profiles  to flag suspect instruments for further examination (see section 3.4), and (iii) comparison of nearby floats ("buddies") of differing type, origin, or age to reveal systematic differences. All of these tests become more useful and accurate as the Argo dataset grows and its statistics are better known.
High level ocean products are also derived from Argo. This includes, in particular, new temperature and salinity climatologies and deep current atlases. Preparation of such products requires a careful delayed mode data analysis and processing. Argo data can also be used to directly derive climate indices (e.g ocean heat and salt content anomalies and trends). These products are essential for ocean climate monitoring; they are also used to validate ocean data assimilations systems. Finally, the use of Argo data in scientific research, including over 100 papers published in 2007 alone, can further test the accuracy of Argo data and its consistency with related observations, sometimes revealing new problems (e.g. Willis et al., 2007).

Coriolis : an in-situ data thematic assembly center for operational oceanography
In situ data are sparse and processed in several data centres internationally. Data access is not always easy, in particular, to meet near real time needs of operational forecasting systems. It has improved in past years with the setting up of Global Data Centres for major observation programs. However further work is still needed to integrate the different data streams into a coherent dataset directly usable by the operational users.
The role of an in situ data thematic assembly center is to assemble high quality data sets, and to provide the "best" in situ products for data assimilation and model validation both in real-time and delayed mode to the monitoring and forecasting centres. This implies a clear definition of quality control procedures, validation processes with error characterisation as well as processing capabilities for re-analysis purposes. It also implies to push/help institutions to speed up their data flow to operational users. Contrary to National Oceanographic Data Centres (NODCs), a TAC deals with a limited number of parameters (mainly physical parameters such as temperature, salinity, current, sea level and a few biogeochemical parameters such as oxygen and chlorophyll-a) on a large area while NODCs deals with a large variety of parameters but acquired by a given country (they are in charge of long term archiving of these national data).
In Europe, Coriolis has set up an in situ TAC that integrates into a single dataset data from international networks (Argo, GOSUD, OceanSITES, DBCP, GTSPP) and European regional data (EuroGOOS Regional Operational Oceanographic systems). It provides two main products: real-time product for forecasting activities, delayed-mode product (updated on a yearly basis) for reanalysis and climate research activities. Over 10 years, the amount of data processed by Coriolis has been multiplied by 10 ( Figure 3).
To be able to provide such products, Coriolis developed and implemented additional quality control procedures that look at the data as a whole and are able to detect suspicious measurements that where not detected by automatic tests, or profiles/time series that are not consistent with their neighbours. Since 2005, Coriolis has also been producing global ocean weekly temperature and salinity fields from Coriolis database using objective analysis. Statistical methods also allow the detection of outliers in a dataset by exploiting mapping error residuals (Gaillard et al., 2008). An alert system has been set up that detect the profiles for which the error is larger than a threshold. Outliers are scrutinized by an operator that is able to make the difference between an erroneous profile and an oceanographic feature such as eddy or front. Coriolis is also setting up complementary validation activities for Argo data: comparison of Argo data (individual profiles and/or their average in space and time) with a recent Argo climatology (mean and variability), comparison with altimeter data (see section 3.4) and comparison with neighbouring floats. Figure 3: 10 days of profile data from Coriolis data base in Sept. 2002 (left) and Sept. 2008 (right).

Altimetry and SSALTO/DUACS
Fifteen years after the launch of ERS-1 and Topex/Poseidon and 10 years after the beginning of GODAE, satellite altimetry has become one of the most essential tools for ocean research and operational oceanography. Satellite altimetry is now used for a wide range of scientific and operational applications.
This maturity results from a large number of improvements in a wide variety of domains (e.g. instrumentation, precise orbit determination, processing techniques, geophysical corrections) by many different teams all over the world. The Science Working Team (SWT, from 1998 to 2003) and the Ocean Surface Topography Science Team (OSTST, since 2004) played an important role allowing the scientific and operational users to exchange, share and propose recommendations and evolutions.
The multi-mission processing of altimeter data developed by CLS as part of the DUACS European project (Developing Use of Altimetry for Climate Studies) started at the same time as GODAE (1997). The system was then integrated to the CNES multi-mission ground segment SSALTO in 2001. SSALTO/DUACS is now maintained, upgraded and operated with CNES funding with shared cost from European Commission projects (MFS, MERSEA, MyOcean). It aims to provide directly usable, high quality near real time and delayed mode (for reanalyses and research users) altimeter products to the main operational oceanography and climate centers in Europe and worldwide. All products are described and available through the AVISO portal (http://www.aviso.oceanobs.com/).
During the last decade, the system has been continuously upgraded. New altimeter missions have been incorporated, starting from the Topex/Poseidon and ERS-1 duo to the high resolution quartet (T/P, Jason-1, ENVISAT, GFO) and now Jason-2. A review of the main SSALTO/DUACS achievements is given below.
It is not possible to describe the evolution of the SSALTO/DUACS system without mentioning first the breakthroughs achieved in several domains such as sensor accuracy, orbit precision, geophysical corrections and reference surfaces. They significantly improved the accuracy and consistency of each altimeter mission with a global error variance reduction of about 36%. 50% of the improvements come from geophysical corrections especially high frequency signals corrections (tide modelling, inverse barometer) and 50% are due to orbit and sensor processing improvements. A complete review was given by Dorandeu et al. (2006) during the 15 years of progress in radar altimetry Venice symposium. Figure 5: 15 years of improvements in altimetry performance, illustration on SSH crossover variance ( from Dorandeu et al., 2006).

Figure 4: SSALTO/DUACS and AVISO, a user-driven altimetry service
Major progresses have been made in higher level processing issues such as orbit error and long wavelength error reduction, intercalibration and merging of altimeter missions (e.g. Le Traon and Ogor, 1998;Le Traon et al., 1998;Ducet et al., 2000). Main processing steps are product homogenization, data editing, orbit error correction, reduction of long wavelength errors, production of along track and maps of sea level anomalies. The merging methodology which is based on optimal interpolation was progressively improved. Initially explored by Ayoub et al. (1997) with Topex/Poseidon and ERS-1 missions, the method was refined by Le Traon et al. (1998) by taking into account explicitly long wavelength errors in the mapping procedure. Ducet et al. (2000) derived for the first time global high resolution maps from T/P and ERS allowing a good description of the mesoscale variability ( Figure 6). Since 2001 and the launch of several altimeter missions (GFO, Jason-1, ENVISAT), the method was upgraded to analyse more than two missions (Le Traon et al. 2002, Pascual et al. 2005. During this decade, the merging method also evolved (e.g. improved signal and error covariance) taking into account improvements in the level 2 data processing (reduction of the orbit error, better estimation of the inverse barometer, better management of the high frequency signal aliasing, etc…).
The mapping method was also used to analyze the mapping capabilities of multiple altimeters and to derive requirements for near real time altimetry (Le Traon and Dibarboure, 1999;Le Traon et al., 2001;Le Traon and Dibarboure, 2002;Pascual et al. 2005Pascual et al. , 2006Pascual et al. , 2007. The main results are summarized in Oke et al. (2008, this issue).

Figure 6: Global MSLA product (left) and Absolute Topography and currents in the Gulf Stream (right)
The SSALTO/DUACS weekly production also moved to a daily production in 2007 to improve timeliness of data sets and products. A new real time product was also developed for specific real time mesoscale applications. The SSALTO/DUACS baseline system is based on the IGDR data (Interim Geophysical Data Record) which are delivered within 2 to 4 days in a nominal situation. Thanks to these accurate data, merged maps are produced with a delay of 5 days. Real time level 2 altimeter data (e.g. the so-called OSDR for Jason and Fast Delivery GDR for ENVISAT, and OGDR for Jason-2) are available in a few hours. These data do not have at all the same accuracy as the IGDR products, so a specific processing was implemented and is now under operational demonstration. As it is not possible to precisely separate the orbit error from the large scale ocean signal in these real time data, a high pass filter is applied to extract the mesoscale signal. The large scale signal is derived from SSALTO/DUACS baseline system (using IGDRs). This assumes quasi-stationary (over 3 days) large scale ocean signals. The innovative sampling of mesoscale from the OGDR/FDGR is thus merged with the precise large scale information from the IGDR. Then, all standard algorithms that unify, homogenise and merge the data are applied to produce along track data and maps.
The mean dynamic topography (MDT) is an essential reference surface for altimetry. Added to the sea level anomalies, it allows us to access to the absolute sea level and ocean circulation ( Figure 6). After a preliminary MDT computed in 2003, a new MDT, called RIO-05, was computed in 2005. It is based on the combination of GRACE data, drifting buoy velocities, in-situ T,S profiles and altimeter measurements. The methodology is detailed in Rio et al. (2005). A first-guess deduced from the differences between a mean sea surface (CLS01, Hernandez et al, 2001) and a geoid (EIGEN-GRACE03S from the GFZ) is computed. This large scale guess is then improved by synthetic MDT data obtained by subtracting from in-situ dynamic heights and drifting buoy velocities concurrent altimeter height and velocity anomalies. The MDT was tested and is now used by several GODAE modelling and forecasting centers. The use of this MDT has a positive impact on the forecast skill. An updated version based on the use of data from the GOCE mission, additional in-situ measurements and new reprocessed data will be elaborated when GOCE measurement will be prepared in 2009. This new version will allow us to obtain an MDT with an unprecedented accuracy.
The sea level information deduced from altimetry enables the monitoring of the ocean variability. Recently, the operational monitoring of ocean climate indicators was set up (e.g. global and regional sea level trends, sea level variations related to El Nino).
These continuous advances have allowed us to provide the research and operational community with accurate, homogeneous and intercalibrated along-track and mapped sea level data. As a result, SSALTO/DUACS is now serving a wide range of users and applications.

Sea surface temperature
Prior to GODAE, sea surface temperature data sets from satellite systems were poorly coordinated. Some large scale SST data processing efforts had been initiated in the USA for the AVHRR sensor set. The AVHRR Pathfinder Project (Kilpatrick et al., 2001) conducted multiple reprocessing of AVHRR data from the mid 1990's and continues through to today with efforts underway to transition the processing capability from the University of Miami to NOAA's National Oceanographic Data Center. In Europe EUMETSAT initiated in 1997 the Ocean and Sea Ice Satellite application Facility (OSI-SAF) project to develop enhanced SST and sea ice data sets for the meteorological and oceanographic communities. During the past ten years, a concerted effort to understand satellite and in situ SST observations has taken place leading to a revolution in the way we approach the provision of SST data to the user community. New passive microwave and highaccuracy infrared radiometers were flown for the first time and the scene was set for a new and exciting infrared and microwave imaging instruments over the following 10 years. GODAE, recognising the importance of SST data sets for ocean forecasting initiated the GODAE High Resolution SST Pilot Project (GHRSST-PP) to capitalize on these developments and develop a set of dedicated products and services (see Donlon et al., this issue).

Satellite instrumentation
A large number of satellite SST products, derived by several different groups or agencies from various different satellite sensors and platforms, has become available in near real-time. The instruments used are all passive sensors (radiometers), measuring the natural radiation emitted by the earth surface and propagating through the atmosphere. Two main families of instruments exist, one measuring the radiation in the infra-red wavelengths, the other one in the microwave wave lengths.

Infra-red radiometers
Since the 70s, operational infra-red radiometers are carried on board meteorological polar orbiting satellites, in particular the AVHRR instrument on board the NOAA satellites. The primary objective of this instrument was the imaging of clouds and weather systems for meteorological applications, but by combining the radiation measurements from infra-red window channels at 3.7, 11 and 12 µm, sea surface temperature can be retrieved in the absence of clouds with a typical accuracy of 0.4 -0.5 K (accuracy estimation derived by computing the RMS difference between collocated satellite and buoy measurements).
These measurements offer a good horizontal resolution (1 km) and potentially a global coverage, with the important exception of cloudy areas. However, their accuracy is limited not only by the radiometric quality of the AVHRR instrument (noise, calibration), but more importantly by the number of atmospheric effects which have to be accounted for: cloud detection, atmospheric correction for water vapour and aerosols.
More recently, infra-red radiometers have been designed specifically for the retrieval of high accuracy SSTs, in particular the European ATSR and AATSR instruments flown on board the ESA ERS and ENVISAT satellites. These instruments have an improved on board calibration, and make use of dual views at nadir and 20° incidence angle for an improved cloud detection and atmospheric correction, leading in principle to an accuracy of 0.2 -0.3 K. The main drawback of these instruments is their limited coverage, due to a much narrower swath than the AVHRR instruments. Nevertheless, the use of AATSR as a reference sensor for all other satellite SST data sets is now gaining acceptance in operational SST analysis systems and for reducing the impact of atmospheric aerosols in single view infrared data sets -particularly geostationary observations in the Eastern Atlantic that are affected by Saharan dust outbreaks (see Figure 7). Since the 90s, the new generation of operational meteorological geostationary satellites (in particular the NOAA GOES and the EUMETSAT MSG series) are carrying radiometers with similar infra-red window channels as the AVHRR instrument. SST products are being retrieved in real-time from these measurements, with a similar accuracy as the AVHRR SSTs. Their horizontal resolution is coarser (3 -5 km), but their great contribution comes from their high temporal sampling (up to 15 minutes), which improves also significantly their geographical coverage (because clouds are moving). The geostationary SST products provide also a unique data source for the study of the sea surface temperature diurnal evolution.
Key developments in data processing of infrared SST data sets over the last 10 years are: • The improved radiative transfer schemes that are used to derive retrieval algorithms by correcting for atmospheric attenuation; • The development and successful application of the dual view 'along-track-scanning' technique to improve atmospheric correction; • the derivation of uncertainty estimates based on the statistical analysis and combination of near contemporaneous in situ and satellite observations; • Improvements in flagging cloudy and aerosol contaminated data; • Improvements in data delivery and timeliness; • Better understanding of SST in the marginal ice zone; • Better in situ validation strategies including sustained (>5 years) deployments of ship-mounted infrared radiometer systems aboard commercial ships • Better synergy between in situ data providers and satellite systems for the placement of in situ drifting buoys; • Quality control of in situ SST data using infrared satellite observations.

Microwave radiometers
Several microwave radiometers have been developed and flown over the last 10 years. These include the Advanced Microwave Scanning Radiometer flown on the Japanses ADEOS I and II platforms and more recently the AMSR-E on board the NASA AQUA satellite. A dedicated low-earth-orbit microwave imager (called the TMI) was also flown on the Tropical Rainfall Mapping Misstion (TRMM) satellite). TMI is interesting as this system is able to sample diurnal variability over a period of ~1 month but has only limited coverage 40°N to 40°S. TMI and AMSR-E are both currently used to retrieve in near real-time SST products. The horizontal resolution of these products is around 25 km, and their accuracy around 0.6 -0.7 K. The great advantage of microwave measurements compared to infra-red ones is that SST can be retrieved even through non-precipitating clouds, which is very beneficial in terms of geographical coverage. However, in addition to their coarser resolution, their main drawback is that they cannot be used close to the coast or the sea ice edge, because of contamination through the signal received from antenna secondary lobes.
Key developments in data processing of microwave SST data sets over the last 10 years include: • Stabilisation of good calibration for passive microwave SST observations through improved algorithm development and knowledge of instrument characteristics; • Excellent throughput of data in near real time with good timeliness; • New uncertainty estimates for each observation; • Better understanding of microwave derived SST in high wind speed regimes; • Improved sea ice products and SST in the marginal ice zone; • Better rain flagging;

The role of the GODAE High Resolution SST (GHRSST) Pilot Project
In 2002, GODAE initiated a GODAE High Resolution SST Pilot Project (GHRSST-PP) to address an emerging need for more accurate high resolution sea surface temperature (SST) products. GHRSST-PP coordinates the harmonisation of a wide variety of SST data streams from satellite and in situ sources that are shared, indexed, processed, quality controlled, analyzed and documented within a Regional/Global Task Sharing (R/GTS) framework implemented in an internationally distributed manner. Large volumes of data (currently over 25Gb per day) and data services are harnessed together to deliver the new generation of global coverage high resolution SST data sets together with meaningful error estimates for each observation or analysis grid. Research and development within GHRSST-PP projects continue to tackle the problems of diurnal variability, skin temperature deviation and validation/verification of GHRSST-PP products. Ensemble analysis products together with anomaly SST outputs are generated each day as part of an ensemble approach to improving analysis systems and providing confidence in analysis outputs (see http://www.ghrsst-pp.org/ for more information). This project, still underway, has been very successful and fruitful, in particular in the following areas : -the definition and adoption by all the satellite SST producers of a set of common definitions for SST (SSTskin SSTsub-skin ,SSTdepth and SSTfoundation, see Donlon et al 2007)), and of common standards for satellite SST products format ("L2P" format, based on NetCDF CF-1.3) and content (confidence levels, bias and error standard deviation estimates, ancillary information about sea ice, wind, solar radiation etc…..), -The design, development and implementation of a Regional/Global Task sharing framework at the international level to share data and develop high-resolution SST analysis data sets.
A full description of the GHRSST-PP is provided in Donlon et al., this issue.

Levels 3&4 data processing
Several new analyzed high resolution SST products have been produced in the framework of GHRSST-PP. As part of the MERSEA project, Ifremer set up the ODYSSEA analysis system producing the global high resolution sea surface temperature fields required by the various ocean models and downstream services, following a methodology jointly defined at European level with Météo-France, Met.No, DMI and CNR. These fields have been being produced daily on a 0.1 degree grid since January 2007. They are estimated by an optimal interpolation method merging SST satellite measurements from both infrared and microwave sensors. The processing scheme is broken down in two main steps. The first step is the collection and preprocessing of all available GHRSST L2P products. The pre-processing consists mainly in a screening and quality control of the retrieved observations from each single datasets and in constructing a coherent merged multi-sensor set of the most relevant and accurate observations (level 3). The merging of these observations requires a method for bias estimate and correction (relative to a chosen reference, currently AATSR). This inter-sensor correction is daily updated using the latest data and each single source dataset monitored through a set of statistical tools allowing us to detect any problem with a source and to continuously upgrade and improve the screening of the input data. The gap free SST foundation field is finally computed from the merged set of selected observations using an objective analysis method. Here again the performances and properties of the MERSEA analysis are estimated and monitored both daily and on a long term basis through a set of evaluation tools. Particular attention is paid to assessing the true product resolution, correct retrieving of the main oceanic features, accuracy, long-term consistency. Appropriate methods were implemented for the overall assessment of the product quality with respect to each of these criteria (see http://www.mersea.eu.org/Satellite/sst_validation.html).
Similarly an operational L4 processing system at the Met Office called the Operational SST and Sea Ice analysis (OSTIA) uses AATSR as a reference sensor together with in situ observations of SST and provides a 6.5 km global analysis every day. This is a UK contribution to the GHRSST-PP with outputs available at http://ghrsst-pp.metoffice.com/pages/latest_analysis/ostia.html.

Other data sets and products
Main GODAE efforts were focused on altimeter, SST and in-situ data needed and used by most GODAE global data assimilation systems. Advances were also made for other data sets and products that are critical for specific applications (e.g. sea ice) or will become more and more important for operational systems (e.g. ocean color, high resolution winds) and/or for reanalysis activities (satellite winds&fluxes).

High resolution winds
To enhance the spatial and temporal resolutions of surface wind, several attempts have been made to merge the remotely sensed data to the operational NWP wind analyses over the global oceans. As part of MERSEA, 6-hourly gridded wind speed, zonal component, meridional component, wind stress and the corresponding components at global scale were prepared merging ECMWP analyses with remote sensing data. The spatial resolution of the resulting wind fields is 0.25° in longitude and latitude (Figure 8). The remotely sensed wind observations are derived from near real time measurements performed by Seawinds scatterometer onboard QuikSCAT satellite and by the three Special Sensor Microwave Imager (SSM/I) onboard DMSP satellites F13, F14, and F15. ECMWF analyses are available at synoptic time (0h; 06h; 12h; 18h) on a regular latitude-longitude grid of size 0.5625°×0.5625°. ECMWF analyses are interpolated in space and time over each satellite swath occurring within 3 hours from the synoptic time. The differences are evaluated at each scatterometer and radiometer wind cell of about 0.25° resolution. The former are used through an objective method to estimate global wind fields retaining first ECMWF-QuikSCAT wind differences in swath regions, and in the temporal and/or spatial QuikSCAT unsampled areas, available and valid observed differences between ECMWF and SSM/I are used. SSM/I retrieved surface wind speed is considered as regionalized variable. It is related on average to QuikSCAT wind speed through a linear relationship determined on one hand from QuikSCAT and buoy and on other hand from SSM/I and buoy comparison results (Bentamy et al, 2002). More details about data, objective method, computation algorithm can be found in Bentamy et al (2006).  For use in data assimilation, as well as for model evaluation, quantified knowledge of the uncertainty and errors of the sea ice analysis is needed. It is important that this information is provided along with the data sets. To say something about the uncertainty one needs independent information. There are however very few in situ observations. The most reliable source for validation is therefore the analysis provided by the national ice services. These services provide manual analysis using SAR data as well as images from optical instruments to derive high resolution ice analysis. An example of evaluation against Ice Service products are given at http://saf.met.no/validation/. There is also an increasing demand for sea ice forecasts based on regional ice-ocean models. This demands higher resolution input than provide by the global PM data. Today such models have started to use manual ice analysis as input, however direct use of high resolution images such as SAR data is a R&D priority.
The data assimilation centers also require sea ice information input for their re-analysis. PM data are available back to 1979 (available e.g. from NSIDC, http://nsidc.org) and is today the most important data source also for this purpose. However, long time series of satellite data now also exist for other instruments e.g. scatterometer and SAR data (back to 1991). In re-processing and re-analysis a lot of care is needed to ensure climate-consistent products. Inter sensor calibration is needed. Calibrating sea ice observations also utilize conservative properties of the ice compared to e.g. atmospheric variations. Although ice coverage and ice motion is now well observed, a weakness of the existing observing system is lack of regular information about the variation in ice volume. Important information is currently extracted from measuring ice type or the ice age dependency of roughness and emissivity. For example can satellite measured ice roughness combined with in situ ice thickness measurements give input to ice mass analysis. However, the expected ice thickness measurements from the advanced altimeter onboard Cryosat-II will be very much welcomed when hopefully successfully launched in 2009.
Ocean colour GODAE systems were focused on the analysis and forecasting of the physical state but they are now evolving towards biogeochemistry and ecosystems. Ocean colour is thus now taking a central importance.
Over the last decade, the applications of satellite-derived ocean colour data have been considerably extended to various disciplines, making important contributions to biogeochemistry, physical oceanography, ecosystem assessment, fisheries oceanography and coastal management. Operational agencies responsible for the protection and conservation of our Seas are increasingly adopting ocean colour radiometry (OCR) to ensure sustainable management of coastal and marine resources, ecosystem health and safe navigation.
The MERSEA project has largely contributed to an international effort to provide an accurate and consistent stream of ocean colour data at a resolution and format compatible with the operational forecasting of the marine environment at global and regional scales (http://mersea.jrc.ec.europa.eu/). The assembled database consists of surface chlorophyll concentrations and diffuse attenuation coefficients, commonly used as indices of phytoplankton biomass and water transparency, respectively. A strong validation exercise ) and refinement of regional algorithms (Volpe et al. 2007) guarantee that this dataset meets requirements for both scientific research and operations. The longest currently-operating ocean colour sensor (SeaWiFS) was launched in 1997. Additional overlapping missions such as MODIS and MERIS, have the advantage of increasing the spatial coverage of the global ocean, otherwise limited with single observations due to sunglint effects or could cover. A combination of three ocean colour satellites can improve the daily ocean coverage by 50% compared to a single sensor [IOCCG 1999]. All these instruments grant access to unprecedented views of the marine systems with a better accuracy than the defunct CZCS, operating from 1978 to 1986. Another challenge, however, is to optimize the information available by combining data from individual OCR sensors with different viewing geometries, resolution and radiometric characteristics (Pottier et al. 2006, IOCCG 2007. The availability of merged datasets through, for example, the ESA GlobColour initiative (http://www.globcolour.info/), allows the users to exploit a unique, quality-consistent, time-series of ocean colour observations, without being concerned with the performance of individual instruments. In the near future, a number OCR satellite missions (e.g., ESA-Sentinel 3, NOAA-VIIRS) are currently approved and will ensure for the next decade long-term and reliable time-series of key oceanic parameters at global scale. This continuity in the data flow is prerequisite to identify and understand possible response of the marine system to forcing factors such as climate change, and to support decisionmaking process, implementing ecosystem-based approaches to the management of oceans and seas.

Joint use of in-situ and remote sensing data : validation
The comparison of in-situ and satellite data is needed to validate satellite data but can also provide indication on the quality of in-situ data. It is also useful to check the consistency between the different data sets before they are assimilated in an ocean model. We provide here examples on the joint use of Argo, tide gauges and altimeter data for validation activities. Other outstanding examples (not shown) include the use of in-situ and satellite SST (see also previous section).
On the use of satellite altimeter data in Argo quality control Delayed mode Argo quality controlled is a challenging task as it requires high quality CTD in the float vicinity (see section 3.1).  have proposed a complementary approach based on the analysis of the consistency between Argo and satellite altimeter data. The method compares co-located sea level anomalies (SLA) from altimeter measurements and dynamic height anomalies (DHA) calculated from Argo temperature and salinity profiles for each Argo float time series. By exploiting the correlation that exists between the two data sets  and an a priori statistical information on their differences, it is possible to use the altimeter measurements to extract random or systematic errors in the  Wong et al. (2008) for more details on the Argo processing). The second example (WMO 1900249 float) shows clearly a progressive drift of the DHA time series regarding the SLA time series as the float is travelling from East to West in the Tropical Atlantic Ocean. Additionally, the correlation between the two time series is null, while it is expected to be larger than 0.5  showing a clear malfunction of one of the sensor.
Despite the very conservative criteria used in , the method appears to be very instrumental at extracting anomalous floats. It is also quite interesting to work on a vertically integrated field such as the dynamic height. It gives a very rapid idea of the behaviour of the float time series. The method should be used as a validation/verification tool but also to quantify the real impact of the adjusted fields (pressure, salinity). The use of simultaneous altimeter measurements is very powerful since these satellite data allow us to monitor the mesoscale and interannual variability of the ocean that validation methods based on comparisons with climatological fields cannot. R is the correlation coefficient between the two times series and rms-diff the differences expressed as percentage of the variance of the altimeter signal. Adapted from .
On the use of the global tide gauge network for the monitoring of the stability of the altimeter missions Satellite altimeter measurements have been widely used to monitor global and regional mean sea level trends (Beckley et al., 2007). The background main assumption considered the perfect stability of the different missions over time. However, some potential drifts have been identified due to geophysical corrections, orbit errors and the uncertainty to link the different MSLA time series. The stability of the altimetric missions is commonly performed by comparing the altimetric sea surface height measurements with those from an array of independent tide gauges (Mitchum, 2000). The basic idea is that differences between tide gauge and altimetric measurement should not have any drift or bias over long time scales. Recent results from Beckley et al. (2007) show that their TOPEX versus tide gauge mean height differences have a residual drift of 0.04 mm/yr. For Jason, they obtained a higher estimation of 0.69 mm/yr. The combined TOPEX, Jason time series show a residual drift of 0.05 mm/yr ( Figure 11). The uncertainty in the estimate is 0.4 mm/yr and it is dominated by systematic error from poorly know vertical land motions (Mitchum, 2000). Additionally, the authors precise that this uncertainty also dominates the error budget of their altimeter-derived rate of global mean sea level rise which is estimated to be 3.36 mm/yr over the 1993 to 2007 period.

Joint use of in-situ and remote sensing data: merged products
Products merging in-situ and satellite data through statistical methods have been developed over the past 10 years by several GODAE groups. They have been used both to validate data assimilation systems and to serve applications. Examples from Mercator-Ocean, Bluelink and NOPP are given hereafter. Other outstanding examples (not shown) are from Navocean with the Modas products.

Mercator-Ocean and CLS Global Observed Ocean Products
Complementary to the Mercator-Ocean model/assimilation systems that provide analyses and forecasts up to 14 days over the global ocean (Drevillon et al., 2008), an observation-based component has been developed . It provides Global Observed Ocean Products (GOOPs) at high temporal and spatial resolution obtained from a combination of in situ and remote-sensing observations. 3-D thermohaline fields (ARMOR-3D) and surface currents fields (SURCOUF) have been developed.
To reconstruct global instantaneous temperature (T) and salinity (S) fields at a weekly period and on a 1/3° Mercator horizontal grid from the surface down to 700-meter depth (ARMOR-3D products), a two-step merging method has been developed to combine the accurate but sparse in situ T/S profiles with the highresolution remote-sensing altimeter and SST measurements (Guinehut et al., 2004;Larnicol et al., 2006) ( Figure 12). The first step consists in deriving synthetic T and S profiles from altimeter and SST data through a multiple linear regression method and a priori covariances. The second step consists in combining the synthetic profiles with in situ T and S profiles using an optimal interpolation method. For the temperature fields, statistics over the years 1993 to 2003 indicate that about 40 % of the temperature variance at depth can be deduced from altimeter and SST and a simple statistical technique ( Figure 13). The combination method then improves the reconstruction of the field by 25 %. For the salinity field, results indicate (as expected) that it is more difficult to infer salinity at depth from altimeter data and simple statistics and show the need for in-situ data. Additionally, these results show, that the optimal combination of in situ and remote-sensing observations is instrumental in reducing the aliasing due to the mesoscale variability and in adjusting the analysed fields to the in situ fields (see also Oke et al. (2008) (this issue)).   Larnicol et al. (2006).
Global instantaneous surface currents fields (SURCOUF products) are computed every day and on a 1/3° Mercator horizontal grid by adding two velocity fields (Rio et al., in preparation, 2008). The first one corresponds to fields of absolute geostrophic velocities as distributed by AVISO (AVISO, 2008). The second one corresponds to fields of Ekman velocities computed from ECMWF wind fields and a two parameter model. Comparisons to drifting buoy velocities show an overall good consistency with an error less than 40 % of the buoy velocity variance in the zonal direction. Higher errors are obtained in the meridional direction as well as in the equatorial band. When the Ekman component is added to the altimetric geostrophic velocities, the differences to the in situ drifting buoy velocities are reduced by more than 5 % (Rio et al., 2008).
Both GOOPs are produced operationally but re-analyses for the full altimetric period are also available. The GOOPs proposed a synthesis of different sources of observations. They provide information both on the physical variables (T, S, currents) and on ocean integrated parameters (transport, heat content, sea level). They are thus quite complementary to model analyses for global ocean monitoring applications and they can be used to characterise the model accuracy and to quantify the contribution of the model dynamic and the assimilation system. An illustration is given for the inter-comparison of surface currents in the Indian Ocean ( Figure 14). Both SURCOUF analyses and Mercator-Ocean PSY3 outputs (Drevillon et al., 2008) capture well the seasonal variability of the mean features of the surface circulation of the Indian Ocean as described by the climatology (Schott and Mc Creary, 2001). The combination of the ARMOR-3D and SURCOUF products can also provide very useful global 3-D velocity fields (Rio et al., 2008).

SURCOUF PSY3 Climatology
Figure 14: Surface current in the Indian Ocean from SURCOUF analysis, PSY3 forecasting system and a climatology. Top for the summer season (July-Aug 2005), Bottom for the winter season (Jan-Feb 2006). Adapted from Larnicol et al. (2006).

Bluelink altimeter and SST merged products
In June 2007, two groups of observers (one scientific, one industry) noticed that particularly strong currents were occurring on the continental slope near Ningaloo Reef, Western Australia. Long periods of clear skies gave the opportunity to view the motion of the water at high resolution (2km) by animating HRPT AVHRR thermal imagery (http://www.marine.csiro.au/remotesensing/oceancurrents/ Ningaloo2007.AVI). The general clockwise circulation of the eddy was resolved by multi-mission altimetry but the details of the sub-mesoscale features believed to be associated with the peak velocities were not.
In May 2008, industry attention was again focussed on the current flows over the continental shelf near Ningaloo Reef (Figure 15). Coincidentally, a Surface Velocity Programme drifter transited the area of interest, providing valuable ground-truth data that was clearly consistent with the SST imagery, but not with some of the routinely-generated altimetry maps, which were clearly suffering with only two altimeters (Jason-1 and Envisat) in use. The Maximum Cross Correlation (MCC) method estimates velocities by comparing pairs of thermal images (Bowen et al., 2002). Suitably filtered, these scattered velocity estimates can be used as a gradient constraint when mapping coastal and altimetric sea-level observations. Doing this brought the geostrophic velocity into agreement with the drifter. More work is required on the filtering step, however, before we include this technique in our automatic system. Maps like Figure 15 are very popular with a wide range of Australian mariners all around Australia. One group of people very grateful for a detailed map of the very high velocities of the East Australian Current was a group rowing across the Tasman Sea from New Zealand in late 2007. Figure 16 shows how a fortuitous positioning of the eddies, and knowledge of this, gave them a rapid passage to Sydney. The other way that SST and altimetry are used routinely together is for the estimation of the sub-surface temperature field (see previous section). This technique works particularly well in the eddy field of the Tasman Sea, as exemplified by Figure 17. The subsurface temperature field is used operationally by the Australian Fisheries Management Authority, via a habitat preference model, to set fishing boundaries, thereby reducing by-catch of high-value species like Southern Bluefin Tuna (Hobday and Hartmann, 2006).

NOPP Ocean Surface Current Analyses Real-time (OSCAR)
The OSCAR project was initially established in response to a US National Ocean Partnerships Program (NOPP) call in 2001 to develop new operational applications for satellite sea level and vector wind measurements. The project focused first on the tropical Pacific as a pilot study. It became operational in 2002 providing and displaying satellite-derived surface current data in various formats for researchers and the general public via the website (http://www.oscar.noaa.gov, Figure 18). Since then, the project and later developments and applications have been extended to the global ocean. The main partners of this endeavor include NOAA National Environmental Satellite, Data, and Information (NESDIS), NASA, University of South Florida (USF), the FSU Center for Ocean-Atmospheric Prediction Studies (COAPS) and the Naval Research Laboratory (NRL). OSCAR is the result of a long-term effort by all its partners, with prior support under the Topex/Poseidon Extended Mission (TPEM) science team, NOPP, and the present cycle of the Ocean Surface Topography Science Team (OSTST).
The approach to computing the surface currents from satellite sea level height and wind observations is a straightforward combination of quasi-steady geostrophic and wind-driven dynamics (Bonjean and Lagerloef, 2002). The geostrophic term is computed from the gradient of surface topography anomaly fields that have been gridded from along-track satellite altimeter sea level anomaly records and superimposed on a mean global ocean surface topography. Wind-driven velocity components are computed from an Ekman/Stommel formulation with variable viscosity, and the OSCAR model also includes a minor thermal wind adjustment using satellite SST data. Improvements to the Ekman/Stommel formulation are the subject of an ongoing study based on the QuikScat scatterometer wind data, NCEP and ECMWF wind reanalyses and the surface velocity program (SVP) drifter data. With this model, 1°x1° degree Eulerian vector fields have been generated every 5 days over the global ocean. A higher-resolution calculated field with a 1/3° resolution will be released in October 2008. The tropical Pacific field (the initial focus of the project) was evaluated with more sparsely sampled SVP drifters, moored current meters, and ADCP transects . Major improvements have been implemented, such as the integration in the processing system of the AVISO sea level height altimetry product, the extension of the calculation to the global ocean, and the development of a quasi-real time analysis. The OSCAR surface currents are routinely validated globally with surface drifter data, and the satellite-derived currents are also systematically compared to mooring data, including TRITON/TAO, PIRATA and Indian Ocean (RAMA) mooring data. All of these validation results are provided from the main OSCAR website, which receives an average of 50,533 hits per month.
These analyses have contributed significantly to understanding the role of surface transports in the genesis of the 1997/98 El Nino and the 2002/03 El Nino. For several years, monthly tropical surface current analyses have routinely been provided to the NOAA Climate Diagnostics Bulletin. In addition to the ENSO-related applications, the OSCAR analyses have supported a wide range of studies, such as the role of salt transport in tropical freshwater, the influence of tropical instability waves on phytoplankton blooms in the Pacific, and the mixed-layer temperature balance. In the tropical Atlantic, studies have addressed equatorial divergence and upwelling dynamics, interannual equatorial long waves and the mixed-layer salinity balance. The surface current field was also used to assess the seasonal cycle of diabatic heat storage in sub-regions of the Pacific and to correct surface heat flux estimates; the spatial and temporal context of the seasonal-to-interannual timescale currents was examined in order to study intraseasonal variability and dynamics in the tropics. The OSCAR analyses have also been used to assess the surface current field from the oceanic data assimilation systems of NCEP and ECMWF.
Efforts under the OSCAR project have been dedicated to collaborating with educators on a NASA-funded educational web site which subject is the ocean motion and surface currents (http://oceanmotion.org ).

Conclusion and prospects
Over the past 10 years, capabilities of data assembly and processing centers have been dramatically improved. New or improved data sets and products needed by the modeling and data assimilation systems and for applications have been developed. Accuracy and timeliness of products have also been improved.
In-situ and remote sensing data are now jointly used for calibration, validation, consistency analysis and to derive merged products that provide complementary information to modelling and data assimilation products.
There are still a series of advances in data processing that are expected to impact operational oceanography and its applications. Continuous improvements are needed so that data sets and products evolve according to requirements from modeling and data assimilation systems. As the global ocean observing system evolves, data processing systems must also be ready to incorporate the new data sets and develop quality control, validation and high level products. This holds, for example, with the development of biogeochemical in-situ sensors (e.g. oxygen, Chl-a) on Argo floats or gliders. New satellite missions for sea surface salinity (SMOS, Aquarius) and gravity (GOCE) and very high resolution altimetry (SWOT) will require innovative data processing techniques. There is also a need to improve the data assembly of key data sets such as velocity data (drifters, ADCP, Argo floats) and to compare and merge them with satellite information (altimetry, scatterometry but also surface currents derived from sea surface temperature and ocean colour data). New theoretical frameworks should also allow us to better exploit the high resolution information in satellite observations (e.g. Isern-Fontanet et al., 2006).