Optimal Planning and Sampling Predictions for Autonomous and Lagrangian Platforms and Sensors in the Northern Arabian Sea

USAGE Permission is granted to copy this article for use in teaching and research. Republication, systematic reproduction, or collective redistribution of any portion of this article by photocopy machine, reposting, or other means is permitted only with the approval of The Oceanography Society. Send all correspondence to: info@tos.org or The Oceanography Society, PO Box 1931, Rockville, MD 20849-1931, USA. Oceanography THE OFFICIAL MAGAZINE OF THE OCEANOGRAPHY SOCIETY


INTRODUCTION
Since the early days of ocean science, fixed platforms or expeditions with vessels have been used to collect varied observations (e.g., Stommel, 1989;Dickey, 2003;Schofield et al., 2010).In the past decade, adventurous research expeditions have been augmented with varied robotics-based platforms and sensors (e.g., Rudnick and Perry, 2003;Bellingham and Rajan, 2007;Nicholson and Healey, 2008).These include vehicles with a relative speed, such as propelled autonomous underwater vehicles (AUVs), underwater gliders, wave gliders, solar-powered vehicles, and surface craft.The experiments also involve vehicles with no or limited relative speeds such as autonomous drifters, floats, hybrid profilers, and semi-drifting surface craft.
A critical benefit of robotics platforms is that they reduce the undersampling of ocean surveys and augment satellite data.However, they are significantly affected by dynamic ocean motions, such as currents and waves.Their ocean measurements are thus a mixture of Eulerian (fixed or not affected by currents) and Lagrangian (current-following) observations.These facts are crystallized in the name "autonomous and Lagrangian platforms and sensors (ALPS), " introduced by pioneering researchers (e.g., Rudnick and Perry, 2003;Dickey et al., 2008, and references therein).To best account for environmental effects, ALPS systems should become "expert systems" (e.g., Jackson, 1998) that plan and optimize their motions, using and integrating predictive models, control algorithms, uncertainty estimates, and data assimilation.Such integration of disciplines was initiated by the Autonomous Ocean Sampling Network (e.g., Curtin et al., 1993;Rudnick et al., 2004;Leonard et al., 2007Leonard et al., , 2010;;Lermusiaux, 2007;Ramp et al., 2009;Curtin and Bellingham, 2009), and it remains an area of active research.
For a reproducible scientific sampling approach, predicting where, when, and what to sample (i.e., optimal ocean sampling), and predicting which sampling locations can be reached and how to best reach them (i.e., reachability and optimal path planning), should be done quantitatively and using fundamental principles.Such principled autonomous sampling then becomes part of the emerging discipline of the "Science of Autonomy" (Steinberg, 2006;Lermusiaux et al., 2016).In simple terms, the Science of Autonomy is the systematic development of fundamental knowledge and reproducible methods about autonomy.This article reviews concepts and illustrates some of our recent progress using this approach, based on real vehicles operating in a large open-ocean basin, within the context of the Northern Arabian Sea Circulationautonomous research (NASCar) program.This US Office of Naval Researchfunded program employs a variety of ALPS systems to investigate the dynamics of the northern Arabian Sea (Centurioni et al., 2017, in this issue).One of the goals of our NASCar contribution is to apply our theory and schemes for optimal path planning and optimal ocean sampling with swarms of autonomous vehicles.The collaborative applications that we describe are results toward this goal.
Effective predictive models of the relevant ocean dynamics and efficient uncertainty prediction and data assimilation schemes are crucial for quantitative optimal path planning and sampling.It is the advances in these disciplines that allow the present results, especially the progress in multiresolution numerical ocean modeling (Robinson et al., 2003;Deleersnijder and Lermusiaux, 2008;Deleersnijder et al., 2010;Haley and Lermusiaux, 2010;Cushman-Roisin and Beckers, 2011;Ringler et al., 2013;Haley et al., 2015;Burchard et al., 2017) and in ensemble uncertainty prediction and Bayesian data assimilation (Lermusiaux and Robinson, 1999;Lermusiaux et al., 2006a,b;Lermusiaux, 2006;Bocquet et al., 2010;Särkkä, 2013;Sondergaard and Lermusiaux, 2013a;Reich and Cotter, 2015;Lolla and Lermusiaux, 2017a).These advances will be employed next.We refer to the articles cited for the necessary details.
In this article, we first provide definitions and review fundamental equation-based results for reachability, path planning, and adaptive sampling.We then showcase results of the February-April 2017 real-time forecasting and ABSTRACT.Where, when, and what to sample, and how to optimally reach the sampling locations, are critical questions to be answered by autonomous and Lagrangian platforms and sensors.For a reproducible scientific sampling approach, answers should be quantitative and provided using fundamental principles.This article reviews concepts and recent progress toward this principled approach, focusing on reachability, path planning, and adaptive sampling, and presents results of a real-time forecasting and planning experiment completed during February-April 2017 for the Northern Arabian Sea Circulation-autonomous research program.The predictive skill, layered fields, and uncertainty estimates obtained using the MIT MSEAS multi-resolution ensemble ocean modeling system are first studied.With such inputs, deterministic and probabilistic three-dimensional reachability forecasts issued daily for gliders and floats are then showcased and validated.Finally, a Bayesian adaptive sampling framework is shown to forecast in real time the observations that are most informative for estimating classic ocean fields and also secondary variables such as Lagrangian coherent structures.
planning experiment in the interior northern Arabian Sea during which we predicted (1) high-resolution ocean fields and their probability; (2) reachable sets, reachability fronts, and time-optimal paths for underwater gliders and floats; and (3) the uncertainty of such reachability fields and optimal paths.Finally, we illustrate the use of our probabilistic modeling and Bayesian adaptive sampling theory to predict the locations that are most informative for estimating ocean fields or revealing coherent structures.

PLANNING FUNDAMENTALS FOR EXPERT ALPS SYSTEMS
We now review theories and methods for reachability, path planning, and adaptive sampling for expert ALPS systems operating in dynamic ocean currents.As schematized in Figure 1, quantitative planning for ALPS takes into account ocean currents and platform constraints (e.g., maximum speed, battery) and involves accurate predictions of the locations or regions that can be reached by the autonomous platforms (i.e., the reachable set), the optimal launch and recovery, the optimal observations to be collected to achieve the experiment's scientific goals, and the optimal paths for ocean vehicles to achieve these goals.Some key questions are: (1) What are the locations that can be reached by an autonomous platform (i.e., reachability)?(2) What are the most informative locations and times at which measurements should be made to achieve the scientific goals (i.e., adaptive sampling)?(3) What are the optimal paths to reach these locations in desired times and in a safe fashion (i.e., path planning)?(4) Are the equations and methods answering these questions practical for real-time expert ALPS systems?In what follows, we focus on our own progress toward answering these questions by developing and using fundamental principles.Our quantitative planning utilizes prior knowledge in the form of simple fundamental relations, conservation laws, and mathematical models.This knowledge is combined with computational and decision-making capabilities, with the aim of obtaining expert systems (Jackson, 1998;Russell and Norvig, 2009).For reviews on marine path planning and adaptive sampling, we refer readers to Leonard et al. (2007), Lermusiaux (2007), Roy et al., (2007), Paley et al. (2008), Lermusiaux et al. (2016, and in press), and Subramani et al. (2017), and the references cited therein.Other results that are useful for marine autonomy but are not covered here include nested autonomy (e.g., Schmidt et al., 2016); underwater navigation, communication, and motion control (e.g., Leonard and Bahr, 2016); and ocean sensing and underwater networks (e.g., Curtin and Bellingham, 2009;Venkatesan et al., 2017).

Path Planning
The prediction of paths along which specific criteria (e.g., time, energy, data collected, and/or safety) are optimized is called path planning.Traditionally, path planning has been developed for robots in complex but mostly static environments.Most planning methods are based on graphs and dynamic programming, fast marching schemes, wave front expansions, potential fields, nonlinear programming, nonlinear optimization, evolutionary algorithms, case-based reasoning, or dynamics-based approaches.Some of these methods consider uncertainties in the ocean estimates, usually by computing the deterministic plan for an ensemble of realizations.However, because ocean currents can be both highly dynamic and strong when compared to ALPS nominal speeds, many methods provide inaccurate solutions or are too expensive for real-time use with ALPS systems.Consequently, we developed theories and schemes that provide exact and practical solutions.They are reviewed next and will be exemplified later in this article (Applications section) by novel applications in the NASCar region.

REACHABILITY
As schematized in Figure 1, we studied the problem of path planning through the fundamental concept of reachability (Lolla et al., 2014a,b, and references therein).The set of locations that can be reached is the reachable set, and its boundary is called the reachability front.The net motion of autonomous vehicles is the sum of the forward thrust and advection by dynamic currents.Based on these first principles, we developed a level-set partial differential equation (PDE) that governs the reachability front of a vehicle moving in strong and dynamic flows.T i * τ Reachability Front at time T i * FIGURE 1. Schematic of reachability, path planning, and adaptive sampling.Reachability sets are predicted for a vehicle operating at nominal speed within dynamic currents.The total vehicle velocity is the vector sum of currents at that location and the nominal velocity.Optimal most informative measurements are taken at each optimal position x i * and time T i *.Predictions can be done in two or three dimensions, and for deterministic or stochastic currents.For stochastic currents, we employ ensembles or efficient dynamically orthogonal equations to evaluate distributions of reachable sets, fronts, and optimal paths.Adaptive sampling aims to predict the location x i * and time T i * of the observations to be collected over the sampling period [0,τ] that are most informative about the to-be-estimated/verification variables.Here, only a single future fixed verification time is sketched, but the estimation time(s) can be in the past or cover a period of time.

OPTIMAL LAUNCH, PICK-UP, AND INTERCEPTION
To save time and costs in ocean sampling, the launch, pick-up, and/or interception of autonomous platforms should be efficient.The use of dynamic reachable sets (Figure 1) is directly applicable to these problems.They provide exact solutions, and their utilization has been exemplified for time-optimal interceptions by surface craft, including natural and vessel wave effects (Mirabito et al., in press) and for pursuit-evasion problems (Sun et al., 2017a,b).

TIME-AND ENERGY-OPTIMAL PATHS
The first time at which the reachability front reaches the target end point or optimal sampling location is the fastest travel time (Figure 1).The time-optimal headings are normal to the reachability fronts.The corresponding time-optimal path can then be extracted from the dynamic reachability front prediction by solving a particle backtracking ordinary differential equation (ODE; Lolla et al., 2014b).These equations were applied to gliders and AUVs for reachability and path planning, and coordinated path planning of swarms, in realistic current simulations (Lolla et al., 2014a(Lolla et al., , 2015)).They were theoretically extended to anisotropic motions and to fully three-dimensional paths, including planning for floats or other vehicles with constrained relative motions.They were also used successfully with real AUVs at sea (Subramani et al., in press; Edwards et al., in press) and extended to uncertain stochastic currents (Wei, 2015) and to energy-optimal path planning (Subramani and Lermusiaux, 2016;Subramani et al., 2017).

COMPUTATIONAL COSTS
The computational time for all above applications is much shorter than the mission time.For example, for reachability and time-optimal planning, the computations take seconds to minutes to plan missions that last days to weeks, and the cost only grows linearly with the spatial resolution.Even for large stochastic energy-optimal planning run on a single CPU, the computational time is less than the time required to integrate modern numerical ocean modeling systems.Optimal paths can thus be recomputed multiple times as new observations arrive, hence allowing feedback and onboard routing.

Adaptive Sampling
Ocean science observation campaigns are designed based on hypotheses or questions to be answered.As schematized in Figure 1, observations are to be collected over a period of time, often referred to as the sampling period.The observational data should provide the most information about the chosen hypotheses or questions, at a fixed time or over a period of time.These data can be utilized on their own, but are in general combined with prior knowledge, including conservation laws and model equations.Observations are then utilized to improve state variable estimates or field predictions, to infer model parameters, to test model formulations, or to estimate secondary variables such as energy, enstrophy, spiciness, or coherent structures of the flow (Figure 1).Given that platforms and sensors at our disposal are limited, the goal of adaptive sampling is thus defined here as guiding the ALPS system optimally so as to collect the observations that are most informative about the scientific objectives.
Practical adaptive sampling is directly linked to path planning: adaptive sampling requires not only identifying the most informative data types, locations, and times but also planning feasible optimal paths.This latter path planning serves two purposes.First, it determines the reachable regions of the platforms that limit the set of possible candidate measurements.Second, after the most informative data are determined, path planning guides the platforms so that they navigate along the optimal paths to collect the desired data.Because each measurement will change both our ocean prediction and the expected information content in later candidate observations, the planned paths and the computed information content should be updated as data are collected.
Previous results on ocean adaptive sampling include methods that involve some linearizations of the system dynamics, such as in the use of adjoint equations for sensitivity analysis or of optimal perturbations and singular vectors.Other methods with limited or no linearizations utilize breeding, ensemble subspaces, ensemble transforms, mixed-integer programming, potential functions, or genetic algorithms.Reviews are provided in Lermusiaux (2007) and Lermusiaux et al. (in press).Note that in their present usage, these methods neglect the non-Gaussian properties of the dynamics.
We recently formulated a novel and rigorous framework by combining uncertainty quantification, Bayesian data assimilation, and information theory.For uncertainty predictions, we employ the variance-optimal reduced-order stochastic dynamically orthogonal (DO) equations (Sapsis and Lermusiaux, 2009;Ueckermann et al., 2013;Feppon and Lermusiaux, in press).To exploit the non-Gaussian statistics, we utilize the Gaussian Mixture Model (GMM)-DO filter (Sondergaard and Lermusiaux, 2013a,b) and smoother (Lolla and Lermusiaux, 2017a,b).To rigorously measure the information content of varied observations in a principled way, we employ mutual information (Cover and Thomas, 1991).Mutual information quantifies the amount of information a variable has about another variable.For reachability computation and optimal navigation, we utilize the previously mentioned path planning equations.Finally, we use dynamic programming to solve the information content optimization problem exactly and so predict the optimal future sampling strategies (Lolla, 2016).

APPLICATIONS
We now illustrate some of the capabilities of the theory and methods reviewed in the last section for planning ALPS operations in the interior northern Arabian Sea during February-April 2017, within the context of the NASCar program.We first describe results on predictive skill, dynamics, and uncertainty obtained using our real-time multiresolution ensemble ocean modeling.The corresponding forecast fields and their probabilities were then input to our path planning and adaptive sampling studies (Figure 2).For the path planning, we exemplify our deterministic and probabilistic three-dimensional reachability forecasts issued daily for gliders and floats as they operated in the region (see later section on Probabilistic Eulerian-Lagrangian Reachability for Gliders and Floats), using the level-set PDEs introduced in the Path Planning section.For the adaptive sampling, we showcase how computing machines using our Bayesian schemes can forecast in real time the observations that are most informative for estimating classic ocean fields and also secondary variables such as Lagrangian coherent structures (Bayesian Adaptive Sampling section).

Multiresolution Ensemble Ocean Forecasting and Dynamics
The MIT Multidisciplinary Simulation Estimation and Assimilation System (MSEAS) was utilized for three months for real-time multiresolution simulations (Lermusiaux et al., 2017).Capabilities employed involved implicit two-way nesting for realistic tidal-to-mesoscale modeling with hydrostatic primitive equations (PEs) and a nonlinear free surface (Haley and Lermusiaux, 2010;Leslie et al., 2010;Haley et al., 2015) and Error Subspace Statistical Estimation (ESSE) for stochastic uncertainty predictions (Lermusiaux, 1999;Lermusiaux et al., 2002).

MODELING SETUP
The NASCar modeling domain (Figure 2a) in the northern Arabian Sea had a 1/60° horizontal resolution and 70 vertical levels with optimized level depths (e.g., higher resolution near the surface or large vertical derivatives).The bathymetry was obtained from the 15 arc-seconds SRMT15 data (Smith and Sandwell, 1997).Initial conditions were downscaled from 1/12° HYCOM (Hybrid Coordinate Ocean Model) analyses (Cummings and Smedstad, 2013) via optimization for our higher-resolution coastlines and bathymetry (Haley et al., 2015).Tidal forcing was computed from the highresolution TPXO8-Atlas from Oregon State University (Egbert andErofeeva, 2002, 2013), again with adjustments to our higher-resolution geometry and quadratic bottom drag.For atmospheric forcing, we employed the wind stress and surface freshwater flux from 1/4° NCEP GFS (National Centers for Environmental Prediction Global Forecast System) one-hourly forecasts (Environmental Modeling Center, 2003) and the net heat flux from 1/2° NAVGEM (Navy Global Environmental Model) three-hourly forecasts (Hogan et al., 2014).

DYNAMICS AND FORECAST SKILL
The MSEAS PE modeling system and its physical and numerical parameters were calibrated for the regional dynamics using more than 600 deterministic simulations and data-model comparisons.Once calibrated, three-to four-day forecasts were issued daily for three months.As we will illustrate, we found that the higher resolution and tides in our calibrated system resolved additional processes and improved the skill of lowerresolution ocean fields.This result is not direct because increasing resolution and adding dynamics can easily increase uncertainties and reduce the skill of deterministic forecasts.
One such example of additional dynamics is the formation of layering in upper-ocean salinity in the form of interleaving subsurface layers of high and low salinity, as observed by Seaglider SG133 during March 4-7 (Figure 3c).This layering was largely absent in the initial conditions from HYCOM (Figure 3a); however, it was created within the higher-resolution MSEAS PE as the forecast simulation progressed (Figure 3b).Similar findings were also obtained by comparisons of model forecasts to another Seaglider (SG137; not shown in figure).This indicates that the MSEAS-PE additional dynamics and increased resolution (1/60° and 70 optimized levels) were likely needed to develop and maintain the complex layered features observed in the region.The addition of tidal forcing also generated internal tides and waves in the western part of the domain at Socotra Island, as well as on nearby ridges (not shown).
To evaluate the effect of model resolution on the forecast skill, Figure 4a-c compares the salinity profiles from the initial conditions for March 11 (Figure 4a) and the MSEAS PE three-day forecast for March 14 (without data assimilation; Figure 4b) against in situ Argo profiles collected during March 14 in the region (Figure 4c).With the increased MSEAS PE resolution, the salinity forecast bias is reduced by 50% and the root mean square error (RMSE) by 25% when compared to the lower-resolution initial conditions.A significant improvement in the salinity is seen in the 100-300 m depth range (driven in part by the formation of the layering).Overall, forecasts showed that high resolution in the model was needed to develop and maintain such layered features.The assimilation of observations further improved the higher-resolution forecasts.Figure 4d-f shows this impact of assimilating the Argo profile data for one week prior to issuing forecasts.Specifically, we compare two higherresolution MSEAS PE three-day salinity forecasts for March 10, one without  (Figure 4d) and one with (Figure 4e) Argo data assimilation, against the same independent verification Argo profile data of March 10.In the simulation with data assimilation (Figure 4e), we assimilated the Argo profiles from March 1-7 prior to the start of the forecast period.
The assimilation corrected salinity features in the upper ocean and other ocean fields during that March 1-7 period, which led to a reduction of both the bias and the RMSE in the salinity forecast for March 10.

ESSE ENSEMBLE FOR UNCERTAINTY FORECASTING
Once the deterministic modeling system was calibrated, uncertainties were forecast.The ESSE uncertainty forecasts consisted of an ensemble of numerical simulations that were each set up by perturbing the initial and boundary conditions in accord with their dominant uncertainties, and by adding stochastic forcing to the external forcing (i.e., uncertainties in the times/phases and amplitudes of the atmospheric fluxes and tides) and to the deterministic equations so as to represent modeling errors occurring during the time and space integrations of the MSEAS PE ocean modeling system.The result is an ensemble of forecasts from which statistics can be computed, including not only classic ensemble standard deviations but also histograms or probability density functions.In the present NASCar experiment, 50-member ensemble forecasts were issued daily using ESSE from mid-March to April.To initialize multiscale ensembles (Lermusiaux et al., 2000;Lermusiaux, 2002Lermusiaux, , 2007)), historical CTD synoptic data for January, February, and March were used to create joint vertical EOFs (empirical orthogonal functions) for temperature (T) and salinity (S).To construct the three-dimensional T and S perturbations, the joint EOFs were combined with an eigendecomposition of a horizontal correlation matrix defined by a Mexican hat correlation function of 25 km decay-scale and 75 km zero-crossing (these numbers were obtained from the calibration PE runs and the literature).These T and S perturbations were used to generate velocity perturbations in accordance with close-to-geostrophic PE balance.The MSEAS-PE model was integrated for four days for each ensemble member's initial conditions, including also novel stochastic error models for tides and atmospheric fields (Lermusiaux et al., 2017).
Figure 5 illustrates one of these ESSE uncertainty forecasts for March 27, at two days into the four-day forecast.The forecast shows significant uncertainties: at the surface, the average ensemble standard deviation forecasts for velocity, salinity, and temperature are about 18 cm s -1 , 0.14,  As shown in Figure 5, at 100 m depth, these same average ensemble standard deviation forecasts are about 13 cm s -1 , 0.2, and 0.6°C, respectively.The uncertainties in T and S follow distinct features at different depths (due to layering) and are less uniform in space at 100 m, while the density (not shown) and velocity uncertainties at 100 m are much more uniform over the domain.In accord with the initialization, in the upper layers, this reflects an uncertain multiscale eddy field that is not far from geostrophic balance with density uncertainties, while T and S variabilities mostly compensate each other.The T and S uncertainties were higher on the eastern and southeastern sides (Figure 5a) of the modeling domain because of the presence of a tighter thermocline/halocline to the east.This is in part because the western side experienced higher vertical mixing due to past strong atmospheric forcing and internal tides.The uncertainty due to the multiscale eddy field dynamics (Figure 5b) is clearly more uniform, with standard deviations generally between 11 cm s -1 and 16 cm s -1 everywhere.The formation of distinct quasi-horizontal layers (e.g., isopycnals) led to a demarcation in the uncertainty contained in the upper layers (e.g., four layers in the upper 200 m in Figure 5c) with significant salinity uncertainty visible within the main halocline from about 150 m to 400 m.All of this is logical because unsampled variability is a strong contributor to uncertainty, especially when the variability is growing in time and space.As the simulations progress to the four-day forecast, advective stirring, instabilities, and vertical mixing spread the salinity uncertainty between the upper layers in about 60% of the domain (not shown).There was also significant growth in the uncertainty near the base of the main pycnocline (200-350 m) and near Socotra Island where internal tides were generated (not shown).Below 200 m depth, the T and S variations were compensated, leading to relatively smaller potential density uncertainties (not shown).

GLIDER REACHABILITY FORECASTS
To assist the Seaglider operations, daily four-day glider reachability forecasts for SG133 and SG137 were issued using our ocean forecasts.Gliders were piloted in a yo-yo pattern (1,000 m dives in four to five hours) and their reachability was computed accordingly (e.g., Lolla et al., 2014a).Figure 2b illustrates such a glider reachability forecast overlaid on a 2 m mesoscale vorticity forecast.In Figure 6, we study four such forecasts.Columns 1 and 2 correspond to Seaglider SG133, and columns 3 and 4 to SG137.
In each Figure 6 panel, the actual location of the glider at the beginning of the forecast window is shown as a gray point.The reachability fronts at every 12 hours are shown by black contour lines overlaid on a forecast of the 2 m vorticity (columns 1 and 3), and depth-averaged currents (columns 2 and 4) are at the midpoint of the forecast time window.As defined previously, the area within the reachability front can be visited by the gliders while the area outside cannot.
Overall, the growth and shape of the reachability fronts are set by the local

SG137
Probabilistic Eulerian-Lagrangian Reachability for Gliders and Floats During February-April 2017, we forecasted the reachability fronts of Seagliders SG133 and SG137 and the dynamic probability of reachable points of ALAMO (Air-Launched Autonomous Micro-Observer) profiling float 9103.The Seagliders were navigating in a bow-tie pattern to track the ALAMO float (Centurioni et al., 2017, in this issue).The Seagliders dove to 1,000 m depth every four to five hours, and the float was profiling to roughly 300 m depth every two hours.circulation and by the glider's propulsion (itself affected by buoyancy fields).For the period March 5 to March 9 (Figure 6, row 1), the integrated flows near both gliders are forecast to be predominantly to the south, and hence the reachability front grows farther to the south (where currents aid motion) than to the north (where currents oppose motion).The forecast mesoscale vorticity field (Figure 6a1,b1) is positive to the east of the gliders and negative to the west.As a result, the reachability forecasts develop a kidney beanshaped pattern, with a bit more growth on the northeast and northwest extremes compared to the immediate north.For the period March 14 to 18 (Figure 6, row 2), the forecast flows around SG133 (Figure 6c2) are southwest in the north, east, and southeast, westward in the west, and northwest in the south.The maximum flow is to the east with magnitude of ~15 cm s -1 .The flows around SG137 (Figure 6d2) are similar except they are nearly zero in the southeast.The growth of the reachability fronts is aided to the east but opposition in other directions is not as strong as from March 5 to March 9.The local shape of the reachability fronts is governed by the mesoscale flows and vorticity.For the periods March 22 to March 26 (Figure 6, row 3) and March 27 to March 31 (Figure 6, row 4), the flows are to the southeast, with stronger flows in the vicinity of the gliders for the latter period when compared to the former.The effect of the anticyclonic eddy in setting the shape of the reachability front during the March 27 to March 31 period is clearly visible in the last two forecasts.

ALAMO FLOAT POSITION PROBABILISTIC FORECASTS
Daily forecasts of four-day future float positions were also issued, with good accuracy.To account for uncertainties in the initial float positions and initial flows around them, we first advected a dynamic set of possible reachable positions, starting from a small circle around the latest known position of each float.This reachable set then estimates the region of likely future locations (i.e., the regions outside of which the likelihood of finding the float is small).It is conceptually different than the reachable set of a glider because a float is assumed horizontally passive while a glider is not; the governing equation is nonetheless the same, up to the propulsion term.These sets were forecast from the latest known position of each float and issued in 12-hour intervals.during the forecast horizon.
On March 18 (Figure 7a1), the float is in a region of low velocity with east-west shears from adjacent stronger currents.The forecast reachable set initially advects westward but then elongates (due to the shear) both eastward and westward.This pattern was consistent with the actual float, which initially moved westward but then reversed and advected eastward.On March 21 (Figure 7a2), the float is positioned between weak and strong southeast flows.Hence, the eastern edge of the reachable set grows farther south than the western edge.In the above two sets, the shear in the forecast flows stretches the reachable set, thereby elongating the initial circle to ellipse shapes.The real float track (Figure 7a4,a5) is consistent with forecasts and improves with the strength of the flow near the float.In the March 26 forecast (Figure 7a3), the float is in the strong flow proper.It is thus predicted to move faster, which again agrees with the real float tracks (Figure 7a6).
The skill of the float's reachable set forecast was mostly governed and limited by the accuracy of the larger timedependent mesoscale flow.To account for its complete dynamic uncertainty, we employed ESSE ensembles to predict the dynamic set of reachable positions for each ensemble member.A rigorous probability field of reachable float positions is then obtained through normalizing by area the union of each ensemble's set of reachable positions.Figure 7b1-b4 depicts such a probabilistic reachable set field forecast for the ALAMO float.It was issued on March 26, 2017, 00Z for the four-day period March 26-30.The four panels show the forecasts every 24 hours from day 1 to day 4 of the forecast (March 27 00Z to March 30 00Z).The color represents the quantitative probability that a location on the map is in the reachable set for the float.Overlaid on the probability map are 0-300 m averaged current vectors from the central MSEAS PE forecast and the real track of the float from the March 26 00Z to the time of the forecast of each panel.This Lagrangian probabilistic reachability planning system shows predictive skill for long periods of time.

Bayesian Adaptive Sampling
We extended the Bayesian GMM-DO data assimilation for simultaneous estimation of Eulerian variables (e.g., velocity, temperature, salinity) and Lagrangian variables (e.g., drifter/float positions) and features derived from these variables (e.g., waves, fronts, eddies, and other coherent structures).Lagrangian coherent structures, or the most influential and persistent material surfaces in a flow (Haller, 2015), have gained prominence as a tool for studying fluid flow transport and particularly advection of hazardous material in environmental flows.A common approach to identifying Lagrangian coherent structures relies on finite-time Lyapunov exponent (FTLE).
In Figure 8, we illustrate how mutual information fields forecasts can be used to identify the locations for observing different types of data that would be most informative about the velocity field or Lagrangian coherent structures.A particular realization of the ensemble forecast of the forward-time FTLE field over an interval of three days (March 27 to March 30, 2017) is shown (Figure 8a).The ridges of the forward-time FTLE field correspond to repelling Lagrangian coherent structures (i.e., material lines from which parcel trajectories separate the most).The white box marks the region for which the zoomed-in FTLE field is shown on the right.Winds and upper-ocean dynamics lead to rapid variability in the surface velocity field (not shown).When the velocity field is rapidly varying, the features in the FTLE field cannot be identified with the velocity field at one particular time instance.In Figure 8c,d we show forecast mutual information fields between candidate observations of salinity anywhere in the small domain (white box in Figure 8a), and the verification variable, which is a field defined over the whole small domain.The mutual information field between salinity and the scalar field of zonal velocity over the small domain indicates that the most informative salinity data locations are around 12.6°N, 58.2°E.The mutual information field between salinity and the FTLE field over the small domain (from which coherent structures can be estimated) indicates the most informative salinity data locations are around 12.5°N, 58.7°E.The informative locations in this field lie on the edge of a high-salinity intrusion (not shown here).We also note the differences in the locations of the most informative data in the two fields, confirming that observation data locations that are highly informative for one verification variable may not be so for another.
In Figure 8e-g we show forecast mutual information fields between velocity and the forward-time FTLE field.The mutual information field between zonal velocity and forward-time FTLE field indicates that the most informative locations are around 12.5°N, 58.8°E.In contrast, when meridional velocity is measured, the most informative data locations lie near 11.6°N, 59.2°E.Also, there are more candidate observation locations that are highly informative about coherent structures when measuring either velocity component than when measuring salinity.Furthermore, if we were to observe both zonal and meridional velocity, the mutual information about the coherent structures is logically higher than when we measure only one of the components.This full velocity mutual information is, however, maximized when the observation locations are around 12.3°N, 58.8°E and 11.7°N, 59.6°E.

CONCLUSIONS
Recent fundamental theories and methods for optimal planning and sampling predictions were reviewed and illustrated in real-time forecasting conditions in the northern Arabian Sea.The emphasis was on the concepts of reachability, path planning, and adaptive sampling.Reachability and path planning equations provide exact and practical solutions for time and energy optimal solutions, naturally including coordination, obstacle avoidance, anisotropic motions, and uncertain flows.A novel and rigorous framework for adaptive sampling was also outlined, integrating uncertainty quantification, Bayesian data assimilation, and information theory.
For the real-time forecasting and planning experiment completed during February-April 2017 in the northern Arabian Sea, the predictive skill, layered fields, and uncertainty estimates obtained using our MIT MSEAS multiresolution ensemble ocean modeling system were studied and validated against independent in situ data.The higher resolution and inclusion of tides in our calibrated system resolved additional processes and improved the skill of lower-resolution ocean fields.The ensemble ocean forecasts enabled probabilistic prediction of the reachable sets, reachability fronts, and time-optimal paths of underwater gliders and floats.These reachable sets and paths were verified by comparison to subsequent vehicle tracks, showing predictive skill for several weeks.The Bayesian adaptive sampling framework was shown to be capable of forecasting the information content of specific observation about different state or secondary variable fields.Specifically, it determined the future surface salinity and velocity observations that would be most informative about the surface velocity field and surface FTLE field (from which Lagrangian coherent structures are estimated).
We expect that the transfer of humans' scientific predictive knowledge and decision-making capabilities to autonomous ocean sampling and computing machines will continue to increase, leading to expert ALPS systems and collaborative human-machine networks for ocean science.

FIGURE 2 .FIGURE 3 .
FIGURE 2. (a) Our multiresolution computational domains in the northern Indian Ocean.The domain bounded by the green outer box (53.66°E-69.18°E;9.38°N-17.83°N)is the main domain used for realtime forecasts to support Northern Arabian Sea Circulation-autonomous research (NASCar) operations.(b) Three-day forecast of the vorticity field at 2 m depth, issued by the MIT Multidisciplinary Simulation Estimation and Assimilation System primitive-equation (MSEAS PE) modeling system for March 7, 2017, 00Z (middle time for the March 5-9, 2017 forecast period for Seaglider SG137).A smaller moving domain (green inner box) surrounds the initial position of Seaglider SG137 on March 5, 2017, 00Z.The black contours are four days of reachability front forecasts, at 12-hour intervals (12 h to 96 h forecasts).The reachability front is the boundary of the largest set (the reachability set) that the glider can reach within that duration.It was forecast by integrating its exact governing level-set partial differential equation.

FIGURE 4 .
FIGURE 4. Effects of resolution and assimilation on the MSEAS-PE forecasts.(a-c) Impact of higher spatial resolution.Salinity profiles from the initial conditions for March 11 (a) and from the three-day higher-resolution forecast for March 14 (b) are compared to Argo profiles of March 14 (data not assimilated).Locations of the March 14 Argo profiles overlaid on the surface salinity forecast for March 14 00Z are shown in (c).Higher resolution enhanced the forecast skill.(d-f) Impact of data assimilation (DA).Salinity profiles from two three-day forecasts without (d) and with (e) DA compared to Argo profiles collected on March 10.The simulation with DA assimilates March 1-7 Argo data, prior to starting the forecast.This DA corrects the salinity features in the upper layers.Locations of the March 1-7 and March 10 Argo profiles overlaid on the surface salinity forecast for March 10 00Z are again shown in (f).For comparison with initial conditions, all profiles of the considered day are compared with the initial conditions regardless of the observation time.For forecast comparison, the Argo profiles are compared with the forecast profile at the data time and location.

FIGURE 5 .
FIGURE 5. Real-time two-day Error Subspace Statistical Estimation (ESSE) ensemble forecast for March 27, 2017, from an ensemble of 50 members.(a) Forecast standard deviation in temperature at 100 m depth.At that depth, temperature (T) and salinity (S) uncertainties are higher on the eastern-southeastern sides of the Arabian Sea because of the presence of a tighter thermocline/ halocline to the east.This is in part because the western side experiences higher vertical mixing due to past strong atmospheric forcing and recurring internal tides.(b) Magnitude of the forecast standard deviation in velocity at 100 m depth.(c) East-west vertical section of forecast standard deviation in salinity.The formation of distinct vertical layers leads to a demarcation in the uncertainty contained in the layers.

FIGURE 6 .
FIGURE 6. Reachability front forecasts for Seagliders SG133 (columns 1 and 2) and SG137 (columns 3 and 4) for March 5-9 (1 st row), March 14-18 (2 nd row), March 22-26 (3 rd row), and March 27-31 (4 th row) 2017.The actual locations of the gliders at the beginnings of the forecast windows are shown as gray points.The reachability fronts forecast at 12-hour intervals (12 h-96 h) are shown by black contour lines.The 48-hour forecasts of 2 m mesoscale vorticity (columns 1 and 3) and the 0-1,000 m depth-averaged flow magnitude overlaid with velocity vectors (columns 2 and 4) are shown in the backgrounds.
Figure 7a1-a6 shows three examples of such forecasts issued on March 18, 21, and 26.They are overlaid on the 0-300 m averaged velocity forecast magnitude and vectors, again at the midpoint of the forecast time window.A plot of the real track of the float accompanies each forecast

FIGURE 7 .
FIGURE 7. ALAMO float position probabilistic forecasts.(a1-a3) Reachability sets forecast from the latest known positions of each float and issued in 12-hour intervals, overlaid on the 0-300 m averaged velocity forecast magnitude and vectors, at the midpoint of the forecast time window.(a4-a6) Plots of the real float tracks during the forecast horizon.The real tracks (a4-a6) overall confirm their corresponding forecasts (a1-a3).(b1-b4) Probabilistic reachable set field forecast for the ALAMO float issued on March 26, 2017, 00Z for the four-day period March 26-30.Overlaid on the probability map are 0-300 m averaged current vectors from the central MSEAS PE forecast and the real track of the float from March 26 00Z to the time of the forecast of each panel.The Lagrangian probabilistic reachability planning system shows predictive skill for long periods of time.

FIGURE 8 .
FIGURE 8. Adaptive sampling predictions for velocity or coherent structure fields.(a-b) Forecast realization of the forward-time finite-time Lyapunov exponent (FTLE) field (a) and of the same FTLE field but zoomed in a small domain (b), marked by the white box in (a).(c-g)Forecast mutual information fields within this small domain, between the observation variable at any location in the domain and the verification variable which is here always a field defined over that small domain.The five mutual information fields forecasts are between each of the following pairs of observation and verification variables: (c) salinity and zonal velocity field, (d) salinity and forward-time FTLE field, (e) zonal velocity and forward-time FTLE field, (f) meridional velocity and forward-time FTLE field, and (g) velocity (both components) and forward-time FTLE field.These mutual information fields forecast the most informative observation locations for estimating the verification variable over the small domain.Note that the color bars of panels (c-g) differ.