The main purpose of centralising oceanographic data at a data centre is to prevent the loss of these data. The loss of data relating to the marine environment is a serious issue for two main reasons:
First, because marine data acquisition is extremely costly: scientific cruises on research vessels or satellite use represent considerable budgets. In total, the budget devoted to compiling data in databases represents less than 2% (source: MRAG investigation) of the total observation cost (including vessel costs) at Ifremer.
Without permanent archiving, 30% of the data acquired are lost within 10 years of their collection. This is the notion of "intangible heritage".
The second reason is a constraint related to the variability of the marine environment. Marine data are not reproducible. It is impossible to turn back time to record the physical parameters of the water column at a specific location in the world at a given moment in time. Unlike data obtained from laboratory experiments, marine data can only be obtained once.
The quantities and diversity of marine data are continually rising; we must adapt to new observation systems.
There are two different forms of data collection: automatic collection and manual collection.
Automatic collection
The data that arrive directly at SISMER without human intervention is collected automatically. These incoming data flows are generally received by email or deposited on the network drive. This concerns notably the automated observatories (satellite, ARGO floats, etc.), different fisheries data flows, ship data, etc.
Automatic collection ensures operational monitoring and reporting for some of these data.
CTD and TSG data transmitted in real time are in low resolution.
Manual collection
While the majority of data arrive at SISMER automatically and at regular intervals, this is not the case for all data. Some data have to be retrieved from other systems (e.g. Météo France, etc.) and sometimes scientists have to be contacted to obtain their data (physics data bank, research cruises, etc.).
CTD and TSG data transmitted in deferred time have a higher resolution.
Input data for certain information systems
Certain information systems, such as CDOCO for instance, require additional data to implement different processing methods.
- forcing data (CDOCO)
- Boundary conditions: Forecasts and reanalysis (Mercator), wave field (WW3/REFDIF, SHOM)
- Initial conditions: Climatology (Levitus, Bay of Biscay, Medar/Medatlas), Outputs (Mercator)
- Air/sea interface: Winds and heat flows (Météo-France, NCEP, MM5), Spatial winds (CERSAT), Spatial flows (SAF/OCEAN)
- Freshwater inputs: River flow rates (SPDIREN/COLIANE, IAV, CNR), Rainfall (Météo-France)
- reference data
- Bathymetry (CDOCO)
- Vocabulary (SIH, SeaDataNet)