Open Data: DataPoint and Linked Data
The UK Met Office, as a government organisation, is mandated to make some data open and readily available. Some of this open data is available from the Data Point
site in the form of XML, JSON, CSV and linked data. Currently, the data consisting of forecasts at several thousand locations in the UK, every 3 hours for the next threee days, at several levels in the atmosphere, and for several parameters such as temperature, rainfall, wind speed and direction etc.
This is a sustainable and scalable approach to supporting many data feeds, because each data point is 'elemental', and there are not too many.
There is a requirement to support the dissemination of analysed or forecast gridded data. Currently, the approach is to send a complete grid of about 1 000 000 points, at one level, for one time, of one parameter by FTP. This is too demanding of total bandwith. Trimming the grid to a much smaller rectangular subset of grid points for each user request is also compute intensive.
It is envisaged that the million points of the original grid are subdivided into a fixed set of 'tiles' or 'cubes' and then users request the tiles of immediate interest. For example, 10 000 tiles could each contain 10x10 = 100 points.
There would be a set of thin/flat tiles for surface data, and a compatible set of 'cubes' or 'thick tiles' for upper air data.
- Server supplying data tiles covering Volume or Period Of Interest.
- One client, of many, receiving data tiles local to clients' Volume and Period Of Interest.
- Intermediate web caches to allow other clients to receive the same tiles if of interest.
The server and client have negotiated a tile service for an Volume and Period Of Interest.
Main Success Scenario
As client's Point of Interest moves in space and time, the client's local copies of data tiles are replaced and unneeded data tiles are deleted if required. The data tiles received allow the client to process the data as appropriate.
The recently used/requested tiles remain in the web cache.
- Client's Point of Interest moves out of range, or out of time, of service and must negotiate a new service for a new Volume or Period Of Interest.
- The offered service does not have data for the requested parameters.
- The offered service does not have data of the appropriate resolution.
- Each tile could have a time extent with several data times, or each tile could have data only for one instant and separate tiles must be requested to cover a time period.
- Each tile could have a vertical extent with several data altitudes/levels, or each tile could have data only for one level and separate tiles must be requested to cover a vertical extent.
- Each tile could have a different, dimensional, extent with several data dimension values, or each tile could have data only for one dimension value and separate tiles must be requested to cover that dimensional extent. E.g. wavelength of imaging data.
- Each tile could contain several different parameters of interest, or each tile could have data only for one parameter and separate tiles must be requested to cover that dimensional extent. E.g. Vector or tensor valued parameters, such as wind components or current speed and direction should probably be in the same tile.
- It is envisaged that tiling is primarily horizontal (x,y) in extent, with z and t coordinates 'added'. It is possible that the primary tiling could be in, say (z,t) space.
-- Main.clittle - 16 Jun 2015