Optional vs Mandatory properties in GeoSciML v4
The SWG needs to decide on the policy for
mandatory + voidable vs
optional + voidable attributes. Options considered are presented in the Options Considered section below.
At the last CGI meetings in July 2014 in Tucson, we planned that GeoSciML 4 basic level would be (even) better for INSPIREd surveys than the GeoSciML 3.2 equivalent packages, having 4 fewer non-INSPIRE constants needing to be coded for an INSPIRE focused basic service, and a much simpler INSPIRE-like GeologicEvent model. And Eric and GSC have since kindly offered his time, formally in support of
OneGeology, to progress this schema production process rapidly with the plan to present draft OGC conformance documentation by the time we meet in Italy JRC in late October/early November 2015 at the next formal CGI/OGC face to face meetings.
Options considered
Option 1. All Mandatory + Voidable
- Maintain the status quo from GeoSciML version 3.2 and GeoSciML-Portrayal 2.0. (ie, most attributes and associations in the data model remain mandatory + voidable.)
Case for
Consumers and developers are always sure exactly what data structure they are going to get. Mandatory + voidable attributes decrease the variability of delivered web services.
Peter Baumann's comment: "
OWS Common is full of optionals and a nightmare. The problem is not the server - implementers can choose or lose anything they want. But a client has to be prepared that any of these data might be missing, so it has to support all possible combinations - which is impossible in practice. During the redesign of WCS, leading to WCS 2 in 2010, we have radically removed all options wherever possible, and have used an explicit scheme (given by the modular core/extension model) for alternatives. This has proven very helpful,implementers (such as OPeNDAP) have applauded the simplicity of the spec.
The creation of the GeoSciML-Basic package is essentially a basic use case for GeoSciML, expressed as an XSD. It is our attempt to identify the really important bits of GeoSciML, and make things as easy as possible for data providers and consumers (especially INSPIRE) without the need for Schematron to enforce a data structure. Attributes in this core part of the data model could remain mandatory + voidable. The attributes in GeoSciML -Basic are:
- GeologicFeature:observationMethod, purpose, classifier, geologicHistory, composition, hierarchy, occurrence
- MappedFeature:MappingFrame, shape, observationMethod, ResolutionRepresentativeFraction, exposure, positionalAccuracy, specification
- GeologicUnit:geologicUnitType, rank
- GeologicUnitHierarchy: role, proportion
- Contact:contactType
- Fold:profileType
- Foliation:foliationtype, orientation
- ShearDisplacementStructure:faultType
- GeologicEvent:eventProcess, eventEnvironment, olderNamedAge, youngerNamedAge, numericAge
- CompostionPart:role, proportion, material
- EarthMaterial:color, purpose
- RockMaterial:lithology
- Collection:collectionType No need for users of GeoSciML to develop Schematrons to constrain particular use cases. It is easier to guarantee valid services if you only have to rely on XSD validation without having Schematron validation as well.
A fully mandated model makes it easier for a client because it knows what to expect. This has been the underlying principle for GeoSciML up until version 4.
Supports Multi-purpose data exchange services that may be used by many clients, according to many use cases, only a subset of which we are aware of. They need to provide all available data, and in a very disciplined way.
Case against
Assumes that a client can handle an overly inflated and verbose XML stream. I understand this may not be the case for many clients such as mobile devices. Even with everything mandatory data providers could still supply XML snippets, its just that they wouldnt be XML compliant, which is perhaps the case for most XML services. (Bruce)
Makes life difficult for application-targeted services due to the cost of configuration, encoding, transmission and parsing for reasons of time, performance, bandwidth and complexity. At LCR, NZ the targeted services are our most valuable assets. (Alistair)
What WCS community has done is very similar in design to fully mandatory model, but they are dealing with a API standard while we are dealing with a domain standard (Eric).
Option 1 is perhaps the ideal but if it is provoking difficulties for the user community then I think we need to respond to that, particularly as GeoSciML v4 has fewer more diverse schemas than v3.2 so potentially the problem will be greater. (John)
Option 2. All Optional + Voidable
- All attributes and associations in all of GeoSciML-Portrayal, GeoSciML-Basic, GeoSciML-Extension, and all other GeoSciML schemas to optional and voidable.
What is being proposed here is to propose a core model that will allow people to easily create their own extension and set the content rules based on the intended use. (Eric)
For any GeoSciML v4 options that include removing the existing "mandatory" status of GeoSciML properties, we will need to provide schematron rule sets for some common use cases. eg: For a "GeoSciML v3.2" use case, schematrons would be required for the GeoSciML4 packages as if they had the original (i.e. equivalent to the GeoSciML v3.2 conceptual model) option
alities. These would need to be tested in practice using schema/schematron paired validation with various profiles.
NOTE (StephenRichard): an OGCgetCapabilities response can only declare ONE profile URI identifier; a one service-one profile mapping with profiles that can import/aggregate multiple profiles of feature types being served... Different profiles could be specified as OutputFormats on the FeatureType in the Capabilities document. Profile could also be encoded as a keyword on the FeatureType.
Case for
Mandatory properties are use case dependent and should be defined for that use case, using Schematron to enforce the structure of the delivered data. In this way, properties are made mandatory in the scope of one use case, and not in another one.
The model is more flexible and appealing to other domain communities who may want to reuse GeoSciML. They may not consider important those properties we thought were important, hence the idea that use cases should drive the content. (Obviously, the risk is that would create a mess of conflicting use cases.)
Web services are bloated with many mandatory attributes containing nil content which offers little to the average consumer. Eric has received criticisms from web developers about this. In particular the Extension parts of GeoSciML are full of attributes that might normally be populated in only very specific use cases.
GroundwaterML has just moved to adopt the all optional + voidable policy.
Making everything optional solves some of the small client problems. (Bruce)
Adopting a more permissive approach will help us support the open world of the Semantic Web. (Alistair)
The UML specifies the domains types of features, their properties and the patterns between them rather than a complete representation of the real world. This started with our moving the vocabularies out of the UML (these are now specified through the profile using schematron) and lifting of cardinalities the next step in this process. (Bruce)
Id love to see the SWG publish a permissive schema accompanied by an official GeoSciML profile that supports the multi-purpose services for data exchange (the primary use case). I guess consisting of schematron rules making all properties mandatory, and enforcing the content model for the use of CGI vocabularies. (I believe this is called having your crow and eating it.) (Alistair)
Supports multipurpose services (not one of them prevent a fully nilled service being deployed) and goes a long way towards supporting the targeted services. (Alistair)
Moves the decision to have mandatory properties to use cases, each use case being a profile of GeoSciML tailored for a given use. (Eric)
The client would still know what to expect, but not from the XSD. It would know from the profile that I expect to be way more simple that dealing with a bunch of irrelevant properties (as far as the use case is concerned). I expect a service could claim to be compliant to many profiles at once, for example by saying it exposes all the properties required for use cases that requires rock age and all those that requires unit compositions.
GetCapabilities can report which profiles are supported by services. (Eric)
For those of us that have to deploy disciplined, clearly documented services relying on the XSD alone is an inelegant way of doing it. (Alistair)
Need to make it easier for those pesky json programmers (Tim)
Case Against
Because GeoSciML imports many other schema, all of which contain mandatory properties, there is still excess baggage supplied.
It will generate "the need for new user communities to agree their own schematron controlled use cases which may well not happen in practice...." (Tim)
Option 3. All optional, except status quo in GeoSciML-Portrayal
- Change all attributes and associations in all GeoSciML schemas to optional + voidable.
- Keep status quo for GeoSciML -Portrayal (ie, URI attributes remain mandatory).
Because GeoSciML imports many other schema, all of which contain mandatory properties, there is still excess baggage supplied. This option recognizes this reality and attempts to compromise between 1 and 2.
As GSML_Portrayal ( including the ready to publish ERML_Portrayal schema) is a level 0 simple features GML schema it cannot be made any less mandatory than it is already as such a simple features schema, so that Eric considers any vote for option 2 is in fact practically a vote for option 3 to help Ollie. (Tim, Eric)
Option 4. Extension elements voidable, status quo for Basic and Portrayal
- Change all attributes and associations in GeoSciML -Extension, and other extended schemas (ie, Borehole, LaboratoryAnalysis-Specimen, GeologicTime) to optional + voidable.
- Keep status quo for GeoSciML-Basic attributes (ie, mostly mandatory + voidable).
- Keep status quo for GeoSciML-Portrayal (ie, URI attributes remain mandatory).
Case for
Because GeoSciML imports many other schema, all of which contain mandatory properties, there is still excess baggage supplied. This option recognizes this reality and attempts to compromise between 1 and 2.
We also note that, in the absence of ready availability and use of secondary schematron validation, that will also fit better with INSPIRE needs for the use of GeoSciML 4 basic where ALL INSPIRE attributes are mandatory and voidable precisely because IF the data is available Surveys HAVE to legally serve it the point being that nearly all surveys DO have all this data so in long term practice there will NOT be all those voided/nilled attributes bloating the offered services that the JSON programmers are so worried about they will be populated with real data! Similarly for developing
OneGeology services (consider them as practical and useful supersets of the INSPIRE legal requirements) . The INSPIRE argument is similar to why we called Basic basic we expect surveys to have that data and this is what we basically expect them to be able to serve in a worthwhile useful data service. (Tim)
Discussion
Recognize the distinction between the conceptual model (essentially GeoSciML 3.2 with some minor modifications), expressed in UML, and the XML schema designed for information interchange based on the conceptual model. The names and semantics of the objects, features, data types, associations and attributes are the essence of the conceptual model. Use of conceptual element names for the implementing physical elements is one basis for interoperability.
Optional elements are not hard to test for; what is hard is when there are many ways of encoding the same information, like the CGI_quantity of days gone by that allowed a term (concept), term-range, number, number range, or gml:TM_primitive, which has a whole blizzard of possible concrete encodings.
The heavy usage of profiles for interoperability requires that we have very clear specifications for how an instance document declares the profile it conforms to, and for how a service advertises the profiles it offers (for WFS 1.1.0 and 2.0 services), and how an ISO19139 or 19115-3 metadata record will specify the profile for a distribution offered on a dataset (last one is lower priority, but sure would be useful). In defense of using profiles, note that for complex features, many XML schemas in use do not provide sufficiently expressive validation constraints to ensure interoperability. ISO19139 is the poster child
, but I suspect thats only because it is relatively commonly used, GeoSciML 3.2, WaterML, SensorML are all probably similarly ambiguous on too many points to really enable practical interoperability. Practical interoperability is being able to write, test, and put in production code to utilize somebody elses service and interchange format within a reasonable budget.
Once were going down the road of all optional elements and schematron rules, can we dispense with the requirement for element ordering? Schematron rules can be agnostic about element ordering (in fact, arent they always?).
StephenRichard - 21 Mar 2015
Complete Email Trail
- Option 2
- Option 3
- Option 4
- Option 1
EricBoisvert - 11 Mar 2015
This is a tough call and no matter what decision is made it will be found to be flawed for various reasons. I fully support Peter Baumanns suggestions that a fully mandated model makes it easier for a client because it knows what to expect. This has been the underlying principle for GeoSciML up until version 4. The trouble is it assumes that a client can handle an overly inflated and verbose XML stream. I understand this may not be the case for many clients such as mobile devices. Even with everything mandatory data providers could still supply XML snippets, its just that they wouldnt be XML compliant, which is perhaps the case for most XML services.
Making everything optional solves some of the small client problems, but creates others that you and Peter have outlined below. In addition, because
GeoSciML imports many other schema, all of which contain mandatory properties, there is still excess baggage supplied. The Option 3 & 4 recognise this reality and are an attempt to compromise between 1 and 2.
Im starting to see the UML as specifying the domains types of features, their properties and the patterns between them rather than a complete representation of the real world. This started with our moving the vocabularies out of the UML (these are now specified through the profile using schematron) and I see the lifting of cardinalities the next step in this process.
Consequently, my order of preference would be:
- Option 2
- Option 3
- Option 1
- Option 4
BruceSimons - 12 Mar 2015
I agree with Bruce (it happens more and more now we no longer work for the same survey). There are merits to all but option 1.
Since my first GeoSciML test bed, and more so now in NZs quasi-commercial government environment, we have had to deploy multiple web services for a given app-schema. They can be broadly summarised as:
1. Multi-purpose data exchange services that may be used by many clients, according to many use cases, only a subset of which we are aware of. They need to provide all available data, and in a very disciplined way.
2. Targeted services for a particular community or client (agency or machine) that may operate in a constrained environment (for example limited bandwidth or slow connections). The clients are typically commercial entities that work to very lean data requirements. These services benefit from permissive schema constrained by focused profiles.
Pretty much every data custodian should support a type 1 service*, and do so as a mandatory-nillable one for the reasons Peter Baumann gives. Type 2 services are deployed as required (at LCR these are our most valuable assets).
Voting option one supports the multi-purpose services but makes life difficult for the targeted services due to the cost of configuration, encoding, transmission and parsing for reasons of time, performance, bandwidth and complexity. The other options still support the multipurpose services (not one of them prevent a fully nilled service being deployed) and go a long way towards supporting the targeted services.
Also, adopting a more permissive approach will help us support the open world of the Semantic Web (see Bruces last point as a justification).
So, using Bruces preferential voting system, my votes are:
- Option 2
- Option 4
- Option 3
- Option 1 (by some distance)
* I know theres work involved, but Id love to see the SWG publish a permissive schema accompanied by an official GeoSciML profile that supports the multi-purpose services for data exchange (the primary use case). I guess consisting of schematron rules making all properties mandatory, and enforcing the content model for the use of CGI vocabularies. (I believe this is called having your crow and eating it.)
AlistairRitchie - 16 Mar 2015
Ok, bit of clarification on the proposal. I sense people understood all optional as a signal the data producers can generate whatever they want. Its not. It just moves the decision to have mandatory properties to use cases. Each use cases being a profile of
GeoSciML tailored for a given use.
>
I fully support Peter Baumanns suggestions that a fully mandated model makes it easier for a client because it knows what to expect.
I also agree and what WCS community has done is very similar in design but they are dealing with a API standard while we are dealing with a domain standard.
WCS has a core model and extensions. GSML 4 also had a core model and one extension, but the packaging has been decided top down by the modeler. What is being proposed here is to propose a core model that will allow people to easily create their own extension and set the content rules based on the intended use.
Furthermore :
But a client has to be prepared that any of these data might be missing, so it has to support all possible combinations - which is impossible in practice. This problem is in
no way resolved by nillables anyway.. its the same problem, just made explicit.. and verbose.
The client would
still know what to expect, but not from the XSD. It would know from the profile that I expect to be way more simple that dealing with a bunch of irrelevant properties (as far as the use case is concerned). I expect a service could claim to be compliant to many profiles at once, for example by saying it exposes all the properties required for use cases that requires rock age and all those that requires unit compositions. GetCapabilities can report which profiles are supported by services.
This is also what I understand from Bruces comment;
"Im starting to see the UML as specifying the domains types of features, their properties and the patterns between them rather than a complete representation of the real world. This started with our moving the vocabularies out of the UML (these are now specified through the profile using schematron) and I see the lifting of cardinalities the next step in this process."
EricBoisvert - 17 Mar 2015
Well put - agree. To be clear I would expect every service to be deployed according to a schema/profile pair (with the latter published as mod spec compliant requirements/conformance classes etc etc).
Assuming you still have a concept of service behaviour that needs to be met to be an approved
GeoSciML service, approval would require the publication of the profile, or profiles, that constrain the behaviour of the service. Hence my belief that the community will need to provide an all-encompassing profile for the most comprehensive services.
To be honest, I wouldnt lose sleep if people did deploy services that did whatever they want, the GeoSciML info model is still valuable there. However, for those of us that have to deploy disciplined, clearly documented services relying on the XSD alone is an inelegant way of doing it.
AlistairRitchie - 17 Mar 2015
It is Tim who has recently further encouraged this active GeoSciML 4.0 schema production process as he has tight and well defined OneGeology (to work with actual clients already in the OneGeology portal) and EU INSPIRE profiles that should be rolled out in 2015 as live WFS 2.0s by more than 20 European Geological Surveys. Back at the last CGI meetings in July in Tucson we planned that GeoSciML 4 Basic level would be (even) better for INSPIREd surveys than the GeoSciML 3.2 equivalent packages (e.g. 4 fewer non-INSPIRE constants needing to be coded for an INSPIRE focussed basic service, much simpler INSPIRE-like GeologicEvent handling etc). And Eric and GSC have kindly offered his time, formally in support of OneGeology implemented standards, to progress this schema production process rapidly with the plan to present draft OGC conformance documentation by the time we meet in Italy JRC in late October/early November at the next formal CGI/OGC face to face meetings.
So Eric has really cracked on with this process and through applying the magnifying glass of actually doing it he has come up with an unexpected radical i.e. not proposed in Tucson - proposal below to make all attributes optional in the GeoSciML 4 XML schema.
When we read the initial response arguments from Bruce and Alistair
(Ollie - see above) we thought they were going to vote for the compromise option 4 first but in both cases (and similarly in Erics also) they voted 4 last!
(Ollie - see Alistair's correction below). We think that demonstrates how tricky doing such a radical change like this without a meeting can be, however we have now had a fruitful skype meeting with Eric to clarify some issues which Eric encourages us to share with you.
First of all we were surprised to see GeosciML_Portrayal included in this voting proposal and in technical practicality Eric wishes me to point out that as _portrayal ( including the ready to publish ERML_Portrayal schema) is a level 0 simple features GML schema it cannot be made any less mandatory than it is already as such a simple features schema, so that he considers any vote for option 2 is in fact practically a vote for option 3 to help Ollie. Small point just for clarity but this does raise the issue of ERML version 2.0 which (part of!!) is being used actively in the EU
Minerals4U project complete with its existing mandatories ( and reliance on GeoSciML 3.2) as the required by this project extension to the basic INSPIRE Minerals schema. ERML version 2 is CGI governed we should not forget whilst GeoSciML 4.0 has become OGC governed ( through the CGI/OGC MOU) schema.
Because we believe that in PRACTICE having mandatories validated by xsd (with lots of experience and tools in doing basic xsd validation around) has helped geological surveys actually populate and check services for parts of these models, taking the mandatories away will lead in practice to less interoperability and as John put it will generate the need for new user communities to agree their own schematron controlled use cases which may well not happen in practice.....
So our HEAD says we should vote as a reasonable compromise:
- Option 4
- Option 1
- Option 3
We also note that, in the absence of ready availability and use of secondary schematron validation, that will also fit better with INSPIRE needs for the use of Geosciml 4 basic where ALL INSPIRE attributes are mandatory and voidable precisely because IF the data is available Surveys HAVE to legally serve it the point being that nearly all surveys DO have all this data so in long term practice there will NOT be all those voided/nilled attributes bloating the offered services that the JSON programmers are so worried about they will be populated with real data! Similarly for developing
OneGeology services
(consider them as practical and useful supersets of the INSPIRE legal requirements). The INSPIRE argument is similar to why we called Basic basic we expect surveys to have that data and this is what we basically expect them to be able to serve in a worthwhile useful data service.
However our HEART tell us we should vote indeed for 3 first the all optional proposal - because again as John put it
"Option 1 is perhaps the ideal but if it is provoking difficulties for the user community then I think we need to respond to that, particularly as GeoSciML v4 has fewer more diverse schemas than v3.2 so potentially the problem will be greater."
We do understand the need to make it easier for those pesky json programmers as even Eric expressed it and if our entire case for voting for Option 4 first is predicated on not having practically available the schema/schematron validation pairing as Alistair put it then Eric proposes to do the following:
1). He will not change the mandatories in the conceptual UML model ( he does not practically need to do so to create the xsi
that will output the schema) after all to do so would be to de-nature= decrease the information content in the conceptual geosciences model that is understood globally and that we spent 11 years designing ( oh yes we did!) to include concepts like if you have x then we expect you to have attribute y and supply it).
2). As part of cutting a schema with all attributes optional he will provide the schematron for Basic level (and please also for the Borehole extension because that is also required by INSPIRE) just checking that the mandatories in the model are there i.e. the schema/schematron validation here will be as if GeoSciML-Basic and borehole was as originally proposed with some mandatory attributes and therefore also fit for immediate INSPIRE use. INSPIRE = BGS will then take that schematron and build into it all the necessary INSPIRE data content checking for a full INSPIRE profile.
Eric will also provide the similar schematron for the other GeoSciML 4 extensions as if they had the original mandatories in. And lets see how this works out in practice using schema/schematron paired validation with various profiles ( although we have pointed out to Eric that an OGC GetCapabilities response can only declare ONE profile URI identifier, not many, and that was presumably deliberately done by OGC maybe we will end of with a one service/one profile mapping but then a profile id could in practice refer to two or more profiles of feature types being served...).
And lets see how that pans out when we write the OGC conformance tests will they be weak and all optional and so to easy to conform to so hardly any point?
(Alistair do you have detailed schematrons you could share with us for all the different service types you described? Because INSPIRE has recently done a survey to see what is out there in terms of data content validation and no surprises very little out there currently at all in Europe we BGS never got further than
https://www.seegrid.csiro.au/subversion/GeoSciML/branches/3.0.0/schematron/GeoSciML_v3_Testbed_4.sch which didnt really test real data content)
TimDuffy - 12 Mar 2015
Eric's reply to Tim
"although we have pointed out to Eric that an OGC getcapabilities response can only declare ONE profile URI identifier, not many, and that was presumably deliberately done by OGC maybe we will end of with a one service/one profile mapping but then a profile id could in practice refer to two or more profiles of featuretypes being served...)."
No.
http://schemas.opengis.net/ows/2.0/owsServiceIdentification.xsd
<elementname="Profile" type="anyURI" minOccurs="0" maxOccurs="unbounded">
<annotation>
<documentation>Unordered list of identifiers of Application Profiles that are implemented by this server. This element should be included for each specified application profile implemented by this server. The identifier value should be specified by each Application Profile. If this element is omitted, no meaning is implied.</documentation>
</annotation>
</element>
EricBoisvert - 21 Mar 2015
Alistair's reply to Tim's comments:
When we read the initial response arguments from Bruce and Alistair below we thought they were going to vote for the compromise option 4 first but in both cases (and similarly in Erics also) they voted 4 last!
No:
So, using Bruces preferential voting system, my [Alistairs] votes are:
- Option 2
- Option 4
- Option 3
- Option 1
(Ive tweaked the quoting of my vote the fourth preference for clarity. The order is unchanged.)
We think that demonstrates how tricky doing such a radical change like this without a meeting can be. - Yes.
any vote for option 2 is in fact practically a vote for option 3 to help Ollie - This doesn't withstand scrutiny there was a bit to consider when comparing the two, and, long story short: option 4 is a compromise between 1 and 2 that has merit. Option 3 is an icky compromise, and much of the ickyness is due to it rather undermining _portrayals important role as part of a logically consistent set of level 0 and level 1 GeoSciML services. Differing views on the role of portrayal schema, in GeoSCiML (and their soil equivalent) mean this is probably a point to be discussed, not a position to be asserted.
bloating the offered services that the JSON programmers/make it easier for those pesky json programmers - The motivation for this is not that simple. At all.
He will not change the mandatories in the conceptual UML model - We are basically discussing how to implement GML profiles, this is at odds with how I understand profiles work. The second rule of profile club is:
include all mandatory particles (subelements and attributes) of the parent element in GML (GML 3.2.1 clause 20.4).
Which model artefact defines mandatory? The conceptual model, or the physical GML application schema? If the former, does leaving everything mandatory put us in a bit of a bind, and could it leave our behaviour at odds with other MLs (the we depend on)?
Alistair do you have detailed schematrons you could share with us for all the different service types you described? - Sorry, but no. I am drafting the requirements classes as we speak for soil (ANZSoilML), hydrology (WaterML) and environmental observation (O&M and, soon, TimeseriesML) services. However, they only include stubs of conformance tests that will refer to the .sch files that will implement them. I am finalising funding to implement them next financial year (starting June 2015).
AlistairRitchie - 23 Mar 2015
Here's the e-mail I thought I'd sent on Friday-- it was still in my drafts box...
I'm for all optional/voidable and schematrons to define profiles. I can go either way on leaving the mandatories in Portrayal. So that's:
- Option 2
- Option 3
- Option 4
- Option 1
I like Eric's suggestion-- here's my paraphrasing of Tim's paraphrase (did I get it right Eric, Tim?). I created a document summarizing (hopefully all of) the e-mail thread, along with some Discussion from me, its in the GeoSciML /Trunk/Documents in subversion. (Ollie's note: the discussion from the subversion document Steve mentions is now included in this wiki page.)
1. Leave optionality (mandatory {1, 1..*}, mandatory nilable, optional (0..1, 0..*}) in the conceptual UML model as they are; they represent our understanding of the domain, not the requirements for content in an information exchange.
2. Construct schema with all attributes and associations optional, AND provide a schematron rule set testing for the optionalities specified in the Basic level conceptual model for for the Extended model.
The schematron rules would constitute the normative definition of GeoSciML interchange documents.
Its debatable whether making the URIs mandatory in Portrayal really gets us anything-- its entirely schema valid to put any random string in those elements. The xsd really can't enforce that the values are URIs that are registered in some fashion to make them machine processable (i.e. from a standard vocabulary). Bottom line on Portrayal cardinalities... maybe it doesn't matter.
The Europeans would like to include implementation of the Borehole extension because that is also required by INSPIRE. The intention is that the a Geosciml-Basic instance that conforms to the schematron rules would be as originally proposed with some mandatory attributes and therefore also fit for immediate INSPIRE use. INSPIRE = BGS will extend that schematron to add rules testing conformance with additional INSPIRE requirements. Fine with me!
Final thought--Once were going down the road of all optional elements and schematron rules, can we dispense with the requirement for element ordering? Schematron rules can be agnostic about element ordering (in fact, arent they always?).
StephenRichard - 23 Mar 2015
Steve, Just saw your email (was not addressed to the mailing list so I did not get it)
> "Final thought--Once were going down the road of all optional elements and schematron rules, can we dispense with the requirement for element ordering? Schematron rules can be agnostic about element ordering (in fact, arent they always?)."
Yes and no.
It could be done in XSD if we use xsd:choice intead of xsd:sequence, but encoding rule uses sequences (we would have to break encoding rules).
Even with choices, we would still have some sort of sequence because inherited properties come before (some the inherited choice block will be before the new class choice block). The ultimate solution would be to get rid of XSD altogether and use schematron or RelaxNG
EricBoisvert - 25 Mar 2015
Thank you all for your thought-provoking responses to this debate. I have had some offline discussions via email and Skype with a few of you, and as many of you have indicated, there are plenty of arguments both ways.
I am swayed by the all optional arguments, but hear the concerns of the INSPIRE user base. As I have told Tim, INSPIRE are already going to have to create a Schematron to check for valid content (ie, vocabularies) in a GeoSciML -Basic service. If we go with Option 2, I dont think it would be too much of an impost to include in that INSPIRE Schematron a rule that states all attributes in GeoSciML -Basic are mandatory. That scenario would effectively implement the primary use case for INSPIRE.
I would perhaps prefer to keep the existing mandatory properties in GeoSciML -Portrayal (eg, Option 3). But that preference is not very strong, and I recognise Steves comment that there is no way to ensure that valid URIs are used in the Portrayal mandatory properties without a Schematron, so why not also ensure the mandatoryness of those properties in the same Schematron?
As far as retaining the existing optional and mandatory cardinalities in the conceptual model goes, I would rather keep the conceptual and implementation cardinalities in sync because I am worried that it would introduce unnecessary inconsistency and confusion between the conceptual and implementation models if they are different like that. ie, if the mandatory cardinalities are important enough to be represented in the conceptual UML, then why would we not enforce them in our implementation UML? The logic of different cardinalities in different UML models seems counter-intuitive to me and I would rather strip the mandatories out of both models.
So, for me
- Option 3 (but only just in front of Option 2, with no real conviction)
- Option 2
- Option 4
- Option 1
-- OllieRaymond - 25 Mar 2015
PS. The new WaterML spec illustrates how we might incorporate Schematron into our GeoSciML specification. e.g:
I have read all the comments from Ollie´s twiki pages and I have will look this issue through ERML and also how we have trying to implement
EarthMaterial part in our
Minerals4Eu project (voidable problems)!If you dont remember the INSPIRE mineral resource model Earth material is inside the INSPIRE MR extension not mandatory for members states!
And then as Tim said However our HEART tell us we should vote indeed for 3 first"!!
- Option 3
- Option 2
- Option 4
- Option 1
Main.
JouniVuollo - 25 Mar 2015
Thank you Ollie. I guess BGS should formally vote its from the heart view which would be:
- Option 3
- Option 4
- Option 1
- Option 2
But this vote is dependant on Eric fulfilling his intention to cut and make available the mandatory schematrons for all parts of the model, Basic and extensions including borehole so that we can use as soon as the schema is published and this domain information is not lost, and it will be useful to see what these schematrons look like . Perhaps we should discuss whether references to these parts of the current conceptual model should be formally part of the OGC documentation when we are face to face in late October/early December. As Steve says so succinctly below (my bold) :
. Leave optionality (mandatory {1, 1..*}, mandatory nilable, optional (0..1, 0..*}) in the conceptual UML model as they are;
they represent our understanding of the domain, not the requirements for content in an information exchange.
P.S. Eric you are of course correct re OWS 2.0 unbounded for profile URI identifiers i.e. can have many and even though OWS 2 is not used by WFS 2.0 nor WFS 2.0.2 ( they use OWS 1.1) In OWS 1.1 it is still unbounded there, the point we were making is that no current WFS software implements declarations of even one profile ( we will check whether Xtraserver does in April hopefully) i.e. they have the one WFS endpoint=one profile being served= the features available model. Even if we did have multiple profiles to point people to it is indeed people who would read these profile declarations it is dundamentally not an xml machine readable object, a point you made to me.
TimDuffy - 25 Mar 2015
Tim,
> "the point we were making is that no current WFS software implements declarations of even one profile"
As discussed with James, adding this to a GetCapabilities can be done through a static file, its just advertisement for client / parser that your service complies with one or more official rules identified by a URIs. I am not implying that one should implement a multi profile service, just advertise those that it actually complies to. The only way WFS 2.0 has to alter the output is to play with outputFormat (and there is a bit more into profile that just output format, eg: what vocabulary is being used)
A WFS can serve multiple namespaces , so you can technically serve inspire:GeologicUnit and international:GeologicUnit, each following their own profile, but the profiles section wont tell you which profile applies to which feature (unless we use ExtendedDescription in the FeatureSection but we then invent our own semantic)
EricBoisvert - 26 Mar 2015
Sorry for this late answer (and thanks to Ollie for this extra time for voting
). After internal discussion with Agnès and Sylvain, the BRGM vote is :
- Option 3
- Option 4
- Option 1 or 2
Main.FrançoisRobida - 25 Mar 2015
I'm sorry also from my side. I have a long discussion with my expert people relate to this discussion, I have also to maintain in line GSML with INSPIRE. Of course I'm not official member so my vote could be not influential.
From my prospective the option list is:
- Option 4
- Option 3
- Option 2
- Option 1
CarloCipolloni - 26 Mar 2015