Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: > On Sat, 22 Dec 2007 16:46:38 GMT, JD <jd4x4@<del.this>verizon.net> > declaimed the following in soc.genealogy.computing: > >> Except that XML is specifically extensible. Meaning that at SOME >> point the user or software would be required to handle "extra" data >> that may be an enhancement, or partially relevant/not relevant to the >> software and user's task. Having that built-in to the file format >> (and therefore the software) is a HUGE step forward, imo. >> > And one way of "handling" that extra data is to simply ignore > unknown tags. No different than current GEDCOM. It also means that the I personally wouldn't want software that ignored something and didn't at least provide me with a means of dealing with it. But if it did, as you say it can be transformed with xslt, which is exactly what I did with my publishing source data. I had several sources each with slightly differing source schemas, but all of the data was related and I created my own "final" schema from them, for the use I required. And, the software allowed my data to be exported with my schema, which could then (if desired) be again transformed back into the two original source schemas. And I didn't have to consult multiple sources to find out for sure if source A widgets were the same as source B widgets because I knew the definition as well as the context from the respective schema. > XML would not have a doctype and can not be used by a validating > parser. If it has a defined doctype than it has probably gone through my dtp software allowed me to define my own schema, validate it, > some standardization process and the client programs have been coded > to make use of that "standard". See the above. The "standardization" you refer to doesn't mean that the data has been changed, only reordered.. to YOUR schema. Software doesn't have to do that alone. YOUR schema can be YOUR ordering of the data. > > Yes, XML is "extensible"... But in practically all commercial > software, that "extension" ability is achieved by first creating a > doctype that defines what all the VALID entities will be for an > application. Parsers then reference that doctype, and maybe XSLT > transformation rules to change the presentation of the data. > > While an XSLT transform may be able to work without needing an > external doctype reference (as long as both the XML document and the > XSLT rules were created in parallel) these transforms are typically > from one representation of the data to another. Where I work, we have > an XML doctype used to define data types, interface methods, etc... > The transform engine (currently uses Oxygen8) with a set of XSLT > templates can take the XML to generate either an Ada main program and > linkage source code, or generate a structured (docbook) set of > documentation. Neither operation will succeed fully unless all tags in > the XML validate against the doctype specification file. > And should you be in a position (as I was) to need a third application for the data that would have a somewhat different output... you could then define your own schema and validate against it. > XML is, in effect, a language for defining application specific > mark-up languages. As such, the end "language" is only useful if all > sides agree on how to handle the content of the tags. > Not true. Both sides only need to agree on what the most descriptive, accurate, and useful rendering of the DATA for the content is. >> Data from a multitude of good sources (along with bad sources) is on >> the web. Eventually ALL of the data from good sources will be tagged >> using XML tags at some point. Imagine good data, properly tagged, >> able to be read by your software and auto-tagged with it's source >> info, and PASSED ON because the file format supports it. > > So far as I know, even current GEDCOM supports source citing -- > but > the users of the programs are not rigorous in doing that. So we don't need to perpetuate that by not providing a mechanism that could facilitate automation, imo.