First off..Merry Christmas, Happy Holidays, and a good winter solstice to all. "Tony Proctor" <tony_proctor@aimtechnology_NoMoreSPAM_.com> wrote: > "Dennis Lee Bieber" <wlfraed@ix.netcom.com> wrote in message > news:13mrvtst5hh523c@corp.supernews.com... >> On Sun, 23 Dec 2007 04:33:37 GMT, JD <jd4x4@<del.this>verizon.net> >> declaimed the following in soc.genealogy.computing: >> >> > >> > I personally wouldn't want software that ignored something and >> > didn't at least provide me with a means of dealing with it. But if >> > it did, as you say it can be transformed with xslt, which is >> > exactly what I did with my publishing source data. I had several >> > sources each with slightly >> >> XSLT still requires a known source and destination format; it can't >> take an unknown source tag and create a known destination tag with >> meaning... Maybe it can produce some sort of blanket output for >> unknown tags, but that will quite likely not be a reversible >> transformation. >> >> > differing source schemas, but all of the data was related and I >> > created my own "final" schema from them, for the use I required. >> > And, the >> >> You "created"... The software didn't derive a consistent schema... >> >> Who "creates" the schema and transforms for all the many programs >> that currently exists? >> Whomever finds it relevant. Really though, Tony's point about the "data model" is where it has to start, imo. But more on that later.. >> > >> > See the above. The "standardization" you refer to doesn't mean that >> > the data has been changed, only reordered.. to YOUR schema. >> > Software doesn't have to do that alone. YOUR schema can be YOUR >> > ordering of the data. >> > >> To me, said ordering requires prior knowledge of what the meaning of >> various tags IS... What if "my" data considers "fourth flood of the >> river <x> in the reign of <y>" to be acceptable as a date (okay, even >> TMG would consider that a very irregular date). How would your >> software treat something that output such as "<date>....</date>", vs >> "<date><month>...</month><day>...</day><year>...</year></date>" >> >> Besides... I'm buying the software to handle the genealogical data >> and reporting... I'm not writing my own package in which I have the >> option of defining transforms into what I think should be used... >> Unless all the producers of said software all agree on what is valid >> data, commercial software will not be able to /losslessly/ accept the >> data of others. >> I really don't think there is much variation in the agreement of the actual "core" data.. but that means different things to different people, mainly those that are stuck on "names" for the data, etc. A person is a person, a date is a date. It all depends on where you are starting in your "relevancy" model as to what "extra" bits are attached to the core data. >> > >> > And should you be in a position (as I was) to need a third >> > application for the data that would have a somewhat different >> > output... you could then define your own schema and validate >> > against it. >> > >> How many weekend genealogists are going to even know what an XML >> transform is, much less write one to handle one source of data? >> They shouldn't have to. That's part of my point for having the software deal with it in meaningful & useful context. >> > So we don't need to perpetuate that by not providing a mechanism >> > that could facilitate automation, imo. >> >> Well, we could insist that all extant genealogy programs be modified >> to refuse to accept any data entry that doesn't have some sort of >> source citation, even if it is nothing more than "personal knowledge >> of <xyz>" -- >> Wulfraed Dennis Lee Bieber KD6MOG >> wlfraed@ix.netcom.com wulfraed@bestiaria.com >> HTTP://wlfraed.home.netcom.com/ >> (Bestiaria Support Staff: web-asst@bestiaria.com) >> HTTP://www.bestiaria.com/ > Refuse is pretty harsh/final. Give you an overview of the usage in the context of the sending schema, and allowing you and/or the software to automate (via xslt, etc) it's transformation is preferred. > XML is touted as some sort of panacea. It is an improvement on the > plethora of data formats (in all IT areas) that existed previously, > but it has to be understood for what it is. It is merely a > standardised syntax for representing hierarchical data. That > standardisation therefore only applies to the syntax, not to the > semantics. What this means, in layman speak, is that any XML file is > instantly recognisable as "XML" but it doesn't make the content any > more understandable. > > Sure, there are lots of tools for loading/viewing/manipulating XML but > they only know about the syntax, not the semantics. Yes, you can write > your only XSLT (which I have to say is an awful language) but all > those transformations would be doing is manipulating the syntax, e.g. > removing stuff, moving stuff around, extracting stuff, etc. In > principle, this would all be possible with any documented data format, > including GEDCOM, at the expense (& risk) of having to write a little > more of the necessary software yourself. > > I firmly believe that a "data model" has to be defined and accepted > first. This subject has come up several times in this group, and links > have been posted here about ongoing projects striving to achieve this. > Once such a data model specification exists then representation of it > in any data format (XML, GEDCOM, some other) is almost a mechanical > operation > I agree totally. Where I seem to differ with everyone else is that I think the "data model" has been discussed with a confused mixture of the actual data element definitions and the use of them in various heirarchies, as well as with differing attributes and expansions. I think once XML would be used as simply a transport (meaning at least a basic expandable core data set and (maybe) even a rough heirarchy defined), then extensions of the data set and differing heirarchies would emerge naturally. > Tony Proctor > > > I haven't really begun to evaluate it yet, but on the surface I think GenoPro may be on the right track. It lacks some traditional display concepts and needs work in possibly the research data organization and entry areas, but it appears to be a very flexible, extendable start I think. It's at http://www.genopro.com It's strength I think is in it's use of remapping data through "reports" into XML files as well as displays. I don't think it helps much with the schema & transforms, but I've only just installed the trial.