Note: The Rootsweb Mailing Lists will be shut down on April 6, 2023. (More info)
RootsWeb.com Mailing Lists
Previous Page      Next Page
Total: 3440/10000
    1. GUIDS/UUIDS and the whichness of what (Was: How Should We Store Evidence in Genealogical Databases?)
    2. Bob Melson
    3. On Friday 27 May 2011 22:03, Wes Groleau ([email protected]) opined: > On 05-27-2011 22:28, Bob Melson wrote: >> So, will SOMEbody please 'splain me this thing called a >> globally/universally unique ID and its place in the grand scheme of >> things? > Thanks. It's about what I expected - and took away from the previous exchange. > UUID / GUID were not created by genealogists to identify multiple > definitions of a person. They were created by computer types to > distinguish between two things that can't be told apart any other way. > Kind of like a serial number. > > A checksum on the other hand, is a way to be almost certain that > two items have or don't have identical contents without actually > comparing byte by byte. Or to verify that the item has or hasn't > changed since the checksum was generated. And here you have it. My mind insists that *IDs _should_ be like some sort of super checksum, with the same "record" resulting in the same *ID no matter _where_ the record might be found. Dunno if it'd be any more useful than a *ID, but it certainly couldn't be any _less_ useful. (And the pesky critter - my mind, that is - insists there's gotta be a reasonable use for *IDs beyond taking up space in a database.) > > If a record on your machine and one on my machine have identical > UUIDs, then either one of them was copied from the other (NOT > independently generated) or one of us (or our software) was naughty and > altered a UUID. If the UUIDs match and the items do not, then > either someone changed the UUID on another record, or changed the record > without giving it a new UUID. Well-behaved software will not > gratuitously change a UUID. But lots of programs will fail to create a > new UUID when the item ceases to be a copy of the other. That, in my > opinion makes them useless for genealogy. But they were never intended > to be some magic way of automatically identifying independently > generated records as being representations of the same entity. Then why bother with 'em? I won't further belabor the point. Kinda like the Kipling poem "East is East and West is West, and never the twain shall meet, 'til earth and sky meet presently at God's great Judgement Seat", except here we have GUID and UUID and ... > > -- > Wes Groleau > > There are two types of people in the world … > http://Ideas.Lang-Learn.us/barrett?itemid=1157 Sighin' Ol' Bob -- Robert G. Melson | Rio Grande MicroSolutions | El Paso, Texas ----- The greatest tyrannies are always perpetrated in the name of the noblest causes -- Thomas Paine

    05/27/2011 04:57:57
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Bob Melson
    3. On Friday 27 May 2011 18:22, Wes Groleau ([email protected]) opined: > On 05-27-2011 10:53, Bob Melson wrote: >> Seems to me at least one requirement for universality is that the same >> data input on different machines results in the same output on those >> machines. This is absolutely not true - in my experience - when dealing >> with UUIDs. Worse yet, not only does the same data NOT result in >> identical UUIDs when input on different machines running different >> software, it doesn't even result in identical UUIDs when input on >> different machines using >> _identical_ software. This would, IMO, support the contention that >> uniqueness and universality is restricted to a single machine and >> application on that machine. > > UUID = Universal Unique ID. If two machines generate the same string, > then they have blown the Unique criteria. It is not intended to be a > tag that is common to distinct items that happen to be identical. > That would be a checksum. :-) > > -- > Wes Groleau > > There are two types of people in the world … > http://Ideas.Lang-Learn.us/barrett?itemid=1157 I'll answer this and the one immediately previous with this. To start, I don't want to re-ignite the previous discussion regarding *IDs, aside from saying that it seems to me that a {Globally|Universally} Unique ID should indeed be unique everywhere; given the identical input even on different machines; the resulting ID should, or so it seems to me, be identical but unique to that input. This is not "blowing" the unique criteria, any more than identical checksums derived from identical strings on different machines "blows the checksum criteria". Matter of fact, I think a checksum would be a helluva lot more useful than a *ID when you come right down to it. All that said, it's more than likely that I have a faulty understanding of what "globally unique" or "universally unique" actually mean. Based strictly on the meanings of the words, though ... The ID produced on my machine uniquely identifies a record ON my machine but is otherwise of no value and, to my mind, appears to be redundant as there are other record identifiers that "uniquely" identify that record. That same record on another machine will produce another unique ID, different from the one produced on my machine and valid only on the machine producing it. Go to a third (or a fourth or an Nth) machine with an identical record and you'll get a 3d or 4th or Nth ID, different from all others and, IMO, valueless for identifying the record anywhere except on the machine on which the ID is produced. The end result is that we have N records with N IDs, all unique, and none of 'em (the IDs) useful for any discernible purpose. So, will SOMEbody please 'splain me this thing called a globally/universally unique ID and its place in the grand scheme of things? Stumped Ol' Bob -- Robert G. Melson | Rio Grande MicroSolutions | El Paso, Texas ----- The greatest tyrannies are always perpetrated in the name of the noblest causes -- Thomas Paine

    05/27/2011 02:28:20
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Wes Groleau
    3. On 05-27-2011 10:53, Bob Melson wrote: > Seems to me at least one requirement for universality is that the same data > input on different machines results in the same output on those machines. > This is absolutely not true - in my experience - when dealing with UUIDs. > Worse yet, not only does the same data NOT result in identical UUIDs when > input on different machines running different software, it doesn't even > result in identical UUIDs when input on different machines using > _identical_ software. This would, IMO, support the contention that > uniqueness and universality is restricted to a single machine and > application on that machine. UUID = Universal Unique ID. If two machines generate the same string, then they have blown the Unique criteria. It is not intended to be a tag that is common to distinct items that happen to be identical. That would be a checksum. :-) -- Wes Groleau There are two types of people in the world … http://Ideas.Lang-Learn.us/barrett?itemid=1157

    05/27/2011 02:22:11
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Wes Groleau
    3. On 05-27-2011 03:55, Tom Wetmore wrote: > You were right and should have stuck by your guns!! I may be misremembering, but I think the disagreement was that someone, perhaps Bob, thought that the UID would be a universal ID for a _person_ which could be used for that person in all databases. Others stated, no, it distinguihses a _record_ from other records that otherwise might seem identical. > I don't know where the others got the interpretation that a UID should > be unique only within one database. This is certainly not a GEDCOM rule I don't recall seeing that interpretation. -- Wes Groleau There are two types of people in the world … http://Ideas.Lang-Learn.us/barrett?itemid=1157

    05/27/2011 02:18:44
    1. RE: Event-based database software for historians, biographers and genealogists - redux
    2. Harrison Genealogy
    3. Steve Also the following has appeared in the Custodian Forum.... <Snip> Currently, Custodian builds Indexes for Name and Place. I use User Field 1 to keep my paper filing reference in. If this was indexable, and had a View similar to Names (contents of User Field 1 in left frame, Custodian entry in right hand frame), I could check that Custodian has details of everything I have paper copies of, and vice versa. Currently, I believe, I can only run this sort of check on groups of databases (eg Census), not across all databases. Reply .... Hi, Would it be useful to show the User Field 1 field in the Name Index? You can already do this - from the Name Index, choose Views from the Shortcut Bar, then choose the View Wizard. Give the 'new view' a name (or click on Main View to edit the original view)and click on Next and in the next window, find your User Field 1 and tick the check box. When you originally showed the User Field 1 in your form, you could also change the field name to something more meaningful (Filing Ref, or something similar). You might want to go back and do this so the column has a more meaningful title. You can also move the column, once showing in the Name Index, to the front of the other columns or a position where it would be seen without scrolling - just drag and drop the column header and use File, Save Layout. <Snip> Regards Bill -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Steve Hayes Sent: 27 May 2011 15:47 To: [email protected] Subject: Re: Event-based database software for historians, biographers and genealogists - redux On Fri, 27 May 2011 09:23:03 +0100, "Harrison Genealogy" <[email protected]> wrote: >Sorry about that it is as you say for tracking Documents BUT documents >relate to events .... So can you enlighten me when you say tracking events >what do you actually mean ? Yes, but you can have several document relating to a single event. An event could be: 1. A 21st birthday party -- documents relating to who was there, who was drunk, who was sober could include a photo (or album of photos), a diary entry, a letter from someone who was there describing the party. Each document may mention different people who were present, but the event record links to all persons who were listed in any of the documents as being present. 2. A committee meeting -- documents could include minutes, letters etc. 3. A car accident -- documents could include witness astatements etc. 4. Publication of a book -- people involved could be author, editor, literary agent etc. And so it goes. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message

    05/27/2011 12:15:36
    1. RE: Event-based database software for historians, biographers and genealogists - redux
    2. Harrison Genealogy
    3. Steve YES ... you can do something like that with Custodian ... for example on the Probate section you can enter the whole of a will and then highlight the names of people mentioned in the text and send them to the Names Index. Then with the SQL report you can interrogate the record. With the spare fields available you can add to any database with the data you want Regards Bill -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Steve Hayes Sent: 27 May 2011 15:47 To: [email protected] Subject: Re: Event-based database software for historians, biographers and genealogists - redux On Fri, 27 May 2011 09:23:03 +0100, "Harrison Genealogy" <[email protected]> wrote: >Sorry about that it is as you say for tracking Documents BUT documents >relate to events .... So can you enlighten me when you say tracking events >what do you actually mean ? Yes, but you can have several document relating to a single event. An event could be: 1. A 21st birthday party -- documents relating to who was there, who was drunk, who was sober could include a photo (or album of photos), a diary entry, a letter from someone who was there describing the party. Each document may mention different people who were present, but the event record links to all persons who were listed in any of the documents as being present. 2. A committee meeting -- documents could include minutes, letters etc. 3. A car accident -- documents could include witness astatements etc. 4. Publication of a book -- people involved could be author, editor, literary agent etc. And so it goes. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message

    05/27/2011 11:56:10
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Richard Smith
    3. On May 27, 3:53 pm, Bob Melson <[email protected]> wrote: > Seems to me at least one requirement for universality is that the same data > input on different machines results in the same output on those machines. No, you're misunderstanding the meaning of the 'universal' in a UUID (or 'global' in a GUID, if you prefer the Microsoft terminology). The guarantee that a UUID provides is that every time you generate one, it will unique across all databases, past, present or future, on any computer, anywhere in the world. UUIDs are not necessarily generated for any particular data, though in practice in a genealogical application, they may well be associated with a piece of data, such as a person, event or place. What you're asking for is not possible with a UUID, partly because there's no concept of generating a UUID for a specific piece of data -- there is no input when you generate a UUID. What you're asking for sounds more like a hash. This is where you take everything you know (or perhaps just a specific part of the data), and use it to generate a number -- the hash -- which can be used as an identifier. Each time you have the same input, you'll get the same hash out. But that's not actually very useful for genealogy. We've all got individuals on our family trees about whom we know very little, and what we do know is poorly documented. Someone in the family (but you can't remember who) said that second-cousin Bob had a nephew called John Smith who lived in London. All we know of this John Smith is that he lived in London. That must describe hundreds or thousands of different people. How do we ensure that the genuinely different John Smiths end up with different identifiers, while also ensuring that two different researchers without co-operating or even mutual knowledge of each other, can end up assigning the same identifier to the same individual even if they both have exactly the same information? It can't be done. It's all well and good saying we'd like it, but it's a technical impossibility. Even with the researchers co- operating in generating identifiers (for example, by using some central internet-based generator), it can't be done because "John Smith in London" simply isn't a unique handle. So we must compromise. Either we lose uniqueness -- that is, accept that two different people might sometimes get assigned the same "unique" identifier. Or we lose repeatability -- that is, accept that sometimes the same data will lead to a the same people, with the same known information, being assigned multiple identifiers. In the former case, a hash is a good implementation strategy; in the latter, a UUID is good. Or we may decide that because we've lost at least one of these guarantees, we may as well lose both an go for a simpler implementation, such as the xrefs used in GEDCOM (these are the I0001- type things). Richard

    05/27/2011 11:47:33
    1. Re: Event-based database software for historians, biographers and genealogists - redux
    2. Steve Hayes
    3. On Fri, 27 May 2011 09:23:03 +0100, "Harrison Genealogy" <[email protected]> wrote: >Sorry about that it is as you say for tracking Documents BUT documents >relate to events .... So can you enlighten me when you say tracking events >what do you actually mean ? Yes, but you can have several document relating to a single event. An event could be: 1. A 21st birthday party -- documents relating to who was there, who was drunk, who was sober could include a photo (or album of photos), a diary entry, a letter from someone who was there describing the party. Each document may mention different people who were present, but the event record links to all persons who were listed in any of the documents as being present. 2. A committee meeting -- documents could include minutes, letters etc. 3. A car accident -- documents could include witness astatements etc. 4. Publication of a book -- people involved could be author, editor, literary agent etc. And so it goes. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk

    05/27/2011 10:46:31
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Richard Smith
    3. On May 23, 12:52 pm, Tom Wetmore <[email protected]> wrote: > This thread is an offshoot from the Linux thread that is going off on a number of tangentsl. > > How should we store evidence in genealogical databases? [I've been away for a few days, so apologies for coming back into the discussion rather late.] I regard genealogical research as a seven stage process, and I tend to handle the data generated at each stage in different ways. 1) Planning Sometimes I've got a specific objective in mind -- something like "find out who Thomas Smith's parents are". For each of these objectives, I create a text file with a few notes about where might be a good place to search for evidence, where I've already looked, and a mixture of speculation and notes to myself. I name the file by surname, name and some additional suffix (say "the boot-maker") to make the person unique; if there's more than one plan per person (there rarely is), I'll disambiguate it in some further way. I also use symlinks (a bit like Window's shortcuts) to maintain an index of such plans by ancestor number in a separate directory. As I've got further back, I've found more and more frequently I don't have such as specific objective. The ultimate objective is usually to push back one or more generation, but I'm no longer specifically targeting records with that individual in mind; instead, I'm gathering as much information as I can do on the surname in the area. I have a directory with a more general set of plan files with just a surnames and area (typically a parish name somewhere near the centre of the area of interest). I use a revision control system (currently CVS) to keep track of changes to these plan files, and also to assist in backing them up. 2) Searching Whenever I search for something, I try to note the fact that I've done it in one of the plan files. This is particularly important if the search fails to find anything. If I'm in a records office, I tend to have a printout of the plan file and scribble on it, typing the notes up later. Sometimes I do the same for on-line research. I find on-line sites such as ancestry.com and familysearch.org particularly troublesome in this regard -- it's far too easy to spend an hour or two searching for things and forgetting to note anything down. Neither site keeps a log (at least, not that's available to the user) of what you've searched for, so you can't go back and write it up later. For this reason I no longer use familysearch.org directly. The only time I ever used it was to look up things on the IGI, so I wrote a perl script to drive the (old) site, do searches for me, download the full data set as GEDCOM and log each search I do to the appropriate plan file. The program requires me to associate the search with a specific plan, so I can't avoid recording the fact I've done a search. Putting these search logs into a database, and associating them with a source and/or repository, would be an obvious improvement. I did briefly experiment with gnote and mediawiki for the plan files but gave up -- I found them both overkill for what I wanted. The result of the search will vary. It might be a piece of GEDCOM (as per the example above), or an image (e.g. a census image on ancestry.com), or a entry in book (in which case I may or may not have been able to make a copy of it). Any paper copies I do end up with get scanned, and everything gets stored in directories, classified by type of record and surname. I'm not a big fan of putting things like images in a database, though indexing them in a database would be useful. At the moment, the only index I have is the directory listing. (As with plan files, I sometimes use symlinks if one document should be filed in multiple places.) 3) Transcribing Having found a document, the next job is to transcribe it. Often the result is a flat text file, again one file per source. I try to transcribe the document as accurately as plain text will allow, and there's the odd bit of ad hoc mark-up in it to document important bits of formatting: e.g. [struck-through: my daughter Isabella] or [inserted: Hampshire]. I very much like the idea that Nick Matthews suggested elsewhere of using XML for this, and may well start doing so. In longer documents, such as wills, I tend to put asterisks around peoples names to assist in searching; similarly, I often add ISO-style dates in parentheses [2011-05-24]. I don't do similar tagging for place names, though if I move to a light-weight XML format, I probably will do. In other cases, the source is essentially a long table. Baptism registers or census forms are a good example of this. In these cases, I use a tab-separated text file to record each field. That makes it easy to import into a spreadsheet or database, but at present the primary version is simply in the text files. Sometimes I'll use a spreadsheet to create them too, especially if I'm entering a large number by hand. If I need to add extra notes, they end up in the rightmost column. Tabular data of this sort is, again, an example of something that could usefully go into a database. At the moment, the text files get stored in CVS to retain a version history and to back them up. 4) Translating This stage is often irrelevant as the source is often in English (the only language I speak fluently). When it is necessary, I put the translation below the original transcript, in the same file as it. Even in English documents, there's sometimes an element of translation: for example, I'll add a note to remind myself what I think some obscure word or abbreviation means. 5) Extracting This is the stage that seems to be causing all the excitement here. It is when I extract the genealogical content from the source and put it into some computer-readable form. Typically I use GEDCOM as the destination format, simply because of its ubiquity. Sometimes I find GEDCOM inadequate for the purpose. For example, if a will mentions two grandchildren but gives no indication of whether the grandchildren are siblings, there's no way of expressing this in GEDCOM. In such a case, I'll either misuse GEDCOM to express what I need as best I can, or simply not bother extracting that bit of information (perhaps instead putting into a text note). For things like censuses, baptisms and so on, because the result of the transcription is already in a nice easy-to-parse tabular form, I have scripts that automatically create GEDCOM from the tables. Sometimes it needs hand editing afterwards to add some extra information that was in the source, but outside of the expected data -- for example, I once found a census on which two children had been grouped together with a big "}" and "twins" written next to them. In earlier baptism registers, the data is often more or less tabular, but with implicit fields recording whatever the priest felt was necessary; and occasionally an entry will have extra information included. Such cases need manual handling. I've also got a number of scripts that create blank bits of GEDCOM -- templates, if you like -- that I can then fill in. That fills in suitable source information. The result is hundreds of small GEDCOM files, one per source. Some (e.g. from a gravestone) just contain a single individual and little else; others (e.g. from a parish register or from an IGI search) may contain hundreds of individuals, some of whom may be duplicates (for example, if a couple have three children baptised, then the parents will appear three times). These GEDCOM files then get stored in CVS -- even the automatically generated ones. I will sometimes upload them into a genealogy program, but as I've not really settled on one that I like, I regard the GEDCOM as the primary version and never (well, rarely, anyway) use the program to make changes. It's just a tool to help me process or visualise the information. I've also experimented converting the GEDCOM to RDF and importing into an RDF processor (typically the Redland one) so that I can run SPARQL queries against it. This is really powerful, but also painfully tedious to use. I do see a future for something like this, though. I've also got a script that can search a directory tree of GEDCOM files looking for people that match specific criteria -- at the moment, it's pretty primitive, basically just doing name, date at a particular event, role in the event. It was originally designed for looking for baptisms, but has expanded a bit. 6) Reasoning This is the stage that most people think of as genealogy. It's where I try to work out how I need to combine the persona-level data extracted from the sources into real people. Was the John Smith in the 1851 census the one baptised in North Dunny or South Dunny, or maybe neither? This typically involves looking through all of the extracted persona-level data for people with the same (or a similar) surname in the locality over quite a long period. I tend to the view that unless I can understand every instance of surname in the source record, I cannot be confident that I've pieced it all together correctly. (And sometimes even then I can't be confident of it.) An unexplained burial could be evidence that what I had considered to be one family was in fact two, for example, and that might have knock-on-effects elsewhere. How I work at this stage depends on how many people I have. Sometimes there are few enough personae that I can keep everything in my head. For larger groups, I tend to print things out and spread everything out of my dining room table. In the very largest cases that's infeasible. For example, I once had an ancestor called John Smith and all I knew was that he was a cobbler, from Southampton, and an approximate date of birth from the 1841 census. Trying to sort out all of the Smiths in a big town was a complex task. (In the end I discovered evidence that he wasn't actually from Southampton after all -- he'd just lived there for a while before his marriage.) In that case, I created spreadsheet with everyone in. (And I still use an extended version of that spreadsheet as an index to the other records.) Once I've sorted things out into groups, then I enter them into Gramps (my current preferred program). I'll import bits of the persona-level GEDCOM because that's a convenient way of keeping source information with it. (Irritatingly I have to strip the repository from the GEDCOM and manually reassign it because Gramps can't, so far as I know, merge repositories as it can with other things, but that's a minor difficulty.) But what this doesn't do is give me any way of of documenting why I've merged the personae as I have. Sometimes this will be immediately obvious from the sources; but other times it won't. But at times, the reasoning process is more sophisticated. I often start with a large number of possibilities, consider each one and gradually discount possibilities as being too improbable until only one remains which for the time being I regard as probably correct. Documenting such things is tricky, but I really do care about documenting things: not primarily to justify my conclusions to others (though that is useful), but so that I can easily revisit them as further evidence comes to light, or as I correct any mistakes. At present, I use the plan files that I create right at the start of the whole research process to add notes on why I came to the conclusions I did. But this means the documentation behind the merging process is not kept with the merged individuals; nor is there a computer-readable link from the source to the documentation. I really want there to be so that if I have to correct a mistake in my transcription / translation / interpretation of the source, I can readily see what knock-on effects it might have. 7) Presentation The final step is presenting the data in a good way. That might means drawing trees (which many programs seem quite poor at), drawing ancestor tables (which they're much better at), or maybe just producing indexes of people. But this step is really beyond the theme of this discussion. Like most people, I expect, in practice, these seven steps often get blurred together, or some of them are not relevant. But whenever I find myself thinking about how to store some new sort of data, or how to rearrange the way I file things, I do find it very useful to think in terms of these seven steps. Richard

    05/27/2011 03:33:03
    1. RE: Event-based database software for historians, biographers and genealogists - redux
    2. Harrison Genealogy
    3. Steve Sorry about that it is as you say for tracking Documents BUT documents relate to events .... So can you enlighten me when you say tracking events what do you actually mean ? Regards Bill -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Steve Hayes Sent: 27 May 2011 01:47 To: [email protected] Subject: Re: Event-based database software for historians, biographers and genealogists - redux On Thu, 26 May 2011 18:47:28 +0100, "Harrison Genealogy" <[email protected]> wrote: >Steve > >Such a thing already exists ! .... Its called Custodian 3 and is based on >about 12 Access databases all linked together .... > >A demo version is available for download at http://www.custodian3.co.uk/ >the only restriction in the demo I believe is the number of records you can >enter. > >You can import from Access, Excel, VRI, IGI etc. AND you can add your own >fields and there is a form designer to enable to add your own records etc. I did check it out once, and even tried to discuss it here in comparison with Clooz, though there was little response. But as I remember it, it was evidence based rather than event based. It seemed to be primarily a tool for tracking documents and relating pieces of evidence, and there has been some discussion about that here too. That is not the same thing as tracking events. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message

    05/27/2011 03:23:03
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Bob Melson
    3. On Fri, 27 May 2011 00:55:01 -0700 (PDT), Tom Wetmore<[email protected]> wrote: > >>On Thursday, May 26, 2011 11:31:45 PM UTC-4, Bob Melson wrote: >>> >>> Y'all'd have to check the archives for the down'n'dirty, but Wes and I >>> and I don't remember who all else had a discussion of this very thing >>> (_UID, >>> _UUID, _GUID) some time back. I contended then that UUID (Universal >>> Unique ID) and it's close kin, UID and GUID, meant that the number >>> assigned to Cousin Mortimer should be the same universally - here, >>> there, >>> wherever it appears. Not so. The "universe" is the machine where the >>> data appears and on which the software resides by which the UUID is >>> generated and it's only there that it's guaranteed to be unique. Take >>> that same data to another machine or other software and, guess what?, >>> the >>> Universal Unique ID will be different. Or, say, Mort's data changes - >>> ta DA, you may (or may not) get yet another UUID. >>> >>> The only use I can see for UUIDs is in determining the origin of a >>> particular record - if I publish Mort's data and include the UUID and >>> some time later find that Snively Whiplash has published the exact same >>> data with the exact same UUID and has claimed it to be his own, then >>> I'll know what to think about ol' Snively, won't I? >>> >>> Swell Ol' Bob >>> >>Bob, >> >>You were right and should have stuck by your guns!! >> >>I don't know where the others got the interpretation that a UID should >>be unique only within one database. This is certainly not a GEDCOM rule >>since GEDCOM doesn't even have the UID concept. Their interpretation >>can only be coming from some vendor's misinterpretation, so to treat that >>misinterpretation as if it were a rule or the way things should be is >>wrong. >> >>A UUID is intended to be unique for all time and place. If a vendor says >>they use UID's where the U means "universal" and they don't support this >>then they don't support UID. A program should not alter a UID value upon >>import. If it does, though, who really cares, since there's no way to >>take advantage of UID's in genealogy. So no wonder it doesn't matter to >>anyone (yet). >> >>Tom Seems to me at least one requirement for universality is that the same data input on different machines results in the same output on those machines. This is absolutely not true - in my experience - when dealing with UUIDs. Worse yet, not only does the same data NOT result in identical UUIDs when input on different machines running different software, it doesn't even result in identical UUIDs when input on different machines using _identical_ software. This would, IMO, support the contention that uniqueness and universality is restricted to a single machine and application on that machine. Bob -- Robert G. Melson | Rio Grande MicroSolutions | El Paso, Texas ----- The greatest tyrannies are always perpetrated in the name of the noblest causes -- Thomas Paine

    05/27/2011 02:53:57
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Bob LeChevalier
    3. Steve Hayes <[email protected]> wrote: >On Thu, 26 May 2011 15:03:39 -0400, singhals <[email protected]> wrote: >>Steve Hayes wrote: >>> On Wed, 25 May 2011 21:36:21 -0400, in soc.genealogy.computing you wrote: >>> >>>> singhals wrote: >>>>> PAF produces a UID for each entry; I'm pretty certain other >>>>> programs do to, particularly the ones that allow >>>>> synchronization of multiple databases. >>>> >>>> Well color me surprised. >>>> >>>> PAF and Legacy both assign a UID, which one can see in the GED. >>>> >>>> Oddly enough, and despite what I've heard said, when I >>>> imported the PAF file into a LEGACY file (direct import, not >>>> GED), the UIDs changed. Seems to me that's a bit awkward >>>> for someone trying to use those UIDs for anything. >>> >>> What is a UID, and where can I find it? >> >>Unique IDentifier -- new with PAF4 as I recall, and I think >>inspired by another program that used them. As I recall, we >>were told that if two people used two computers to enter the >>same data each computer would generate a different UID for >>that person. I don't /think/ you can see them anywhere but >>the GED. The computer can, and in PAF when you run the >>match/merge routine, it asks if you want to use UID. > >When I do that it asks me if I want to us the Ancestral File Number. > >Thatr works quite well if the person has an Ancestral File Number, but those >numbers have now been discontinued, and no new ones are being generated. Actually, nFS is indeed generating new numbers, but calls them something different. Legacy handles them, as of version 7.5 (Actually, it has space for a User ID, an AF number, and a Family Search ID number, the latter being the new number) Of course the only way to get a Family Search ID number is to enter your data on their site. lojbab --- Bob LeChevalier - artificial linguist; genealogist [email protected] Lojban language www.lojban.org

    05/27/2011 02:35:28
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Steve Hayes
    3. On Thu, 26 May 2011 15:03:39 -0400, singhals <[email protected]> wrote: >Steve Hayes wrote: >> On Wed, 25 May 2011 21:36:21 -0400, in soc.genealogy.computing you wrote: >> >>> singhals wrote: >>>> PAF produces a UID for each entry; I'm pretty certain other >>>> programs do to, particularly the ones that allow >>>> synchronization of multiple databases. >>> >>> Well color me surprised. >>> >>> PAF and Legacy both assign a UID, which one can see in the GED. >>> >>> Oddly enough, and despite what I've heard said, when I >>> imported the PAF file into a LEGACY file (direct import, not >>> GED), the UIDs changed. Seems to me that's a bit awkward >>> for someone trying to use those UIDs for anything. >> >> What is a UID, and where can I find it? >> > >Unique IDentifier -- new with PAF4 as I recall, and I think >inspired by another program that used them. As I recall, we >were told that if two people used two computers to enter the >same data each computer would generate a different UID for >that person. I don't /think/ you can see them anywhere but >the GED. The computer can, and in PAF when you run the >match/merge routine, it asks if you want to use UID. When I do that it asks me if I want to us the Ancestral File Number. Thatr works quite well if the person has an Ancestral File Number, but those numbers have now been discontinued, and no new ones are being generated. > >> I regularly export my data from FHS to Legacy. >> >> FHS assigns an RID, which Legacy callas a RIN, but if it imports them from the >> GEDCOM file, Legacy scrambles them. >> >> So what I do is import the exported records to PAF 4, and import them directly >> from PAF to Legacy. Then the Legacy RINs correspond to the FHS ones. >> >> But what is the UID? > >The UID is not a RID/RIN; it is specifically assigned to an >individual record and does not mutate when moved around. I'll have to look at some recent Gedcom files to see if I can find them. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk

    05/26/2011 09:15:47
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Steve Hayes
    3. On Thu, 26 May 2011 10:09:09 +0100, "Steven Gibbs" <[email protected]> wrote: > >"Bob LeChevalier" <[email protected]> wrote in message >news:[email protected] >> >> That is the difference in my approach. I generally don't add someone >> to my data base unless I have connected them to at least one other >> person in my data base. Unlinked individuals are better dealt with in >> a flat table (spreadsheet) than in a relational data base. > >How do you do that when the data in the record is inadequate to provide >linkage? I used to keep my parish register extractions sorted in text >files, but it became impossible to find things once the files became >significantly large. > >Imaginr that you have the will of a John Smith which names his sons as >William and Thomas. Imagine also that you have a marriage certificate for a >Thomas Smith that names his father as John Smith. Clearly on the evidence >I've presented they may or may not be the same people. Can you search your >text files easily to find all candidates for the Thomas Smith who married, >subject to the constraint that his father is called John? If, not having >looked at the family for a few years, you later come across a document which >confirms that Thomas has a brother William, or a document which suggests >that Thomas has no brothers, can you rearrange your thought processes to >take this into account? It might be easier if you transfer the data from a spreadsheet to a database program. It is quite easy to do that in most cases. Database programs usually have better reporting facilities. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk

    05/26/2011 09:11:42
    1. Re: Event-based database software for historians, biographers and genealogists - redux
    2. Steve Hayes
    3. On Thu, 26 May 2011 18:47:28 +0100, "Harrison Genealogy" <[email protected]> wrote: >Steve > >Such a thing already exists ! .... Its called Custodian 3 and is based on >about 12 Access databases all linked together .... > >A demo version is available for download at http://www.custodian3.co.uk/ >the only restriction in the demo I believe is the number of records you can >enter. > >You can import from Access, Excel, VRI, IGI etc. AND you can add your own >fields and there is a form designer to enable to add your own records etc. I did check it out once, and even tried to discuss it here in comparison with Clooz, though there was little response. But as I remember it, it was evidence based rather than event based. It seemed to be primarily a tool for tracking documents and relating pieces of evidence, and there has been some discussion about that here too. That is not the same thing as tracking events. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk

    05/26/2011 08:46:59
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Tom Wetmore
    3. On Thursday, May 26, 2011 11:31:45 PM UTC-4, Bob Melson wrote: > > Y'all'd have to check the archives for the down'n'dirty, but Wes and I and > I don't remember who all else had a discussion of this very thing (_UID, > _UUID, _GUID) some time back. I contended then that UUID (Universal > Unique ID) and it's close kin, UID and GUID, meant that the number > assigned to Cousin Mortimer should be the same universally - here, there, > wherever it appears. Not so. The "universe" is the machine where the > data appears and on which the software resides by which the UUID is > generated and it's only there that it's guaranteed to be unique. Take > that same data to another machine or other software and, guess what?, the > Universal Unique ID will be different. Or, say, Mort's data changes - ta > DA, you may (or may not) get yet another UUID. > > The only use I can see for UUIDs is in determining the origin of a > particular record - if I publish Mort's data and include the UUID and some > time later find that Snively Whiplash has published the exact same data > with the exact same UUID and has claimed it to be his own, then I'll know > what to think about ol' Snively, won't I? > > Swell Ol' Bob > Bob, You were right and should have stuck by your guns!! I don't know where the others got the interpretation that a UID should be unique only within one database. This is certainly not a GEDCOM rule since GEDCOM doesn't even have the UID concept. Their interpretation can only be coming from some vendor's misinterpretation, so to treat that misinterpretation as if it were a rule or the way things should be is wrong. A UUID is intended to be unique for all time and place. If a vendor says they use UID's where the U means "universal" and they don't support this then they don't support UID. A program should not alter a UID value upon import. If it does, though, who really cares, since there's no way to take advantage of UID's in genealogy. So no wonder it doesn't matter to anyone (yet). Tom

    05/26/2011 06:55:01
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. singhals
    3. Tom Wetmore wrote: > Cheryl, > > I stand corrected. I was talking about standard GEDCOM. The _UID tag is an > extension tag used by some systems. As you say, UID's should be immutable > for all time and space. But since it's not a standard tag there's not telling what > might happen to it on import to arbitrary programs. Any program that changes > its value on import should be taken behind the barn and shot. You could be right, but I would have sworn it was in the GedStan. (shrug) I've been wrong before -- and /recently/. Cheryl

    05/26/2011 04:04:08
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. Bob Melson
    3. On Thursday 26 May 2011 20:04, singhals ([email protected]) opined: > Tom Wetmore wrote: >> Cheryl, >> >> I stand corrected. I was talking about standard GEDCOM. The _UID tag is >> an extension tag used by some systems. As you say, UID's should be >> immutable for all time and space. But since it's not a standard tag >> there's not telling what might happen to it on import to arbitrary >> programs. Any program that changes its value on import should be taken >> behind the barn and shot. > > You could be right, but I would have sworn it was in the > GedStan. (shrug) I've been wrong before -- and /recently/. > > Cheryl Y'all'd have to check the archives for the down'n'dirty, but Wes and I and I don't remember who all else had a discussion of this very thing (_UID, _UUID, _GUID) some time back. I contended then that UUID (Universal Unique ID) and it's close kin, UID and GUID, meant that the number assigned to Cousin Mortimer should be the same universally - here, there, wherever it appears. Not so. The "universe" is the machine where the data appears and on which the software resides by which the UUID is generated and it's only there that it's guaranteed to be unique. Take that same data to another machine or other software and, guess what?, the Universal Unique ID will be different. Or, say, Mort's data changes - ta DA, you may (or may not) get yet another UUID. The only use I can see for UUIDs is in determining the origin of a particular record - if I publish Mort's data and include the UUID and some time later find that Snively Whiplash has published the exact same data with the exact same UUID and has claimed it to be his own, then I'll know what to think about ol' Snively, won't I? Swell Ol' Bob -- Robert G. Melson | Rio Grande MicroSolutions | El Paso, Texas ----- The greatest tyrannies are always perpetrated in the name of the noblest causes -- Thomas Paine

    05/26/2011 03:31:45
    1. RE: Event-based database software for historians, biographers and genealogists - redux
    2. Harrison Genealogy
    3. Steve Such a thing already exists ! .... Its called Custodian 3 and is based on about 12 Access databases all linked together .... A demo version is available for download at http://www.custodian3.co.uk/ the only restriction in the demo I believe is the number of records you can enter. You can import from Access, Excel, VRI, IGI etc. AND you can add your own fields and there is a form designer to enable to add your own records etc. Check it out .... Regards Bill (just a user of it - No links to the Co at all) -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Steve Hayes Sent: 26 May 2011 09:04 To: [email protected] Subject: Event-based database software for historians, biographers and genealogists - redux There has been some discussion in soc.genealogy.computing about event-based software, and also software for recording evidence. This has come up before, and I still feel the need for such software, and have yet to find it. We are spoiled for choice in lineage-linked programs, but I don't know of any event-based programs that will do what I want - namely to record events, and the people and organisations associated with those events. The links would not necessarily be genealogical, but should include friends, employers, yeachers, pupils, colleagues and even enemies. I think a program that could do this would ne of interest not just to family historians, but also to general historians, biographers and even detecitives (where the "events" could be related to a crime, or a series of crimes, and the "people" would be the suspects and their associates. I've tried to summarise the needs in a blog post at: http://su.pr/2vQjRv and sample database tables are available for download from the files section of the genealogy software forum at: http://groups.yahoo.com/group/gensoft/ I'm reopening this subject because I really would like to see (and be able to use) such a program before I die. -- Steve Hayes from Tshwane, South Africa Web: http://hayesfam.bravehost.com/stevesig.htm Blog: http://methodius.blogspot.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message

    05/26/2011 12:47:28
    1. Re: How Should We Store Evidence in Genealogical Databases?
    2. singhals
    3. singhals wrote: > singhals wrote: >> Tom Wetmore wrote: >>> On Wednesday, May 25, 2011 12:40:54 PM UTC-4, Wes Groleau wrote: >>>> On 05-24-2011 16:26, Tom Wetmore wrote: >>>>> That doesn't solve the problem of where to store those links. As Cheryl >>>>> points out, if you put a link in a person record, you are making the >>>>> explicit statement that the linked-to evidence refers to that person. >>>> >>>> Always? Many programs support some variation of GEDCOM's TYPE tag. >>>> >>>> No reason a link couldn't have a TYPE subrecord, or a NOTE or .... >>>> Even if GEDCOM doesn't officially support it. >>>> >>>> Maybe some software out there somewhere has tried that. >>>> >>>> (Please don't take my comments as an enthusiastic endorsement >>>> of GEDCOM) >>>> >>> Wes, >>> >>> I wasn't clear enough. The idea I was getting at is this. You have found >>> an item of evidence that you either copy onto your computer as a file >>> or you have as a URL text string. You are pretty sure this evidence refers >>> to a person you are interested in, but you haven't gotten enough info >>> yet to be sure of this or to know exactly what person it refers to. >>> >>> Cheryl made the point that she would keep a link to that file or URL in >>> a person record in her database. My question was directed to the >>> situation where you don't yet have such a person record to hold the link. >>> >> >> Then I create a person-record/persona for it. Hence the >> dozen or so different entries with a single name. >> >>> My preferred approach is to codify that evidence into new persona >>> records and let them be sit in the database while you collect more data. >>> These persona records are indexed and searchable and manipulable >>> and editable as easily as regular person records. >>> >> >> Apparently, you're calling what I do "codifying"; I call it >> saving. >> >>> I personally find that this simple mechanism solves all problems >>> I have with designing a single system that can seamlessly handle >>> both record-based and person-based genealogy. I just need the >>> software to give the UI to do this. >> >> PAF produces a UID for each entry; I'm pretty certain other >> programs do to, particularly the ones that allow >> synchronization of multiple databases. > > Well color me surprised. > > PAF and Legacy both assign a UID, which one can see in the GED. > > Oddly enough, and despite what I've heard said, when I > imported the PAF file into a LEGACY file (direct import, not > GED), the UIDs changed. Seems to me that's a bit awkward > for someone trying to use those UIDs for anything. > [BLUSH] Sorry; it does what it's supposed to; I have two guys with the same name and I picked up the wrong one when I looked before. The UIDs did remain stable and unchanged from one program to the other. The UID is NOT an RIN/RID; it is a multi-digit number said to be totally unique (hence, UNIQUE Identifier). Cheryl

    05/26/2011 09:06:06