On 7/2/2016 1:05 AM, Andrew Lancaster via wrote: > ... > > In my opinion Stewart's position on larger scale online collaboration > boils down to this: "In my opinion, it is unlikely that the second route > will ever lead to anything better than "not quite as bad as it used to > be." Even if it were possible (which would require a method that > corrects errors faster than they are being introduced), the labor > involved in cleaning up these messes would be much more than just > starting from scratch." > > In other words Stewart accepts that large scale collaborations possibly > get slowly better sometimes but insists that he is sure that worsening > edits will always inevitably come faster overall. I have pointed out > that this is a statement which can be tested against reality, but the > only empirical evidence being given to "prove" the statement is a static > observation of bad articles on wikis. If edits remain open to those lacking the necessary expertise, then I believe that the chances of overall improvement are slim. And, as I pointed out above, even if the quality of a bad database does increase to something acceptable, that is likely to take longer than what would result from just starting over and doing it right in the first place. Also, any increase in quality will almost certainly be slower in the more difficult cases. Even if a database "improves" to the point where 90% of the pages are reasonable and 10% of the pages are nonsense, that is still pretty bad, especially when claims of "accuracy" are badly overstated. > On wikitree, just to take that example again, I have not seen any > article that anyone worked on properly ever revert or worsen in any > significant way, at least since editing in the medieval sections was > changed. If that observation holds, then Stewart's model of how large > scale wikis inevitably have to be fails, because his model is basically > stating that this can absolutely never be true. > > In other words I am saying that if you fix 10 articles on wikitree they > will almost certainly remain fixed as far as I have seen, at least for > pre-1500 articles. Could you point out some examples of high quality wikitree pages? I have yet to see one (in any time period) that was any better than mediocre, but perhaps I have been looking in the wrong place (mainly my own ancestors and various early medieval pages). > None of this is intended to accept the opposite argument that we might > as well all work on large scale collaboration because these are > inevitably the future. Small focused collaborations can achieve things > more quickly, and with far more attention to quality from the first > moment. As people get more used to the idea that a quality publication > does not need to be paper, some online collaborations are increasingly > seen as equivalent to quality publications on paper. (Whereas largescale > wikis typically see themselves as places to collect and collate > information from more focused but un-linked sources.) And one of the major problems is that the typical large scale genealogy wiki page writers show so little discrimination between good and bad sources, often citing the good and bad side-by-side without any discrimination. One example is the wikitree page on Charlemagne (given "Carolingian" as an apparent surname), which does not even cite the most obvious source, La préhistoire des Capétiens, by Christian Settipani and Patrick van Kerrebrouck (although one citation suggests that one contributor used some written by Settipani indirectly). > And just to make it clear, fixing an article also means making sure it > has proper sourcing. I have never heard an argument against that. I find > the comments on why sourcing is important in this discussion pretty weak > to be honest. Sourcing is important not because people "own" facts, and > that they can be "stolen". This seems a misapplication of a legal term. > Owning and stealing are defined in written laws, that can be changed by > politicians, and vary between countries. I do get the point that some > bad people on the internet have no respect for what is important, and we > want to use words that show we feel strongly, but I am not sure this > helps us make the world better. I have no problem with "pride in work", > but it is more a description of why we find quality generally important > in the first place, a comment on human nature, not why sourcing is > itself part of quality. I think you are misinterpreting the comments made by Denis. I interpreted "stealing" in this context as referring to the plagiarism which is so common on the Internet. I don't think that anyone in THIS discussion was claiming that facts can be "owned" (although some commercial operations seem to have a position close to that). > I believe that if we are talking about why sourcing is important for > quality in good writing of articles about most things, perhaps > especially genealogy, it is because a good article should distinguish > what is known from primary sources, and what has been suggested by > secondary sources including ourselves. If we do not set-out where we got > our information, mistakes will multiply. That is why I believe the > standardized formatting for any lasting online collaboration for good > medieval genealogy will (like the Henry project) include a policy on how > to set-out and explain the sources, and indeed the debates possible > about them. Standardized formatting works OK in routine situations, but it can get in the way in more difficult cases. For example, the silly "Charlemagne Carolingian" mentioned above is probably due to the careless use of a surname field. In addition, "documentation" includes more than just listing where the information came from (ideally with enough details that anyone wanting to check the information can find it). Good documentation also includes a discussion of the logic behind any interpretation of the evidence that is anything more than routine, and the best documentation makes it clear where the proof of each individual "fact" can be found, not just a list of stuff at the end where we have to check a bunch of sources to find proof of the individual fact of interest. I can't count the number of times that I have checked somebody's work which they clearly believed that they had documented, only to find several supposed facts (dates, places, middle names, etc.) for which the cited sources supplied no evidence. All of the standardized formats I have seen are far too rigid to deal with complicated cases. For example, a statement in a medieval sources that two individuals were cousins might lead to several different partly overlapping scenarios in the scholarly literature attempting to explain the relationship, with varying degrees of probability. Different interpretations of evidence on one particular relationship can have a "cascade effect" which affects other relationships in a complicated way. Two examples of this can be found in the Henry Project pages for Geoffroy, viscount of Châteaudun and Gerberge, mother of Otte-Guillaume of Burgundy: http://sbaldw.home.mindspring.com/hproject/prov/geoff004.htm http://sbaldw.home.mindspring.com/hproject/prov/gerbe002.htm As can be seen, both of these individuals involve complicated discussions which spill over to the pages involving other individuals. The organization of such complicated discussions generally involves some sort of creative process, and often cannot be coherently outlined in the brief "author A says X, author B says Y" form that amateurs writing such outlines seem to prefer. It would be interesting to see a standardized format that is flexible enough to deal with such situations. Stewart Baldwin