DR Yes, all the long rangers, as far as I can tell, are not controlling for the possibility that the similarities between languages they are working on fall within the range of what one would expect by chance alone.
PS Do you have some sort of objective measure on this question?
DR Yes, I've tried constructing randomised word lists, running tests on them, and as far as I can tell the incidence of chance resemblances is much greater than one would expect just trying to do it by ones own gut feelings. In fact, one can prove to oneself just by taking a look at those look-a-likes among the Indoeuropean family that really have nothing to do with each other historically: English `much' has nothing to do with Spanish `mucho' and so on and so forth, and indoeuropeanists can show that. The reason we can show that is because we can show all the sound changes. That sort of thing should at least lead one to be uneasy about the factor of chance and begin to try to put together some kind of method to figure out what the range is. I've only got very crude measures of what the range is right now, but it's clear that it's substantial, and they're just not controlling for it.
PS Okay, how do you control for them?
DR Well, the way that traditional historical linguists control for the factor of chance is going for overkill. We insist upon regular sound change only, we insist upon knowing all the regular sound changes all the time, we insist that every word show nothing but regular sound changes if it's going to be probative of relationship.
PS In every segment?
DR In every segment. Also we insist on perfect semantic matches. Now it's not the case that everything we work with has to show all these characteristics to be probative, but there has to be a hard core that does to prove the relationship in the first place.
PS And you think the hard core is missing in Nostratic?
DR I think that the hard core is so small that it could be the result of chance. For example, if you take a look through the comparative dictionary of Illich-Svitych, you will discover that he does not really work with exceptionless sound changes. The thing is really quite a bit worse than Pokorny's Indoeuropean dictionary, and Pokorny is regarded as disgracefully lax by indoeuropeanists, terrible in fact, bad by our standards.
PS But the 2000 or so items in Pokorny are not all of the hard core you are talking about.
DR Only a tiny fraction are, maybe only 10%, but that's still rather a lot. I didn't find such a hard core in Illich-Svitych, I've been through it and I don't find it there. That's really what I'm talking about. It is possible to get some kind of measure of what the range of chance is and I'm working on that, and I've published a number of things along those lines, and I'm still trying to improve the mathematics, and improve the application of the mathematics to data, to come up with a set of guidelines that will work across the board, so we can sit down and apply the guidelines and come up with a good estimate at the other end. The problem is that everything depends upon the individual sounds. It turns out that the common sounds in any language are going to match the common sounds in any other language pretty often, so you have to sit down and work out the incidence of every phoneme in each language before you start. It's labour intensive and there's no way around it.
PS Surely one control measure is that you should be able to come up with a number of equally plausible reconstructions at such great time-depths, and yet from my perspective the Illich-Svitych model is more plausible than say Bomhard's, for example, it's not riddled with mistakes as Bomhard's.
DR Oh sure, sure.
PS It's a principle of science that the best hypothesis at the time you get up and run with.
DR Sure, only the best hypothesis we have at this time is that the relationship is not demonstrable. That is an hypothesis, you can't just rule it out, and I think they really have ruled it out, and while it looks like Illich-Svitych knows his languages better than Bomhard, and is somewhat more careful, he doesn't know them well enough and he's not careful enough, as far as I'm concerned, and as far as indoeuropeanists in general are concerned. We just have higher standards and we think they are necessary in order to avoid a factor of chance. This is why Illich-Svitych's work has been referred to in the German press as witchcraft. They take their lead from the indoeuropeanists in German universities, who are the most rigorous lot in the world. They looked at Illich-Svitych and laughed because he's nowhere near as rigorous as them and I might say as me.
PS This is another issue, but it has been suggested to me that indoeuropeanists really only work at rather shallow time depths, like only a couple of thousand years. That is that they work back from Germanic languages to Proto-Germanic, or Slavonic languages back to Slavonic, or from one of the subgroups back to PIE or from the ancient written languages such as Ancient Greek or Sanskrit back to PIE, and this is quite a different thing to what the Nostraticists are doing.
DR There are two things to say here: bear in mind that comparative Celtic only starts with Old Irish, our oldest substantial Celtic material, and that's the 8th century AD. Using Celtic as evidence for IE therefore involves a jump of well over 3000 years and possibly over 4000 years.
PS But you already have PIE reconstructed.
DR We use Celtic to reconstruct PIE just like we use any other branch. It does in fact have some crucial things that no other branch retains. The same thing with Tocharian, the gap between Proto-Tocharian and PIE is at least 3500 years, and Tocharian is used as one of the primary witnesses for PIE. It's not the case that we reconstruct PIE on the basis of Greek and Sanskrit and then compare it elsewhere, people who think we do that don't understand what we do. If they do they're wrong, it's as simple as that, they're wrong. That's one thing, the other thing to be said is that evidence for proto-language really does self-destruct over time, you can prove this to yourself, in fact you can give yourself a good scare..
PS I've given myself many in my reconstruction work..
DR [hearty laughter]..by taking Clark-Merritt and Hall's dictionary of Old-English, which is great because it tells you the Modern English words that are descended from the Old English words, open it to any page, any page at all, and the number of Old English words surviving into Modern English will always be less than half the words on the page and often much less than that and that's only a thousand years!
PS But the English situation is rather unique, with the imports of Norse and French etc..
DR It's not that unique, you'll find the same thing with Latin and French, granted that's a longer period of time, that's a couple of thousand years, but then French re-borrowed from Latin heavily in the Middle Ages, and you get the same result, something like fifty percent loss of the whole vocabulary in a millennium.
PS Okay, the whole vocabulary.
PS Swadesh never pretended to be dealing with the whole vocabulary.
DR That's why we use the Swadesh lists. I point this out because some people say that using basic vocabulary is a bad idea, that we should be looking at the whole vocabulary, you hear that from long rangers, who haven't tried the little test with the modern languages.
PS But surely Dolgopolskij is the champion of sticking to the core vocabulary?
DR He's a little better than some of the others.
PS He satisfies the criteria you specified, and then you say that he's only a little better?
DR He's not satisfying all the criteria. Again, he's not controlling for chance. But since we're talking about time depths, which as you say is a slightly different question, I mean he's a little better over all, he knows what he's doing better, he knows the languages even better than Illich-Svitych did, as far as I can tell.
PS They were contemporaries, and he's been doing it a lot longer.
DR Sure. Anyway, so you take a look at core vocabulary and that's quite a bit better, but even core vocabulary gets lost, there's nothing that can't be lost, English has even lost pronouns, for that matter even Armenian has lost pronouns, lost basic pronouns that English hasn't. The interrogatives in Armenian are a serious problem, they don't have any initial consonants...
PS Which is about all they have in most of the world.
DR Yeah, right, exactly! The second person plural nominative in Armenian is formed by taking the second person singular and slapping the plural marker on it. You know, just gross losses. Anything can be lost, and obviously there is no single constant rate of loss, that was proved for the Swadesh list back in the fifties, but there is a range outside of which languages don't fall over the long run, you know.
PS It evens out?
DR It evens out in the long run. You find that at the absolute outside, after 12 to 15 millennia you've got nothing left, you've got maybe five words on the hundred word list and that's the point at which you can't tell the difference between real retention and real similarities.
PS But Nostratic falls within that range.
DR What range?
PS Well it's not 15,000 years old. I'd argue that it's about 10,000 years old.
DR Then the reason that they're coming up with so little that will stand the test of chance simply has to be that their methods aren't good enough. I return to what I said before about Illich-Svitych, not rigorous enough, and again not controlling for chance.
PS You're giving a joint presentation here?
DR Actually Ann (Taylor) is giving the presentation, we're working jointly with a computer scientist whose mathematics we don't understand, it's a little too complex. I've got enough understanding to field questions if they're not technical. I hope nobody's going to ask how the algorithm works because the last explanation I got lasted 15 minutes, and I went along with it pretty well for 15 minutes and then she said "and then you just do it all with dynamic programming" and I can't even visualise that!
PS Well I'm a strong believer in the principle that if you want mathematics you ask a mathematician, if you want linguistics you ask a linguist and so on and so forth.
DR Yep. I'm trying to get some probablists interested in this little question of chance resemblance. I've got a bite, and we'll see if I can reel them in and get them to work on this. What I've done so far is more or less correct but it's not sophisticated enough, we need a mathematician for it.
PS I have one more question then. On the question of Proto-World, it has been suggested that when you look for the Proto-World vocabulary, you can find a distinctive vocabulary, there being about 40 or 50 things you can regularly find, but beyond that you run out and don't find any more vocabulary. This is a distinctive distribution, and this distribution has some meaning.
DR I can only say that that has not been my experience working with languages that I don't think are related, I don't think there is any such limit. For example I sat down with 9 languages from North and Central America, two known to be closely related, Ojibwa, Menominee, and two known to be distantly related, Pipil and Hopi, and the rest, in my view not related at all, but all of them in Greenberg's Amerind, if Amerind were real, except for Greenlandic Inuit. I found 115 sets in the 100 word list of at least two items that looked pretty good to me, and I was being quite careful, I could have found more than 115 sets. What's more, Greenlandic didn't fall at the bottom in terms of how many sets it had, it fell in the middle, as it should not do if Amerind was a... and mind you I factored out the ones where we know the languages are related, I wasn't even counting those. So my guess is that if these guys are finding a Proto-World vocabulary and it peters out, that's an artefact of what they're doing, that comes from inside their heads.
PS It's important that we find where the artefacts are in our methods.
DR Yep, that's right. That's the reason for working on randomised models, cranking them through the mathematics and seeing what comes out the other end That's objective, there isn't any subjective judgement there. That's what you've got to do.
[Editor's note: some months later, in subsequent discussion about the above interview, Don offered the following interesting addition:
Johanna Nichols has, I think, come up with a method that is likely to give nice, rigorous judgments on long-range comparisons without the same amount of grinding work that my attempts have required. This is very recent..... my hard-assed remarks toward the end may now be obsolete, and I know it.]