All right. Thank you, ladies and gentlemen. Please be seated. Dr. Weir, would you resume the witness stand, please. And, Mr. Clarke, you may conclude your direct examination.
Dr. Weir, I think we had been discussing the fact that in this case, you also calculated frequency estimates for the various items of evidence that matched a particular individual; is that right?
Did you also calculate frequencies, that is approximations, of how often certain characteristics that were found in mixture occur in populations?
As a result of the examinations in this case of not only the databases that you've described earlier, but also the actual pieces of evidence and the frequencies attached to that piece of evidence that matched or was consistent with someone, did you describe those in your report as far as this same idea of approximately "1 in"?
Yes. That's a fairly convenient way of writing the numbers. We could do it one of two ways. We could say the frequency is .000 and so on or we could say it's 1 in a million. It's a little easier to think about, the 1 in a million, than write down decimal point, five zeroes and a 1.
In other words, it's little easier in terms of understanding the math to use the term "1 in" instead of, as you've said, .0235 or whatever combination that may be involved?
It's a little easier to keep track of things. We understand the word million. It's hard for our eyes to distinguish five zeroes in a row from six zeroes. It's that kind of thing I'm talking about.
Now, in your calculation of frequencies, when you use the methods that you have described, would your frequencies be the same as, for instance, as described on these boards by Cellmark or DOJ for various pieces of evidence?
Well, for three reasons. The first one is, I sometimes use different databases. And as we know, as we expect different databases, we'll get different answers. The second one is, along with my estimate, I attach some measure of confidence in that estimate. As I've said already, ultimately, the answers we come up with are based on a few hundred people. That's just the way it is. It's appropriate, but that's just the way it is. A different set of people will give us a different answer. So as statistician, I need to say, here is my answer and this is how good I think the answer is. Now, we meet this concept of day. We see in the newspaper 47 percent of the people support the president on some issue plus or minus 3 percentage points. So the president feels good because he's got half the people and the opposition feels good because they've only got 44. It gives everybody room to feel good. But when we see the 47, we stick with that number, but we know whatever the true number is, it's probably going to be in this range of plus or minus 3 percentage points. And it's the plus or minus 3 which, if you think about it, is just about as important as the 47 percent. One without the other is only giving some of the stories. The plus or minus is an important concept. In statistics, we call that a range from the--from the 47 minus 3 to the 44 percent up to the 47 plus range to 50 percent, that range of 44 to 50, it's a confidence limit, confidence interval. Excuse me. It is an interval with which we have a certain confidence that includes the true answer. Well, it's the same thing here. Our measure of confidence, our confidence interval is going to depend on the data. It's going to reflect the size of the database. If we type everybody that was, the confidence limit--interval would shrink to nothing and the answer we have would be a true one. The fewer people we type, the wider becomes this confidence limit, and in essence, the less confidence we have in any specific number in that range. Now, the public opinion polls are generally based on only about a thousand people, 1200, something like that, to give the plus or minus 3 percentage points, but they apply to a single question, do you support the president, yes or no or yes, no or many. It's a single question.
What we have here is, it's compound, isn't it because the profile has say 12 bits to it. Each of the bits separately, we could do a plus or minus, and on the databases, we'll get about the same thing plus or minus 1 or 2 percentage points. When I say the frequency of a piece is 10 percent based on a few hundred people, that will be about somewhere in the range 9 to 11 percent, maybe 8 to 12. But it will be very close to 10 percent. Each of those 10 percent numbers, I have a fairly good precision on. But when I start multiplying them together, my precision kind of dies off, and I'm in--and it's kind of dramatic. It's not going to be 3 percent anymore. It's going to be about 50 percent. So when I say--when anybody says an estimate of a profile is about 1 in a million, attached to that 1 in a million is the statement or it could be as low as half a million. And now, the actual numbers vary according to the situation, but a good rule of thumb is that we--that I would divide the original number by 2, and then that division, that smaller number, which is actually a higher frequency, it's a more conservative number. That's sort of given me--it's a 99 percent confidence limit. So I do the estimates as did the other people, and I would attach to them a 99 percent confidence limit. It's just a reflection of the fact that our answer is based on a relatively small number of people.
Is that a step you take to make sure at least to the extent you can that you're not overstating the rarity of something?
There is another step which even further reduces the numbers that we have prepared, and this is a more recent step that's been developed over the last--probably last couple of years by myself and some other scientists in both this country and in England, and it has to do with a somewhat vexing problem in theory. Here we have a crime committed in a certain area and we're going to attach to the evidentiary items a number based on data collected from people all over the country. What relevance does that--like the FBI's data. What relevance does the FBI's data from the whole United States or at least from three different states have to Los Angeles County?
Well, I think it has a lot of relevance because I think these frequencies do not vary very much. But the point has been raised, well, what if the relevant population here is very different as far as its DNA frequencies go. Maybe people in this area or people who might be considered as potential perpetrators for any crime, maybe that set of people has this profile very commonly. Maybe the profile that's found to match occurs once in a hundred of these people. It might occur once in a million over the whole country, but that would be unfair the argument goes to give the country-wide figure when we should be talking about this specific area of the crime. So the issue is, do we have groups within the population. It's called population structuring. Well, we do of course. We know that in the U.S. Caucasian population, we have people whose backgrounds are from different European countries. We've got people of Italian descent, we've got people of Irish descent and so forth, just to make sort of an obvious example. And to some extent at least, people with those backgrounds--married people they know and maybe have similar backgrounds. So we may to some extent have preserved those differences that occur in the European countries. They may still be here.
That's the objection that has been--that's the point that's been raised. Well, it sounds like a valid point. How can we accommodate it? Well, in population genetics, we have a theory which enables us to measure the amounts of difference amongst these groups and it enables us to modify our answers, enables me, for example, to take the band frequencies from the whole country and in essence modify them. It makes them a little bit more frequent to account for the possibility that in any one group, the frequencies are a bit wider than the nationwide average. I do--I have published on that as have other people. So we have done an accommodation for the possibility of population structuring. When I do that, it reduces a number a little bit further.
Is this an example of another step that you take to make sure that you don't overstate the rarity of matching characteristics?
In a criminal case, does that then benefit a Defendant if there's been instances of matches between evidence and a Defendant?
In essence, it downgrades the right to be attached to any matching evidence, yes. Downgrades the number. Excuse me.
And in fact, you calculated frequencies for these various pieces of evidence that appear to be from one person?
And I believe you also described the fact that you calculated approximately frequencies for what were mixtures of samples; is that right?
Without getting in--well, let me rephrase that. The method that you used to calculate these approximate frequencies for mixtures, what does that number reveal to us? What does it tell us?
It tells us what's the frequency. How likely is it, what's the probability or what's the chance of getting this mixed stain. There are some mixtures which, instead of having two bands or two types of the locus, have three or four. Obviously it was more than one contributor. So how likely is it that we get such a mixture if in fact there were two contributors or how likely would we get this mixture if there were three contributors. It's not different in concept at all to the single stains. We have a single stain with two bands, and we ask the question, how likely is it that we get those two bands in a person, in the one person for a single stain or in two or three people for a mixed stain.
So for a mixed stain, you would calculate estimates of how often these combination of markers would be found if, for instance, two people were part of that stain?
Or if three different people contributed to the results that were ultimately obtained?
The method that you use to calculate that--and I'm just referring to the mixtures now--
All right. Did you in fact then make these--and let's start with instances in which there were mixtures.
With the Court's permission, I would like to use the first board of the Bundy crime scene results.
And also with the Court's permission, I intend to elicit from the witness certain frequencies. It was my intent to just simply write them in as opposed to the witness going back and forth, if that's acceptable.
Now, Dr. Weir, I'd like to if I can--and I don't know if you can see--can you see that board?
Yes. You have before you what appear to be legal size or larger Xerox copies of basically what's the board before the jury at this time?
Actually, I would like one, if the People could also provide a copy to the Defense.
Dr. Weir, what I'd like to do is first, as to this results board, you're referring to a document in front of you that appears to be a reduced version of the board; is that right?
Did you have an opportunity to review that board including the numbers that are written in off to the far right?
And does that--without comparing every detail by any means between the board, the exhibit itself and the document you have, do they appear to contain the same information?
What I'd like to do is direct your attention to item no. 78, which is labeled "The Ronald Goldman boot drop." Do you see that?
And in particular, is it your understanding that with respect to--and let's start with the RFLP results.
--that there was observed from the use of 5 different probes 10 bands of greater intensity that were consistent with Nicole Brown Simpson and 4 additional bands that were consistent with 4 of the bands from Ronald Goldman?
Well, I need to be careful. My analyses are based on the bands. I have not gone into the relative intensities. That's beyond my area.
All right. Then limiting what I just described to you as not including the relevant intensities, is that your understanding of the results reported by the laboratory on this test?
Yes. Cellmark has reported the mixtures--the mixed stains, and it looks like 4, 8, 13, 14 bands, 14 RFLP bands, some of which match the profile of Nicole Brown and some of which match the profile of Ronald Goldman.
Did you as a result--and you obtained the actual lengths of these fragments so that you could make a statistical calculation of the approximate frequency that we're about to discuss?
Your Honor, I object to the term "Match" especially as to the bands consistent with Mr. Goldman given the "Some" offered.
Now, as part of this calculation process--and again, just focusing on the RFLP results--were you able to approximate or make estimates of how often these fragments would be found assuming that, first of all, two people contributed to that mixture?
Did you also do that under the assumption that three people contributed to that mixture?
And under those two assumptions, you then calculated these estimates; is that right?
Now, from your review of how this board basically has been described as far as frequencies, there have been ranges; is that right? In other words, using simply the item above 78, number 56, there is written in under frequency 1 in 48 to 1 in 610; is that right?
And in fact, more than one database is used because there's more than one major racial group; is that right?
Well, Cellmark has three databases. They were using three databases in their report, yes.
When you calculated the frequencies of these mixtures--and again, assuming two persons, two people contributed to item 78, and then we'll also have you describe it under the consumption three people contributed to 78--are you able to give us the same types of ranges for these combinations of characteristics?
And in fact, would the estimates be different depending on which of the major groups you used?
Now, specifically with regard to item 78 then, if we put on the board 2 colon, would that be okay to signify your assumption that two people are in the stain?
That's right. That--that number would be the frequency with which two people in essence plucked off the street, two random people had between them profiles which would look like the mixed stain profile.
When you perform that same estimate using three people, does it become a different number than it was with two?
Well, it's a different question. If--the chance of getting two people with blue eyes walk through the door next is different from the chance of getting three people with blue eyes walk through the door next. So if we ask a different question, we'll get different answers.
Okay. With regard to specifically the RFLP mixture in item no. 78, can you give us the most frequent of the estimates using the various databases for two people, that is two persons contributing to that stain, and then also the least frequent? And with the Court's permission, I'll simply write it in on the board.
So would it assist you to refer to the report to insure that you provide us with the exact estimates, that is the exact numbers in your estimates in your report?
All right. Then could you do that, Dr. Weir, with regard to item no. 78, again under the assumption that two people contributed to that RFLP mixture?
So if there were two people, the--and I get confused when I talk about smaller and larger, but the--the--the frequency which is the more common is 1 in about 300 million.
1 in 300 million. And then what would be the least common or the rarest amongst that range?
Well, this is where it starts to get embarrassing, but I will say it's a 1 trillion.
KEY QUOTEAnd I'll put a circle around the two person. And I hope that sound doesn't bother anybody. Can't make the pen sign without the squeak apparently. Did you also make a similar calculation under the assumption that that mixture was a result of three people; that is, three persons contributed to that stain as shown by the RFLP results?
Yes, I did. And I'm just--I'm just checking to make sure that I got the right--I have--I don't seem to have the Cellmark figures for that item 78 for the RFLPs.
All right. While we're on item 78, there were also PCR results indicating a mixture on that same stain from Ron Goldman's boot; is that right?
And did you perform this same calculation under the two assumptions, both that two persons contributed to those PCR results as well as three people?
Let's just do the two first as you mentioned. Can you tell us what that range would be just for the PCR markers?
Yes. For the PCR markers--now, this is using the FBI's data. So the range goes from 1 in 500--
All right. Then perhaps let's go through that. And that's again under the assumption that three persons contributed to this stain; is that right?
That's right. Watch the frequency with which three people unknown to us--we don't know who these people are or what their types were. But just the chance of drawing these people at random.
I'm sorry, your Honor. May I just approach the witness? I just think he missed something on his table.
Okay. Let's turn, if we can then, to the assumption that three people contributed to that stain.
That's right. So we're going to get a different answer because we're talking about a different scenario here. Three people, three people randomly, and this is from the--any three racial background. The most common frequency is 1 in 60.
First of all, Dr. Weir--and you've described why there is this difference between two and three people; and that's basically under one situation you're assuming two people contributed to that stain?
That's right. That's right. Yes. Well, we certainly know that there was not one contributor because there are more than two alleles, the probes.
And the second assumption is that three people led to obtaining those results or contributed to that mixture?
As far as the differences--and let's use the last one that you just described, which was from 1 in 60 approximately to 1 in 490,000. Why is there that difference?
Because these PCR markers, the alleles in this mixed stain have very different frequencies in the different racial databases.
In other words, there is some at least difference that's enough to create this difference from 60 to 490,000 because characteristics at these genetic markers different among racial groups?
Okay. Perhaps we can--what we'll do is return to item 78 once you've had an opportunity to re-examine as to three contributors.
What I would like to call your attention to next is a particular results board dealing with the glove found at Rockingham. Are you familiar with that board?
And do you have one of those similar ledger size sheets that appears to be a Xerox of what's contained on what will shortly be the glove results board?
For the record, your Honor, the exhibit that we've placed up is People's exhibit 272.
First of all, with regard to this board, you not only produced a report, you also produced an addendum; is that right, Dr. Weir?
That's right. I had done an initial analysis on the stains including the mixtures, and then subsequently I did the RFLP results.
And did that include mixtures on various locations from what's been identified as the Rockingham glove?
Do you have that board--I'm sorry. Do you have that Xerox in front of you that appears to contain the same information as People's exhibit 272-B, the results board?
With regard to these gloves--and let's start with the RFLP results--did you calculate these mixture frequencies?
And does that addendum describe these frequencies for the mixtures that we're about to discuss?
Yes. That's right. And it's a little simpler maybe if we recognize that G1 and G3 in fact have the same profile types. I believe that's right. G1 and g--is that right? Is it--G1 and G4. Excuse me.
So that when you give a result, I can write it on more than one stain to make it--
Yes. It looks in fact, doesn't it, that G1, G2 and G4 are the same. They have a DQ-Alpha 1.1, 1.3, 4 and possibly a 1.2. They all have a D1S80 with the 80 and the 24 alleles. So in fact, we--you will be able to write the same numbers for G1, G2 and G4.
Okay. Why don't we deal with--would it be easier to deal with the PCR results first or the RFLP?
Okay. Then let's start with if we can--and I'm going to direct your attention to what's marked G1. And as you've described it, it will also apply to G3 and G4 as to the RFLP results?
Yes. I think--just let me double-check that. Oh, well, excuse me. I've gotten myself confused. Those three have the same PCR types, don't they?
The RFLP--because the band lengths are a little bit different and also, different number of bands were present. So we'd better do them one at a time.
In particular, the RFLP frequency estimations under the assumption that there are two donors to that particular mixture.
All right. And then did you also make an estimate for or under the assumption there were three contributors?
Under the assumption there were two contributors on the RFLP results, then what would the range be?
The range--the range turns out to be the same for G1. The same number of bands were visible in the mixture, and DOJ's method of doing the calculations will result in the same numbers. So it will be the range 1 in 6 million to 600 billion.
Would it be easiest then to move on to G4 and the RFLP results under the assumption of two donors?
So G4 had one less matching band, and so the numbers are a little different, although not dramatically. They go from 1 in a million--
All right. I have--I have the--for item G1, I have the two and the three where I can get at them.
Okay. Let's start with G1, and just the PCR results, again under the assumption that two people contributed to that mixture.
Right. Now, of course, these numbers are quite different. The RFLPs were based on six although based on several RFLP probes, and there's only two of the PCR probes used.
In other words, with regard to these items of the glove, there are just, as far as PCR is concerned, results from two markers, DQ-Alpha and D1S80?
Is that then less information to go on as far as calculating a frequency estimate for these mixtures?
Under the assumption that there were two contributors as far as these two PCR markers are concerned, can you give us the range of frequencies?
Is it convenient to do the three assumption, that is the assumption three persons contributed to this stain?
Yes. If there were three people, the chance of getting that mixture would range from 1 in 400.
And again, these differences, as far as the range is concerned for 400 to 36,000, that's because of the differences in how often we see these genetic characteristics or types in different racial categories?
That's right. The PCR allele frequencies can be quite different amongst the three--amongst the various racial databases.
The numbers you've just given us as far as G1 is concerned for PCR, will they be the same for G2 and G4?
And that applied--that applies to either two or three contributors. The PCR profiles were the same, the same alleles. So the answers are going to be the same.
Now, I've written down for G2 under PCR and the assumption of two persons 1 in 600 to 1 in 11,000, and for three contributors, 1 in 400 to 1 in 36,000?
Dr. Weir, while we're on the glove board, can we look at the remaining areas--well, no. Excuse me. If we could, could we turn to the PCR results on G3?
Yes. I'd like to do G3 and G9 together because the PCR determinations were the same for those two.
Make sure I have this correct. For PCR, on both G3 and G9 and the assumption that two contributors made up the stain, the range is from 1 in 3900 to 1 in 22,000 approximately?
The three contributors for either of those two mixtures, the range is from 1 in 9,000 to 1 in 150,000.
G10 had the same profile as another evidentiary item. That was number 31. And I'm looking at table 33C.
--as that's I believe on a different board. But if we can just stay with the glove board.
Yes. I'm sorry. For G10, for two contributors, the chart that two unknown people would have that mixed stain goes from 1 in 3900--
I believe I have--G12 and G14 seems to be the same--the same--the same in the sense that they both have a D1S80 showing alleles 18 and 24.
And G14. The frequencies that you estimated for these assumed two contributors and assumed three contributors are as a result of just one PCR marker; is that right?
Okay. Then have you calculated an approximate frequency of how often this combination of markers would be found assuming two contributors?
Why are the numbers you've just described--1 in 6 is a fairly common number; is that right?
Why is that the case with the results on G12 and G14 just from looking at the actual results of this PCR test?
--those appear to be more frequent certainly than most if not all of the numbers you've reported so far?
That's right. Because this is only based on only two matching alleles. There's only two alleles we have to account for instead of accounting for a larger number. There's only two things we have to say. These two alleles are contained in the profiles of two people or three people.
Did you also calculate approximations assuming three contributors to these stains?
Yes. G11 and G13 have the same--and I'm looking at table 27A. This is also a D1S80 profile, and I see on the board there are three people not excluded, but there are three alleles showing the D1S80 in this mixture. Three alleles could have been contributed by two people or three people or a larger number. They could not have been contributed by one person. If there were two contributors to the stains 11 and 13--I see. I've got the table Mr. Neufeld labeled--this is table 27C. The numbers will go from 1 in 14 to 1 in 300 for two contributors.
Okay. With regard to those portions of the glove from G1 on down, have we completed the mixtures?
Well, this is where it starts to get embarrassing, but I will say it's a 1 trillion.
It's not going to be 3 percent anymore. It's going to be about 50 percent. So when I say--when anybody says an estimate of a profile is about 1 in a million, attached to that 1 in a million is the statement or it could be as low as half a million.
It's hard for our eyes to distinguish five zeroes in a row from six zeroes. It's that kind of thing I'm talking about.
So the most frequent is 1 in 6 million... To 1 in 600 billion.