Okay. Good afternoon, Dr. Shields. Dr. Shields, did you bring with you a curriculum vitae--
And I suppose we should have this marked as the next in--Defense next in order, your Honor.
I have an AB in biology from records university and an MS and a Ph.D. in zoology from the Ohio State University.
I'm a Professor of Biology at the State University of New York, College of Environmental Science and Forestry in Syracuse.
Okay. And I see on your vitae you list some previous positions that you've held, among them distinguished scholar and residence at Northern Arizona University. Can you describe what that position was?
That wasn't a previous position. That was an endowed share to allow for people to visit during sabbatical. So I had a sabbatical year that year in 1986, `87 and spent a year in Arizona. In fact, that's where I started doing molecular genetic techniques.
Okay. All right. Can you describe for us generally what your areas of research are?
Yes, I can. Since graduate school, about half of the research I've been doing has been in the area of behavioral ecology, primarily with birds. The other half of the research that I do is about evolutionary biology, and in particular, the evolution of population structure and how population structure influences evolution.
Okay. Now, this work in this second area, evolutionary biology, does this include research and publications in the field of population genetics?
Yes. I published a book and--which is peer reviewed and a number of articles as well.
I have published papers that are statistical in nature and in the journals where statistics would be.
Okay. Now, have you published any work relevant to the use of statistics in connection with DNA evidence?
I've published one paper which was not peer reviewed in a proceedings of a Promega Conference that's directly about the use of statistics in forensic DNA typing, and then I've published a number of papers in parts of the book where relevant ideas, ideas that are relevant to understanding how to do forensic DNA typing, profile frequencies are presented.
What teaching experience do you have that's relevant to the fields of statistics or population genetics?
Well, I've taught courses. Well, probably three different ones. I've taught courses in--that have a large genetic component, primarily conversation biology and conversation genetics where one looks at how population genetics influences the conservation of rare and endangered species. I've also taught courses in statistical inference, graduate courses.
And what is the nature of the laboratory work that you--that goes on in your lab?
The laboratory is set up to do what we call field research on genetic variation. We look at a variety of organisms and use three major tools to look at how they vary in nature. We use protein electrophoresis which has been mentioned, we use RFLP DNA typing and we also use PCR.
In about 1990, an attorney in Syracuse, New York asked me to look at some data involved in a criminal case, and that was my initial involvement. As a result of that case, I became involved in his case, and I've continued to occasionally look at a variety of data that are relevant to forensic proceedings.
Uh-huh. Are you familiar with the methods used by laboratories such as Cellmark diagnostics and DOJ?
I've visited Cellmark. I have visited the Department of Orange County's crime lab. I have visited the--that's about it.
But they don't do their case work there. That's where they do the research underlying the case work.
Okay. And have you studied the protocols developed by the various forensic laboratories that do DNA testing?
I've done similar studies back in 1990, `91 to the sorts that Dr. Weir mentioned earlier, looking at the data that are produced in--by the forensic laboratories for difference races and ethnic groups and different geographical regions to examine the issues of independence. I've done that for 40, 50 labs from around the world.
Okay. And does this work have to do with assessing the validity of statistical inferences drawn from forensic DNA evidence?
Well, I've been asked to present my ideas about some of the potentials and problems with--associated with forensic DNA typing, in particular, the statistical aspects by Promega in a human identification conference a number of years ago, by the California association of criminalists, again, a number of years ago at a meeting at bass lake.
Is this the same California association of criminalists that we've heard testimony about in this case from Mr. Fung and other witnesses?
Okay. When you attended these conferences, did you meet with people from the LAPD crime lab?
I've also been asked to in essence do short courses, sometimes a little longer than short, but short courses and other kinds of courses about forensic DNA typing, and again, particularly, the statistics with a variety of legal organizations; for example, public defender's office in Maryland, public defender's office in New Hampshire, the public defender's office in Massachusetts, also, the Bar Association of Tompkins County in New York, so that it had Prosecutors, Judges and Defenders, and I've been asked to engage in debate to explicate these issues, for example, at the University of Iowa with scientists associated with the Royal Canadian Mounted Police.
Have you been asked by any crime labs to consult with them on how to set up their databases?
Yes, I have. Fairly recently, I've become involved with the Monroe County crime lab, that's Rochester, New York, in an attempt to develop databases that they can use with their PCR techniques.
Now, one other question about your laboratory. You mentioned a lot of the statistical work. In your laboratory, do you also do work in molecular biology?
My students do. I don't do actual molecular biology for my own research, but I in essence teach them and assist them in interpreting their molecular work.
Okay. Now, have you been called to testify as an expert witness concerning statistical inferences in forensic DNA evidence cases?
In about 60 different cases. And those cases sometimes would include hearings and/or trials, sometimes both.
Now, do you testify only for the Defense or are you willing to testify or have you testified for both Prosecution and Defense?
Okay. Have you ever been asked to testify as an expert witness by the Los Angeles County District Attorney's office?
Okay. And how did it happen that you were contacted by the Los Angeles County District Attorney's office and asked to be an expert witness for them?
An assistant District Attorney, whose name I do not remember, called and asked if I would be interested, and he said he was calling at the request of another assistant District Attorney.
Okay. Now, Dr. Shields, are you familiar with the National Research Council's report titled "DNA technology and forensic science"?
And do you consider this an authoritative source of scientific information about forensic DNA testing?
Do you consider this an authoritative source of scientific information about forensic DNA testing?
Okay. Now, at page 59 of the NRC report, there's a statement that we've been talking about concerning the way--concerning what statistics should be computed in connection with mixed DNA samples. Are you familiar with that statement?
Are we talking about the statement that's the last sentence in the third paragraph, fourth paragraph?
To be clear, let me read the statement. The statement says: "If a suspect's pattern is found within the mixed pattern, the appropriate frequency to assign to such a match is the sum of the frequencies of all genotypes that are contained within, i.e., that are a subset of the mixed pattern." Now, that's the statement you're familiar with?
Okay. Now, I would like to have you take a look at the exhibit that's been marked Prosecution no. 410. And I don't know if I can put that on the elmo now.
Actually, I'm sorry, your Honor. I don't. I didn't know you were talking about the exhibit.
Okay. Now, Dr. Shields, has this chart--does this chart show computations that have been done in accordance with the recommendations of the National Research Council concerning mixtures?
Okay. Why is it necessary to compute statistics for mixed DNA stains in a manner that's different for--from the way statistics are computed for an unmixed stain, stain from a single person?
Because there's a different set of evidence that needs to be addressed. In one case, they're--in what we call a single stain, we will see evidence that's consistent with a single stain and no positive evidence of a mixture. In other words, we can assume that it's a single person and not have any evidence that's against that or any evidence that would suggest otherwise. When you have a mixture, you know that there is more than one person involved in that piece of evidence. But what you don't know, there's no positive evidence that it's two, three, four or five.
Or any number. There is no evidence based on simply the mixture that you see in a stain of how many people contributed to that.
And how does that uncertainty about number of contributors affect the statistic that should be presented to characterize the value of a finding of consistency with a mixture?
Well, that will depend on the question that you're attempting to address and what you're trying to present to a jury. If what you're trying to do with any trier of fact or to a student, it doesn't matter who the audience is, if what you're trying to do is get a flavor for how common or rare a match would be, which is presumably what's going on, then the question becomes, how many individuals in the random population or the population of interest in another way would be likely to be if they were taken off the street and typed declared potential contributors to that mixed stain. And that's a way of providing a weighting to that evidence without any assumptions.
Does the National Research Council's method address the question that you just posed?
Okay. Have you reviewed Dr. Weir's method for computing the frequencies in connection with mixed stains?
Okay. Let's limit our discussion to his report of--his most recent report, which is the report of Tuesday. No. Wait a minute.
The report of yesterday. Okay. We'll limit our comments to his diversion yesterday. Have you reviewed that document?
Does the method Dr. Weir recommends for computing statistics in connection with mixtures comply with the National Research Council's recommendations?
And why not? In what way does it fail to comply with the National Research Council's method?
The National Research Council is very clear and unequivocal about how this should be done. It says to sum the frequencies of all genotypes that are found in the mixture. Genotypes are individuals. The frequencies of all of the individuals whose genotypes allow them to be potential contributors to that mixture is the only way that one can produce a frequency as suggested by that sentence.
In contrast, Dr. Weir's methodology produces a summation of the frequency of pairs of individuals. Rather than simply summing genotypes, it sums probabilities of pairs of individuals.
In your opinion, is Dr. Weir's method an appropriate way to characterize the value of mixture evidence in a case such as this one?
Because it makes assumptions that go beyond the genetic evidence in my opinion. The genetic evidence of a mixture only allows one to say that one or more individuals contributed to that. It doesn't tell you who. If one assumes that one knows who produced a mixture, one is no longer doing a test of a hypothesis. One is just validating one's initial assumption. So in essence, I don't believe it's correct to assume that you know anything about the number of contributors as a geneticist looking at the data. I think it's appropriate to provide a frequency with which individuals could possibly contribute to the mixture. In contrast, Dr. Weir's method is to ask the question, would these known individuals--what's the probability of these known individuals having contributed to this evidence sample. What the NRCRI would say is, what is the probability of any individuals that could possibly--what's the frequency of, not what's the probability of, but what's the frequency of any individuals that could--and that "Could" is an important word--contribute or could have contributed to this evidence.
In your opinion, would other scientists in the field of statistics and population geneticists accept Dr. Weir's approach for characterizing mixtures?
Okay. All right. So--and what about the National Research Council's method? What do you think would be the distribution of scientific opinion about that method?
So would it be fair to say that this--this issue of how to compute statistics in connection with mixtures is an issue in controversy in the scientific community?
It's not really been addressed. So I'm not sure how you could call it a controversy yet.
Okay. And why is that? Based on your experience, Dr. Shields, in testifying in a number of cases, why is it that this issue of how to compute statistics on mixtures hasn't--hasn't come up before?
Well, it has come up before. I've been involved in a couple of cases where it's come up. But it doesn't happen frequently enough I think that people have sat down and thought about it except that they did present a way of doing it that's without assumption.
Okay. So is this--is this an issue that has been discussed extensively in the scientific community or is it an issue that's just emerging as a topic of discussion?
Okay. At this point, is there, in your opinion, any consensus in the scientific community about the right way to compute statistics in connection with mixtures?
Is there any consensus about the right way to compute statistics in connection with mixtures?
I'm not sure what "Consensus" means. I don't think so. I mean I--if "Consensus" means everybody sort of agrees, no.
Okay. Which--which method--as between Dr. Weir's method and the NRC method, which method is the most conservative method in terms of making the fewest assumptions?
The fewest assumptions? The NRC's method makes no assumptions that are not there in the genetics.
Okay. And do you think it's bad to make the assumptions that Dr. Weir was making in doing his calculations?
I wouldn't. I don't think it's good. I'm not sure if "Bad" is the right word for it. I think--I don't think it's--it presents the evidence in the light that's the best characterization of that evidence for--from the genetics.
Okay. All right. Now, did you understand Dr. Weir to say that the frequencies that he was computing were likelihood ratios?
A likelihood ratio is in essence just the ratio of two probabilities. And by doing a ratio of two probabilities, one can come up with some feeling for how much more likely one of those hypotheses is versus a second, and it's the probabilities associated with two hypotheses.
Okay. Now, does a likelihood ratio--what's the purpose of a likelihood ratio in the field of statistics? What is it used for?
It can be used for lots of different things. But the notion is that you're trying to get a field for how much more likely one proposed set of events is relative to another proposed set of events to explain the same sort of outcome.
Do you think that Dr. Weir's statistics, that is statistics computed according to his method, are an appropriate likelihood ratio for characterizing the value of mixture evidence in this case?
Well, there's actually two reasons. One has to do with the fact that, first of all, they're not likelihood ratios unless one makes an assumption. The frequencies as they're presented for single stains are likelihood ratios if one assumes that there is a known individual. And that's--that can done if you have a known sample, for example, if there's just one individual contributing that stain. You can also do that I think legitimately if you have no extra bands to make an assumption that--to show you that that assumption is absolutely false, that there are in fact more than one contributor to a particular piece of evidence. So if one assumes that there is one individual and they match at each and every band, then the numerator in the likelihood ratio becomes 1. And when you multiply 1 by the frequency, the answer is the frequency.
So the frequency and the likelihood ratio are arithmetically equivalent. But that's based on that 1, and it doesn't have to be 1, the numerator doesn't have to be 1. If you have no knowledge of what the likelihood is, that it was two individuals that contributed a stain or whether it was these two individuals that might have contributed the stain, if you're not using those hypotheses, you can still present the frequency, but it's no longer part of a likelihood ratio. It's simply the frequency of the genetic profile in the populations represented by the databases sampled.
Okay. And so are you telling us then that the frequencies as computed by Dr. Weir would not be an appropriate likelihood ratio unless a number of assumptions are made?
Yes. That's just the assumptions for the numerator. There are also assumptions for the denominator.
And do you think that the assumptions that are required in order for Dr. Weir's frequencies to be treated as a likelihood ratio are appropriate assumptions for an expert witness to make in a criminal case?
Do you think that the assumptions that would be necessary in order for Dr. Weir's frequencies to be treated as a likelihood ratio are appropriate assumptions for an expert witness to make in a criminal case?
If--I don't believe they're appropriate for a genetics expert witness to make because a genetics expert is, in my opinion, intended to be presenting information that stems from the genetic data that are associated with that case and only the genetic data.
And would Dr. Weir's assumptions require the expert to go beyond the genetic data?
All right. I see in Dr. Weir's report that when he computes likelihood ratios, the likelihood ratio purports to characterize the probability of the evidence under two different hypotheses. Did you see that section?
And the two hypotheses are that the particular individual had contact with the scene or did not have contact with the scene.
Is that correct? Now, do you believe that that is a legitimate way to frame competing hypotheses that are to be tested by genetic data?
No, because there are non-genetic influences that will change those probabilities.
Okay. And so if one--if one took the position that frequencies computed from genetic data provided a likelihood ratio for distinguishing those hypotheses, would one be leaving out any important factors?
Well, since that what goes into the likelihood ratio here is the probability of this profile given the database information, the frequencies of the variance in a database, it does not address the questions of the likelihood of getting a positive result evidence with a particular profile that's associated with errors of a variety of kinds, cross-contamination that can occur for a variety of reasons, and it also doesn't--all it does is the coincidental match, the probability that someone other than the individual in question might have left the evidence.
Would it be scientifically valid to leave out those other variables you mentioned when presenting a likelihood ratio which purported to show the relative likelihood of the evidence under the hypothesis contact and no contact?
Okay. Would it be biased to present such numbers and to claim that they were a likelihood ratio for the hypothesis contact versus no contact?
Those two--with that set of words, contact and no contact, yes, I think they are.
KEY QUOTEOkay. Do you think that the assumptions that you've discussed that underlie Dr. Weir's analysis are easy assumptions for people to understand?
You've--based on your experience in teaching statistics and genetics to students, would it be easy to explain what's wrong with Dr. Weir's assumptions?
Well, would it be--would it--let me withdraw the question and rephrase it. Would it be easy to explain the nature of the assumptions that Dr. Weir is making?
I think it shows up more in what happens in the results of Dr. Weir's assumptions and analysis where you end up with a set of what he calls frequencies, but I think they are probabilities associated with different explicit pairs of individuals of different ethnic backgrounds associated with different databases and, therefore, different frequencies so that you end up with instead of a number that gives someone a flavor for how likely it is that anybody might contribute to this mixed stain which is a single number, you have a set of numbers that can range from 1 in 7 to 1 in 17,000. And how one interprets that, I think it's open to individuals to look and say, okay, the 1 in 7 is the biggest number, and so that makes that the likeliest trio, and that trio may have very little relationship to anything that's going on in the case or in any case. It's precise, but I think it can mislead.
Okay. Now, let me direct your attention to the chart that's sitting just to your left. It's the Prosecution's chart labeled "Results of DNA analysis, Bronco automobile." Have you seen that chart before?
Okay. Now, in the column on the far right, that column is headed with a term "Frequency," and in some of the boxes below there, there are certain numbers. Now, do all the numbers that are presented in that chart so far--well, first of all, do you understand what those numbers are and what they mean?
Okay. Would it be correct to characterize all of the numbers that are on the chart at this point as frequencies?
Okay. Now, if we put onto that chart numbers derived from Dr. Weir's computations, would those numbers be comparable as far as their meaning to the numbers that are already on there?
The difference stems from the column to the left of the frequency column. The column to the left says "Not excluded." And, therefore, it's the frequency of individuals who are not excluded. And if it's the frequency of individuals who are not excluded, the NRC's methodology is what gives you that proportion, that frequency. Dr. Weir's method gives you the frequency of pairs of individuals that would not be excluded.
Well, he's also done three's. But it doesn't do the frequency of individuals who would not be excluded, and that's what all of the other ones are.
Okay. So if--are you saying that if we wanted the mixture numbers to be comparable in their meaning to the numbers that are already on there, we should present numbers done the NRC way?
Do you think it would be easy for people to understand the difference between the NRC numbers and Dr. Weir's numbers?
The genetic evidence of a mixture only allows one to say that one or more individuals contributed to that. It doesn't tell you who. If one assumes that one knows who produced a mixture, one is no longer doing a test of a hypothesis. One is just validating one's initial assumption.
Those two--with that set of words, contact and no contact, yes, I think they are [biased].
You end up with instead of a number that gives someone a flavor for how likely it is that anybody might contribute to this mixed stain which is a single number, you have a set of numbers that can range from 1 in 7 to 1 in 17,000.
I'm sure that many would [accept Weir's approach]. And do you think some would not? I'm sure that many would not. We have one of each here.
It does not address the questions of the likelihood of getting a positive result evidence with a particular profile that's associated with errors of a variety of kinds, cross-contamination that can occur for a variety of reasons.