People v Wakefield |
2022 NY Slip Op 02771 [38 NY3d 367] |
April 26, 2022 |
DiFiore, J. |
Court of Appeals |
Published by New York State Law Reporting Bureau pursuant to Judiciary Law § 431. |
As corrected through Wednesday, October 5, 2022 |
The People of the State of New York, Respondent, v John Wakefield, Appellant. |
Argued March 15, 2022; decided April 26, 2022
People v Wakefield, 175 AD3d 158, affirmed.
This appeal primarily concerns the admissibility of DNA mixture interpretation evidence generated by the TrueAllele Casework System. We conclude that Supreme Court did not abuse its discretion in finding, following a Frye hearing, that TrueAllele's use of the continuous probabilistic genotyping approach to generate a statistical likelihood ratio—including the use of peak data below the stochastic threshold—of a DNA genotype is generally accepted in the relevant scientific community. We also hold that there was no error in the court's denial of defendant's request for discovery of the TrueAllele software source code in connection with the Frye hearing or for the purpose of his Sixth Amendment right to confront the witness against him at trial.
On April 12, 2010, the victim was found strangled to death in his apartment, with a guitar amplifier cord wrapped around his neck. Several items had been stolen from the victim's home, including a PlayStation 3, a laptop and a distinctive orange duffel bag. Witnesses observed defendant in the company of the victim the weekend of the homicide and defendant admitted to three individuals that he had choked the victim. Defendant did not dispute that he had been present at the victim's home. A separate witness observed defendant with a distinctive orange duffel bag like the one belonging to the victim, attempting to trade a PlayStation and a laptop for drugs. The victim's PlayStation 3 was recovered from the home of a local drug dealer.
At the scene, the police collected DNA samples from several items of evidence and sent them to the New York State Police Forensic Investigation Center (the Lab) for PCR DNA typing analysis, using the [*2]FBI-selected 15 STR loci standard.[FN1] Defendant's single-source DNA profile was developed from two bottles taken from the victim's home. Relevant to the issues presented here are four DNA profiles developed by the Lab through the{**38 NY3d at 372} samples taken from the front and rear outside collar of the victim's shirt, the victim's dorsal forearm and a section of the amplifier cord used to strangle the victim. The DNA test results developed by the Lab in an electropherogram were then compared to known DNA profiles from defendant and the victim. The Lab concluded that: (1) the two profiles generated from the shirt collar were consistent with at least two donors, one of which was the victim, and defendant could not be excluded as the other contributor; (2) the DNA mixture from the right dorsal forearm was consistent with DNA from the victim, as the major contributor, mixed with at least two additional donors; and (3) the amplifier cord was a mixture of at least two donors, from which the victim could not be excluded as a possible contributor. The results generated from the amplifier cord were not compared to defendant's DNA profile because of the complexity of the mixture.
Since the Lab used an interpretation standard of a stochastic threshold of 50 to 100 relative fluorescence units, the analyst did not call any alleles based on peaks on the electropherogram below that threshold. As a result, there was insufficient data to allow the Lab to calculate probabilities for the unknown contributors to the DNA mixtures found on the amplifier cord and the front of the shirt collar. The Lab was able to call only 4 out of 15 STR loci and the analyst, using the combined probability of inclusion method,[FN2] generated a statistic that the probability an unrelated individual contributed DNA to the outside rear shirt collar was 1 in 1,088. Using the same number of loci for the profile obtained from the victim's forearm, the analyst generated a statistic that the probability an unrelated individual contributed DNA to the profile was 1 in 422.
The electronic data from the DNA testing of the four samples at issue was then sent to Cybergenetics for additional analysis because its TrueAllele Casework System applies a continuous probabilistic genotyping method of calculating a likelihood ratio—using all of the information generated on the electropherogram, including peaks that fall below a laboratory's stochastic threshold. The likelihood ratio in its modern form was developed by Alan Turing during World War II as a code-breaking method. TrueAllele uses a probability model to assess{**38 NY3d at 373} the values of a genotype objectively. It does not consider a reference sample for any particular DNA profile. Following protocol, once the genotypes were inferred based on mathematical computations from all the data in the electropherograms, the system compared defendant's genotype to all of the statistical genotype [*3]possibilities and calculated likelihood ratios as to the presence of defendant's genotype. The ratios were exponentially greater than those generated by the methods employed by the Lab. Specifically, TrueAllele concluded that it was 5.88 billion times more probable that defendant was a contributor to the mixture on the amplifier cord than an unrelated black person, that it was 170 quintillion times more probable that defendant was a contributor to the mixture on the outside rear shirt collar than an unrelated black person, that it was 303 billion times more probable that defendant was a contributor to the mixture on the outside front shirt collar than an unrelated black person, and that it was 56.1 million times more probable that defendant was a contributor to the mixture on the victim's dorsal forearm than an unrelated black person.
Prior to trial, in March 2014, defendant moved to preclude the introduction of any evidence or testimony derived from the TrueAllele Casework System or, in the alternative, for a Frye hearing to determine the general acceptance of TrueAllele in the relevant scientific community. Defense counsel acknowledged having received from the People approximately 1,500 pages of discovery documents, including reports relating to the DNA analysis, but argued that several additional items "must be disclosed"—specifically, the defense sought the "assumptions" and parameters programmed into the TrueAllele system and the software's source code.
In support of the motion, defendant submitted an affidavit from Ranajit Chakraborty, Ph.D.—a member of the Scientific Working Group on DNA Analysis Methods (SWGDAM)[FN3]and former member of the New York State Commission on Forensic Science DNA Subcommittee. Parenthetically, the DNA Subcommittee is charged by Executive Law § 995-b (13) (b) to "assess and evaluate all DNA methodologies proposed to be used for forensic analysis" in the state. Dr. Chakraborty acknowledged {**38 NY3d at 374}that both New York's DNA Subcommittee and the full Commission on Forensic Science (CFS) approved TrueAllele for forensic casework without limitation in 2011 while he was a member. However, since TrueAllele "re-analyzes" the electronic data from another laboratory by including "all allele peak height information including that which extends below" the stochastic threshold, he maintained that it was a novel innovation which had not gained general acceptance in the scientific community. Dr. Chakraborty opined that, when the DNA Subcommittee approved TrueAllele, it had not been presented with evidence of its ability to analyze certain types of complex DNA mixtures and that it "ha[d] not been adequately validated for the type of casework[ ] [to which] it is now being applied." Dr. Chakraborty contended that, in the absence of disclosure of the source code for the software and the underlying assumptions programmed into the system, "TrueAllele cannot be meaningfully validated."
Supreme Court granted defendant's motion to the extent of ordering a Frye hearing to determine the admissibility of TrueAllele's methodology.
In July 2014, prior to the Frye hearing, defense counsel made a supplemental demand for discovery seeking several items relating to the TrueAllele process including, as relevant here, the TrueAllele source code. The People, who were not in possession of the source code, denied the [*4]request as the source code was outside the scope of the former demand discovery provision, CPL 240.20.
The Frye hearing commenced in October 2014 and the People called three witnesses—Dr. Mark Perlin from Cybergenetics, and Dr. Barry Duceman and forensic analyst Jay Caponera from the Lab. Dr. Perlin, who has a medical degree as well as a Ph.D. in both mathematics and computer science, founded Cybergenetics in 1994 and is its Chief Scientist and Executive Officer. He testified that TrueAllele is comprised of two systems—TrueAllele Databank, for interpreting single-source reference samples (used by the New York State Police for the state DNA databank), and TrueAllele Casework, for interpreting more complex DNA mixtures. The TrueAllele Casework System does not generate a DNA profile, but analyzes the electronic raw data that has already been generated by a forensic crime lab, separates (or deconvolutes) the genotypes and calculates the{**38 NY3d at 375} likelihood ratio.[FN4] As described at the hearing, the analyst operating the TrueAllele system reviews the raw electronic data and then inputs the data file into the computer, setting parameters such as the number of contributors and the quality of the mixture. The computer then separates the genotypes using the mathematical probability principle of the Markov chain Monte Carlo (MCMC) search to calculate the probability for what the different genotypes could be. The computer conducts a quantitative analysis using all of the electronic data generated in the electropherogram—including patterns and peak heights—as opposed to the qualitative analysis of visual human review, which has a limited data field due to the laboratory's stochastic level.[FN5] Through the mathematical process, the system makes an inference based on the probability of each possibility of the alleles at each locus, then records the genotypes and mixture weights for each contributor. The analyst then looks at the results to determine whether the computer has achieved "concordance" and, if not, the analyst has the option to take additional steps—i.e., ask the computer to model for another donor, allow it to run more cycles or ask it to account for degradation in the sample.
As to general acceptance of the continuous probabilistic genotyping system, the testimony of the People's witnesses established that probabilistic genotyping methods have been recognized by the relevant scientific community such as SWGDAM, the American National Standards Institute [*5]and the{**38 NY3d at 376} National Institute of Standards and Technology (NIST) as a valid approach to DNA interpretation—including the fully continuous probabilistic genotyping approach used by TrueAllele. The mathematical and scientific principles underlying the system (MCMC and Bayes' theorem) are well-established and independent validation of the reliability of the software is available in the form of a free trial that can be used to verify a known sample. Dr. Perlin has presented on the DNA mixture interpretation of TrueAllele at multiple scientific conferences, including internationally, and TrueAllele Casework has been the subject of numerous peer-reviewed published articles in scientific journals. The TrueAllele Casework System had also undergone approximately 25 validation studies—one involving samples created by NIST. Although Dr. Perlin was involved in and coauthored most of the validation studies, his interest in TrueAllele was disclosed as required by the journals who published the studies and the empirical evidence of the reliability of TrueAllele was not disputed. Four validation studies were conducted independently by laboratories.
Two of the independent validation studies were conducted by the New York State Police Lab and were designed to conform to the quality assurance standards of the FBI in order to maintain the Lab's access to the national Combined DNA Index System (CODIS) database. Jay Caponera testified that he performed the validation studies using complex mixtures of up to four contributors and varying amounts of DNA, including low template samples.[FN6] The system was able to separate the known donor samples from the unrelated profiles provided by staff members—demonstrating that it provides both inculpatory and exculpatory results. In sum, the validation studies found the results generated by TrueAllele were reliable as they were sensitive (identifies the correct person), specific (excludes noncontributors), accurate and reproducible.
The evidence at the hearing further established that NIST used the TrueAllele system to examine the composition and assess the weights of a two-person mixture for its standard reference materials (used by other laboratories to conduct quality{**38 NY3d at 377} assurance).[FN7] Evidence generated by TrueAllele had also been found admissible after hearings conducted in states outside New York, as well as other countries. In addition, the system was used to conduct all forensic casework in the Commonwealth of Virginia and in Kern County, California.
Defendant did not call an expert witness or introduce any evidence at the hearing. He also did not refute the fundamental mathematical principles of the methodology used by TrueAllele. Rather, in cross-examining the People's witnesses, the defense focused on the fact that the TrueAllele system involves artificial intelligence in its mathematical application by drawing inferences on the data, the "black box" nature of the technology and the small number of independent validation studies conducted without Dr. Perlin's involvement. The defense also emphasized that most forensic laboratories still used the stochastic threshold model. Defendant elicited that some [*6]laboratory analysts lack a complete understanding of how the sophisticated mathematical system works and that Dr. Perlin characterized the "main barrier" to implementing the continuous method as the difficulty in educating analysts on the system to the degree they would be able to testify in court.[FN8]
Following the hearing, in a lengthy and detailed opinion, Supreme Court found that TrueAllele was generally accepted in the relevant scientific community and denied defendant's motion to preclude testimony and evidence derived therefrom (47 Misc 3d at 859). The court concluded that the quantitative analysis performed by TrueAllele had been empirically tested and found to be reliable and accurate, that it was the subject of publication in scientific journals and favorable peer review, that it was more efficacious than human review using the stochastic method, that it had been validated and the results were reproducible, and that the scientific and mathematical principles upon which it was based had been accepted by the relevant scientific community long ago. The court observed that, notwithstanding that TrueAllele had been in existence{**38 NY3d at 378} since 1999, there was a lack of negative critical work concerning its methods in the scientific community.
Prior to trial, defendant moved for disclosure of the source code in order "to meaningfully exercise his constitutional right to confront his accusers." He argued that the report generated by TrueAllele was testimonial, that the computer program was the functional equivalent of a laboratory analyst and that the source code was the witness that must be produced to satisfy his right to confrontation. He claimed that Perlin's "surrogate" trial testimony without disclosure of the source code was inadequate—"the TrueAllele Casework System source code itself, and not Dr. Perlin, is the declarant with whom [defendant] has a right to be confronted." The court denied the motion, finding that the source code was not a witness or testimonial in nature, and that defendant would have the opportunity to confront and cross-examine Dr. Perlin—the analyst and the developer of the software.
Defendant again raised his confrontation argument prior to Dr. Perlin's trial testimony, asserting that the TrueAllele Casework System was the witness and that he needed the source code to effectively cross-examine that witness. When the court questioned how one cross-examines a computer program, defendant represented that, once his experts had the opportunity to review the source code, he would then pose questions to Dr. Perlin based on the experts' review. The court denied the request, stating that the issue defense counsel raised was a discovery issue and that defendant's ability to cross-examine Dr. Perlin, the developer of the source code, satisfied his right to confrontation.
Both Dr. Perlin and the analyst from the Lab who conducted the electrophoresis testified at trial to the results of the DNA testing and statistical analysis.
Gary Skuse, Ph.D., a professor of biological sciences at the Rochester Institute of Technology, testified at trial as a defense witness. After reviewing 1,278 pages of documents [*7]relating to the DNA testing in this case, including the electropherograms, Skuse agreed with the DNA interpretation analysis that defendant's DNA was present in the mixtures found on the shirt collar and amplifier cord and that it was "most likely" present on the victim's forearm. Skuse opined, however, that the DNA may have been present as a result of secondary transfer (not directly from defendant).{**38 NY3d at 379}
The jury convicted defendant of murder in the first degree and robbery in the first degree.
The Appellate Division affirmed (175 AD3d 158 [3d Dept 2019]). The Court held that Supreme Court properly concluded that TrueAllele was generally accepted in the relevant scientific community and rejected defendant's argument that disclosure of the TrueAllele source code was required to properly conduct the Frye hearing. The Court noted that "[t]he record reflects that articles evaluating TrueAllele have been published in six separate forensics journals"; "TrueAllele had undergone approximately 25 validation studies, some of which appeared in peer-reviewed publications"; "[t]he DNA Subcommittee of the New York State Forensic Science Commission offered a binding recommendation that TrueAllele be used by the State Police for its forensic casework" and it was later approved by the full Commission; and "[a]t the time of the Frye hearing, TrueAllele had also been used in various states and had been deemed admissible in Virginia, Pennsylvania and California" (175 AD3d at 162-163). As to defendant's claim that the failure to disclose the TrueAllele source code violated his constitutional right to confrontation, the Court concluded that the report generated by TrueAllele was testimonial in nature, but that the source code was not a declarant. The Court cited Dr. Perlin's explanation at the Frye hearing that
"there is human input when utilizing TrueAllele[, including that] a human analyst tells the computer what to download and under what conditions to analyze the data, the analyst tells the computer what questions to ask when interpreting the data and the analyst downloads certain results from the computer, the analyst determines how many 'runs,' or cycles, of the data the system will complete and the analyst then makes comparisons to form the likelihood ratios" (175 AD3d at 169).
The Court further observed that defendant had the opportunity to confront Dr. Perlin, "his true accuser," at trial and that defendant did not preserve the alternative argument that the failure to disclose the source code impaired his ability to cross-examine Dr. Perlin as the declarant (175 AD3d at 170).
In a concurring opinion, one Justice opined that there was no need to address the relative merit of defendant's arguments as to the violation of his right to confrontation, in light of the absence of "any meaningful attempt by defendant to gain access {**38 NY3d at 380}to, or compel disclosure of, the source code prior to trial" (175 AD3d at 173).
A Judge of this Court granted defendant leave to appeal (35 NY3d 1097 [2020]) and we now affirm.
We must address whether the trial court abused its discretion in determining that TrueAllele "is not novel but instead is 'generally accepted' under the Frye standard" (47 Misc 3d at 859). [*8]Defendant argues that the evidence the People presented at the Frye hearing was insufficient because, absent disclosure of the TrueAllele source code for examination by the scientific community, its "proprietary black box technology" cannot be generally accepted as a matter of law.[FN9] He further asserts that, even if such technology could be generally accepted, the People failed to meet their burden at the hearing, given the dearth of independent validation as a result of Dr. Perlin's involvement in the large majority of studies produced at the hearing.
The well-known Frye test applied to the admissibility of novel scientific evidence (Frye v United States, 293 F 1013 [DC Cir 1923]) is "whether the accepted techniques, when properly performed, generate results accepted as reliable within the scientific community generally" (People v Wesley, 83 NY2d 417, 422 [1994]). General acceptance by the relevant scientific community, however, does not require that the procedure be " 'unanimously indorsed' " (83 NY2d at 423, quoting People v Middleton, 54 NY2d 42, 49 [1981]).
At issue in Wesley was the general acceptance of DNA evidence—specifically, the restriction fragment length polymorphism (RFLP) methodology, including the assessment of a visual "match" between DNA samples. There, in a concurring opinion, Chief Judge Kaye warned of the "pitfalls of self-validation by a small group" and urged caution in accepting technology that had been validated by individuals with a commercial or professional interest in promoting its use, "developed {**38 NY3d at 381}in commercial laboratories under conditions of secrecy, preventing emergence of independent views" and had not been peer-reviewed (83 NY2d at 439-440 [internal quotation marks and citations omitted]). Notwithstanding these concerns, Chief Judge Kaye ultimately agreed that, at the time the appeal was decided, "RFLP-based forensic analysis [was] generally accepted as reliable" (id. at 445) and those testing procedures were accepted as the standard methodology used in the scientific community until the advent of the PCR STR method used today.
[1] Here, the evidence presented at the Frye hearing established that the relevant scientific community generally accepted TrueAllele's DNA interpretation process and that the continuous probabilistic genotyping approach is more efficacious than human review of the same data using the stochastic threshold. It was undisputed that the foundational mathematical principles (MCMC and Bayes' theorem) are widely accepted in the scientific community. It was also undisputed that the relevant scientific community was fully represented by those persons and agencies who weighed in on the approach. Although the continuous probabilistic approach was not used in the majority of forensic crime laboratories at the time of the hearing, the methodology has been generally accepted in the relevant scientific community based on the empirical evidence of its validity, as demonstrated by multiple validation studies, including collaborative studies, peer-reviewed publications in scientific journals and its use in other jurisdictions. The empirical studies demonstrated TrueAllele's reliability, by deriving reproducible and accurate results from the interpretation of known DNA samples.
Defendant and the concurrence raise the legitimate concern that the technology at issue is proprietary and the developer of the software is involved in many of the validation studies. This skepticism, however, must be tempered by the import of the empirical evidence of reliability demonstrated here and the acceptance of the methodology by the relevant scientific community. First, Dr. Perlin's hearing testimony established that, in forensic science, "most validation studies are internal and they're not published," but the FBI's Quality Assurance Standards require that "a developmental validation for a particular {**38 NY3d at 382}technology" be published (see also NIST, DNA Mixture Interpretation: A NIST Scientific Foundation Review at 64 [NISTIR 8351-DRAFT June 2021], https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8351-draft.pdf). The interest of the developer was addressed at the Frye hearing in this case and, contrary to defendant's argument, Dr. Perlin's involvement in many of the validation studies does not preclude a determination of general acceptance as a matter of law.[FN10] The concurrence's claim to the contrary ignores that the performance of validation studies by Dr. Perlin, the State Lab and independent agencies was entirely consistent with the scientific method (see e.g. Executive Office of the President, President's Council of Advisors on Science and Technology, Report to the President: Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods at 46 [Sept. 2016] https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/PCAST/pcast_forensic_science_report_final.pdf [published after the Frye hearing was held]). Here, unlike in Wesley, there were developer and independent validation studies and laboratory internal validation studies, many published and peer-reviewed. The technology was approved for use by NIST and for forensic casework in New York by both the DNA Subcommittee and the full CFS—entities that were established, in part, in response to the concerns raised in Wesley as to the lack of independent scientific validation of DNA technology (Governor's Approval Mem, Bill Jacket, L 1994, ch 737; Executive Law § 995-b). Importantly, the Lab, in accepting the TrueAllele Casework System after approval by the CFS, must still conduct additional validation of the program and the hearing testimony indicates that they did so in this case. In addition, NIST's use of the TrueAllele system [*9]for its standard reference materials{**38 NY3d at 383} likewise demonstrates confidence within the relevant community that the system generates accurate results.[FN11]
Disclosure of the TrueAllele source code was not needed in order to establish at the Frye hearing the acceptance of the methodology by the relevant scientific community.[FN12] First, defendant's initial attempt to obtain the source code was made by a July 2014 supplemental demand under the former demand-discovery provision (former CPL 240.20). Defendant was not entitled to the source code under that provision, as the source code is not a "written report or document" made at the People's request for trial purposes and the proprietary information belonging to Cybergenetics was not in the People's possession or control (see former CPL 240.20 [1] [c]; former CPL 240.45; People v Washington, 86 NY2d 189 [1995]; compare People v DaGata, 86 NY2d 40 [1995] [error to deny defendant access to FBI notes relating to DNA testing not in the People's possession]). As we have previously explained, the former article 240 of the CPL was "a detailed discovery regimen" and "[i]tems not enumerated in article 240 [were] not discoverable as a matter of right unless constitutionally or otherwise specially mandated" (People v Colavito, 87 NY2d 423, 427 [1996]). Outside of his discovery demand, defendant made no further attempt to demonstrate a particularized need for the source code by motion to the court (see former CPL 240.40 [1] [c]).
Moreover, defendant's arguments as to why the source code had to be disclosed pay no heed to the empirical evidence in the validation studies of the reliability of the instrument or to the general acceptance of the methodology in the scientific{**38 NY3d at 384} community—the issue for the Frye hearing—and are directed more toward the foundational concern of whether the source code performed accurately and as intended (see Wesley, 83 NY2d at 429). To the extent the testimony at the hearing reflected that the TrueAllele Casework System may generate less reliable results when analyzing more complex mixtures (see also Executive Office of the President, President's Council of Advisors on Science and Technology, Report to the President: Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods at 80 [Sept. 2016], https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/PCAST/pcast_forensic_[*10]science_report_final.pdf [published after the Frye hearing was held]), defendant did not refine his challenge to address the general acceptance of TrueAllele on such complex mixtures or how that hypothesis would have been applicable to the particular facts of this case. As a result, it is unclear that any such objection would have been relevant to defendant's case, where the samples consisted largely of simple (two-contributor) mixtures with the victim as a known contributor (see also NIST, DNA Mixture Interpretation: A NIST Scientific Foundation Review at 3 [NISTIR 8351-DRAFT June 2021], https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8351-draft.pdf).
Defendant also argues that the source code for the software is the declarant and that, in the absence of disclosure of the source code, he was deprived of his Sixth Amendment right to confront the witness against him. He maintains that the TrueAllele system involves artificial intelligence and, to some extent, draws its own inferences from the data. He asserts that Dr. Perlin's testimony was therefore that of a surrogate, merely parroting the results of the analyst.
[2] Here, like the Lab reports on the generated DNA profiles, the report created by TrueAllele providing the likelihood ratio that defendant was a contributor to the DNA mixture profile found on the items of evidence is testimonial. The report was prepared by Cybergenetics at the request of the People for purposes of prosecuting defendant in a pending criminal proceeding. Indeed, the DNA results were sent to TrueAllele precisely because of its "more advanced approach to analyzing the DNA evidence"—i.e., its consideration of patterns and peaks below the stochastic threshold and ability to produce a{**38 NY3d at 385} higher match statistic. Therefore, the report satisfies our primary purpose test and was testimonial (see People v John, 27 NY3d at 303-308).[FN13]
However, we reject defendant's novel argument that the source code is the declarant. Even if the TrueAllele system is programmed to have some measure of "artificial intelligence," the source code is not an entity that can be cross-examined. "[T]he Confrontation Clause provides two types of protections for a criminal defendant: the right physically to face those who testify against him, and the right to conduct cross-examination" (Coy v Iowa, 487 US 1012, 1017 [1988], quoting Pennsylvania v Ritchie, 480 US 39, 51 [1987 plurality]). The essential purpose of the provision was to ensure
" 'a personal examination and cross-examination of the witness in which the accused has an opportunity, not only of testing the recollection and sifting the conscience of the witness, but of compelling him to stand face to face with the jury in order that they may look at him, and judge by his demeanor upon the stand and the manner in which he gives his testimony whether he is worthy of belief' " (California v Green, 399 US 149, 157-158 [1970], quoting Mattox v United States, 156 US 237, 242-243 [1895]).
In Bullcoming v New Mexico, the United States Supreme Court addressed an argument that a laboratory report could be introduced into evidence through the testimony of an analyst who did not personally perform or observe the test because the gas chromatograph machine, used to analyze the blood alcohol content of the accused's blood sample, was the "true accuser" and the analyst who [*11]ran the test was a "mere scrivener" (564 US 647, 659 [2011] [internal quotation marks omitted]). The Court did not expressly address the concept that a machine can be a declarant, but rejected it sub silentio. Instead, the Court focused on the actions taken by the analyst who operated the machine that would be the appropriate subject of cross-examination—e.g., that the blood sample was received in an intact condition, that a particular test was performed on the sample number that corresponded to the case and that the test{**38 NY3d at 386} was performed according to protocol (see 564 US at 660). In other words, the analyst "certified to more than a machine-generated number" (564 US at 661; see also John, 27 NY3d at 311 [declining to "indulge in the science fiction that DNA evidence is merely machine-generated, a concept that reduces DNA testing to an automated exercise requiring no skill set or application of expertise or judgment"]). Similarly, here, the instrument performs its quantitative analysis on electronic data generated by the Lab during the electrophoresis process only after the analyst sets the parameters following a human review of the data. And both the analyst who performed the electrophoresis on the DNA samples and Dr. Perlin, who fully understood the parameters and methodology of the TrueAllele software in its DNA interpretation processes, testified at trial and were subject to cross-examination.
We agree with the Appellate Division that defendant failed to preserve the separate argument that he was entitled to disclosure of the source code in order to fully cross-examine Dr. Perlin as the declarant at trial and, regardless, defendant's argument suffers from the same defect as the request for the source code for purposes of the Frye hearing. Defendant was not entitled to the source code under the former demand discovery statute. After the People refused the demand, defendant failed to make any further attempt to demonstrate a particularized need for the source code by motion to the court (see former CPL 240.40 [1] [c]; CPL 245.30 [3]).
Defendant's remaining arguments, including the arguments raised in his pro se brief, are without merit.
Accordingly, the order of the Appellate Division should be affirmed.
Rivera, J. (concurring in result).
Traditional DNA testing employs human analysis and judgment (see Erin Murphy, The Art in the Science of DNA: A Layperson's Guide to the Subjectivity Inherent in Forensic DNA Typing, 58 Emory LJ 489, 501-508 [2008]). As a consequence, test results may reflect human error and bias, especially in cases involving mixtures of small amounts of DNA from two or more individuals (see e.g. John M. Butler et al., NIST Interlaboratory Studies Involving DNA Mixtures [MIX05 and MIX13]: Variation Observed and Lessons Learned, 37 Forensic Sci Intl: Genetics 81 [2018]; Itiel E. Dror & Greg Hampikian, Subjectivity and Bias in Forensic DNA Mixture Interpretation, 51 Sci & Just 204 [2011]; see also Saul {**38 NY3d at 387}M. Kassin et al., The Forensic Confirmation Bias: Problems, Perspectives, and Proposed Solutions, 2 J Applied Rsch Memory & Cognition 42 [2013]). Enter probabilistic genotyping software—here, a platform branded TrueAllele—a computer-based form of DNA analysis dependent on complex mathematical models and artificial intelligence to derive DNA profiles from complex, multi-person mixtures. While TrueAllele does not completely replace humans, it comes pretty close by eliminating all but the most rudimentary of human participation.
At defendant's trial, the court admitted DNA results developed using the TrueAllele methodology, even though at the time its source code and underlying algorithms were kept from [*12]independent evaluators and the defense as trade secrets. The court's decision was an abuse of discretion as a matter of law because it relied on validation studies by interested parties and evaluations founded on incomplete information about TrueAllele's computer-based methodology. Without defense counsel and objective, expert third-party access to and evaluation of the underlying algorithms and source code, the court could not conclude that TrueAllele's brand of probabilistic genotyping was generally accepted within the forensic science community. However, I concur in the result and would affirm defendant's conviction because the evidence against him was overwhelming and the error, even when considered with the other alleged trial errors raised on this appeal, could not have contributed to the guilty verdict.
In 2014, prior to defendant's trial for murder and robbery, the court held a Frye hearing on the admissibility of the prosecutor's DNA evidence developed by use of a proprietary probabilistic genotyping methodology called TrueAllele. The prosecutor sought to establish the likelihood that defendant committed the crimes with results based on TrueAllele's computerized interpretive analysis of mixtures of more than one individual's DNA. In preparation for the hearing, defendant requested but was denied access to TrueAllele's source code, which would have revealed the underlying algorithmic assumptions that replace human judgment.
The majority adequately describes the science-based, analytical challenges presented by mixed DNA samples, like those in{**38 NY3d at 388} issue at defendant's trial, and I do not repeat them here (see majority op at 374-376). Instead, I focus on the testimonial description of TrueAllele and the studies relied upon by the prosecution and referenced by the majority to support its conclusion that, at the time of the Frye hearing, TrueAllele was generally accepted within the relevant scientific community of forensic scientists.
The most revealing and significant hearing testimony came from Dr. Mark Perlin, M.D., Ph.D., the founder of Cybergenetics and the lead developer of TrueAllele. He testified that TrueAllele Casework was first developed in 1999, after he had already developed and commercialized TrueAllele Databank, a DNA database software.[FN1] Dr. Perlin stated that TrueAllele is an "expert system," defined as a "computer system that replicates human expertise," usually using artificial intelligence, that makes its own inferences. TrueAllele uses a probabilistic genotyping model, which calculates the probabilities that specific allele pairs belong to the genotype of a particular individual in a mixed, low-template DNA sample. Put another way, TrueAllele sifts through scattered genes in a DNA sample and attempts to reconstruct them into a potential genotype, picking from algorithms, equations, and assumptions that can help it to reach a valid conclusion. The core of TrueAllele is a Markov chain Monte Carlo (MCMC) algorithm, an often-used statistical model first developed in the 1950s that determines the probabilities of all the different possibilities that explain a set of data.
TrueAllele's probabilistic genotyping model, according to Dr. Perlin, was developed in the context of well-known flaws in the field of stochastic, or threshold, modeling in DNA interpretation. Thus, TrueAllele was based on "a need to use normative statistics and mathematical analysis that's [*13]used in all other fields where you use all the data, it's examined objectively, you build and validate and test statistical models, use computers and accept that reliable results . . . can be learned, taught and used."
Following the MCMC calculation, TrueAllele "reports out what the genotype is for each contributor at each locus" by "looking at . . . the probabilities across all the allele pairs at [a] locus for . . . one contributor without particular regard to {**38 NY3d at 389}any particular allele pair." TrueAllele also accounts for various variables and determines which of those variables to apply in its analysis. Finally—and separately from its inferential processes—TrueAllele compares the "inferred" genotypes against a random population to produce a likelihood ratio based upon a known reference sample, using standard methods of DNA comparison.
Dr. Perlin described an analyst's process of using TrueAllele as follows. Initially, raw data from a .FSA file, produced after an electropherogram is generated in a laboratory from DNA samples, are imported into TrueAllele, at which point TrueAllele analyzes the data "in order to determine the size and heights of the peaks to develop quantitative data." Then, an analyst asks the computer a question, such as "[a]ssuming that there are two contributors, what are the genotypes of those two contributors." In general, those questions are preset for ease of use, but trained analysts have the option to ask more complex questions and set additional parameters. After the analyst asks the computer questions, the computer conducts the relevant calculations, drawing on its algorithm and programming to help it determine which equations or models it should use. Generally, as part of the question, the analyst will tell TrueAllele to run a particular number of MCMC calculations. Dr. Perlin testified that the number of calculations can vary depending on the amount of time and computer resources allocated to the case or whether TrueAllele is working with an "easy mixture[ ]." Using TrueAllele, "[t]he computer does the work, we don't, we just set up the variables. And the amount of computation is proportional to the number of variables that you're considering. So . . . if it's thinking a lot harder then it's going to take longer." The analyst also looks for concordance between genotypes across each run of the software and can ask for additional MCMC calculations if necessary. Finally, the software "produces probabilities and then it writes them down." Thereafter, an analyst can compare the genotype(s) inferred by TrueAllele against known reference samples or DNA databases.
According to Dr. Perlin, unlike threshold DNA models, which exclude data from an analyst's review, TrueAllele is fully continuous, in that it uses "the peak height data and the patterns of those peak rise[s]." In contrast, using a "threshold model," a human analyst "get[s] a fixed list of possible alleles, and then by looking at all pairwise combinations of those alleles, those will be included genotypes." Generally, an analyst {**38 NY3d at 390}will input "the alleles or the pu[t]ative alleles . . . into software by hand and then the computer calculates one over the square of the frequencies of those alleles, the sum of it, and then produces a match statistic."
In comparing TrueAllele to threshold methods of DNA analysis, Dr. Perlin explained that "the computer can do it or the lab can do it. . . . [H]aving a better way of understanding the data through statistical modeling and computer interpretation is a way of looking at the data you have and extracting more information from it." Computer analysis can thus go beyond human review of a sample using methods and calculations that a human would be unable to replicate. Essentially, this testimony revealed that TrueAllele extrapolates and processes data to make judgments that supplant human choice, leading to conclusions based on assumptions generated by artificial intelligence.
[*14]Dr. Perlin asserted that TrueAllele had gone through 25 unique version numbers for its software since 1999 and had 170,000 lines of source code, inclusive of "the user interface, the way it interacts with databases, [and] the way it solves problems."[FN2] TrueAllele had last received a significant update in 2008, when Dr. Perlin and his team "put in more hierarchal modeling to account for even more of the variation" in inferring genotypes. Dr. Perlin testified that the development team "made no effort to preserve" earlier versions of TrueAllele.
TrueAllele underwent an initial validation study in the early 2000s. The software was first used in a criminal trial in Pennsylvania in 2009, and was subsequently found admissible{**38 NY3d at 391} in Virginia, California, Northern Ireland, and Australia. The New York State Police Forensic Investigation Center was in the process of onboarding TrueAllele, and a state and international crime lab had both fully adopted TrueAllele for forensic work. Seven other labs had also purchased the TrueAllele software package. The DNA Subcommittee of the New York State Commission on Forensic Sciences and the full Commission had approved TrueAllele for laboratory use. Likewise, the National Institute of Standards and Technology (NIST) had used TrueAllele to verify one of its reference material kits. NIST had also hosted conferences and meetings to encourage laboratories to adopt probabilistic genotyping models like TrueAllele.
Dr. Perlin testified that TrueAllele had undergone approximately 25 validation studies, resulting in several peer-reviewed publications and some prints that were pre-publication at the time of the hearing. Seven of those peer-reviewed studies and two non-peer-reviewed validation studies were entered into evidence.[FN3] Dr. Perlin either conducted or participated in all of these studies. He [*15]also explained that the source code was a "trade secret," but that the basic mathematical equations underlying TrueAllele were published in the literature. However, "[t]he engineering of the elaboration of [the equations] in more levels of hierarchy" were proprietary. {**38 NY3d at 392}Dr. Perlin argued that source code validation was "not needed" because, in his view, "[t]he question when conducting a validation study had to do with the reliability of the system, the reliability of the software, not the text of the code that underlies that reliable system."
Barry Duceman, Ph.D., who was then the Director of the Biological Science Section of FIC, testified that he was in charge of implementing and validating TrueAllele's Databank and Casework systems and obtaining the State Forensic Commission's approval for its use. Dr. Duceman testified to TrueAllele's "increased sensitivity" and "stronger statistical measure," meaning that TrueAllele can do what a human cannot or will not. He also testified to the threshold approaches currently used in DNA testing, explaining that they resulted in data being excluded from analysis. However, Dr. Duceman testified that his laboratory had yet to use TrueAllele because they were still running further validation studies.
Jay Caponera, a Forensic Scientist 3 with FIC, testified that he had conducted internal validation studies on TrueAllele for FIC. Caponera also discussed his training on the use of TrueAllele and testified to attending conferences hosted by NIST on probabilistic genotyping and the use of TrueAllele in other jurisdictions, as well as his review of the scientific literature. Caponera did not have access to the source code.
The court concluded that TrueAllele was generally accepted in the scientific community, based largely on the validation studies and its use in some laboratories and criminal cases in the United States and abroad (47 Misc 3d 850 [Sup Ct, Schenectady County 2015]).
[*16]The victim was discovered in his home, facedown and propped against the couch. His laptop and video game console were missing. At trial, the prosecutor argued that defendant strangled the victim and stole his property to sell for drugs. Undisputed evidence established that the victim had died of asphyxia due to ligature strangulation from a guitar cord, and that defendant and victim were known to each other and were together at the victim's home just before the estimated time of the murder.
The prosecution presented testimonial and physical evidence of defendant's guilt. One witness testified that she knew defendant{**38 NY3d at 393} regularly bought drugs with money and by trading property, and that a few months after the victim's death, defendant, while at a "crack house," told her he had a laptop and video game system to trade for drugs, and that she saw him with a duffel bag which she identified as the same bag the victim was holding in a photograph taken years prior. The prosecutor also presented evidence that the police recovered the receipt for the video game console, and a former employee from the maker confirmed that the console went online five days after the victim's death and that the IP address for the console matched the address of a person who another witness identified as a known drug dealer who had supplied one of the drug houses frequented by defendant.
Three witnesses testified that defendant admitted to killing the victim. One witness testified that he knew defendant for decades and that a few days after the murder, defendant showed him a marijuana pipe which he said he took from a man "he had choked and took out" with a guitar cord. The witness came forward to the police months later, after he heard about the murder and a reward for information about the crime. The two other witnesses testified that defendant separately confessed to them when they were each incarcerated with defendant. One of those witnesses testified that defendant admitted to going to the victim's apartment to steal his musical equipment and then killed him when things got out of hand. The other witness testified that he had known defendant for years and that they had smoked crack together. Defendant admitted to killing the victim and stealing his laptop, gaming system, and other items, which he sold for drugs. Defendant also provided the witness with details of his interaction with the victim the evening before the murder, including the type of liquor they were drinking in the apartment and descriptions of the two other people who were present. The court admitted evidence from the victim's garbage of a liquor bottle with defendant's DNA on it which also matched the slang name of the liquor mentioned by defendant. The two individuals who had been with defendant and the victim the night before the victim was killed also testified to seeing defendant drinking alcohol at the victim's apartment.
The prosecution's additional conventional DNA analysis further inculpated defendant. According to the results, neither defendant nor the victim could be excluded from a sample obtained from the rear collar of the victim's shirt containing a{**38 NY3d at 394} mixed profile, with a combined probability of inclusion (CPI) of 1 in 1,088. A DNA sample obtained from a swab taken from the victim's right dorsal forearm was also found to have a CPI of 1 in 422 with regards to defendant and the victim.
As further proof of defendant's guilt, Dr. Perlin testified that, based on his analysis using the TrueAllele methodology, it was 5.88 billion times more probable that defendant contributed to DNA obtained from the guitar cord than an unrelated African American person, 170 quintillion times more probable that defendant contributed to DNA obtained from the back of decedent's shirt collar than an unrelated African American person, 303 billion times more probable that defendant contributed [*17]to DNA obtained from the front of decedent's shirt collar than an unrelated African American person, and 56.1 million times more probable that defendant contributed to DNA obtained from a swab of the cut on decedent's forearm than an unrelated African American person.
Defendant argued that he was entitled to TrueAllele's source code at trial in order to challenge the DNA evidence pursuant to his Sixth Amendment right to confrontation. The court denied the request, concluding that the source code was neither a witness nor testimonial in nature under the Confrontation Clause.
The jury convicted defendant of one count each of first-degree murder and robbery, and the court sentenced him to life imprisonment without parole. The Appellate Division affirmed the judgment of conviction (175 AD3d 158 [3d Dept 2019]). A Judge of this Court granted defendant leave to appeal (35 NY3d 1097 [2020]). I would affirm, but not for the reasons set forth in the majority analysis. Here, the court erred in admitting the TrueAllele results but the error, either alone or considered with defendant's claims of other alleged errors, was harmless.
"The admissibility and scope of expert testimony are subject to the discretion of the trial court, limiting our scope of review to whether the determination to exclude the proffered expert testimony was an abuse of that discretion as a matter of law" (People v Powell, 37 NY3d 476, 489 [2021] [citation omitted], citing People v Lee, 96 NY2d 157, 162 [2001]). New York courts follow the approach set out in Frye v United States (293 F 1013 [DC Cir 1923]), which asks "whether the accepted techniques, when properly performed, generate results accepted as reliable{**38 NY3d at 395} within the scientific community generally" (People v Wesley, 83 NY2d 417, 422 [1994]). "In determining whether . . . DNA profiling evidence [is] properly admissible, attention must focus on the acceptance of such evidence as reliable by the relevant scientific community" (id.). Under Frye, there must be "consensus[,] [which] has been described as 'a surrogate for determining the reliability of a purported scientific methodology' " (People v Williams, 35 NY3d 24, 37 [2020], quoting Michael M. Martin et al., New York Evidence Handbook § 7.2.3 at 644 [1997]).
"[T]he particular procedure need not be 'unanimously indorsed' by the scientific community but must be 'generally accept[ed] as reliable' " (Wesley, 83 NY2d at 423, quoting People v Middleton, 54 NY2d 42, 49 [1981]). However, "[a] showing that an expert's opinion has 'some support' is not sufficient to establish general acceptance in the relevant scientific community" (Williams, 35 NY3d at 37, quoting Cornell v 360 W. 51st St. Realty, LLC, 22 NY3d 762, 783 [2014]). A novel method "should be supported by those with no professional interest in its acceptance" because "Frye demands an objective, unbiased review" (id. at 42). Indeed, "a proprietary program exclusively developed and controlled by" one individual "is not 'an appropriate substitute for the thoughtful exchange of ideas . . . envisioned by Frye.' It is an invitation to bias" (id. at 41 [citation omitted], quoting Wesley, 83 NY2d at 441 [Kaye, Ch. J., concurring]).
Defendant is correct that TrueAllele's proprietary algorithm was not generally accepted because its source code had not been tested and assessed as reliable by independent third parties within the relevant forensic scientific community.[FN4] {**38 NY3d at 396}Although Dr. Perlin has since offered to release [*18]the source code in other criminal proceedings, at the time of the Frye hearing here he asserted that TrueAllele's source code was a trade secret and refused to turn it over to defendant (see e.g. State v Simmer, 304 Neb 369, 380, 935 NW2d 167, 177 [2019] [noting that "Cybergenetics had recently decided to allow defense experts access to the TrueAllele source code, with limitations"]; State v Baugh, 2019 Ga Super LEXIS 418, *18 [Apr. 29, 2019, No. 2017-CR-618, Palmer, J.] ["Dr. Perlin explained that approximately two years ago he agreed to disclose TrueAllele's source code under specific conditions"]; but see United States v Ellis, 2021 US Dist LEXIS 36176, *1 [WD Pa, Feb. 26, 2021, No. 19-369, Ambrose, J.] ["Cybergenetics was not willing to disclose the source code"], reconsideration denied 2021 WL 1600711, 2021 US Dist LEXIS 78212 [Apr. 23, 2021]). Because there was no opportunity for members of the relevant scientific community to review the source code, and specific software using a complex algorithm cannot be deemed reliable in the scientific community without an independent review of how the software reaches its conclusions—including the inferences made by artificial intelligence—the prosecution failed to satisfy its burden at the Frye hearing. As the President's Council of Advisors on Science and Technology (PCAST) has made clear, while "probabilistic genotyping software programs clearly represent a major improvement over purely subjective interpretation[,] . . . they still require careful scrutiny to determine . . . whether the software correctly implements the methods" (PCAST, Report to the President, Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods 79 [2016], available at https://obamawhitehouse.a{**38 NY3d at 397}rchives.gov/sites/default/files/microsites/ostp/PCAST/pcast_forensic_science_report_final.pdf [hereinafter PCAST Report];[FN5] see also State v Pickett, 466 NJ Super 270, [*19]305-306, 246 A3d 279, 301 [2021] ["Allowing independent access to the requested information, for the sole purpose of addressing whether the technology underlying the expert testimony is reliable—specifically, whether the source code for that technology is properly implementing the program's design specifications—is obvious"]).[FN6] Just as a machine may need to be recalibrated and tested regularly for accuracy, the Frye hearing evidence failed to establish that TrueAllele is generally accepted within the forensic scientific community based on its inferential assumptions and its applications of general mathematical principles.
To be sure, Dr. Perlin had published the general mathematical principles behind TrueAllele in the scientific literature at the time of the hearing. But those general principles alone could not serve to validate TrueAllele's unique, proprietary application and inferences that are the basis for its ultimate DNA matching determinations. As Dr. Perlin testified, TrueAllele goes beyond human capabilities in conducting calculations{**38 NY3d at 398} that would be nearly impossible to do by hand; it also substitutes for humans in that it chooses which of its models, algorithms, and equations to run based on the questions posed by human analysts. So while the mathematical principles—such as MCMC—may be commonly accepted and understood, the inferences and interpretive efforts made by TrueAllele that are subject to challenge were unknown and unknowable at the time of the hearing because of Dr. Perlin's refusal to turn over the source code. In effect, disclosure of the formulas, but not the source code, is analogous to a student pointing to the correct equation to use and giving an answer on a mathematics exam, without showing their work. In that case, however, the teacher knows the correct answer to the exam question, notwithstanding whether the student reached the answer incorrectly; here, there is no "generally accepted" correct answer—as a scientific matter—unless the relevant scientific community can determine that TrueAllele was making appropriate choices when doing the math. Put another way, general principles cannot displace the need for an independent source code review of software that makes its own decisions and inferences [*20]and where one error in a decimal place or a missing parenthesis, or a choice to assume the existence of an allele based on the computer's judgment of an acceptable probability, can mean the difference between inclusion and exclusion of a person as a DNA donor. Therefore, the court abused its discretion as a matter of law by admitting the results of the TrueAllele's analysis at trial.
The majority's view that the Frye hearing record established acceptance within the scientific community is based on an overly favorable but unsupported view of the testimony and documentary evidence. First, contrary to the majority's suggestion, there is no meaningful involvement by the analyst (see majority op at 375, 398). Dr. Perlin testified that TrueAllele "does the work, we don't" and that it relies on its programming to determine the particular analytical tools that are appropriate based upon the data entered into the system. That means there is all the more reason to provide defendant with access to the source code—the algorithm makes the critical decisions and a human being is an "analyst" in name only, as they merely provide the data that TrueAllele computes and probes. Indeed, this is TrueAllele's defining service (see Kwong at 283 ["Dr. Perlin published an article that called the interpretation method used by most laboratories a 'random generator' and argues in interviews that his company's technology produces{**38 NY3d at 399} better probability measurements" (footnote omitted)]).
Second, the majority places undue reliance on evaluations by regulatory entities that had only partial information; the widely accepted nature of TrueAllele's underlying mathematical principles; validation studies in which Dr. Perlin was conflicted due to his financial interest in TrueAllele; and the perceived efficacy of probabilistic genotyping (see majority op at 381-383). We have previously rejected or been skeptical of each of these types of "supporting evidence." In Williams, we made clear that "insular endorsement[s]" by executive branch agencies—here, the DNA Subcommittee of the New York State Forensic Science Committee and NIST—cannot "supplant the courts' obligation to ensure" that a scientific technique has gained general acceptance within the scientific community (35 NY3d at 41). Notably, the reliance on these agencies by the majority—and a description of them as "the relevant scientific community" for the purposes of the Frye hearing (majority op at 375-376)—is a jarring turnabout for the Court, given that it is the same view unsuccessfully advocated by a minority in Williams two years ago (see Williams, 35 NY3d at 49-50 and n 1 [DiFiore, Ch. J., concurring]). That is particularly true in this appeal because Dr. Perlin criticized the PCAST Report, arguing that "NIST lacks expertise in modern statistical analysis"—which would presumably be a prerequisite to being a part of the relevant scientific community that is qualified to evaluate TrueAllele (Letter from Dr. Mark Perlin, Chief Scientific and Executive Officer, Cybergenetics, to Dr. John Holdren, PCAST Co-Chair, Sept. 16, 2016 at 3, available at https://www.cybgen.com/information/newsroom/2016/sep/files/letter.pdf; see Kwong at 289).
The majority—going further—also refers to the American National Standards Institute (ANSI) as being part of the relevant scientific community (see majority op at 375-376). ANSI, however, is a standards-development organization (SDO), a private industry group that writes "voluntary consensus standards" for use in technological applications (see American Socy. for Testing & Materials v Public.Resource.Org, Inc., 896 F3d 437, 440-441 [DC Cir 2018] [discussing role of SDOs in promulgating standards]; Robert W. Hamilton, The Role of Nongovernmental Standards in the Development of Mandatory Federal Standards Affecting Safety or Health, 56 Tex L Rev 1329, 1341-1368 [1978] [discussing ANSI's history and its procedures for developing industry standards]; see generally Harm Schepel, The Constitution of Private Governance: Product{**38 NY3d at 400} Standards in the Regulation of Integrating Markets 145-152 [2005] [discussing the history of SDOs in the United [*21]States]). Including such organizations within the "relevant scientific community," as the majority does, raises similar concerns about professional interest and profit motive analogous to Dr. Perlin's involvement in the validation studies here (see infra at 400-401). Indeed, Dr. Perlin was on a DNA working group that assisted in the development of an NIST standard approved by ANSI that recognized probabilistic genotyping "as a valid approach to DNA interpretation and reporting" (Perlin et al., New York State TrueAllele Casework Validation Study at 1458, citing NIST Special Publication 500-290, ANSI/NIST-ITL 1-2011, American National Standard for Information Systems: Data Format for the Interchange of Fingerprint, Facial & Other Biometric Information [July 2012], available at https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=910136).
In any case, without the source code, the agencies could not adequately evaluate the use of TrueAllele for this type of DNA mixture analysis, and Dr. Perlin's self-serving statements to the contrary were insufficient to satisfy the prosecution's burden. Williams also rejected the notion that the use of known mathematical principles could satisfy the Frye standard, explaining that such reasoning cannot overcome "the proprietary nature of the [software] or the relatively narrow subsection of the relevant scientific community able to examine and endorse that tool" (Williams, 35 NY3d at 42).
Third, as for the validation studies, the majority acknowledges that Dr. Perlin conducted or was involved in the majority of them. Given Dr. Perlin's "professional interest in [TrueAllele's] acceptance" (id.), the studies must be viewed with a healthy dose of skepticism and cannot support a conclusion of general acceptance. The majority attempts to minimize this obvious conflict of interest with an unsupported statement that internal validation studies and validation studies conducted by laboratories using the software are "entirely consistent with the scientific method" and because Dr. Perlin's interest was raised at the hearing (majority op at 382).[FN7] Contrary to the majority's view, such conflicts do present serious concerns about the integrity of the studies (see Wesley, 83 NY2d at 441 [Kaye,{**38 NY3d at 401} Ch. J., concurring]). As the literature shows, conflicts have long been a source of concern within the scientific community and the standard procedure is to disclose such financial interests, not to assume, as the majority does, that they have little impact on the study itself (see e.g. Singapore Statement on Research Integrity, World Conferences on Research Integrity [2010], https://wcrif.org/statement; Sheldon Krimsky & L.S. Rothenberg, Financial Interest and Its Disclosure in Scientific Publications, 280 JAMA 225 [1998]; Jerome P. Kassirer & Marcia Angell, Financial Conflicts of Interest in Biomedical Research, 329 New Eng J Med 570 [1993]). Put another way, mere disclosure of a conflict of interest and subsequent peer review cannot overcome the potential for bias. Dr. Perlin's professional interest in a favorable assessment of TrueAllele was raised at the hearing, but its significance was discounted, and the validation studies were treated as if conducted by disinterested third parties, confirming, rather than refuting, the court's abuse of its discretion. The remaining two validation studies not involving Dr. Perlin are similarly of little value because they were conducted by the New York State Police FIC, which presented a different form of conflict, as they had already invested significant time and money in adopting TrueAllele for internal use.
Additionally, the majority is incorrect that peer review and publication in a scientific journal implies that "the empirical evidence of the reliability of TrueAllele was not disputed" (majority op [*22]at 376). Peer review, while a prerequisite for publication, does not imply that the relevant scientific community has generally accepted a scientific method as reliable. Indeed, as the United States Supreme Court explained in Daubert v Merrell Dow Pharmaceuticals, Inc.:
"Publication (which is but one element of peer review) is not a sine qua non of admissibility; it does not necessarily correlate with reliability, and in some instances well-grounded but innovative theories will not have been published. Some propositions, moreover, are too particular, too new, or of too limited interest to be published. But submission to the scrutiny of the scientific community is a component of 'good science,' in part because it increases the likelihood that substantive flaws in methodology will be detected. The fact of publication (or lack thereof) in a peer reviewed journal thus will be a relevant, though not dispositive,{**38 NY3d at 402} consideration in assessing the scientific validity of a particular technique or methodology on which an opinion is premised" (509 US 579, 593-594 [1993] [citations omitted]; see e.g. David H. Kaye et al., The New Wigmore: Expert Evidence § 8.3.2 [b] [2] [3d ed, 2022 Supp] ["(P)eer review merely signifies that an article has been subject to some basic scrutiny for methodological and logical flaws. In even the best journals, that scrutiny sometimes is inadequate, and peer review is neither a necessary nor a sufficient condition for demonstrating the validity of a theory or technique" (footnotes omitted)]).
Thus, while peer review might be an appropriate factor for consideration under the federal Daubert standard, it is insufficient under Frye (cf. Powell, 37 NY3d at 514 [Rivera, J., dissenting]).
In sum, the majority is incorrect that "[i]t was . . . undisputed that the relevant scientific community was fully represented by those persons and agencies who weighed in on the [TrueAllele] approach" (majority op at 381). The prosecution's case at the Frye hearing turned on Dr. Perlin's validation studies, the opinions of executive agencies and private entities, and the mere fact of peer-reviewed publication, none of which are adequate for the reasons I have discussed.
We are left, then, only with the majority's statement that probabilistic genotyping is more efficacious than other DNA typing. That reasoning is circular because the point of accessing the source code is to evaluate the programming and algorithm and to test the very conclusion that TrueAllele is able to provide a reliable assessment of whether and in what ratio an individual's DNA is likely found in DNA mixtures from a crime scene. On this point, the court below and the majority make the same mistake—assuming that if the system appears reliable then we need not know how the system reached its conclusions (see id. at 381 ["The empirical studies demonstrated TrueAllele's reliability, by deriving reproducible and accurate results from the interpretation of known DNA samples"]). That is contrary to the basic scientific method, which explores not only the end result but the how and the why of an outcome, or "whether the source code performed accurately and as intended" (id. at 383-384). That is no less true in forensic science where, as Dr. Perlin testified, human choice can be based on bias and error. In order to ensure that TrueAllele{**38 NY3d at 403}—created by a human—does not replace one form of bias or flawed assumption with another, third-party independent reviewers must be provided with the source code. As the District Attorney explains,[*23]"[t]he development of the computer inferred probability that an allele is present at a specific locus is the major difference between TrueAllele and a human analyst." That difference is critical to results that either include or exclude an individual as a DNA mixture contributor. Although a human analyst can be asked why they decided a peak was or was not present, or why they believed they could not make a reliable determination of the existence of a peak based on the available data, TrueAllele uses statistical modeling to "infer[ ] the probability that the peak is present." Without the source code, no independent third party or defendant could challenge TrueAllele's assumptions for what is essentially a mathematical guess—a computer-run, theoretically-based conclusion, but still a guess.
Defendant further argues that the trial court's denial of his request for the source code so that an expert could review it was a violation of his constitutional right to confrontation. The Sixth Amendment Confrontation Clause provides that, " '[i]n all criminal prosecutions, the accused shall enjoy the right . . . to be confronted with the witnesses against [them]' " (Crawford v Washington, 541 US 36, 42 [2004]). "[T]he principal evil at which the Confrontation Clause was directed was the civil-law mode of criminal procedure, and particularly its use of ex parte examinations as evidence against the accused" (id. at 50). Thus, although "the Clause's ultimate goal is to ensure reliability of evidence, . . . it is a procedural rather than a substantive guarantee. It commands, not that evidence be reliable, but that reliability be assessed in a particular manner: by testing in the crucible of cross-examination" (id. at 61). "The upshot is that the role of the trial judge is not, for Confrontation Clause purposes, to weigh the reliability or credibility of testimonial hearsay evidence; it is to ensure that the Constitution's procedures for testing the reliability of that evidence are followed" (Hemphill v New York, 595 US &mdash, &mdash, 142 S Ct 681, 692 [2022]).
"Therefore, '[a]s a rule, if an out-of-court statement is testimonial in nature, it may not be introduced against the accused at trial unless the witness who made the statement is{**38 NY3d at 404} unavailable and the accused has had a prior opportunity to confront that witness' " (People v John, 27 NY3d 294, 303 [2016], quoting Bullcoming v New Mexico, 564 US 647, 657 [2011]). Under Crawford's "primary purpose test for determining whether evidence is testimonial," this Court has "considered 'whether the statement was prepared in a manner resembling ex parte examination and . . . whether the statement accuses defendant of criminal wrongdoing' " (People v Austin, 30 NY3d 98, 104 [2017], quoting People v Pealer, 20 NY3d 447, 453 [2013]). "Statements that are considered testimonial include 'affidavits, . . . similar pretrial statements that declarants would reasonably expect to be used prosecutorially . . . [and] statements that were made under circumstances which would lead an objective witness reasonably to believe that the statement would be available for use at a later trial' " (John, 27 NY3d at 303, quoting Crawford, 541 US at 51-52).
The DNA evidence was clearly testimonial and subject to the Confrontation Clause. Forensic reports are testimonial in nature because they are " 'functionally identical to live, in-court testimony' and . . . their 'sole purpose' [is] evidentiary in nature" (id. at 303-304 [emphasis omitted], quoting Melendez-Diaz v Massachusetts, 557 US 305, 310-311 [2009]). Further, it is undisputed that defendant did not have access to the source code, and thus no way to identify and challenge the underlying inferences drawn by TrueAllele and the application of those inferences to the DNA. Defendant could not mount a viable challenge to the conclusions based on the application of TrueAllele's algorithm that linked him to the murder.
[*24]Defendant was entitled to cross-examine those who testified about the "sources" of TrueAllele's inferences—i.e., its source code—and "how [it] evaluated those sources in arriving at [its] conclusion" (People v Stone, 35 NY2d 69, 76 [1974]). Dr. Perlin was a declarant, but defendant could only effectively cross-examine him on his programming assumptions, and any potential biases reflected therein, by reviewing the source code. Although a computer cannot be cross-examined, as Dr. Perlin explained, the computer does the work, not the humans, and TrueAllele's artificial intelligence provided "testimonial" statements against defendant as surely as any human on the stand. Defendant makes a compelling argument that he was entitled to challenge TrueAllele's inferences and choices that led to the DNA interpretations connecting him to the crime in the only way possible: by access to the source code and questioning those who served as the human translators of TrueAllele.{**38 NY3d at 405}
The use of artificial intelligence within our system of justice presents challenging questions and may destabilize our established notions of the dividing line between opinion and uncontestable fact (see e.g. Sonia K. Katyal, Private Accountability in the Age of Artificial Intelligence, 66 UCLA L Rev 54, 62-82 [2019]; Andrea Roth, Machine Testimony, 126 Yale LJ 1972, 2021-2022 [2017]). Courts across the country will decide how our federal and state constitutions may be interpreted in light of continued technological advances and their application in the courtroom. For now, although the questions presented are intellectually challenging and the answers will significantly impact defendants' rights, I need not resolve thorny esoteric questions in this appeal, nor the more specific question of whether defendant was entitled to the source code under the Confrontation Clause or another constitutional guarantee. As I discuss below, even assuming defendant is correct, any constitutional error or abuse of discretion related to the admission of the TrueAllele results without defendant's access to the source code was harmless.
An error of constitutional dimension is "harmless beyond a reasonable doubt" when "the proof of the defendant's guilt, without reference to the error, is overwhelming," and there is "no reasonable possibility that the error might have contributed to defendant's conviction" (People v Crimmins, 36 NY2d 230, 237, 241 [1975]; see People v Perez, 36 NY3d 1093, 1093-1094 [2021]). Overwhelming proof exists where "the quantum and nature of proof, excising the error, are so logically compelling and therefore forceful in the particular case as to lead the appellate court to the conclusion that 'a jury composed of honest, well-intentioned, and reasonable [people]' on consideration of such evidence would almost certainly have convicted the defendant" (Crimmins, 36 NY2d at 241-242).
Here, the totality of the evidence presented at trial, excluding the erroneously admitted TrueAllele DNA evidence, was overwhelming proof of defendant's guilt. There was unchallenged DNA evidence obtained using traditional methods that linked defendant to the crime, with combined probabilities of inclusion—for both decedent and defendant—of 1 in 1,088 for the DNA obtained from decedent's shirt collar, and 1 in 422 for the DNA obtained from decedent's right dorsal forearm. Defendant also did not contest that he had been in decedent's{**38 NY3d at 406} apartment. Moreover, three individuals testified that defendant made inculpatory statements with similar details to each of them about stealing from and killing decedent. Although defendant asserts that those witnesses were unreliable, and that the "jailhouse" witnesses in particular had motive to lie, trial counsel successfully brought those matters to the attention of the jury. Given this evidence, there is "no [*25]reasonable possibility" that a jury would have acquitted defendant (id. at 237, 241-242). Thus, because the error was harmless under the more demanding constitutional standard, defendant is not entitled to a new trial (cf. id. at 242 [explaining that an error is harmless under the nonconstitutional standard when "there is a significant probability, rather than only a rational possibility, in the particular case that the jury would have acquitted the defendant had it not been for the error or errors which occurred"]; People v Easley, 38 NY3d 1010, 1016 [2022, Rivera, J., dissenting] [decided today] [explaining that admission of FST DNA results was not harmless under nonconstitutional standard because "(t)here was no eyewitness who saw defendant in possession of the gun, no admission of his guilt, and no video recording depicting him holding the gun at any time"]).[FN8]
The order of the Appellate Division should be affirmed. Although the prosecutor failed to establish that, at the time of the Frye hearing, TrueAllele's methodology was properly validated by disinterested parties with access to the source code, and defendant was denied an opportunity to review the source code because of the developer's proprietary claims, the error, considered alone or with the other alleged constitutional error, was harmless on the facts of this case.
Even though the majority rejects defendant's claim to the source code on the facts of this case, it remains an open question in this Court whether a defendant should be granted access to a proprietary source code under a protective order. This familiar method of ensuring a defendant's right to present a defense would safeguard commercial interests. It provides no help to this defendant, but it is squarely within a court's authority to grant such an order in an appropriate future case.{**38 NY3d at 407}
Judges Garcia, Singas and Cannataro concur; Judge Rivera concurs in result in an opinion, in which Judges Wilson and Troutman concur.
Order affirmed.
@f1Several other articles on the field of probabilistic genotyping were also received into evidence, including: Christopher D. Steele & David J. Balding, Statistical Evaluation of Forensic DNA Profile Evidence (1 Ann Rev Stat & Its Application 361 [2014]); Hannah Kelly et al., A Comparison of Statistical Models for the Analysis of Complex Forensic DNA Profiles (54 Sci & Just 66 [2014]); Duncan Taylor et al., The Interpretation of Single Source and Mixed DNA Profiles (7 Forensic Sci Intl Genetics 516 [2013]); James M. Curran, A MCMC Method for Resolving Two Person Mixtures (48 Sci & Just 168 [2008]).
Footnote 4:It is a close question whether, at the time of the Frye hearing, probabilistic genotyping was generally accepted as reliable for the DNA purposes for which it was used in defendant's prosecution (see e.g. Kelly et al. at 69 ["(T)he continuous model is unfamiliar to many forensic DNA scientists"]; Taylor et al. at 517 ["(P)rogress is still partial with only a few laboratories worldwide implementing or investigating fully continuous methods" (endnote omitted)], citing Bruce Budowle et al., Mixture Interpretation: Defining the Relevant Features for Guidelines for the Assessment of Mixed DNA Profiles in Forensic Casework, 54 J Forensic Scis 810 [2009]). Indeed, the evidence and testimony at the Frye hearing demonstrated that NIST and other forensic organizations sought to encourage the use of probabilistic genotyping in an effort to overcome hesitance about this type of DNA technology among forensic experts (see e.g. Peter Gill et al., DNA Commission of the International Society of Forensic Genetics: Recommendations on the Evaluation of STR Typing Results That May Include Drop-out and/or Drop-in Using Probabilistic Methods, 6 Forensic Sci Intl Genetics 679, 680 [2012] ["Clearly the adoption of probabilistic models has been inhibited by the complexity of concepts that are largely outside the experience of case-working forensic scientists, coupled with lack of suitable training opportunities"]; Michael D. Coble & John M. Butler, NIST, Exploring the Capabilities of Mixture Interpretation Using True Allele Software, Presentation at the 24th Congress of the International Society for Forensic Genetics [Sept. 3, 2011], slides available at https://strbase.nist.gov/pub_pres/ISFG2011-Coble-TrueAllele.pdf). Assuming it was not an abuse of discretion for the court to decide as a threshold matter that this type of analysis was generally accepted, for the reasons I discuss, the evidence is not at all close as regards TrueAllele.
Footnote 5:It is unclear what significance the majority assigns to the PCAST Report's use of the term "black box" to refer to "subjective methods inside an examiner's head" (majority op at
380 n 9). As the majority recognizes—other commenters and the courts routinely use "black box" to refer to technologies with unknown internal workings (see id. at
382 n 10; Williams, 35 NY3d at 33-34; Katherine Kwong, Note, The Algorithm Says You Did It: The Use of Black Box Algorithms to Analyze Complex DNA Evidence, 31 Harv JL & Tech 275, 292-295 [2017] [discussing TrueAllele's black box algorithm and transparency concerns]). And, in common usage, "black box" refers to "a usually complicated electronic device whose internal mechanism is usually hidden from or mysterious to the user" (see Merriam-Webster Online Dictionary, black box [https://www.merriam-webster.com/dictionary/black%20box]). The majority merely engages in a rhetorical comparison of the "black box" as object to the "black box" as an examiner's mind.
Footnote 6:The majority suggests, without citation or support, that "simple (two-contributor) mixtures with the victim as a known contributor" (majority op at
384) should be distinguished from other DNA mixtures. This misses the mark because although a mixture of several contributors apart from the victim may provide additional complicating factors, the core problem is that the DNA is combined in the sample, requiring an analysis of the data by means of justifiable assumptions and choices that accord with a methodology and lead to a likelihood ratio that a particular individual is a contributor to the mixture. As Dr. Perlin explained at the Frye hearing, describing a mixture as simple or complex largely pertains to the ability of a human analyst "to obtain a reportable result using the methods that they had" (see also Kelly et al. at 66).
Footnote 7:The majority recognizes that unpublished internal validation studies alone would be insufficient to meet the Frye standard (see majority op at
381-382 and n 10).
Footnote 8:Defendant's other challenges are either unpreserved or lack merit.