Showing posts with label risk assessment. Show all posts
Showing posts with label risk assessment. Show all posts

July 6, 2014

Innovative international risk assessment service is expanding

Try your hand at answering these questions:
  1. When evaluating Aboriginal offenders, how valid are standard risk assessment protocols? 

  2. Among Canadian men, how well does the Danger Assessment (DA) predict domestic violence? 

  3. For sex offenders in Vermont, what instrument is more accurate than the widely used Static-99 for predicting recidivism? 

  4. In screening U.S. soldiers coming back from Afghanistan, is there a valid tool that would help allocate limited therapeutic resources in order to decrease violence risk? 

  5. Finally, what the heck are the Y-ARAT, the CuRV, the START, and the VIO-SCAN, and what (if anything) are they good for?

With the frenetic pace of risk assessment research and practice developments, you couldn't be faulted for not knowing the correct answers to all of the above questions. Hardly anyone does.

That’s where the Executive Bulletin comes in.

Back in February, I told you about the launch of this service for clinicians, attorneys and researchers who want to stay abreast of developments in the field of risk assessment. The publishers scour more than 80 professional journals and create a one-page summary for each article relevant to violence and sex offending risk assessment among adults and juveniles. Using an appealing, easy-to-read format, each summary highlights the study's clinical implications and relevant legal questions, while minimizing statistical jargon. 

In the months since my announcement, the Bulletin has been gaining traction around the world. It now reaches more than 11,000 practitioners, researchers, and policymakers in the United States, Australia, China, Hong Kong, Spain, Germany, Canada, the United Kingdom, Argentina, Israel, the Netherlands, Mexico, Lithuania, Norway and Denmark. Among its largest subscribers are the California Department of State Hospitals -- which requires that its forensic evaluators read each monthly issue in order to stay abreast of peer-reviewed research into evidence-based practice -- and the public-policy oriented Council of State Governments in the United States.

The newly rebranded Global Institute of Forensic Research (GIFR), with the ever-energetic forensic psychologist Jay Singh at its helm, is currently rolling out a new features and services, including a new website, a podcast version of the Bulletin for commuters, and expert risk assessment trainings (free to subscribers) that are eligible for continuing education credits from the American Psychological Association and the Canadian Psychological Association.

The service is subscription-based. At $35 per month (and $350 for group subscriptions) it isn’t cheap, but Dr. Singh points out that the alternatives are also costly. It’s both costly and time consuming to stay abreast of important risk-related articles from more than 80 journals, most of them fee-based. Thus, without a synthesizing service such as the Bulletin, practitioners risk falling behind and inadvertently violating relevant standards of practice.

Among my own main concerns if I am going to allow someone else to find and synthesize research for my consumption is the degree of fidelity and expertise that the reviewer brings to bear. Here, the field is fortunate to have someone upon whom we can confidently rely. What I find most valuable about the Bulletin is the level of critical analysis that the expert reviewers bring to bear on each of the 15 or 20 articles they summarize each month. (Indeed, my confidence is why I accepted an invitation a while back to serve on the Institute’s advisory board.)

Singh, an epidemiology professor at Molde University in Norway, has published more than 40 cutting-edge articles on violence prediction (a few of which I have featured in prior blog posts). Formerly a fellow of the Florida Mental Health Institute and a Senior Researcher in Forensic Psychiatry for the Swiss Department of Corrections in Zurich, he has also trained and lectured widely on mental illness and violence, including at Harvard, Yale, Columbia, Cornell, Brown, Dartmouth, and the University of Pennsylvania.

To date, his Institute has conducted exclusive interviews on tips and tricks in forensic assessment with leading practitioners and scholars including Jodi Viljoen, Nicholas Scurich, Annelies Vredeveldt and -- most recently -- Jennifer Lanterman of the University of Nevada at Reno. Next month’s featured expert is Seena Fazel of Oxford University. You can browse the website and find a sample issue HERE.

If you decide to sign up (or, better yet, get your institution to sign up), Singh is offering my blog readers and subscribers a special 10 percent discount. Just click on THIS LINK, and enter the discount code INTHENEWS.

February 9, 2014

Risk researchers launching premium literature service

The Alliance for International Risk Research (AIRR) is launching an excellent new risk assessment resource for mental health, correctional, and legal professionals. The AIRR Executive Bulletin is being called "an exceptional resource that lawyers on both sides, judges, examiners and the rest of us practitioners in these areas of forensic mental health, treatment and law should subscribe to in order to begin to implement a uniform body of current literature and ‘Best Practices’ that continually updates going forward to facilitate development of a legal and constitutional body of law."

The subscription-based service is designed for busy professionals who want to stay up to date but simply do not have the time to locate and read the voluminous literature published each month. It aggregates research on risk assessment for violence, sex offending and general recidivism among adults and juveniles.

AIRR researchers Jay Singh and Kevin Douglas
An expert team led by top risk assessment researchers Jay Singh and Kevin Douglas systematically searches more than 80 journals and identifies every new risk assessment article published each month. The average is around 20 articles. Doctors Singh and Douglas then purchase and read every article and write a one-page, easy-to-digest summary without statistical jargon.

In addition to the monthly summary of literature, subscribers also get four online risk assessment training seminars per year from top clinical researchers, and an exclusive monthly interview with an industry leader. It's a convenient way to get continuing education credits, because the trainings are eligible for American Psychological Association and Canadian Psychological Association credits.

A sample issue is available HERE

You can sign up for either an individual or a group subscription HERE.  For questions, contact lead reasearcher Jay Singh (HERE).

February 4, 2014

Research review II: Sexual predator controversies

Following up on last week’s research review, here are some new articles from the ever-controversial practice niche of sexually violent predator cases:

Facts? Who cares about the facts?!

Once a jury is empaneled to decide whether someone with a prior sex offense conviction is so dangerous to the public that he should be civilly detained, the verdict is a foregone conclusion. Dangerousness is presumed based on the prior conviction, rather than having to be proven.


Researchers Nicholas Scurich and Daniel Krauss confirmed this by giving jury-eligible citizens varying degrees of information in a Sexually Violent Predator (SVP) case and asking them to vote. Some mock jurors were told only that the person had a prior conviction for a sex offense. Others were also given information that the person had a mental abnormality that made him likely to engage in future acts of sexual aggression.


It mattered not a whit. The mock jurors voted to civilly commit at the same rate, whether or not they had heard evidence of current dangerousness.


“The mere fact that a respondent had been referred for an SVP proceeding was sufficient for a majority of participants to authorize commitment,’ the researchers found. “These findings raise concerns about whether the constitutionally required due process occurs in SVP commitment proceedings.”


No surprise, really. In this practice niche more than others, fear and hype often overshadow reason. Sex offenders are not the most appealing human beings, and no one wants to shoulder the responsibility of voting to release someone who could go out and rape or molest again.


The study is:

The presumption of dangerousness in sexually violent predator commitment proceedings, Nicholas Scurich and Daniel A. Krauss, Law, Probability and Risk. A copy may be requested from the first author (HERE).





Sexual disorder diagnoses not reliable



Meanwhile, even when jurors do hear evidence of mental abnormality, it is not especially accurate.


Examining the diagnoses given to 375 sex offenders referred for civil commitment in New Jersey, researchers found “questionable” diagnostic reliability to be a widespread problem across the range of clinicians.


Pedophilia was the only diagnosis in which two evaluators were likely to agree at a level above chance. The rates of agreement were far worse for other disorders that are typically rendered in SVP cases, including “Paraphilia Not Otherwise Specified,” Sexual Sadism, Antisocial Personality Disorder and Exhibitionism. In fact, among the six cases in which Exhibitionism was diagnosed, there was not a single case in which both clinicians agreed.


The study, by Anthony Perillo of John Jay College and colleagues, adds to a burgeoning body of literature (some of which I’ve previously reported on) suggesting that psychiatric diagnoses in SVP evaluations are often dubious and not to be trusted.


The article is:

Examining the scope of questionable diagnostic reliability in Sexually Violent Predator(SVP) evaluations, Anthony D. Perillo, Ashley H. Spada, Cynthia Calkins and Elizabeth L. Jeglic, International Journal of Law and Psychiatry. A copy may be requested from the first author (HERE).





Race bias in actuarial risk prediction



Okay, so the diagnoses aren’t reliable. But we’ve still got another tool of science up our sleeves -- actuarial risk assessment.


Not so fast.


As I’ve previously reported, the predictive accuracy of actuarial risk assessment tools is pretty wimpy. And now, researchers from Sam Houston State University are finding that the most widely used actuarial tool, the Static-99, doesn’t work at all with Latino offenders.


The findings are based on research with a large sample of about 2,000 sex offenders, almost 600 of whom were Latino.


“Findings have implications for fairness in testing and highlight the need for continuedresearch regarding the potentially moderating role of offender race/ethnicity in risk research,” note researchers Jorge Varela and colleagues.


The study is:

Do the Static-99 and Static-99R Perform Similarly for White, Black, and Latino Sexual Offenders? Jorge G. Varela , Marcus T. Boccaccini, Daniel C. Murrie, Jennifer D. Caperton and Ernie Gonzalez Jr. International Journal of Forensic Mental Health. To request a copy from the first author, click HERE.





How to lie with statistics: “The Area Under the Curve”



Listen to any defender of actuarial risk prediction for a few minutes, and you will likely hear "Receiver Operating Characteristics” and “The Area Under the Curve” touted as indicators of statistical accuracy.


But in a new study in the Journal of Threat Assessment, two European scholars argue that these arguments are “fundamentally misleading.” Using the Risk Matrix 2000 instrument -- widely deployed in the United Kingdom -- as an exemplar, they found that a prediction of reoffense for an offender who scored in the “Very High Risk” range will be wrong an astounding 93 percent of the time.


“The numbers necessary to detain in order to prevent one instance of recidivism are large,” write David Cooke and Christine Michie. “On further reflection, from a statistical rather than a psychological perspective, should we be surprised? It has long been recognized that low-frequency events are hard to predict.”


The authors argue that the weak performance of actuarials is being systematically camouflaged by “statistical rituals” that are confusing and non-transparent, raising fundamental questions of fairness in legal decision-making.


The article is:

The Generalizability of the Risk Matrix 2000: On Model Shrinkage and the Misinterpretation of the Area Under the Curve. David Cooke and Christine Michie. Journal of Threat Assessment and Management. To request a copy from the first author, click HERE.





Counterpoint



Not everyone agrees with Cooke and Michie’s analysis. One detractor is Douglas Mossman, of the Department of Psychiatry at the University of Cincinnati College of Medicine. Using a fictional scenario, he attempts to illustrate how "group data have an obvious application to individual decisions.” His paper goes on to argue that “misinterpretations of mathematical concepts and misunderstanding of the aims of risk assessment have led to mistakes about the applicability of group data to individual instances.”


The paper is:

From Group Data to Useful Probabilities: The Relevance of Actuarial Risk Assessment in Individual Instances. (Unpublished.) Douglas Mossman. Paper available online (HERE).





Who is minding the store?



If nothing else, the above research snippets demonstrate the high level of controversy and complexity in the implementation of Sexually Violent Predator laws. If psychologists -- who must master psychometrics and statistics in order to earn our PhD’s -- have a hard time with these concepts, imagine how difficult it is for attorneys. With people’s lives at stake, do they have the knowledge base necessary to avoid being hoodwinked, and to educate jurors and judges?


In a new paper, prolific legal scholars Heather Cucolo and Michael L. Perlin of the New York School of Law argue that more stringent standards for representation are necessary for effective assistance of counsel in SVP cases.


They propose that counsel should be required to “demonstrate a familiarity with the psychometric tests regularly employed at such hearings, and with relevant expert witnesses who could assist in the representation of the client.” Furthermore, they argue for a pool of court-appointed experts who could be appointed at no cost, similar to those provided in insanity cases.


“There is no question that the population in question is the most despised group of individuals in the nation. Society’s general revulsion towards this population is shared by judges, jurors and lawyers. Although the bar pays lip service to the bromide that counsel is available for all, no matter how unpopular the cause, the reality is that there are few volunteers for the job of representing these individuals, and that the public's enmity has a chilling effect on the vigorous of representation in this area.”


The paper is:

'Far from the Turbulent Space': Considering the Adequacy of Counsel in the Representation of Individuals Accused of Being Sexually Violent Predators. Heather Cucolo and Michael L. Perlin. It is available online HERE.


January 12, 2014

Putting the Cart Before the Horse: The Forensic Application of the SRA-FV

As the developers of actuarial instruments such as the Static-99R acknowledge that their original norms inflated the risk of re-offense for sex offenders, a brand-new method is cropping up to preserve those inflated risk estimates in sexually violent predator civil commitment trials. The method introduces a new instrument, the “SRA-FV,” in order to bootstrap special “high-risk” norms on the Static-99R. Curious about the scientific support for this novel approach, I asked forensic psychologist and statistics expert Brian Abbott to weigh in.

Guest post by Brian Abbott, PhD*

NEWS FLASH: Results from the first peer-reviewed study about the Structured Risk Assessment: Forensic Version (“SRA-FV”), published in Sexual Abuse: Journal of Research and Treatment (“SAJRT”), demonstrate the instrument is not all that it’s cracked up to be.
Promotional material for an SRA-FV training
For the past three years, the SRA-FV developer has promoted the instrument for clinical and forensic use despite the absence of peer-reviewed, published research supporting it validity, reliability, and generalizability. Accordingly, some clinicians who have attended SRA-FV trainings around the country routinely apply the SRA-FV in sexually violent predator risk assessments and testify about its results in court as if the instrument has been proven to measure what it intends to assess, has known error rates, retains validity when applied to other groups of sexual offenders, and produces trustworthy results.

Illustrating this rush to acceptance most starkly, within just three months of its informal release (February 2011) and with an absence of any peer-reviewed research, the state of California incredibly decided to adopt the SRA-FV as its statewide mandated dynamic risk measure for assessing sexual offenders in the criminal justice system. This decision was rescinded in September 2013, with the SRA-FV replaced with a similar instrument, the Stable-2007.

The SRA-FV consists of 10 items that purportedly measure “long-term vulnerabilities” associated with sexual recidivism risk. The items are distributed among three risk domains and are assessed using either standardized rating criteria devised by the developer or by scoring certain items on the Psychopathy Checklist-Revised (PCL-R). Scores on the SRA-FV range from zero to six. Some examples of the items from the instrument include: sexual interest in children, lack of emotionally intimate relationships with adults, callousness, and internal grievance thinking. Patients from the Massachusetts Treatment Center in Bridgewater, Massachusetts who were evaluated as sexually dangerous persons between 1959 and 1984 served as members of the SRA-FV construction group (unknown number) and validation sample (N = 418). It was released for use by Dr. David Thornton, a co-developer of the Static-99R, Static-2002R, and SRA-FV and research director at the SVP treatment program in Wisconsin, in December 2010 during training held in Atascadero, California. Since then, Dr. Thornton has held similar trainings around the nation where he asserts that the SRA-FV is valid for predicting sexual recidivism risk, achieves incremental validity over the Static-99R, and can be used to choose among Static-99R reference groups.

A primary focus of the trainings is a novel system in which the total score on the SRA-FV is used to select one Static-99R “reference group” among three available options. The developer describes the statistical modeling underlying this procedure, which he claims increases predictive validity and power over using the Static-99R alone. However, reliability data is not offered to support this claim. In the December 2010 training, several colleagues and I asked for the inter-rater agreement rate but Dr. Thornton refused to provide it.

I was astounded but not surprised when some government evaluators in California started to apply the SRA-FV in sexually violent predator risk assessments within 30 days after the December 2010 training. This trend blossomed in other jurisdictions with sexually violent predator civil confinement laws. Typically, government evaluators applied the SRA-FV to select Static-99R reference groups, invariably choosing to compare offenders with the “High Risk High Needs” sample with the highest re-offense rates. A minority of clinicians stated in reports and court testimony that the SRA-FV increased predictive accuracy over the Static-99R alone but they were unable to quantify this effect. The same clinicians have argued that the pending publication of the Thornton and Knight study was sufficient to justify its use in civil confinement risk assessments for sexually violent predators. They appeared to imply that the mere fact that a construction and validation study had been accepted for publication was an imprimatur that the instrument was reliable and valid for its intended purposes. Now that the research has been peer-reviewed and published, the results reflect that these government evaluators apparently put the proverbial cart before the horse.

David Thornton and Raymond Knight penned an article that documents the construction and validation of the SRA-FV. The publication is a step in the right direction, but by no means do the results justify widespread application of the SRA-FV in sexual offender risk assessment in general or sexually violent predator proceedings in particular. Rather, the results of the study only apply to the group upon which the research was conducted and do not generalize to other groups of sexual offenders. Before discussing the limitations of the research, I would like to point out some encouraging results.

The SRA-FV did, as its developer claimed, account for more sources of sexual recidivism risk than the Static-99R alone. However, it remains unknown which of the SRA-FV’s ten items contribute to risk prediction. The study also found that the combination of the Static-99R and SRA-FV increased predictive power. This improved predictive accuracy, however, must be replicated to determine whether the combination of the two instruments will perform similarly in other groups of sexual offenders. This is especially important when considering that the SRA-FV was constructed and validated on individuals from the Bridgewater sample from Massachusetts who are not representative of contemporary groups of sexual offenders. Thornton and Knight concede this point when discussing how the management of sexual offenders through all levels of the criminal justice system in Massachusetts between 1959 and 1984 was remarkably lenient compared to contemporary times. Such historical artifacts likely compromise any reliable generalization from patients at Bridgewater to present-day sexual offenders.

Training materials presented four months before
State of California rescinded use of the SRA-FV

Probably the most crucial finding from the study is the SRA-FV’s poor inter-rater reliability. The authors categorize the 64 percent rate of agreement as “fair.” It is well known that inter-rater agreement in research studies is typically higher than in real-world applications. This has been addressed previously in this blog in regard to the PCL-R. A field reliability study of the SRA-FV among 19 government psychologists rating 69 sexually violent predators in Wisconsin (Sachsenmaier, Thornton, & Olson, 2011) found an inter-rater agreement rate of only 55 percent for the SRA-FV total score, which is considered as poor reliability. These data illustrate that 36 percent to 45 percent of an SRA-FV score constitutes error, raising serious concerns over the trustworthiness of the instrument. To their credit, Thornton and Knight acknowledge this as an issue and note that steps should be taken to increase reliable scoring. Nonetheless, the current inter-rater reliability falls far short of the 80 percent floor recommended for forensic practice (Heilbrun, 1992). Unless steps are taken to dramatically improve reliability, the claims that the SRA-FV increases predictive accuracy either alone or in combination with the Static-99R, and that it should be used to select Static-99R reference groups, are moot.

It is also important to note that, although Thornton and Knight confuse the terms validation and cross validation in their article, this study represents a validation methodology. Cross-validation is a process by which the statistical properties found in a validation sample (such as reliability, validity, and item correlations) are tested in a separate group to see whether they hold up. In contrast, Thornton and Knight first considered the available research data from a small number of individuals from the Bridgewater group to determine what items would be included in the SRA-FV. This group is referred to as the construction sample. The statistical properties of the newly conceived measure were studied on 418 Bridgewater patients who constitute the validation sample. The psychometric properties of the validation group have not been tested on other contemporary sexual offender groups. Absent such cross-validation studies, we simply have no confidence that the SRA-FV works at it has been designed for groups other than the sample upon which it was validated. To their credit, Thornton and Knight acknowledge this limitation and warn readers not to generalize the validation research to contemporary groups of sexual offenders.

The data on incremental predictive validity, while interesting, have little practical value at this point for two reasons. One, it is unknown whether the results will replicate in contemporary groups of sexual offenders. Two, no data are provided to quantify the increased predictive power. The study does not provide an experience table of probability estimates at each score on the Static-99R after taking into account the effect of the SRA-FV scores. It seems disingenuous, if not misleading, to inform the trier of fact that the combined measures increase predictive power but to fail to quantify the result and the associated error rate.

In my practice, I have seen the SRA-FV used most often to select among three Static-99R reference groups. Invariably, government evaluators in sexually violent predator risk assessments assign SRA-FV total scores consistent with the selection of the Static-99R High Risk High Needs reference group. Only the risk estimates associated with the highest Static-99R scores in this reference group are sufficient to support an opinion that an individual meets the statutory level of sexual dangerousness necessary to justify civil confinement. Government evaluators who have used the SRA-FV for this purpose cannot cite research demonstrating that the procedure works as intended or that it produces a reliable match to the group representing the individual being assessed. Unfortunately, Thornton and Knight are silent on this application of the SRA-FV.

In a recently published article, I tested the use of the SRA-FV for selecting Static-99R reference groups. In brief, Dr. Thornton used statistical modeling based solely on data from the Bridgewater sample to devise this model. The reference group selection method was not based on the actual scores of members from each of the three reference groups. Rather, it was hypothetical, presuming that members of a Static-99R reference group will exhibit a certain range of SRA-FV score that do not overlap with any of the other two reference groups. To the contrary, I found that the hypothetical SRA-FV reference group system did not work as designed, as the SRA-FV scores between reference groups overlapped by wide margins. In other words, the SRA-FV total score would likely be consistent with selecting two if not all three Static-99R reference groups. In light of these findings, it is incumbent upon the developer to provide research using actual subjects to prove that the SRA-FV total score is a valid method by which to select a single Static-99R reference group and that the procedure can be applied reliably. At this point, credible support does not exist for using the SRA-FV to select Static-99R reference groups.

The design, development, validation, and replication of psychological instruments is guided by the Standard for Educational and Psychological Testing (“SEPT” -- American Educational Research Association et al., 1999). When comparing the Thornton and Knight study to the framework provided by SEPT, it is apparent the SRA-FV is in the infancy stage of development. At best, the SRA-FV is a work in progress that needs substantially more research to improve its psychometric properties. Aside from its low reliability and inability to generalize the validation research to other groups of sexual offenders, other important statistical properties await examination, including but not limited to:

  1. standard error of measurement
  2. factor analysis of whether items within each of the three risk domains significantly load in their respective domains
  3. the extent of the correlation between each SRA-FV item and sexual recidivism
  4. which SRA-FV items add incremental validity beyond the Static-99R or may be redundant with it; and proving each item has construct validity. 

It is reasonable to conclude that at its current stage of development the use of the SRA-FV in forensic proceedings is premature and scientifically indefensible. In closing , in their eagerness to improve the accuracy of their risk assessments, clinicians relied upon Dr. Thornton’s claim in the absence of peer-reviewed research demonstrating that the SRA-FV achieved generally accepted levels of reliability and validity. The history of forensic evaluators deploying the SRA-FV before the publication of the construction and validation study raises significant ethical and legal questions:

  • Should clinicians be accountable to vet the research presented in trainings by an instrument’s developer before applying a tool in forensic practice? 

  • What responsibility do clinicians have to rectify testimony where they presented the SRA-FV as if the results were reliable and valid?

  •  How many individuals have been civilly committed as sexually violent predators based on testimony that the findings from the SRA-FV were consistent with individuals meeting the legal threshold for sexual dangerousness, when the published data does not support this conclusion?

Answers to these questions and others go beyond the scope of this blog. However, in a recent appellate decision, a Washington Appeals Court questions the admissibility of the SRA-FV in the civil confinement trial of Steven Ritter. The appellate court determined that the application of the SRA-FV was critical to the government evaluator’s opinion that Mr. Ritter met the statutory threshold for sexual dangerousness. Since the SRA-FV is considered a novel scientific procedure, the appeals court reasoned that the trial court erred by not holding a defense-requested evidentiary hearing to decide whether the SRA-FV was admissible evidence for the jury to hear. The appeals court remanded the issue to the trial court to hold a Kelly-Frye hearing on the SRA-FV. Stay tuned!

References

Abbott, B.R. (2013). The Utility of Assessing “External Risk Factors” When Selecting Static-99R Reference Groups. Open Access Journal of Forensic Psychology, 5, 89-118.

American Educational Research Association, American Psychological Association and National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.

Heilbrun, K. (1992). The role of psychological testing in forensic assessment. Law and Human Behavior, 16, 257-272. doi: 10.1007/BF01044769.

In Re the Detention of Steven Ritter. (2013, November). In the Appeals Court of the State of Washington, Division III. 

Sachsenmaier, S., Thornton, D., & Olson, G. (2011, November). Structured risk assessment forensic version (SRA-FV): Score distribution, inter-rater reliability, and margin of error in an SVP population. Presentation at the 30th Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, Toronto, Canada.

Thornton, D. & Knight, R.A. (2013). Construction and validation of the SRA-FV Need Assessment. Sexual Abuse: A Journal of Research and Treatment. Published online December 30, 2013. doi: 10.1177/ 1079063213511120. 
* * *


*Brian R. Abbott is licensed psychologist in California and Washington who has evaluated and treated sexual offenders for more than 35 years. Among his areas of forensic expertise, Dr. Abbott has worked with sexually violent predators in various jurisdictions within the United States, where he performs psychological examinations, trains professionals, consults on psychological and legal issues, offers expert testimony, and publishes papers and peer-reviewed articles.



(c) Copyright 2013 - All rights reserved

January 5, 2014

New evidence of psychopathy test's poor accuracy in court

Use of a controversial psychopathy test is skyrocketing in court, even as mounting evidence suggests that the prejudicial instrument is highly inaccurate in adversarial settings.

The latest study, published by six respected researchers in the influential journal Law and Human Behavior, explored the accuracy of the Psychopathy Checklist, or PCL-R, in Sexually Violent Predator cases around the United States.

The findings of poor reliability echo those of other recent studies in the United States, Canada and Europe, potentially heralding more admissibility challenges in court. 

Although the PCL-R is used in capital cases, parole hearings and juvenile sentencing, by far its most widespread forensic use in the United States is in Sexually Violent Predator (SVP) cases, where it is primarily invoked by prosecution experts to argue that a person is at high risk for re-offense. Building on previous research, David DeMatteo of Drexel University and colleagues surveyed U.S. case law from 2005-2011 and located 214 cases from 19 states -- with California, Texas and Minnesota accounting for more than half of the total -- that documented use of the PCL-R in such proceedings.

To determine the reliability of the instrument, the researchers examined a subset of 29 cases in which the scores of multiple evaluators were reported. On average, scores reported by prosecution experts were about five points higher than those reported by defense-retained experts. This is a large and statistically significant difference that cannot be explained by chance. 

Prosecution experts were far more likely to give scores of 30 or above, the cutoff for presumed psychopathy. Prosecution experts reported scores of 30 or above in almost half of the cases, whereas defense witnesses reported scores that high in less than 10 percent.

Looking at interrater reliability another way, the researchers applied a classification scheme from the PCL-R manual in which scores are divided into five discreet categories, from “very low” (0-8) to “very high” (33-40). In almost half of the cases, the scores given by two evaluators fell into different categories; in about one out of five cases the scores were an astonishing two or more categories apart (e.g., “very high” versus “moderate” psychopathy). 

Surprisingly, interrater agreement was even worse among evaluators retained by the same side than among opposing experts, suggesting that the instrument’s inaccuracy is not solely due to what has been dubbed adversarial (or partisan) allegiance.

Despite its poor accuracy, the PCL-R is extremely influential in legal decision-making. The concept of psychopathy is superficially compelling in our current era of mass incarceration, and the instrument's popularity shows no sign of waning. 

Earlier this year, forensic psychologist Laura Guy and colleagues reported on its power in parole decision-making in California. The state now requires government evaluators to use the PCL-R in parole fitness evaluations for “lifers,” or prisoners sentenced to indeterminate terms of up to life in prison. Surveying several thousand cases, the researchers found that PCL-R scores were a strong predictor of release decisions by the Parole Board, with those granted parole scoring an average of about five points lower than those denied for parole. Having just conducted one such evaluation, I was struck by the frightening fact – alluded to by DeMatteo and colleagues -- that the chance assignment of an evaluator who typically gives high scores on the PCL-R “might quite literally mean the difference between an offender remaining in prison versus being released back into the community.”

Previous research has established that Factor 1 of the two-factor instrument – the factor measuring characterological traits such as manipulativeness, glibness and superficial charm – is especially prone to error in forensic settings. This is not surprising, as traits such as “glibness” are somewhat in the eye of the beholder and not objectively measurable. Yet, the authors assert, “it is exactly these traits that seem to have the most impact” on judges and juries.

Apart from the issue of poor reliability, the authors questioned the widespread use of the PCL-R as evidence of impaired volitional control, an element required for civil commitment in SVP cases. They labeled as “ironic, if not downright contradictory” the fact that psychopathy is often touted in traditional criminal responsibility (or insanity) cases as evidence of badness as opposed to mental illness, yet in SVP cases it magically transforms into evidence of a major mental disorder that interferes with self-control. 

The evidence is in: The Psychopathy Checklist-Revised is too inaccurate in applied settings to be relied upon in legal decision-making. With consistent findings of abysmal interrater reliability, its prejudicial impact clearly outweighs any probative value. However, the gatekeepers are not guarding the gates. So long as judges and attorneys ignore this growing body of empirical research, prejudicial opinions will continue to be cloaked in a false veneer of science, contributing to unjust outcomes.

* * * * *
The study is: 

The Role and Reliability of the Psychopathy Checklist-Revised in U.S. Sexually Violent Predator Evaluations: A Case Law Survey by DeMatteo, D., Edens, J. F., Galloway, M., Cox, J., Toney Smith, S. and Formon, D. (2013). Law and Human Behavior

Copies may be requested from the first author (HERE).

The same research team has just published a parallel study in Psychology, Public Policy and Law

“Investigating the Role of the Psychopathy Checklist-Revised in United States Case Law” by DeMatteo, David; Edens, John F.; Galloway, Meghann; Cox, Jennifer; Smith, Shannon Toney; Koller, Julie Present; Bersoff, Benjamin

My related essays and blog posts (I especially recommend the three marked with asterisks):



(c) Copyright Karen Franklin 2013 - All rights reserved

November 5, 2013

Static-99 developers embrace redemption

Sex offender risk plummets over time in community, new study reports

Criminals reform.

Violent criminals reform.

And now -- drum roll -- the authors of the most widely used actuarial tool for assessing sex offender recidivism are conceding that even sex offenders cross a "redemption threshold" over time, such that their risk of committing a new sexual crime may become "indistinguishable from the risk presented by non-sexual offenders."

Tracking a large group of 7,740 sexual offenders drawn from 21 different samples around the world, the researchers found that those who remain free in the community for five years or more after their release are at drastically reduced risk of committing a new sex offense.

The offenders identified as at the highest risk on the Static-99R saw their rates of reoffending fall the most, from 22 percent at the time of release to 8.6 percent after five years and only 4.2 percent after 10 years in the community. Based on their findings, the researchers say that risk factors such as number of prior offenses are time-dependent rather than truly static or never-changing.

"If high risk sexual offenders do not reoffend when given the opportunity to do so, then there is clear evidence that they are not as high risk as initially perceived," note authors R. Karl Hanson, Andrew J. R. Harris, Leslie Helmus and David Thornton in the article scheduled for publication in the Journal of Interpersonal Violence.

Quoting two of my favorite scholars -- criminologist Shadd Maruna and law professor/forensic psychologist Charles Ewing -- the authors challenge the notion that sex offenders represent a special case of perpetual danger. They question the need for lifelong monitoring and supervision.

"Even if certain subgroups of sexual offenders can be identified as high risk, they need not be high risk forever. Risk-relevant propensities could change based on fortunate life circumstances, life choices, aging, or deliberate interventions."

The time-free effect was similar across all subgroups examined, including those defined by age at release, treatment involvement, pre-selection into a "high risk/high need" category on the Static-99R, or victim type (adults, children, related children).

The authors recommend revising estimates of risk for individuals who do not reoffend after being free in the community for a certain period of time.

"Once given the opportunity to reoffend, the individuals who reoffend should be sorted into higher risk groups, and those who do not reoffend should be sorted into lower risk groups. This sorting process can result in drastic changes from the initial risk estimates."

The article is: "High Risk Sex Offenders May Not Be High Risk Forever." Copies may be requested from the first author, R. Karl Hanson (HERE).

November 2, 2013

RadioLab explores criminal culpability and the brain

Debate: Moral justice versus risk forecasting


After Kevin had brain surgery for his epilepsy, he developed an uncontrollable urge to download child pornography. If the surgery engendered Klüver-Bucy Syndrome, compromising his ability to control his impulses, should he be less morally culpable than another offender?

Blame is a fascinating episode of RadioLab that explores the debate over free will versus biology as destiny. Nita Farahany, professor of law and philosophy at Duke, is documenting an explosion in the use of brain science in court. But it's a slippery slope: Today, brain scanning technology only enables us to see the most obvious of physical defects, such as tumors. But one day, argues neuroscientist David Eagleman, we will be able to map the brain with sufficient focus to see that all behavior is a function of one perturbation or another.

Eagleman and guest Amy Phenix (of Static-99 fame) both think that instead of focusing on culpability, the criminal justice system should focus on risk of recidivism, as determined by statistical algorithms.

But hosts Jad and Robert express skepticism about this mechanistic approach to justice. They wonder whether a technocratic, risk-focused society is really one we want to live in.

The idea of turning legal decision-making over to a computer program is superficially alluring, promising to take prejudice and emotionality out of the equation. But the notion of scientific objectivity is illusory. Computer algorithms are nowhere near as value-neutral as their proponents claim. Implicit values are involved in choosing which factors to include in a model, humans introduce scoring bias (as I have reported previously in reference to the Static-99 and the PCL-R), and even supposedly neutral factors such as zip codes that are used in crime-forecasting software are coded markers of race and class. 

But that’s just on a technical level. On a more philosophical level, the notion that scores on various risk markers should determine an individual’s fate is not only unfair, punishing the person for acts not committed, but reflects a deeply pessimistic view of humanity. People are not just bundles of unthinking synapses. They are sentient beings, capable of change.

In addition, by placing the onus for future behavior entirely on the individual, the risk-factor-as-destiny approach conveniently removes society’s responsibility for mitigating the environmental causes of crime, and negates any hope of rehabilitation.

As discussed in an illuminating article on the Circles of Support and Accountability (or COSA) movement in Canada, former criminals face a catch-22 situation in which society refuses to reintegrate them, thereby elevating their risk of remaining alienated and ultimately reoffending. Yet when surrounded by friendship and support, former offenders are far less likely to reoffend, studies show.

The hour-long RadioLab episode  concludes with a segment on forgiveness, featuring the unlikely friendship that developed between an octogenarian and the criminal who sexually assaulted and strangled his daughter.

That provides a fitting ending. Because ultimately, as listener Molly G. from Maplewood, New Jersey, comments on the segment’s web page, justice is a moral and ethical construct. It’s not something that can, or should, be decided by scientists.

* * * * *

The episode is highly recommended. (Click HERE to listen online or download the podcast.)