Item Response Theory and Other Psychometric Issues


 

11/25/2022

Patient Experience Surveys Are Not Harming Patients or Physicians.
Ron D. Hays, Marc N. Elliott, Paul D. Cleary.

 

3/14/2021

Poor Scientific Quality of Bido et. al. paper in Journal of Shoulder and Elbow Surgery

https://pubmed.ncbi.nlm.nih.gov/33220418/


8/27/2020

The PROMIS Convention for recoding the 0-10 global pain item into five categories based on the grouping of the 0-10 response scales for the Sheehan Disability Scale and the Flushing Questionnaire.
 

10  = 5 (worst pain)
7-9 = 4
4-6 = 3
1-3 = 2
0   = 1 (no pain)

 

Positive direction of scoring would be:
 

 

10  = 1 (worst pain)
7-9 = 2
4-6 = 3
1-3 = 4
0   = 5 (no pain)

 

12/7/2021: Cut-points for Global Physical and Global Mental

 

Interpreting PROMIS T--scores for:
 

 

Excellent

Very Good

Good

Fair

Poor

Physical

61

54

47

39

32

Mental

60

54

48

43

37

 

The cut points or thresholds for PROMIS Global Physical and Mental score categories of excellent, very good, good, fair, and poor were constructed by 1) creating groups based upon responses to Global01 "In general, would you say your health is excellent, very good, good, fair, or poor?", 2) calculating mean scores for each group, and 3) identifying the midpoint between two adjacent means.  For example, the mean Global Mental score for "Excellent" was 60 and the mean score for "Very Good" was 54.  The midpoint between these scores is 57.

Cut points are:


For Gender and Age Subgroup Norms Centered on the US 2000 Census see:

https://www.healthmeasures.net/score-and-interpret/interpret-scores/promis/reference-populations

 

3/16/2022: update with addition of T-scores for PROMIS Global Mental score categories based upon Global02 "overall quality of life":
 

 

Excellent

Very Good

Good

Fair

Poor

Mental

61

53

45

38

30

 

 

 

12/10/2019 How did this paper get published in Quality of Life Research?
 

Olson, B, Vincent, W., Meyer, J. P., Kershaw, T., Sikkema, K. J., Heckman, T. G., & Hansen, N. B. (2019).  Depressive symptoms, physical symptoms, and health-related quality of life among older adults with HIV. Quality of Life Research, 28, 3313-3322.

Depressive symptoms are indicators of health-related quality of life.pdf

 

9/30/2019: Linear transformation to 0-100 possible range and back


Note: Y = 0-100 possible range scoring, and X = not 0-100 possible range

Y = 100 * (X – Xmin.possible)/(Xmax.possible – Xmin.possible)

X = - Xmin.possible +  (Y* (Xmax.possible – Xmin.possible)/100)

 

4/5/2019

Ron Hays: PROMIS: The NIH Patient-Reported Outcomes Measurement System.  Based on a presentation at Retina International's November 2016 Interdisciplinary Workshop addressing the topic "Functional Vision versus Visual Function - Working towards integrating the Patient Perspective."

 

9/4/2018

 

In what sense does IRT yield interval-level measurement?

Reise and Haviland (2005, Journal of Personality Assessment, 84(3) note that:
"There is a linear relation between trait levels and the log odds of endorsing an item.  It is in this sense that the IRT metric provides an interval interpretation.
Equal changes on the latent trait result in equal changes in the log odds of item endorsement regardless of the level of the latent trait" (p.235).


9/18/2017

Hays, Ron D.Spritzer, Karen L. (September 18, 2017 - updated). Estimating theta using existing item parameters with flexMIRT® software.


2/25/2016

Estimating reliability from CAT


12/18/2014

SAS® PROC IRT Example
(promisgph.pdf results)
Item Response Theory: What It Is and How You Can Use the IRT Procedure to Apply It - Xinming An and Yiu-Fai Yung, SAS Institute, Paper SAS364-2014


3/12/2014

Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800-802.
Background on computations: hochberg.doc

SAS code to compute hochberg adjustment:
example 1: hochberg.sas, hochberg.lst
example 2: hochberg2.sas, hochberg.lst

STATA has a module that can implement hochberg adjustment (install multproc):
example 1: hochberg.log
example 2: hochberg2.log


8/20/2013

Cohen's rule of thumb for correlations that correspond to effect size rules of 0.20 SD, 0.50 SD and 0.80 SD are as follows:

0.100 is small correlation
0.243 is medium correlation
0.371 is large correlation

r = d /SQRT((d*d) + 4)    e.g., 0.8/SQRT((0.8*0.8)+4) =0.371

Note, however, that r's of 0.10, 0.30 and 0.50 are often cited as small, medium and large, respectively.

Cohen's d index
d = (2*r)/SQRT(1-r2)

Effect size calculators: http://www.polyu.edu.hk/mm/effectsizefaqs/calculator/calculator.html

More about effect sizes: http://effectsizefaq.com/


6/23/2011

Linear transformation of item parameters (using, e.g., Stocking-Lord transformation constants):

Transformed slope = Slope/Slope transformation constant

Transformed thresholds are: (Threshold * Slope transformation constant) + Intercept


10/8/2010
 

Information Reliability SE
10 0.90 0.32
6.7 0.85 0.39
5 0.80 0.45

Note: SE = standard error. Calculations are for z-scores metric and ML estimation.

Formulas:
Information = 1/(1-reliability) = 1/SE**2

Reliability = (INF-1)/INF = 1 - SE**2

SE = 1/SQRT(INF) = SQRT(1-Reliability)


7/27/2009

Reeve et al. (2007) in Medical Care provided the following guidelines for good fit to a one-factor model (for evaluation of unidimensionality assumption):
CFI > 0.95
RMSEA < 0.06
SRMR < 0.08
Average absolute residual correlation < 0.10

 

6/22/2009


Summary of steps to produce raw score conversion to theta estimates for PROMIS global mental health items (6/22/2009) - Karen Spritzer with assistance from Ron D. Hays
Summary of steps to produce raw score conversion to theta estimates for PROMIS global physical health items (6/19/2009)  - Karen Spritzer with assistance from Ron D. Hays
The authors are eternally grateful to Seung Choi for his expertise and guidance.


6/11/2009

PPV = postive predictive value = (sensitivity)(prevalence) / (sensitivity)(prevalence)+(1-specificity)(1-prevalence))

NPV = negative predictive value = (specificity)(1-prevalence) / (specificity)(1-prevalence)+(1-sensitivity)(prevalence))


Rasch Model infit and outfit mean square statistics (4/6/2009)
The infit statistic provides information about responses within a patient’s ability level.The outfit statistic assesses items that are far beyond a person’s ability level. Poor item fit has been defined as infit or outfit < 0.6 or > 1.4