The nine-item Patient Health Questionnaire ( PHQ-9 ) is a depressive symptom scale and diagnostic tool introduced in 2001 to screen adult patients in primary care settings. The instrument assesses for the presence and severity of depressive symptoms and a possible depressive disorder. The PHQ-9 is a component of the larger self-administered Patient Health Questionnaire (PHQ), but can be used as a stand-alone instrument. The PHQ is part of Pfizer 's larger suite of trademarked products, called the Primary Care Evaluation of Mental Disorders (PRIME-MD). The PHQ-9 takes less than three minutes to complete. It is scored by simply adding up the individual items' scores. Each of the nine items reflects a DSM-5 symptom of depression. Primary care providers can use the PHQ-9 to screen for possible depression in patients.
69-553: The PHQ-9 is the nine-item depression scale found in the 59-item PHQ. The PHQ is a self-administered version of the PRIME-MD , a screening tool that assesses 12 mental and emotional health disorders. It has modules on mood (PHQ-9), anxiety , alcohol, eating, and somatoform disorders. Robert L. Spitzer, Janet B.W. Williams, and Kurt Kroenke developed the PHQ in the mid-1990s and the PHQ-9 in 1999 with
138-401: A condition. Mathematically, this can be written as: A positive result in a test with high specificity can be useful for "ruling in" disease, since the test rarely gives positive results in healthy patients. A test with 100% specificity will recognize all patients without the disease by testing negative, so a positive test result would definitively rule in the presence of the disease. However,
207-520: A depression screening instrument when treating depression. Studies found the PHQ-9 is also useful for screening for depression in psychiatric clinics. Researchers have used the PHQ-9 to study the mental health of patients with diabetes , HIV-AIDS , chronic pain , arthritis , fibromyalgia , epilepsy , and substance abuse. It also is used in studies involving patients with physical disabilities as well as older adults, students, and adolescents. The PHQ-9 has been extensively used in research investigating
276-425: A disease. Each person taking the test either has or does not have the disease. The test outcome can be positive (classifying the person as having the disease) or negative (classifying the person as not having the disease). The test results for each subject may or may not match the subject's actual status. In that setting: After getting the numbers of true positives, false positives, true negatives, and false negatives,
345-417: A failure), ability to concentrate, psychomotor problems (speaking/moving slowly or fidgety/restless), and thoughts of suicide. Responses range from “0” (Not at all) to “3” (nearly every day). A tenth question asks about the extent to which the previously mentioned symptoms make functioning in daily life difficult. The response to the tenth question is not factored into the final score; however, clinicians may use
414-510: A given confidence level (e.g., 95%). In information retrieval , the positive predictive value is called precision , and sensitivity is called recall . Unlike the Specificity vs Sensitivity tradeoff, these measures are both independent of the number of true negatives, which is generally unknown and much larger than the actual numbers of relevant and retrieved documents. This assumption of very large numbers of true negatives versus positives
483-637: A grant from Pfizer . A patient may take the PHQ-9 in written form or be presented the survey items in interview form. The PHQ-9 questions reflect the diagnostic criteria for major depressive disorder (MDD) found in the DSM-5 . The items ask about the patient's experience in the last two weeks. Questions are about the level of interest/pleasure in doing things (anhedonia), feeling down or depressed, sleep-related problems (sleeping too much/difficulty falling or staying asleep), low energy or fatigue, eating problems (poor appetite or eating too much), self-worth (feeling like
552-418: A high false positive rate, and it does not reliably identify colorectal cancer in the overall population of asymptomatic people (PPV = 10%). On the other hand, this hypothetical test demonstrates very accurate detection of cancer-free individuals (NPV ≈ 99.5%). Therefore, when used for routine colorectal cancer screening with asymptomatic adults, a negative result supplies important data for
621-420: A lower type I error rate. The above graphical illustration is meant to show the relationship between sensitivity and specificity. The black, dotted line in the center of the graph is where the sensitivity and specificity are the same. As one moves to the left of the black dotted line, the sensitivity increases, reaching its maximum value of 100% at line A, and the specificity decreases. The sensitivity at line A
690-410: A negative result from a test with high specificity is not necessarily useful for "ruling out" disease. For example, a test that always returns a negative test result will have a specificity of 100% because specificity does not consider false negatives. A test like that would return negative for patients with the disease, making it useless for "ruling out" the disease. A test with a higher specificity has
759-554: A seven-item version to assess symptoms of anxiety (GAD-7), and a 15-item version to detect somatic symptoms (PHQ-15) have been developed and validated. The PHQ-9, GAD-7, and the PHQ-15 were combined to create the PHQ-somatic, anxiety, depressive symptoms (PHQ-SADS) and includes questions regarding panic attacks (after the GAD-7 section). Though less commonly used, there are also brief versions of
SECTION 10
#1732859421337828-569: A significant minority of patients might find interpretation of the PHQ-9 difficult without support. The National Institute for Health and Clinical Excellence endorsed the PHQ-9 for measuring depression severity and responsiveness to treatment in adults in a primary care setting. The Behavioral Risk Factor Surveillance Survey (BRFSS), the National Health and Nutrition Examination Survey , the Medical Expenditure Panel Survey ,
897-415: A similar format to that of the PHQ-9. Total scores range from 0 to 21 with scores of 5, 10, and 15 indicating mild, moderate, and severe anxiety. Unlike the PHQ-9, clinicians use the GAD-7 to assess the severity of anxiety only. Unlike the PHQ-9, the GAD-7 does not generate provisional diagnoses. A clinical interview must be given to arrive at a clinical diagnosis. The GAD-2 is a 2-question shortened version of
966-405: A total of 10 or above is suggestive of the presence of depression. Listed below are PHQ-9 totals, the levels of depression that they relate to, and suggested treatment for each level of depression: A provisional diagnosis of MDD can be made by using the pattern of responses to PHQ-9 items. According to the DSM-5 , MDD is likely if five or more of the nine criterion symptoms are present for “most of
1035-406: A trained clinician can do that. For example, a trained clinician can determine if the symptoms can be better explained by substance use or another medical or psychiatric condition. Clinicians, however, may use the PHQ-9 to evaluate the efficacy of treatments for depression. A change of PHQ-9 score to less than 10 is considered a “partial response” to treatment and a change of PHQ-9 score to less than 5
1104-482: Is 100% because at that point there are zero false negatives, meaning that all the negative test results are true negatives. When moving to the right, the opposite applies, the specificity increases until it reaches the B line and becomes 100% and the sensitivity decreases. The specificity at line B is 100% because the number of false positives is zero at that line, meaning all the positive test results are true positives. The middle solid line in both figures above that show
1173-402: Is 37 + 8 = 45, which gives a sensitivity of 37 / 45 = 82.2 %. There are 40 - 8 = 32 TN. The specificity therefore comes out to 32 / 35 = 91.4%. The red dot indicates the patient with the medical condition. The red background indicates the area where the test predicts the data point to be positive. The true positive in this figure is 6, and false negatives of 0 (because all positive condition
1242-577: Is a 7-item scale designed to assess symptoms of anxiety. Each item is scored on a 0-to-3 point scale ("not at all" to "nearly every day"). Cut points of 5, 10, and 15 correspond to mild, moderate, and severe anxiety. The PHQ-8 is an eight-item scale developed specifically to screen for depression in American epidemiological populations. The Patient Health Questionnaire - Somatic, Anxiety, and Depressive Symptoms (PHQ-SADS) screens for somatic, anxiety, and depressive symptoms using PHQ-9, GAD-7 , and PHQ-15, plus
1311-488: Is considered to be “remission.” Kroenke, Spitzer, and Williams conducted validity and reliability research on the PHQ-9 in 2001. With regard to reliability , they found that Cronbach's alpha for the PHQ-9 was 0.89 in a sample comprising 3,000 primary care patients and 0.86 among 3,000 OB-GYN patients. However, some research suggests that the scale is not purely unidimensional, with the scale reflecting two latent factors, somatic and cognitive/affective factors. By contrast,
1380-400: Is correctly predicted as positive). Therefore, the sensitivity is 100% (from 6 / (6 + 0) ). This situation is also illustrated in the previous figure where the dotted line is at position A (the left-hand side is predicted as negative by the model, the right-hand side is predicted as positive by the model). When the dotted line, test cut-off line, is at position A, the test correctly predicts all
1449-428: Is defined as the smallest amount of substance in a sample that can accurately be measured by an assay (synonymously to detection limit ), and "analytical specificity" is defined as the ability of an assay to measure one particular organism or substance, rather than others. However, this article deals with diagnostic sensitivity and specificity as defined at top. Imagine a study evaluating a test that screens people for
SECTION 20
#17328594213371518-438: Is defined as: An estimate of d′ can be also found from measurements of the hit rate and false-alarm rate. It is calculated as: where function Z ( p ), p ∈ [0, 1], is the inverse of the cumulative Gaussian distribution . d′ is a dimensionless statistic. A higher d′ indicates that the signal can be more readily detected. The relationship between sensitivity, specificity, and similar terms can be understood using
1587-450: Is not applicable in the present context. A sensitive test will have fewer Type II errors . Similarly to the domain of information retrieval , in the research area of gene prediction , the number of true negatives (non-genes) in genomic sequences is generally unknown and much larger than the actual number of genes (true positives). The convenient and intuitively understood term specificity in this research area has been frequently used with
1656-483: Is often claimed that a highly specific test is effective at ruling in a disease when positive, while a highly sensitive test is deemed effective at ruling out a disease when negative. This has led to the widely used mnemonics SPPIN and SNNOUT, according to which a highly sp ecific test, when p ositive, rules in disease (SP-P-IN), and a highly s e n sitive test, when n egative, rules out disease (SN-N-OUT). Both rules of thumb are, however, inferentially misleading, as
1725-402: Is rare in other applications. The F-score can be used as a single measure of performance of the test for the positive class. The F-score is the harmonic mean of precision and recall: In the traditional language of statistical hypothesis testing , the sensitivity of a test is called the statistical power of the test, although the word power in that context has a more general usage that
1794-599: Is required. This version of the PHQ has been shown to have good diagnostic sensitivity but poor specificity. The Patient Health Questionnaire 4 item (PHQ-4) combines the PHQ-2 with the Generalized Anxiety Disorder 2 (GAD-2), an ultra-brief anxiety screener containing the first two questions from the Generalized Anxiety Disorder 7 (GAD-7). The Patient Health Questionnaire 15 item (PHQ-15) contains
1863-559: Is the self-report version of the Prim ary Care E valuation of M ental D isorders ( PRIME-MD ), a diagnostic tool developed in the mid-1990s by Pfizer Inc . The length of the original assessment limited its feasibility; consequently, a shorter version, consisting of 11 multi-part questions - the Patient Health Questionnaire was developed and validated. In addition to the PHQ, a nine-item version to assess symptoms of depression,
1932-429: Is then 26, and the number of false positives is 0. This result in 100% specificity (from 26 / (26 + 0) ). Therefore, sensitivity or specificity alone cannot be used to measure the performance of the test. In medical diagnosis , test sensitivity is the ability of a test to correctly identify those with the disease (true positive rate), whereas test specificity is the ability of the test to correctly identify those without
2001-409: Is usually a trade-off between sensitivity and specificity, such that higher sensitivities will mean lower specificities and vice versa. A test which reliably detects the presence of a condition, resulting in a high number of true positives and low number of false negatives, will have a high sensitivity. This is especially important when the consequence of failing to treat the condition is serious and/or
2070-423: The gold standard four times, but a single additional test against the gold standard that gave a poor result would imply a sensitivity of only 80%. A common way to do this is to state the binomial proportion confidence interval , often calculated using a Wilson score interval. Confidence intervals for sensitivity and specificity can be calculated, giving the range of values within which the correct value lies at
2139-403: The panic symptoms question from the original PHQ. The PHQ-A is a four module self-report to evaluate depression, anxiety, substance use and eating disorders in adolescent primary care patients. The PHQ-9 has been used in studies to effectively monitor change following cognitive behavioral treatment. A meta analysis stated that the PHQ-9 had good treatment sensitivity. All versions of
PHQ-9 - Misplaced Pages Continue
2208-459: The primary care setting, it lacks coverage for disorders seen in psychiatric settings. Some modules are used independently, and variants have been developed based on the original items. The PHQ-9 (DEP-9 in some sources ), a tool specific to depression, scores each of the 9 DSM-IV related criteria based on the mood module from the original PRIME-MD. The PHQ-9 is both sensitive and specific in its diagnoses, which has led to its prominence in
2277-420: The GAD-7 focuses on the past two weeks, and the PHQ asks about various time periods from the last two weeks to the last six months. Depending on the time period in question, this may or may not require a revision (i.e., if you are interested in depression over the last six months, you might alter the instructions), which could impact the validity of the measure. The scoring thresholds recommended are influenced by
2346-541: The GAD-7; it uses the first two items on the GAD-7. A total score that is greater than 3 indicates that a clinician should administer the full GAD-7 and conduct a clinical interview to assess the presence and type of anxiety disorder. Primary Care Evaluation of Mental Disorders The Patient Health Questionnaire ( PHQ ) is a multiple-choice self-report inventory that is used as a screening and diagnostic tool for mental health disorders of depression , anxiety , alcohol , eating , and somatoform disorders . It
2415-736: The National Epidemiologic Survey on Alcohol and Related Conditions, the Medicare Health Support program, and the Millennium Cohort Study use the full PHQ-9 or a shortened form of it. The Veterans Administration , Department of Defense , and Kaiser Permanente adopted the PHQ-9 as a standard measure for depression screening. The PHQ-9 is also the most commonly used depression measure in the United Kingdom's National Health Service , which requires providers to use
2484-419: The PHQ are self reports and, consequently, are subject to inherent biases, including social desirability and poor retrospective recall. The influence of these biases can mitigated by following up with a structured or semi-structured interview , the gold standard for diagnostic assessment. The time period assessed by each scale could also be a limitation; the PHQ-9 asks about the last four weeks, whereas
2553-472: The PHQ website. Both the original Patient Health Questionnaire and later variants are public domain resources; no fees or permissions are required for using or copying the measures. Additionally, the measures have been validated in a number of different populations internationally. The original Patient Health Questionnaire contains five modules; these contain questions about depressive, anxiety, somatoform, alcohol, and eating disorders. Designed for use in
2622-508: The PHQ's somatic symptom scale. It is a well-validated measure, which asks whether symptoms are present and about their severity. A brief version, the Somatic Symptom Scale - 8 was derived from PHQ-15. The development of the PHQ-15 helped address three main problems in the assessment and diagnosis of somatoform disorders. Firstly, traditional methods of diagnosing somatoform disorders would only capture about 20% of true cases due to
2691-463: The PHQ-15 account for 90% of all symptoms that providers observe in primary care settings. Patients must rate the extent to which symptoms bothered them over the last month. Responses range from "not at all" (a score of 0) to "bothered a lot" (a score of 2). Higher scores on the PHQ-15 are strongly associated with functional impairment, disability, and healthcare utilization. The GAD-7 is a seven-item anxiety screening instrument developed in 2006 with
2760-447: The PHQ-2 will generally lead to the subsequent administration of the PHQ-9. The Veterans Administration uses this method to screen for depression in patients. The PHQ-8 consists of all of the PHQ-9 instruments except for the last question (suicidal thoughts). The 8-item version of the instrument is commonly used in research on general population samples, which mostly comprises individuals who are not depressed. Researchers generally use
2829-549: The PHQ-8 because timing and resource restraints may leave researchers unable to intervene with study participants who indicate that they have experienced suicidal thoughts. The absence of the ninth question has little effect on scoring between the PHQ-8 and PHQ-9. A study found that scores between the two tests are highly correlated ( r = 0.998). The PHQ-15 is a 15-item scale derived from the larger PHQ. The PHQ-15 inquires in 15 symptoms relating to somatoform disorders . The questions on
PHQ-9 - Misplaced Pages Continue
2898-421: The PHQ-9 and GAD-7 that may be useful as screening tools in some settings. In recent years, the PHQ-9 has been validated for use in adolescents, and a version for adolescents was also developed and validated (PHQ-A). Although these tests were originally designed as self-report inventories they can also be administered by trained health care practitioners. The PHQ is available in over 20 languages, available on
2967-450: The PHQ-9, suggesting that those who display depression symptoms on Facebook are experiencing them offline. The Patient Health Questionnaire 2 item (PHQ-2) is an ultra-brief screening instrument containing the first two questions from the PHQ-9. Two screening questions to assess the presence of a depressed mood and a loss of interest or pleasure in routine activities , and a positive response to either question indicates further testing
3036-427: The analysis (the number of exclusions should be stated when quoting sensitivity) or can be treated as false negatives (which gives the worst-case value for sensitivity and may therefore underestimate it). A test with a higher sensitivity has a lower type II error rate. Consider the example of a medical test for diagnosing a disease. Specificity refers to the test's ability to correctly reject healthy patients without
3105-459: The condition are considered "positive" and those who do not are considered "negative", then sensitivity is a measure of how well a test can identify true positives and specificity is a measure of how well a test can identify true negatives: If the true status of the condition cannot be known, sensitivity and specificity can be defined relative to a " gold standard test " which is assumed correct. For all testing, both diagnoses and screening , there
3174-434: The data set is equal to TP + FN, or 32 + 3 = 35. The sensitivity is therefore 32 / 35 = 91.4%. Using the same method, we get TN = 40 - 3 = 37, and the number of healthy people 37 + 8 = 45, which results in a specificity of 37 / 45 = 82.2 %. For the figure that shows low sensitivity and high specificity, there are 8 FN and 3 FP. Using the same method as the previous figure, we get TP = 40 - 3 = 37. The number of sick people
3243-457: The day, nearly every day" over the past 2 weeks; however, one of the symptoms must be either depressed mood or anhedonia (questions 1 and 2 on the PHQ-9). Any degree of suicidal thoughts counts toward a provisional diagnosis. The symptoms must also cause significant distress and loss of function. The PHQ-9 is limited to making a provisional diagnosis. It cannot be used to make an actual diagnosis. Only
3312-561: The diagnostic power of any test is determined by the prevalence of the condition being tested, the test's sensitivity and its specificity. The SNNOUT mnemonic has some validity when the prevalence of the condition in question is extremely low in the tested sample. The tradeoff between specificity and sensitivity is explored in ROC analysis as a trade off between TPR and FPR (that is, recall and fallout ). Giving them equal weight optimizes informedness = specificity + sensitivity − 1 = TPR − FPR,
3381-424: The disease (true negative rate). If 100 patients known to have a disease were tested, and 43 test positive, then the test has 43% sensitivity. If 100 with no disease are tested and 96 return a completely negative result, then the test has 96% specificity. Sensitivity and specificity are prevalence-independent test characteristics, as their values are intrinsic to the test and do not depend on the disease prevalence in
3450-467: The disease. A test with 100% sensitivity will recognize all patients with the disease by testing positive. In this case, a negative test result would definitively rule out the presence of the disease in a patient. However, a positive result in a test with high sensitivity is not necessarily useful for "ruling in" disease. Suppose a 'bogus' test kit is designed to always give a positive reading. When used on diseased patients, all patients test positive, giving
3519-418: The example of a medical test for diagnosing a condition. Sensitivity (sometimes also named the detection rate in a clinical setting) refers to the test's ability to correctly detect ill patients out of those who do have the condition. Mathematically, this can be expressed as: A negative result in a test with high sensitivity can be useful for "ruling out" disease, since it rarely misdiagnoses those who do have
SECTION 50
#17328594213373588-535: The following table. Consider a group with P positive instances and N negative instances of some condition. The four outcomes can be formulated in a 2×2 contingency table or confusion matrix , as well as derivations of several metrics using the four outcomes, as follows: Related calculations This hypothetical screening test (fecal occult blood test) correctly identified two-thirds (66.7%) of patients with colorectal cancer. Unfortunately, factoring in prevalence rates reveals that this hypothetical test has
3657-469: The level of sensitivity and specificity is the test cutoff point. As previously described, moving this line results in a trade-off between the level of sensitivity and specificity. The left-hand side of this line contains the data points that tests below the cut off point and are considered negative (the blue dots indicate the False Negatives (FN), the white dots True Negatives (TN)). The right-hand side of
3726-411: The line shows the data points that tests above the cut off point and are considered positive (red dots indicate False Positives (FP)). Each side contains 40 data points. For the figure that shows high sensitivity and low specificity, there are 3 FN and 8 FP. Using the fact that positive results = true positives (TP) + FP, we get TP = positive results - FP, or TP = 40 - 8 = 32. The number of sick people in
3795-400: The magnitude of which gives the probability of an informed decision between the two classes (> 0 represents appropriate use of information, 0 represents chance-level performance, < 0 represents perverse use of information). The sensitivity index or d′ (pronounced "dee-prime") is a statistic used in signal detection theory . It provides the separation between the means of
3864-469: The number of symptoms required to meet a diagnosis. Secondly, in order to attain more reliable and valid data, assessments need to address more current rather than previous symptoms. Thirdly, continuing to adhere to the "medically unexplained" requirement for symptoms makes it very difficult to make a diagnosis because it is extremely hard to ascertain if a symptom is or is not part of a larger medical condition (ex: chronic fatigue and depression). The GAD-7
3933-433: The patient and doctor, such as ruling out cancer as the cause of gastrointestinal symptoms or reassuring patients worried about developing colorectal cancer. Sensitivity and specificity values alone may be highly misleading. The 'worst-case' sensitivity or specificity must be calculated in order to avoid reliance on experiments with few results. For example, a particular test may easily show 100% sensitivity if tested against
4002-417: The population of interest. Positive and negative predictive values , but not sensitivity or specificity, are values influenced by the prevalence of disease in the population that is being tested. These concepts are illustrated graphically in this applet Bayesian clinical diagnostic model which show the positive and negative predictive values as a function of the prevalence, sensitivity and specificity. It
4071-411: The population of the true positive class, but it will fail to correctly identify the data point from the true negative class. Similar to the previously explained figure, the red dot indicates the patient with the medical condition. However, in this case, the green background indicates that the test predicts that all patients are free of the medical condition. The number of data point that is true negative
4140-464: The primary care setting. This tool is used in a variety of different contexts, including clinical settings across the United States as well as research studies. One study which used the PHQ-9, examined if college student displays of depression symptoms on Facebook were representative of offline symptoms. Results demonstrated that those who displayed depression symptoms on Facebook scored higher on
4209-399: The relationship between burnout and depression. The instrument is available in over 30 languages and may be valid for use in different ethnic groups. Pfizer owns the copyright of the PHQ-9 and allows it to be accessed for free. The PHQ-2 is a shortened version of the PHQ-9. It contains the first 2 questions of the PHQ-9 and takes less than a minute to administer. A score of 3 or greater on
SECTION 60
#17328594213374278-483: The response to help gauge the patient's level of impairment. A massive study of almost 60,000 participants (involving 29 samples from seven countries and speaking five languages) that employed exploratory structural equation modeling bifactor analysis showed the PHQ-9 is essentially unidimensional; cognitive-affective and somatic specific factors were relatively weak. The total sum of the responses roughly indexes levels of depression. Scores range from 0 to 27. In general,
4347-425: The results of the massive study by Bianchi et al. (2022) indicate that the PHQ-9's total score is essentially unidimensional. The test-retest reliability was found to be excellent. The correlation between PHQ-9 scores obtained from in-person and phone interviews with the same patients was 0.84. The PHQ-9 showed acceptable psychometric properties in a rural Indian population. In general, psychometric research supports
4416-414: The samples in which they were validated and correspond with different levels of sensitivity and specificity, which may or may not match well with the intended use of the scale. Sensitivity and specificity In medicine and statistics , sensitivity and specificity mathematically describe the accuracy of a test that reports the presence or absence of a medical condition. If individuals who have
4485-418: The sensitivity and specificity for the test can be calculated. If it turns out that the sensitivity is high then any person who has the disease is likely to be classified as positive by the test. On the other hand, if the specificity is high, any person who does not have the disease is likely to be classified as negative by the test. An NIH web site has a discussion of how these ratios are calculated. Consider
4554-518: The signal and the noise distributions, compared against the standard deviation of the noise distribution. For normally distributed signal and noise with mean and standard deviations μ S {\displaystyle \mu _{S}} and σ S {\displaystyle \sigma _{S}} , and μ N {\displaystyle \mu _{N}} and σ N {\displaystyle \sigma _{N}} , respectively, d′
4623-428: The test 100% sensitivity. However, sensitivity does not take into account false positives. The bogus test also returns positive on all healthy patients, giving it a false positive rate of 100%, rendering it useless for detecting or "ruling in" the disease. The calculation of sensitivity does not take into account indeterminate test results. If a test cannot be repeated, indeterminate samples either should be excluded from
4692-613: The treatment is very effective and has minimal side effects. A test which reliably excludes individuals who do not have the condition, resulting in a high number of true negatives and low number of false positives, will have a high specificity. This is especially important when people who are identified as having a condition may be subjected to more testing, expense, stigma, anxiety, etc. The terms "sensitivity" and "specificity" were introduced by American biostatistician Jacob Yerushalmy in 1947. There are different definitions within laboratory quality control , wherein "analytical sensitivity"
4761-538: The use of total scores, i.e., summing the item scores, in research and practice. In an assessment of construct validity , Kroenke et al. found that the correlation between the PHQ-9 and the SF-20 mental health scale was 0.73. To assess criterion validity , a mental health professional validated depression diagnoses from PHQ-9 scores from 580 participants, resulting in 88% sensitivity and 88% specificity . Preliminary work using gold standard readability measures suggests that
#336663