The PHQ-9: A Clinician's Complete Guide to Depression Screening

The Patient Health Questionnaire-9 (PHQ-9) has become the most widely used depression screening tool in clinical practice, and for good reason. It's brief, validated across dozens of populations, and maps directly onto DSM-5 criteria for major depressive disorder.

But familiarity breeds complacency. Many clinicians use the PHQ-9 routinely without fully using what it can tell them. This guide goes beyond the basics to help you get more clinical value from every administration.

What the PHQ-9 Measures

The PHQ-9 asks patients to rate nine symptoms of depression over the past two weeks, each scored 0–3 (not at all, several days, more than half the days, nearly every day). The nine items correspond directly to the nine DSM-5 criteria for major depressive disorder:

Anhedonia (little interest or pleasure)
Depressed mood
Sleep disturbance
Fatigue or low energy
Appetite changes
Feelings of failure or guilt
Concentration difficulties
Psychomotor changes
Suicidal ideation

Total scores range from 0 to 27.

Interpreting Severity Levels

The standard severity thresholds are:

0–4: Minimal depression
5–9: Mild depression
10–14: Moderate depression
15–19: Moderately severe depression
20–27: Severe depression

A score of 10 or above is generally considered the threshold for clinically significant depression, with a sensitivity of 88% and specificity of 88% for major depressive disorder.

However, these cutoffs are guidelines, not diagnoses. A patient scoring 8 who was at 2 last month deserves clinical attention. A patient consistently scoring 12 who functions well at work may need a different conversation than someone newly scoring 12 after a major life event.

Beyond the Total Score: Item-Level Analysis

The total score is useful for tracking, but item-level analysis reveals far more about your patient's experience.

Item 9 (suicidal ideation) demands immediate attention regardless of total score. A patient scoring 5 overall but endorsing item 9 at any level warrants a safety assessment. Never let a low total score overshadow this item.

Sleep and energy items (3 and 4) are often the first to elevate and last to resolve. If these remain high while mood items improve, consider whether a sleep-specific intervention is needed.

Anhedonia (item 1) is increasingly recognized as a distinct dimension of depression with different neurobiological underpinnings than sad mood. Persistent anhedonia despite improving mood scores may signal the need for different therapeutic approaches. Behavioral activation is particularly effective here.

Psychomotor changes (item 8) are the least reliably self-reported item. Patients often struggle to identify these changes in themselves. If a patient endorses this item, it may be worth exploring what they're actually noticing.

Using the PHQ-9 for Measurement-Based Care

The real power of the PHQ-9 emerges through repeated measurement. Administering it at every session (or every other session) creates a trajectory that informs treatment decisions far better than clinical impression alone.

What constitutes meaningful change? A reduction of 5 or more points is generally considered a reliable change (beyond measurement error). A 50% reduction from baseline is typically considered treatment response. Remission is usually defined as a score below 5.

When to adjust treatment: Research consistently shows that clinicians overestimate patient improvement. If the PHQ-9 score hasn't dropped by at least 25% after 6 weeks of treatment, the evidence suggests it's time to reassess your approach: adjusting medication, switching therapeutic modalities, or exploring barriers to engagement.

Discussing scores with patients: Sharing PHQ-9 results with patients isn't just good practice; it's therapeutic. Patients who see their scores declining report greater hope and treatment engagement. When scores plateau or increase, framing this as useful clinical information (rather than failure) maintains alliance while motivating treatment adjustment.

Common Pitfalls

Relying solely on the total score. Two patients can both score 15 with completely different symptom profiles. The patient scoring high on cognitive items (guilt, concentration, suicidality) may need a different approach than the patient scoring high on somatic items (sleep, appetite, fatigue, psychomotor).

Administering too infrequently. Annual screening catches fewer cases than you'd think. Depression fluctuates. Routine monitoring, ideally at every visit, captures episodes that annual screening misses.

Ignoring the "functional impairment" question. The PHQ-9 includes a final question about how much symptoms have made it difficult to function. This isn't scored but provides important context. A patient scoring 10 who reports severe functional impairment may need more urgent intervention than a patient scoring 15 with mild impairment.

Using it as a diagnostic tool. The PHQ-9 is a screening and severity measure, not a diagnostic instrument. A high score warrants clinical assessment for depression, but the diagnosis requires clinical judgment about duration, functional impairment, differential diagnoses, and context.

When to Use Something Else

The PHQ-9 is excellent for unipolar depression screening and monitoring, but it has blind spots. Consider supplementing or substituting when:

Bipolar disorder is suspected: The PHQ-9 doesn't screen for mania. Pair it with a mood disorder questionnaire or use clinical interview.
Anxiety is prominent: Depression and anxiety are highly comorbid. Adding the GAD-7 gives a more complete picture with minimal additional burden.
Somatic symptoms dominate: The PHQ-15 can help distinguish somatization from depression with prominent somatic features.
Broader distress measurement is needed: The DASS-21 captures depression, anxiety, and stress in a single instrument.

The PHQ-9 in Digital Assessment

Administering the PHQ-9 digitally offers several advantages over paper. Automated scoring eliminates calculation errors (which occur in roughly 1 in 5 hand-scored administrations). Digital tracking makes trajectory visualization effortless. And patients tend to disclose more on digital instruments than face-to-face, particularly on sensitive items like suicidal ideation.

The key is making digital assessment frictionless for patients. An anonymous code-based system, where patients simply enter a short code and complete the assessment on their own device, removes barriers that paper forms and patient portals create.

Key Takeaways

The PHQ-9 remains the gold standard for depression screening and monitoring for good reason: it's brief, reliable, valid, and clinically actionable. Getting the most from it means looking beyond the total score, tracking change over time, discussing results collaboratively with patients, and knowing when to supplement it with other measures.

Routine, repeated measurement is where the PHQ-9 delivers its greatest clinical value. A single score is a snapshot. A series of scores is a story, and that story guides better treatment decisions.