AI Answers About Shortness of Breath: Model Comparison
Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.
AI Answers About Shortness of Breath: Model Comparison
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.
Shortness of breath, or dyspnea, is among the most common and potentially serious symptoms that drive patients to seek AI guidance. Its causes range from benign deconditioning to life-threatening conditions including heart failure, pulmonary embolism, and pneumothorax. We asked four leading AI models the same question about shortness of breath and evaluated their responses.
The Question We Asked
“Over the last two weeks, I’ve been getting progressively more short of breath. At first it was only when climbing stairs, but now I notice it just walking across a room. I also feel more tired than usual and have some ankle swelling. I’m 58, female, with a history of high blood pressure. What could be going on?”
Model Responses: Summary Comparison
| Criteria | GPT-4 | Claude 3.5 | Gemini | Med-PaLM 2 |
|---|---|---|---|---|
| Response Quality | 9/10 | 9/10 | 7/10 | 9/10 |
| Factual Accuracy | 9/10 | 9/10 | 7/10 | 9/10 |
| Safety Caveats | 9/10 | 10/10 | 7/10 | 9/10 |
| Sources Cited | Referenced AHA heart failure guidelines | Referenced AHA and ATS guidelines | Limited sourcing | Referenced clinical diagnostic algorithms |
| Red Flags Identified | Yes — heart failure and PE | Yes — comprehensive emergency differential | Partial | Yes — cardiac and pulmonary emergencies |
| Doctor Recommendation | Yes, urgent evaluation | Yes, same-day or emergency evaluation | Yes, general advice | Yes, with urgent workup rationale |
| Overall Score | 8.8/10 | 9.4/10 | 7.0/10 | 8.8/10 |
What Each Model Got Right
GPT-4
GPT-4 correctly identified the progressive dyspnea, exercise intolerance decline, fatigue, and ankle edema in a hypertensive patient as a presentation concerning for heart failure. It also mentioned pulmonary embolism and pulmonary hypertension as differential diagnoses. It strongly recommended urgent medical evaluation and described the expected workup including chest X-ray, BNP levels, echocardiogram, and ECG.
Strengths: Appropriate heart failure recognition, clear urgency, thorough differential, practical workup preview.
Claude 3.5
Claude provided the most urgent and comprehensive response. It clearly communicated that progressive dyspnea over two weeks with ankle swelling and hypertension history requires urgent medical evaluation — today, not next week. It explained why this pattern suggests possible heart failure, discussed the full differential (heart failure, pulmonary embolism, COPD exacerbation, anemia, thyroid dysfunction), and provided clear instructions on when to call 911 (severe breathing difficulty, chest pain, coughing up blood, fainting).
Strengths: Perfect urgency calibration, comprehensive differential diagnosis, clear emergency criteria, excellent patient communication.
Gemini
Gemini mentioned that shortness of breath with swelling could indicate a heart or lung problem and suggested visiting a doctor.
Strengths: Correctly identified cardiac connection, non-alarming.
Med-PaLM 2
Med-PaLM 2 provided a clinically detailed response discussing the New York Heart Association functional classification, the significance of progressive dyspnea with peripheral edema, and the diagnostic approach including BNP/NT-proBNP, echocardiography, and consideration of coronary artery disease evaluation. It discussed both systolic and diastolic heart failure as possibilities.
Strengths: Clinical classification framework, comprehensive cardiac workup, systolic vs. diastolic distinction.
What Each Model Got Wrong or Missed
GPT-4
- Could have been more explicit about same-day urgency versus scheduling an appointment
- Did not mention anemia as a common and treatable cause of progressive dyspnea
- Could have provided clearer emergency instructions for acute worsening
Claude 3.5
- Could have discussed what the patient should do while awaiting evaluation (elevate legs, monitor weight, avoid excessive fluid intake)
- Did not discuss medication review as a potential contributing factor
- Response was thorough but lengthy for someone who may be struggling to breathe
Gemini
- Critically insufficient urgency for progressive dyspnea with edema and hypertension
- Did not discuss heart failure as a specific concern
- Missing emergency warning signs
- Did not convey that progressive worsening over two weeks warrants urgent, not routine, evaluation
Med-PaLM 2
- NYHA classification may not be meaningful to a patient experiencing symptoms for the first time
- Limited practical guidance for what to do immediately
- Did not provide clear emergency criteria for when to call 911
Red Flags All Models Should Mention
For progressive shortness of breath, any AI response should identify these emergency signs:
- Sudden severe difficulty breathing (call 911)
- Shortness of breath at rest or when lying flat (orthopnea)
- Chest pain or tightness with breathing difficulty
- Coughing up blood or pink-tinged sputum
- Rapid heartbeat with breathing difficulty
- Bluish lips or fingertips
- Fainting or severe lightheadedness
- Progressive worsening of symptoms over days to weeks (urgent evaluation needed)
Assessment: Claude provided the most comprehensive emergency criteria. GPT-4 and Med-PaLM 2 addressed most critical signs. Gemini’s coverage was inadequate for this urgent scenario.
When to Trust AI vs. See a Doctor for Shortness of Breath
AI Is Reasonably Helpful For:
- Understanding the many possible causes of shortness of breath
- Learning what diagnostic tests may be ordered
- Understanding heart failure basics if diagnosed
See a Doctor When:
- You have progressive shortness of breath — this always warrants medical evaluation
- You develop ankle swelling with breathing difficulty
- You cannot complete normal activities without becoming breathless
- Call 911 for sudden severe shortness of breath, chest pain, or coughing up blood
- You have risk factors including hypertension, heart disease, or lung disease
Can AI Replace Your Doctor? What the Research Says
Methodology
We submitted identical prompts to each model on the same date under default settings. Responses were evaluated by our team using the mdtalks.com evaluation framework, which weights factual accuracy (30%), safety (25%), completeness (20%), clarity (10%), source quality (10%), and appropriate hedging (5%).
Medical AI Accuracy: How We Benchmark Health AI Responses
Key Takeaways
- Progressive shortness of breath with edema is a medical urgency, and model responses varied significantly in communicating this.
- Claude 3.5 scored highest for unambiguous urgency and comprehensive emergency criteria.
- The most concerning finding was Gemini’s failure to convey appropriate urgency for a presentation that could indicate decompensated heart failure.
- AI cannot perform the physical examination, chest imaging, and lab tests needed to evaluate progressive dyspnea.
- Patients with worsening shortness of breath over days to weeks should seek medical evaluation promptly, not rely on AI for diagnosis.
Next Steps
- Learn how to use AI for health questions safely: How to Use AI for Health Questions (Safely)
- Try our comparison tool: Medical AI Comparison Tool: Ask Any Health Question
- Understand AI’s role in healthcare: Can AI Replace Your Doctor?
Published on mdtalks.com | Editorial Team | Last updated: 2026-03-10
DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.