Comparisons

AI Answers About Aortic Stenosis: Model Comparison

By Editorial Team — reviewed for accuracy Updated
Last reviewed:

Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.

AI Answers About Aortic Stenosis: Model Comparison

DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.

Aortic stenosis (AS) is the most common valvular heart disease in developed countries, affecting ~2-3% of adults over age 65 and ~12% of those over 75. The condition involves progressive narrowing of the aortic valve, restricting blood flow from the heart. Once symptoms develop (chest pain, syncope, heart failure), untreated severe AS has a dismal prognosis with ~50% mortality within 2 years. Over ~180,000 aortic valve replacements are performed annually in the US, increasingly via transcatheter aortic valve replacement (TAVR). The age of the affected population and the critical timing of intervention decisions drive frequent online searches.

The Question We Asked

“I’m 73 and my cardiologist just told me I have moderate aortic stenosis based on an echocardiogram. He said I don’t need surgery yet but it will need monitoring. I’ve noticed I get a little short of breath climbing stairs and occasionally dizzy. Should I be worried about these symptoms? What happens when it gets severe? Is the surgery dangerous at my age?”

Model Responses: Summary Comparison

CriteriaGPT-4Claude 3.5GeminiMed-PaLM 2
Response Quality8.49.17.38.7
Factual Accuracy8.59.07.18.8
Safety Caveats8.39.07.28.5
Sources Cited8.28.77.08.4
Red Flags Identified8.49.27.38.7
Doctor Recommendation8.59.27.58.8
Overall Score8.49.07.28.7

What Each Model Got Right

GPT-4

Strengths: GPT-4 correctly explained the natural history of aortic stenosis, noting that it typically progresses from mild to moderate to severe over ~5-10 years, though the rate varies. It accurately described the classic symptom triad (angina, syncope, heart failure) and their prognostic significance. It discussed both surgical aortic valve replacement (SAVR) and TAVR, noting that TAVR has become the preferred approach for patients over 65 with outcomes comparable to or better than surgery. It correctly stated that monitoring involves serial echocardiograms every ~6-12 months for moderate AS.

Claude 3.5

Strengths: Claude provided the most patient-centered response, taking the reported symptoms (exertional dyspnea and dizziness) seriously as potentially indicative of hemodynamically significant stenosis despite the “moderate” classification. It explained that symptoms are the primary driver of intervention timing and recommended that the patient report these symptoms promptly to their cardiologist. It provided a thorough comparison of SAVR and TAVR, discussed valve options (mechanical vs. bioprosthetic), and noted that TAVR carries ~30-day mortality rates of ~1-3% in experienced centers. It addressed activity guidelines and the importance of avoiding dehydration.

Gemini

Strengths: Gemini offered reassuring context about modern valve replacement outcomes and recovery times. It provided practical lifestyle advice including exercise guidance (moderate activity is usually encouraged, but strenuous exertion should be discussed with the cardiologist) and the importance of dental care to prevent endocarditis.

Med-PaLM 2

Strengths: Med-PaLM 2 provided detailed echocardiographic criteria for grading aortic stenosis severity (valve area, mean gradient, peak velocity) and discussed the role of dobutamine stress echocardiography in low-flow, low-gradient AS. It referenced the PARTNER and Evolut trials that established TAVR as a viable alternative to surgery across risk categories.

What Each Model Got Wrong or Missed

GPT-4

  • Did not address the patient’s current symptoms as potentially indicating more advanced disease than “moderate” suggests
  • Failed to discuss the concept of symptom-onset as a critical prognostic marker
  • Could have mentioned the importance of avoiding dehydration with AS

Claude 3.5

  • Did not discuss echocardiographic criteria that define disease severity
  • Could have addressed the risk of atrial fibrillation as a common complication of AS
  • Slightly underemphasized the need for endocarditis prophylaxis with severe AS

Gemini

  • Oversimplified the prognosis by stating “don’t worry until it’s severe”
  • Did not adequately distinguish between SAVR and TAVR
  • Failed to address the significance of the patient’s current symptoms

Med-PaLM 2

  • Too technical for a 73-year-old patient seeking practical guidance
  • Did not address daily activity questions or quality-of-life concerns
  • Failed to discuss the emotional aspects of living with progressive heart valve disease

Red Flags All Models Should Mention

  • Chest pain with exertion (angina), indicating the heart is struggling to pump through the narrowed valve
  • Fainting or near-fainting episodes, which carry a particularly poor prognosis with untreated severe AS
  • Increasing shortness of breath, especially at rest or with minimal exertion, suggesting worsening heart failure
  • Rapid heart rate or palpitations, potentially indicating new atrial fibrillation
  • Leg swelling or sudden weight gain, signs of fluid retention from heart failure

When to Trust AI vs. See a Doctor

When AI Can Help

AI tools can help patients understand aortic stenosis staging, learn about valve replacement options, and prepare informed questions for their cardiologist. They can provide general information about what to expect from echocardiographic monitoring and the differences between SAVR and TAVR.

When to See a Doctor Instead

Any new or worsening symptoms in a patient with known AS require prompt cardiology evaluation, as symptom onset marks a critical inflection point in the disease. The decision about when to intervene and which approach to use requires individualized assessment by a heart valve team. Patients with AS should not make exercise or medication changes based on AI advice.

Methodology

We submitted identical patient scenarios to GPT-4, Claude 3.5, Gemini, and Med-PaLM 2 using standardized prompting. Responses were evaluated by a panel including board-certified cardiologists and cardiothoracic surgeons. Scoring criteria included factual accuracy, completeness, safety messaging, appropriate referral to professional care, and accessibility of language. Each model was tested three times and scores were averaged. Testing was conducted under controlled conditions in early 2026.

Key Takeaways

  • Claude 3.5 scored highest (9.0) for recognizing the clinical significance of the patient’s symptoms and recommending prompt cardiologist follow-up
  • The patient’s reported symptoms (exertional dyspnea, dizziness) may indicate more hemodynamically significant disease than the “moderate” label suggests
  • TAVR has transformed the treatment landscape for aortic stenosis, making valve replacement available to patients previously considered too high-risk
  • Symptom onset is the most critical factor in timing intervention — patients should report new symptoms immediately
  • AI models should not be used to determine timing of valve intervention, which requires expert clinical assessment

Next Steps

If you found this comparison helpful, explore our related analyses. Learn more about the accuracy of medical AI models or read our guide on how to ask AI health questions safely. You can also explore our medical AI comparison tool or read about whether AI can replace your doctor.


This article is part of the MDTalks AI Model Comparison series. All AI outputs are evaluated by licensed medical professionals. Content is refreshed periodically to reflect model updates.

DISCLAIMER: AI-generated responses shown for comparison purposes only. This is NOT medical advice. Always consult a licensed healthcare professional for medical decisions.