Do Women Make Better Doctors Than Men? Part II


Ashish JhaOur recent paper on differences in outcomes for Medicare patients cared for by male and female physicians has created a stir.  While the paper has gotten broad coverage and mostly positive responses, there have also been quite a few critiques. There is no doubt that the study raises questions that need to be aired and discussed openly and honestly.  Its limitations, which are highlighted in the paper itself, are important.  Given the temptation we all feel to overgeneralize, we do best when we stick with the data.  It’s worth highlighting a few of the more common critiques that have been lobbed at the study to see whether they make sense and how we might move forward.  Hopefully by addressing these more surface-level critiques we can shift our focus to the important questions raised by this paper.

Correlation is not causation

We all know that correlation is not causation.  Its epidemiology 101.  People who carry matches are more likely to get lung cancer.  Going to bed with your shoes on is associated with higher likelihood of waking up with a headache.  No, matches don’t cause lung cancer any more than sleeping with your shoes on causes headaches. Correlation, not causation.

Seems straightforward and it has been a consistent critique of this paper.  The argument is that because we had an observational study – that is, not an experiment where we proactively, randomly assigned millions of Americans to male versus female doctors – all we have is an association study.  To have a causal study, we’d need a randomized, controlled trial.  In an ideal world, this would be great, but unfortunately in the real world, this is impractical…and even unnecessary.  We often make causal inferences based on observational data – and here’s the kicker: sometimes, we should.  Think smoking and lung cancer.  Remember the RCT that assigned people to smoking (versus not) to see if it really caused lung cancer?  Me neither…because it never happened.  So, if you are a strict “correlation is not causation” person who thinks observational data only create hypotheses that need to be tested using RCTs, you should only feel comfortable stating that smoking is associated with lung cancer but it’s only a hypothesis for which we await an RCT.  That’s silly.  Smoking causes lung cancer.

Why correlation can be causation

How can we be so certain that smoking causes lung cancer based on observational data alone? Because there are several good frameworks that help us evaluate whether a correlation is likely to be causal.  They include presence of a dose-response relationship, plausible mechanism, corroborating evidence, and absence of alternative explanations, among others. Let’s evaluate these in light of the gender paper.  Dose-response relationship? That’s a tough one – we examine self-identified gender as a binary variable…the survey did not ask physicians how manly the men were. So that doesn’t help us either way. Plausible mechanism and corroborating evidence? Actually, there is some here – there are now over a dozen studies that have examined how men and women physicians practice, with reasonable evidence that they practice a little differently. Women tend to be somewhat more evidence-based and communicate more effectively.  Given this evidence, it seems pretty reasonable to predict that women physicians may have better outcomes.

The final issue – alternative explanations – has been brought up by nearly every critic. There must be an alternative explanation! There must be confounding!  But the critics have mostly failed to come up with what a plausible confounder could be.  Remember, a variable, in order to be a confounder, must be correlated both with the predictor (gender) and outcome (mortality).  We spent over a year working on this paper, trying to think of confounders that might explain our findings.  Every time we came up with something, we tried to account for it in our models.  No, our models aren’t perfect. Of course, there could still be confounders that we missed. We are imperfect researchers. But that confounder would have to be big enough to explain about a half a percentage point mortality difference, and that’s not trivial.  So I ask the critics to help us identify this missing confounder that explains better outcomes for women physicians.

Statistical versus clinical significance

One more issue warrants a comment.  Several critics have brought up the point that statistical significance and clinical significance are not the same thing.  This too is epidemiology 101.  Something can be statistically significant but clinically irrelevant.  Is a 0.43 percentage point difference in mortality rate clinically important? This is not a scientific or a statistical question.  This is a clinical question. A policy and public health question.  And people can reasonably disagree.  From a public health point of view, a 0.43 percentage point difference in mortality for Medicare beneficiaries admitted for medical conditions translates into potentially 32,000 additional deaths. You might decide that this is not clinically important. I think it is. It’s a judgment call and we can disagree.

Ours is the first big national study to look at outcome differences between male and female physicians. I’m sure there will be more. This is one study – and the arc of science is such that no study gets it 100% right. New data will emerge that will refine our estimates and of course, it’s possible that better data may even prove our study wrong. Smarter people than me – or even my very smart co-authors – will find flaws in our study and use empirical data to help us elucidate these issues further, and that will be good. That’s how science progresses.  Through facts, data, and specific critiques.  “Correlation is not causation” might be epidemiology 101, but if we get stuck on epidemiology 101, we’d be unsure whether smoking causes lung cancer.  We can do better. We should look at the totality of the evidence. We should think about plausibility. And if we choose to reject clear results, such as women internists have better outcomes, we should have concrete, testable, alternative hypotheses. That’s what we learn in epidemiology 102.

Categories: Uncategorized

8 replies »

  1. You are going to reply be saying that helpful information can be obtained by finding the differences in style between the genders and trying to use this in improvement of our craft. That is OK by me, but we should also be fair and study some other physician differences between excellence and banality in physician outcomes. E.g I firmly believe that engineering and science backgrounds in undergraduate schools turn out great docs. N=1.

  2. You could study tall men vs short physicians; you could study men physicians who had engineering undergrad backgrounds; you could study matching racial and economic backgrounds in physicians vs patients; you could study blond female physicians vs brunette;….on and on…and you could probably uncover real truths. But most of these studies would be impossible to publish because they would be politically incorrect. You just happened upon a study, the results of which pleased the feminist movement. This makes it possible that your effort is to pander to the feminists. It doesn’t make it likely, it just makes it possible, and hence it weakens your conclusions.

    What is the use of this information? We still have to have both sexes as physicians. Your individual male physician can easily be better statistically than a female physician. We are not changing medical school admission policies, are we?

  3. I didn’t find this all that surprising, especially in the setting of your study. The women in our group seem to work better together and have less difficulty asking for help. In our larger institutions they flourish. However, in our smaller settings where they often have to work alone, they seem to have more trouble until they acclimate. I get constant texts from them for the first year or so, then it slows down. Also, there exists a body of literature which shows that groups have better outcomes when there is at least one woman involved. Would be interesting to repeat the study in a small hospital setting.

    As to pay, I have been skeptical about this until recently. My experience, and that of others who are also running medium to large groups (over 100 providers) is that even when you recruit a woman trying to set her up as a future leader, hence more pay, that as soon as that first baby comes they lose interest. Not 100% of the time, but most of the time. That adds up. However, once you leave academia there are plenty of practices set up so that only a few people ever become partners, or they have a senior partner system where the seniors make most of the money. Those seniors seem to be mostly male, at least in the groups of which I am aware.

  4. A lot better study would have been to study malpractice and then see what variables create the most malpractice among those practicing similar types of medicine. One of the variables would of course be sex, but age and many different variables could be studied to look for correctable .problems. Apparently based upon some comments obvious variables that could have a causative effect weren’t looked at. Therefore the study doesn’t tell us much of anything useful and might actually cause more harm than good.

  5. Ashish, interesting study. I have not seen many comments from female physicians so I will give a perspective. Many of us women were told granting us a spot in medical school was a waste because we want children and may work part-time in the future while raising them. This study lends credence to the fact that women physicians unquestionably enhance medicine as a whole; not because they are better or worse than men, or their patients fare better or worse, rather women and men are different and both variations indeed have equally tremendous value.

  6. Hi everyone, there is good news, a perfect and great one. My daughter Janet of 17years who has suffered of HiV for 2 and half years has been cure by a Herbal doctor called Dr. buba who uses herbal and traditional medicine to cure people. I have spent so much on my daughter’s health paying hospital bills and getting her medications for years but no improvement. A friend told me about this herbal man who cure people with herbal medicine. he cures all sickness, diseases, viruses, and infections with his different herbal curing medicine.
    The result that declares my daughter Janet healed was from a hospital here in Kansas, And the results of the test says my daughter was okay. Dr buba directed that Janet would be in 7days if she uses the medicine as directed. And 7days later, we went to confirm from the central hospital and several test that was done says my daughter is not sick of diseases . So she was cured by this herbal doctor in 7days as he has prescribe This amaze me though. I would advice everyone to meet this dr buba on dr.bubaspellhome@gmail.com for his herbal medicine to cure your illness because am happy to see my daughter healthy again.
    Meet dr buba and set yourself and your love ones ill free.
    you can also contact him for these same problems
    doctor buba email again is dr.bubaspellhome@gmail.com

  7. Curious… I am waiting, now, for your results on white vs. black doctors. Maybe compare gays vs. straights would be interesting too.

  8. There are consistent differences between males and females that could explain the findings. The consistent differences between males and females argue for consistent differences such as physician age, physician origins, and physician practice locations and types.

    As noted, consistent differences in outcomes are more likely to be about differences in the patient factors – the ones that contribute more than smaller clinical contributors.

    Differences in males and females may qualify for apples to oranges research flaws.

    The patients will be different according to physician gender. Female physicians are more likely to be from higher income more urban counties and are also more likely to be found in same in practice. This also is demonstrated in the study across type of practice and location.

    Now that so many are congratulating you and others about confirming female advantages regarding communication and other assumptions that they want to make – you might want to examine your data to see if the males had communication issues such as being more likely to have English as a second language, or mismatches between physician origins and patient origins. Higher physician discipline rates have been associated with international graduates.