USMLE Step 1: Leveling the Playing Field – or Perpetuating Disadvantage?


Let me show you some data.

I’m going to show you the Match rate and mean Step 1 score for three groups of residency applicants. These are real data, compiled from the National Resident Matching Program’s (NRMP) Charting Outcomes in the Match reports.


  • U.S. Allopathic Seniors: 92% match rate; Step 1 232.3
  • U.S. Osteopathic Seniors: 83% match rate; Step 1 225.8
  • International Medical Graduates, or IMGs (both U.S. and non-U.S. citizen: 53% match rate; Step 1 223.6

Now. What do you conclude when you look at these numbers?


In the debate over the U.S. Medical Licensing Examination’s (USMLE) score reporting policy, there’s one objection that comes up time and time again: that graduates from less-prestigious medical schools (especially IMGs) need a scored USMLE Step 1 to compete in the match with applicants from “top tier” medical schools.

In fact, this concern was recently expressed by the president of the National Board of Medical Examiners (NBME) in an article in Academic Medicine (quoted here, with my emphasis added).

“Students and U.S. medical graduates (USMGs) from elite medical schools may feel that their school’s reputation assures their successful competition in the residency application process, and thus may perceive no benefit from USMLE scores. However, USMGs from the newest medical schools or schools that do not rank highly across various indices may feel that they cannot rely upon their school’s reputation, and have expressed concern in various settings that they could be disadvantaged if forced to compete without a quantitative Step 1 score. This concern may apply even more for graduates of international medical schools (IMGs) that are lesser known, regardless of any quality indicator.”

The funny thing is, when I look at the data above, I’m not sure why we would conclude that IMGs are gaining advantage from a scored Step 1. In fact, we might conclude just the opposite – that a scored Step 1 is a key reason why IMGs have a lower match rate.

So let’s consider this objection further. In this post, I’d like to answer two questions.

  1. Does a scored USMLE ‘level the playing field’ for lower-tier USMGs/IMGs – or does it simply perpetuate disadvantage?
  2. And regardless of the answer to #1 above, does reporting USMLE scores make sense from a policy standpoint?


Some background

It’s undeniable that Step 1 scores matter in residency selection. Here, just for example, are USMLE Step 1 scores by match status and specialty for U.S. seniors, from the most recent NRMP report.


And it’s also true that IMGs face an uphill battle in the Match.


But in light of the data we started with above, there’s a problem with the “level playing field” argument: students at the top U.S. medical schools do quite well on Step 1.

Here is a scatterplot of the mean Step 1 score by medical school, ordered by the 2019 U.S. News & World Report (USNWR) “Best Medical Schools” research rankings.

As you can see, students at the best schools (at least, the “best” schools according to USNWR rankings) have the best USMLE Step 1 scores: look at the clustering of scores in the upper left quadrant.


Many students at less prestigious allopathic schools, osteopathic schools, and Caribbean medical schools most differ from their colleagues at the “top” allopathic schools in one area: Medical College Admission Test (MCAT) scores.

And there is a correlation between other standardized test scores and the USMLE Step 1.

Could it be that Step 1 scores simply carry forward an advantage already set in motion by earlier standardized tests? Are we just measuring the same thing over and over again?

To demonstrate the correlation between these tests, I also plotted the mean MCAT and USMLE Step 1 score for schools who reported their data for the 2019 U.S. News & World Report “Best Medical Schools” rankings.

Some notes on methods appear below, but I take two conclusions from these data.

  • There is a strong linear relationship between the mean MCAT and mean USMLE Step 1 score by school. Every 1 point increase in MCAT is associated with a 1.08 point increase in USMLE Step 1 (95% CI: 0.89-1.28; p<.001).
  • There aren’t a whole lot of schools that dramatically “overperform” or “underperform” this simple model. The plot of standardized residuals using the above regression equation fits a bell curve distribution. Five schools did have large positive or negative standardized residuals (MCAT/USMLE Step 1/standardized residual): University of Missouri (508/241/+3.30), University of Texas-Galveston (507/237/+2.60), Mayo (516/245/+2.17), New Mexico(505/216/-1.95), UC-Davis (510/220/-2.29). Overall, 55% of the variation in mean USMLE Step 1 scores by school are explained by differences in the MCAT score alone (R2=0.55).


The bottom line

When I look at the scatterplots above, I conclude that, contrary to the assertions of Drs. Katsufrakis and Chaudhry above, it doesn’t appear that that students at elite medical schools “feel that their school’s reputation assures their successful competition” in residency selection. They seem to be doing quite well on Step 1, thank you very much.

So if the “best” schools have the highest Step 1 scores…why do we not look at these data and conclude the the residency selection advantage gained by graduates of “elite medical schools” is BECAUSE of their Step 1 scores, not in spite of them?

Does the use of Step 1 scores in residency selection “level the playing field”? Or would disadvantaged candidates be better off if we played a different game altogether?


Acknowledging the obvious objection

I know there are many students out there who will reject this logic. I’ve already heard from many of them on Twitter.

The issue is, of course, is that even though USMLE Step 1 scores disadvantage IMGs and ‘lower-tier’ USMGs on a systemic level, they offer an individual applicant the hope of catching a program director’s eye.

Listen, I get it. Step 1 is the devil we know, and there are many students out there who would rather “compete” using their Step 1 score than with some other metric. Furthermore, if we just got rid of Step 1 scores and provided nothing new to replace them, students at top-tier medical schools would still enjoy an advantage. (If we simply shift focus to other existing areas of the application, who do you think has better opportunities for research, big name letter writers, etc.?)

But let me make three points in rebuttal.

1. A pass/fail USMLE is a means to an end.

I don’t support a pass/fail USMLE because I think that evaluating candidates using other existing metrics is better. I support it because our idolatry of Step 1 scores in residency selection keeps us from critically evaluating the whole process and working to measure things that matter and truly match candidates with the best program based on both aptitude and goodness of fit.

2. Score reporting policy doesn’t change the basic math of the Match.

For the past 25+ years, there has been a significant mismatch between the number of residency positions and the number of applicants.

How we choose to score Step 1 does nothing to this basic math. And yet, often we talk about it like it does. Not long ago, I debated a faculty member on Twitter who claimed that with no Step 1 scores, most residency programs would simply stop interviewing IMGs, because that would be the “simplest metric” that could be used.

Is that argument supported by data? Or is it just fearmongering?

We’re still going to have the same number of spots – and U.S. allopathic graduates can’t fill all of them. (So far as I know, Harvard and Johns Hopkins aren’t planning to increase their entering class size ten-fold if Step 1 is reported pass/fail.)

Sure, programs who currently match IMGs could certainly choose to stop interviewing them. If they don’t care if their program goes unmatched, that is.

Remember, many programs do not interview or rank IMGs as it is: considering all programs in all specialties, only 42% do, according to the 2018 NRMP Residency Program Directors’ Survey.

But that figure varies widely based on the competitiveness of the specialty. For instance, 90% of pathology programs consider non-U.S. IMGs, while only 6% of orthopedic surgery programs do. In fact, the more competitive the specialty is for U.S. seniors, the fewer programs there are who interview/rank IMGs.

The competitiveness of the Match process is driven by basic math – how many candidates, and how many positions in the field. How the USMLE reports its results doesn’t change that one bit. In my opinion, programs who find value in training IMGs (or who need IMGs to fill their positions) will still choose to interview and rank IMGs, regardless of the evaluation methods available. In fact, if we had more meaningful metrics – metrics that actually predicted residency success, unlike the USMLE – more programs might be willing to consider IMGs. So why not choose methods that are more meaningful?

3. Weighting individual payoffs over societal payoffs is bad policy.

If you consider residency selection policy only from the standpoint of individual applicants, it’s a zero sum game. There are a fixed number of residency spots, and any metric we use to select candidates will benefit one applicant at the expense of another. Use Step 1 scores, or use something else… but if there are two candidates and only one spot, one candidate wins and the other loses.

However, considering the payoffs only to individual applicants takes an unnecessarily narrow view of residency selection policy. What we should really be asking ourselves is, to the extent that the end product of medical education is a public good, are we selecting residents in a way that leads to increased good for society?

I say we aren’t. I don’t think there is any value to students or their future patients by spending hours memorizing soon-forgotten basic science minutiae for the USMLE Step 1. Even if it is true that our best students have the highest scores on Step 1 – is that really how we want them to spend their energy and talent? If students have to compete for residency positions, then let’s at least have them compete in an endeavor that makes them all better doctors.

Listen, I took USMLE Step 1 in 2005. On a personal level, I don’t have anything to gain from USMLE scores being reported as a three-digit number, a ‘P’ or ‘F’, or anything else. I’ve gotten interested in this issue because I see the excessive focus on USMLE Step 1 as something that’s harming the quality of both undergraduate and graduate medical education.

I think that sucks – and I think we can do better.


To ensure transparency in the analyses above, a couple of comments on the methods:

  • Some have asked why I didn’t report regression statistics for the first scatterplot (on Step 1 scores and USNWR ranking.  I’m not trying to hide anything – I just chose not to do a linear regression for a couple of reasons.  For one thing, the data on the x axis are ordinal (that is, it’s not clear what 1 unit measures mathematically, or whether the distance between ranks 1 and 2 is the same as the distance between ranks 53 and 54).  For another, the relationship between USNWR ranking and Step 1 score is not an independent one – the scores do factor into the school’s rank.  Nonetheless, USNWR rankings are a simple way to capture the way we think about which medical schools are most prestigious, and to me, the scatterplot alone is sufficient to demonstrate that students at these schools perform the best on Step 1.
  • The data points for the MCAT/USMLE scatterplot represent schools, not individuals.  However, the correlation between (old) MCAT and USMLE scores for individuals is good.
  • These graphs consider only allopathic medical schools, since many students at osteopathic schools may pursue an alternative licensure pathway (COMLEX).  
  • There are 144 allopathic medical schools in the U.S.  Not all of these schools provided data to USNWR.  Thus, the plots above consider data for only 67% (97/144) of allopathic schools.
  • It seems likely that schools who decline to report their data to USNWR do so non-randomly – i.e., those with lower MCAT/Step 1 scores are less likely to disclose their data than schools with higher scores. My suspicion is that these schools would have points in the lower left quadrant of the MCAT/Step 1 plot above.  However, it is possible that the relationship between MCAT and USMLE Step 1 scores is different for schools who did not report their data. 
  • The USNWR Step 1 data are reported by the individual schools (i.e., not provided independently from the NBME, for instance).  There may be differences in how schools calculate their mean score in terms of which students are included.
  • I did remove one outlier: Wright State, which was listed as having a mean USMLE Step 1 score of 260.  If anyone out there is privy to information suggesting that is an accurate figure, I’d be happy to re-run the regression. 
  • Also, if you’re like me, and you remember the MCAT being three sections graded up to 15 points, the correlation between “old” MCAT scores and Step 1 is essentially the same.  Only 91 allopathic schools reported these data, but the model performed similarly (slope 1.62; 95% CI: 1.27-1.98, R2=0.48). 
  • Lastly, just like always, if you want to check my figures, please do.  The data are available from U.S. News & World Report.  And if you find a mistake, please let me know.  You don’t have to agree with my analysis or conclusions – but the data should be honestly presented. 

Dr. Carmody is a pediatric nephrologist and medical educator at Eastern Virginia Medical School. This post originally appeared on The Sheriff of Sodium here.

2 replies »

  1. If the end product of a medical school education were a public good, we can then assume that total demand would be very high, because there are–by definition–no prices (and prices tell what and how much to produce and what and how much to buy.) Thus, almost any specialty would have a Pareto positivity and no one would know how many slots to establish for what specialty. A post graduate medical residency program would thus be chaotic to say the least.

    It may be that the USMLE Step 1, although overtly testing for “soon-forgotten basic science minutiae”, may actually be testing for something that is not obviously apparent: executive function and the ability to show up, the discipline to spend time reviewing and rapidity of re-learning, the coping with stress, the preparation for terribly uneven and unexptected hours in future practice…in short learning how to act as a professional. Where else can this learned before it is needed?

    Thus, the USMLE Step 1 may not be an opportunity cost.

  2. AND How does the post-graduate funding budget and its allocation process skew the whole process? Medicare annually distributes around $18 Billion for this funding, favors the institutions with higher research budgets and lacks any connection with the availability of Primary Physicians. Paradigm paralysis at it worst.

    Indeed, we must do better.