Disagreeing with a PubMed Instructor about MeSH

Rachel (at Women’s Health News) wrote yesterday about her week-long course in biomedical informatics in a post wonderfully titiled “Dispatches from Nerd Camp.” She writes:

…our PubMed instructor declared, “I’m over the whole MeSH thing,” in the context of explaining that she thinks it’s completely unnecessary to know about and too hard to use.


The instructor suggested that keyword searching is always just fine because it will map to MeSH anyway. This is often true. However, I did a keyword search on “HRT” (a popular topic that most adult humans would understand) and discovered that this does not map to the “Hormone Replacement Therapy” MeSH, and returns a pretty poor set of search results. Discuss.

[My emphases]

Rachel’s instructor is right that using MeSH effectively can be difficult, but many good and useful things are not immediately easy, so I’d categorically dismiss this as a legitimate reason to be “over MeSH.”

The other reasoning that Rachel reports was presented by the instructor is that “keyword searching is always just fine because it will map to MeSH anyway.”

But as Rachel points out, it isn’t always just fine. Sometimes, it is downright inadequate.

Where PubMed fails to map
Check out the search that Rachel describes, searching for HRT in PubMed. Because PubMed doesn’t map HRT to the appropriate MeSH term, it defaults to a keyword search, producing a number of hits in the first 20 results that really aren’t about hormone replacement therapy. So it seems clear that Rachel’s instructor overestimates how effective PubMed’s automatic mapping to MeSH terms is. (I’m not faulting PubMed for this- this mapping must be incredible work to maintain and update.)

“Unnecessary to know about?”
For a moment, we’ll embrace as true the premise that PubMed usually maps as we’d like and reveals the results we’re looking for. If we don’t know how to utilize MeSH, how will we get what we need on the occasions where PubMed doesn’t do this?

Improving Keyword Searching in PubMed
That isn’t to say that keyword searching couldn’t be made much more effective in PubMed. Try searching for “HRT” (without the quotes) at ReleMed and page through the first 25 hits (I stopped at 25), you’ll find they’re all about hormone replacement therapy, despite the fact that the NLM hasn’t mapped “HRT” to the relevant MeSH term(s). ReleMed does two things that are awesome. First, it uses UMLS to translate “HRT” into (hormone replacement therapy) OR (hormone replacement therapies) OR hrt OR (hormone replacement rx)I’d rather it left the “OR hrt” out of the search, but I’m just impressed with ReleMed’s translation compared with PubMed’s mapping (or failure to map).

Secondly, it intelligently sorts the results by relevance (details here).

I hope that the NLM will either buy the technology from ReleMed or develop a similar capability to do the same thing in PubMed, something like this:

(Previously suggested here).

MeSH + PubMed = Best possible results
Even so, the ReleMed search for HRT still isn’t as effective as executing a search in PubMed like “hormone replacement therapy”[majr]

Over MeSH?
No. Not by a long shot- and I wouldn’t want a medical librarian who found MeSH “too hard to use” doing literature searches for our library’s patrons. Our clinical patrons need and deserve the best possible results we can deliver and that takes an understanding of MeSH. MeSH is necessary, it is not too hard to use, and keyword-to-MeSH mapping doesn’t always work as we’d hope.

Unless I’m wrong
This seems pretty clear to me, but I’ve only been working in a medical library for less than two years. I’d love to hear more experienced medical libraryfolk weigh in on the topic. Please consider leaving a comment or blogging about it yourself.

22 thoughts on “Disagreeing with a PubMed Instructor about MeSH

  1. Hmmm…I’m very interested that this was coming from someone who presumably is at NLM. I know that when I teach PubMed to non-librarians, I tell them that they can usually get along just fine w/o using MeSH &/or the Details tab, but it’s important to know about it and how it works, so that when they don’t get the results they’re looking for they can find out why. Abbreviations tend to map poorly, terms w/ multiple meanings (blind) don’t get you what you want at all, and adding a term like “pediatrics” to a search, in hopes of limiting by age, will add the specialty of pediatrics. For my own searching, I use MeSH in combination w/ keyword–just as you’d see they do in Cochrane reviews. I don’t trust either to do the job completely.

    • Me too… if searching for instance for interleukin-21 i would usually search for interleukin-21 (mesh) OR IL-21 (keyword) etc… and combine with other relevant keywords. I was told at a seminar once, that MeSH terms aren't added to new articles straightaway, i.e. to get the most fresh research articles, you would have to stop outside the MeSH comfort zone 🙂

    • Me too… if searching for instance for interleukin-21 i would usually search for interleukin-21 (mesh) OR IL-21 (keyword) etc… and combine with other relevant keywords. I was told at a seminar once, that MeSH terms aren't added to new articles straightaway, i.e. to get the most fresh research articles, you would have to stop outside the MeSH comfort zone 🙂

  2. I agree, Erika. I’m not, after all, suggesting that keyword searching isn’t ever useful, just that the ability to search by MeSH is indispensable.

  3. Had an exchange about this w/ a colleague (one of our web guys) who apparently chatted w/ or saw a presentation by an NLM “uber-scientist” at Computers in Libriaries. His take is that “it sounds like the code behind Entrez is *very* complex, like chaos theory complex (he talked about things as probabilities rather than hard predictions as to what the system would do 😉 . Not trusting the system to do your mapping properly every time is probably a wise strategy, esp. with things like acronyms and terms that evolve over time (and probably aren’t updated as access points in a reliable way).”

    And that doesn’t even add in the human factor in indexing…

  4. Oh, there’s no question the code behind Entrez is insanely complex. That’s why I point out I’m not faulting it (or the marvelous geeks who do the coding). Given the complexity of the data and processes, I think it does what it does about as well as could be hoped.

    I think we’re pretty well in agreement. 🙂

  5. btw, searching PubMed by keyword IS searching with MeSH (IOW, it’s not geek driven), because the PubMed mapping is primary driven by the entry terms in MeSH, which are human constructed. If one checks the MeSH entry for “Hormone Replacement Therapy”, the only entry terms are “Replacement Therapy, Hormone” and “Therapy, Hormone Replacement”. The reason why “HRT” does not automatically map is because it is not an entry term for this MesH term.

    Based on this, one would think (hope?) that NLM would hire a couple of more vocabulary specialists to beef up the number of number of entry terms in MeSH! 🙂

    For further information on how PubMed maps keyword queries, see their discussion on Automatic Term Mapping here: http://tinyurl.com/3e26av

  6. slm-

    I think I agree generally, but to nitpick:

    1. As far as the keyword-only user is concerned, he/she is plugging in keywords and crossing his/her fingers, not consciously seeking a mapping to MeSH.

    2. Keywords that fail to map *are* searched as text strings, so keyword searching *isn’t* always searching with MeSH.

    Thank you for forgiving this nitpick. 🙂


  7. Every PubMed class that I have taught always included both ways of searching, with the emphasis on using MeSH. One of the things that I like to do is to do a kw and a MeSH search and then combine them to get the most out of my search, then apply limits to drill down.

    I am a little surprised that this came from someone from NLM. I mean, try doing an EBM search using kws only and see how far you get.

  8. David, thanks for this post, and everyone else, thanks for your insights. I’m quite sure I’ll continue to use both MeSH and keywords, and teach users both.

  9. “hormone replacement therapy” does expand to “hormone replacement therapy”[MeSH], so it’s perhaps a bit surprising that HRT doesn’t. I suspect the reason is that there are multiple meanings for the abbreviation HRT – as the initial search results showed – so if the expansion was automatic you might actually lose a lot of valid results if you were searching for another meaning of HRT.

  10. David, great entry! I’m with you and Erika, combining keywords and MeSH is the most comprehensive way to search, rather than using one OR the other. My usual examples for the perils of keyword-only searching are to search “nursing” (which maps both to the profession AND to breastfeeding) or to search “cavities” (which doesn’t map to any MeSH heading and finds lots of articles not related to dental caries). Thanks for the great new example!

  11. alf-

    If you’re right and “HRT” doesn’t map to anything because it could potentially map to multiple things, that would be (IMHO) a design failing.

    Rather than mapping to nothing, PubMed could be set to acknowledge when a search term has multiple potential mappings and offer the user the option of selecting the mapping(s) he/she prefers. For example, if we searched for “HRT”, PubMed might reply:

    Click here for “Hormone replacement therapy. Click here for “Heart Rate Turbulence.” …etc.

  12. PubMed generally does a bad job (or at least doesn’t do what you’d expect) w/ acronyms and abbreviations. Every year I critique M2 students’ searches, and they get confused when they search CI (contraindications) and get items pertaining to chemically-induced instead.

    But wouldn’t it be cool if David’s suggestions were followed up on?

    My example of why you need to know about MeSH (even if you hardly ever use it) is “blind.” If you’re looking for visually impaired persons, or the condition of being blind, but just enter “blind,” you get 150,000+ citations, b/c it picks up double-blind, triple-blind, etc. Once you search on Blindness, or visually-impaired persons, that number goes way down…

  13. I think MeSH is essential and definitely not too hard. Sometimes there terms that are not in MeSH, for example, “Acute Chest Syndrome.” But I think it is worthwhile to check MeSH first and only add free keywords to the search if necessary. I always show patrons and students MeSH – it is the very first place I take them to. To me, MeSH is even more important for non-librarians to use because it provides standard wording/phrasing, subheadings that help limit a search, suggestions in case your spelling and wording are off, and a brief definition for each heading so users can be certain they are choosing the right term. While I understand that it appears to make searches easier when words are automatically mapped to MeSH, I think it is an unfortunate thing in the long run because people are unaware that it is happening and might not realize that results using words different from what they used to conduct their search are still about the topic of their interest. I had a student who needed articles on decreased oxygen in babies with heart defects. She kept getting articles about anoxia or hypoxia (both terms used to described decreased oxygen) and was in tears because she thought she wasn’t finding anything. Expanding our vocabularies is always a good thing.

  14. I am really interested in this thread, because lately I’ve been thinking about whether it is really necessary or effective to teach MeSH to end user clinican types. I’ve taught a few sections of residents lately, and although I know they’ve received library instruction during their medical education, it seems like most still don’t understand how to use MeSH! When I ask for a show of hands for who prefers PubMed vs. Ovid Medline, usually it’s about half and half, or sometimes a small majority prefers PubMed. Those who like PubMed like it because they get better results from whatever they’ve plugged in. The bottom line is, they’ve been taught about MeSH, but just because we teach it, doesn’t mean they learn it. So do I go over with them (once again) how to use MeSH and why you should do so? Or do I accept that they prefer keyword searching, and try to teach them to do it in a way that will hopefully increase their odds of getting good results (things like Boolean operators, not using acronyms, searching one concept at a time and combining sets, etc.)?

    I’m even more inclined to abandon MeSH when I have a group of nurses or other hospital staff. Some of my users are very unfamiliar with medical databases and not overly comfortable on the Internet, and within the time I have, it’s hard to teach them about subject headings, plus all the other aspects of database searching like using the results manager, not to mention a general overview of library services. I had a very long, one-on-one session with a nurse a couple of weeks ago, where we went over CINAHL headings in detail. The other day, she was back, very apologetic, to say she had another topic and she really wasn’t able to use the search engine the way I’d shown her.

    Having said all this, I think it’s preposterous that someone in bioinformatics would claim to to “so over” MeSH. For advanced users, researchers, and librarians, MeSH is an absolute must, and if you can’t learn it, maybe you’re in the wrong field.

  15. Knowing how MeSH works is essential because you still need to know when the mapping doesn’t work well, or does not work the way you expect it to. Example: there are separate MeSH for Fertility and Infertility. Typing either of those terms into Pubmed will get you the one you typed in, but not the ‘see also’ note that is important for making the distinction. Knowing that MeSH treats these concepts differently makes a huge difference when you search.

    Of course, when I teach any database, I always tell them they only actually *need* to know two things: my e-mail and my phone number. If they forget absolutely everything else, they need to know to contact me. However, having an idea about how literature is organized, by NLM or anyone else, does help me help them.

  16. Pingback: HubLog

  17. MeSH not necessary? Perhaps for non-librarians and non-systematic-reviewers but for any librarian who wants to be sure that they haven’t missed anything needs to use MeSH (or EMTREE or Cinahl Subject Headings or any other controlled vocabulary you can get your hands on). I would expect this kind of argument from a non-librarian — the idea that information is not currently easily searched without some sort of human organizational intervention is probably the most important thing to be conveyed to everyone. Yes, PubMed makes KW searching more effective but ONLY because it’s including the MeSH occasionally. I would argue that KW searching should almost never be used unless absolutely necessary (i.e. no direct subject heading available or you have the time to sift through every possible item that could possibly apply).

    MeSH too hard? Hardly. This is one reason I stay away from PubMed and prefer Ovid Medline: easy access to and manipulation of MeSH. MeSH is dead simple, when you consider the alternatives. KWs are easy but not if you actually want to find what you’re looking for. And compare MeSH to almost any other controlled vocab. I’m almost afraid to do any lit searching outside of anything biomedical since there’s nothing like MeSH out there!!! LOL

  18. Pingback: federated searching in medicine « omg tuna is kewl

  19. Pingback: davidrothman.net » novo|seek (3rd-Party PubMed/MEDLINE Tool)

  20. Let's face it, librarians: Folks just want to search Medline like they search Google.

    Medline is an elegantly designed database. It has a thesaurus you can look at, you can "explode" terms, there are 88+ clinical subheadings to combine with a medical subject heading. CINAHL, PsycInfo, ERIC, ChemAbs and EMBASE databases are also rather transparent sources. Harder to discover and apply are the subject headings used in Web of Science or Scopus. No, it is not a good idea to be "over" using MeSH terms.

    As others have said, sometimes a MeSH terms does not exist for what the human searcher is trying to find. For an example, a gastroenterologist asked for assistance in searching on the term "NASH". I blanked on what the term means. (A car made by American Motors in the 1950s?). NASH is non-alcoholic steatohepatitis, and there's no formal subject heading for that term either. Plus in MeSH there are over 130 hepatitis terms. Pretty confusing for a novice searcher. Another difficult-search example is a search on "nursing and narcotics". If you search CINAHL you'll get about one million citations for nursing treating substance abusers. However, what this person was looking for was nurses who are themselves substance abusers and are swiping meds while at work by signing themselves into the locked narcotics closet. Eventually we found (in the CINAHL thesaurus) that what she needed to search is "professional impairment" but that took more than a few minutes. These are a durable examples.

    But it IS fun to dissuade scientists from using Google Scholar in favor of Medline! It's a real librarian-victory!