A health news explorer with MeSH headings integration

Guest post by Joao Pita Costa from Quintelligence

The search of worldwide news contributes to a global perspective of the world health, but also to the aspects of regional health where public health institutes can act upon. On the other hand, it can also contribute to the evaluation of the public health campaigns by allowing decision-makers to access to what is the media talking about, often reflecting the opinion of communities. This article will present an example of innovation at MIDAS consisting of a news explorer with integrated MeSH headings to ease the health news search.

Figure 1 – MEDLINE’s record year on research articles labeled as Public Health is 2002, counting more than 5 thousand instances 

The ease-of-use is of great value to a tool of this nature, particularly when that is derived from the familiarity with the tools in use. Based on this, MIDAS integrated the newly developed automated MeSH classifier in the already available external news dashboard. In that, the user of the new MIDAS external news dashboard can use the MeSH Headings in the search of news, just by choosing them over a dropdown menu (see Figure 2). This is much similar to the usability of the well-known medical research explorer PubMed, where the MeSH headings are used in the search for scientific articles.

Figure 2 – The MIDAS news dashboard with integrated MeSH headings annotations (see dropdown menu named “Categories”).

This usability aims to facilitate the monitoring of worldwide news by health professionals which are used to the best practices in the utilization of the PubMed search engine. We believe that this similarity will also allow the public health user to better align that news monitoring with the specific health study, by using the same Medical Subject Heading classes for synchronisation. Moreover, some of the visualisation modules available at the news dashboard expose that integration with the MeSH headings. An example of that is Article Categories module that allows the user to account for the percentage of news articles that talk about a specific MeSH heading (see Figure 3).

Figure 3 – A screenshot of the one of the visualisation modules at the MIDAS external news dashboard, showcasing the percentage of news articles that talk about a specific MeSH heading

With this MeSH headings integration we also expect an improved analysis of the regional news, allowing for the evidence-based decision-making in line with the research review where also the MIDAS external dashboard can contribute. Moreover, the automated MeSH classifier and the integration of it with the Event Registry news monitoring system are independent from that system. The choice of other news monitoring system does not impact the core of the usability of the integration, as this only depends on the usage of the metadata created by the MeSH classifier at each of the news articles by the alternative news dashboard. 

In order to evaluate the efficiency of the classifier, we compared its results to the MEDLINE hand annotations. In that evaluation phase we left out one year of MEDLINE abstracts to be used as evaluation dataset. In a second phase we evaluated the classifier in the context of news articles. For this purpose, we asked four MIDAS experts (i.e., health professionals with experience in the usage of MeSH headings) to annotate news articles on the four MIDAS use cases – over the topics of pediatric obesity, mental health, diabetes, child care – using the MeSH headings. Based on the analysis of the prior evaluation over research articles, we considered that the annotation could go up to a fourth level of deepness in the MeSH tree. Thus we proceeded with providing each of the four experts with a set of 20 news articles and a spreadsheet where they should annotate with three to ten MeSH headings each of the articles.

The four diagrams in each of the cases in analysis (as below) show the evaluation results where the X-axis is the tree depth, the Y-axis is the similarity cut-off threshold, and the colour code is the result of the evaluation (precision, recall, F1, F0.5 measure respectively). For example, in the graph of Figure 4 on the left, for a similarity cut-off above 0.3 at tree depth 3, the precision is around 60%.

Figure 4 –  Precision, recall, F1 and F0.5 measures in the comparison between the MeSH tree depth and the cut-off based on similarity for major MeSH headings for the diabetes use case.

As expected, higher cut-off yields higher precision results and lower recall results. Moreover, lower depths yield both better precision and recall. Roughly and in most cases, for three depth 3, over cut-off values of approximately 0.35 precision increases to over 50% and below cut-off values of approximately 0.25 recall increases to over 50%. At tree depth three, which is what the researchers aim for news items annotation, results are of acceptable quality.

The MeSH classifier was released within MIDAS and is now available Open Source under the Apache 2.0 license at: https://github.com/quintelligence-health/medline_classifier