Telescope peering into the night sky

Microsoft Academic

EMNLP Conference Analytics

Share this page

The Microsoft Academic Graph (opens in new tab) makes it possible to gain analytic insights about any of the entities within it: publications, authors (opens in new tab), institutions (opens in new tab), topics (opens in new tab)journals (opens in new tab) and conferences (opens in new tab). In this series, we present analytic insights about current conferences, which we hope will help you prepare for attending each event. All the insights within are derived from the Microsoft Academic Graph and visualized in Microsoft Power BI. You can generate your own insights by accessing the Microsoft Academic Graph through the Academic Knowledge API (opens in new tab) or through Azure Storage (opens in new tab) (please contact us (opens in new tab) for the latter option). If you would like to learn how we generated the insights below, please see the repository with source code (opens in new tab).

In this post, we present historical trend analysis about the conference EMNLP–Conference on Empirical Methods in Natural Language Processing (opens in new tab), taking place in Brussels, Belgium from October 31-November 4, 2018. We derive insights from 1996 to the latest available year.

Click on each image for current trends and data hosted by Microsoft Academic Graph (opens in new tab).

EMNLP paper output

The chart below shows the evolution of number of conference papers for each conference year.

Conference Papers for Each Conference Year (opens in new tab)

In the following chart, the black bars represent the average numbers of papers each conference paper references for each conference year. The data show that recent publications tend to reference more papers. The green bars show the average numbers of citations received by a conference paper for each conference year. Note that the citations are raw counts and are not normalized by the age of publications. This is because normalizing citation counts turns out to be a nontrivial problem and may well be application dependent. Please treat the raw data presented as an invitation to conduct research on this topic!

2 Average reference/citations per paper (opens in new tab)

*Average Citations: Average number of citations an EMNLP paper received for a given conference year

*Average References: Average number of references an EMNLP paper references for a given conference year

The notable citation spike in 2002 is mostly due to Pang, Bo, et al. “Thumbs up? Sentiment Classification Using Machine Learning Techniques.” Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, 2002, pp. 79–86, (opens in new tab)with 7,817 citations.

The chart below shows the citation distribution among EMNLP papers.

3 Citation Distribution

Diving in deeper, 12.5 percent of the conference papers received at least 50 citations. As these conference papers contributed 79.04 percent of the total citations received by the conference, the Pareto principle (opens in new tab) roughly applies.

4 Conference Paper/Citations Recevied

Memory of references

How old are papers cited by EMNLP papers? Follow a given year’s column to see the age of papers cited in conference papers published that year. For example, in 2016, EMNLP papers collectively cited 1,131 papers published in 2015, 782 papers published in 2014, and so on.

5 Memory of References/Conference paper year (opens in new tab)

*If some years appear to cite publications from the future, it is likely due to two scenarios. First, they cited papers that are published in journals later. Second, they cited books. When a new edition of the book appeared, it replaced the previous one in the Microsoft Academic Graph, and the citation appears to be from the future. In this representation, we remove all instances of papers citing papers more than one year in the future to generate a cleaner view.

Outgoing references

What venues do EMNLP papers cite?

The bar chart shows the top 10 venues cited by EMNLP papers. ACL, EMNLP, NAACL and Computational Linguistics emerge as the top 4.

6 Top Referenced Venues (opens in new tab)

The 100 percent stacked bar chart below shows the percent of references given by EMNLP papers to each of the top 10 venues, year by year.

7 Referenced Venues Over Time (opens in new tab)

Incoming citations

What venues cite EMNLP papers?

The bar chart below shows the top 10 venues that cite EMNLP papers.  Again, ACL is at the top, followed by EMNLP and NAACL. See the table for year-by-year details of citations coming from each of the top 10 venues.

8 Top Citing Venues (opens in new tab)

The 100 percent stacked bar chart below shows the citation distribution from the top 10 citing venues, year by year.

9 Citing Venue Over Time (opens in new tab)

Most-cited authors

Who are the most-cited authors of all time by EMNLP papers? The chart below ranks the most-cited authors by using number of publications cited by the conference and number of citations received from the conference. Authors do not have to have published in EMNLP to appear on this chart.

10 Most-Cited Authors (opens in new tab)

Who are the rising stars among the top-cited authors in EMNLP? The 100 percent stacked bar chart below shows the citation distribution by the top 20 authors, year by year.

11 Most Cited Authors Over Time (opens in new tab)

Top institutions

The bubble chart visualizes the top institutions at EMNLP by citation count. The size of the bubble is proportional to the total number of publications from that institution at EMNLP.

12 Top Institutions (opens in new tab)

Get the most current data and also explore the top institutions at the conference in more detail by clicking the chart. Once on the underlying Microsoft Power BI report, click on a column to rank the top institutions by publication or citation count.

13 Top Institutions (opens in new tab)

Top authors

The next three charts show author rankings according to different criteria.

The bubble chart displays EMNLP authors ranked by citation count, with bubble size being relative to publication count.

14 Top Authors (opens in new tab)

Get the most current data and also explore the top authors at the conference in more detail by clicking the chart. Once on the underlying Microsoft Power BI report, you can also explore the top conference authors in more detail. Click on a column to rank the top authors by Microsoft Academic rank, publication, or citation count.

15 Top Authors (opens in new tab)

The bubble chart below visualizes author rank, which is calculated by Microsoft Academic by using a formula that is less susceptible to citation counts than similar measures. The X axis shows author rank. The higher an author’s rank, the closer they are to the right side. The Y axis normalizes the rank by publication count and enables us to identify impactful authors who might not have had a very large number of publications. The closer an author is to the top, the higher their normalized rank. Of course, the area of the chart that represents the highest rank is the top right corner.

16 Top Authors by Microsoft Academic Rank (opens in new tab)

We hope you have enjoyed the analytic insights into this conference made possible by the page to learn how you can use our knowledge graph to generate your own custom analytics about an institution, a topic, an author, a publication venue, or any combination of these.

As always, we would like to hear from you either through the feedback link at the bottom right of the website (opens in new tab), or on Twitter (opens in new tab). You can also find our project home page with this blog on the Microsoft Research site at aka.ms/msracad (opens in new tab).