31 July 2018
The ACL 2018 programme chairs performed some quantitative analyses on the submissions, to obtain some insights into the accepted papers and the research trends of ACL 2018.
First, we analyse the correlation of the overall score with some review variables. The top three variables are substance, soundness and originality. There is a negative correlation with the reviewer confidence, though not very high. This negative correlation is because many rejected papers receive low overall scores but high confidence.
The next figure shows the distribution of the overall review scores among the accepted/blue and rejected/red papers. Most of the accepted papers have received scores of 4 and 5, while there were a significant number of rejected papers among those with the score of 4 and even some with the score of 5.
The next figure presents the distribution of the Originality scores. We find that it is roughly the same number of papers with the score of 3 being accepted or rejected. For the score of 4, around two-thirds get accepted and the remaining one-third are rejected. Most of the rejected papers have low originality.
The next figure presents the distribution of the Readability scores. We cannot see much difference for accepted and rejected papers in terms of their readability scores.
Next, we present some word cloud based analyses. We use the Venncloud to generate the comparison word clouds. We first looked into the words in the titles of the accepted and rejected papers in ACL2018. The left/blue column presents the words that appear often in the accepted papers but not very often in the rejected papers, the right/red column presents the words that appear often in the rejected papers but not very often in the accepted papers, and the middle/black column presents the words that appear often in both accepted and rejected papers.
Word clouds showing most frequent words. Left: only accepted papers; Middle: accepted and rejected; Right: only in rejected papers
Finally, we compare the words in the titles of the accepted papers of ACL2017 and ACL2018 in the next figure. We find attention, network, knowledge, sequence and language are popular in both 17 and 18. In addition, we find sentence, embeddings and sentiment are not so popular in 2017 but becomes more prominent in 2018.
Word clouds showing most frequent words. Left: only ACL 2017; Middle: ACL 2017 and 2018; Right: only in ACL 2018