08.05.2017 11:05

Plazi 2016 publications mining statistics


2016 has been the first full year when Plazi had a fully automated data mining process implemented and running. This includes the export of the illustrations to the Biodiversity Literature Repository (BLR) and bibliographic references to RefBank.

The following statistics of the data mining activities cover articles published in 2016.

  • Number of articles mined: 3,223 (total in TreatmentBank: 19,969) 
  • Number of journals covered 74 (609)
  • Number of pages 61,826 (445,764)
  • Number of figures 16,420 (total exported to BLR: 109,130)
  • Number of bibliographic references 133,457 (750,218)
  • Number of treatments 43,767 (194,619)
  • Number of n.sp. 5,095 (39,814)
  • Number of nov. comb. 753 (4,025)

Data processing is either based on TDM based on PDF articles, or processing of born digital articles based on TaxPub Journal Archival Tag Suite. Statistics are available for articles or treatments. Data analysis: May 8, 2017.