This paper classifies six publicly availablebiomedical corpora according to variouscorpus design features and characteristics.We then present usage data forthe six corpora. We show that corporathat are carefully annotated with respectto structural and linguistic characteristicsand that are distributed in standard formatsare more widely used than corporathat are not. These findings have implicationsfor the design of the next generationof biomedical corpora.
展开▼