In anticipation of the development of the “Diachronic Corpus” to be built at the National Institute for Japanese Language and Linguistics, basic study is being conducted on diachronic corpus design. Based on typical materials from several periods ranging from ancient times to early modern times, an experimental model of the “Diachronic Corpus” will be created, and at the same time, research focused mainly on the following three points will lead to the actual construction of a partial corpus.
(1) The grounds for selecting materials for the corpus
(2) How classical texts are digitized, and what kinds of information (variant texts, text notations, variant characters, quotations, writing styles, etc.) are added
(3) How morphological analysis corresponding to the vocabulary and the grammar of each period and writing style is conducted