Address: 10-2 Midori-cho, Tachikawa City, Tokyo 190-8561 Tel. +81-42-540-4300 / Fax +81-42-540-4333
© National Institute for Japanese Language and Linguistics
The Gen-Nichi-Ken Corpus of Workplace Conversation was published on August 20, 2018
via the online corpus search application
Chunagon. This corpus has been created based on the transcripts obtained
through the two research projects described below, namely "Josei no Kotoba: Shokuba
Hen (Language of Women at Work)" and "Dansei no Kotoba: Shokuba Hen (Language of Men
at Work)."
These projects were pioneering attempts made in the 1990s, in
which research cooperators recorded their workplace conversations by themselves to
collect natural conversations, which have received high praise as groundbreaking
data. A total of 22 research articles utilizing these corpora are published in "Gappon Josei
no Kotoba/Dansei no Kotoba: Shokuba Hen (Language of Women at Work and Language of
Men at Work in One Volume)" (edited by Gendai Nihongo Kenkyukai and published
by Hituzi Syobo in 2011). These research projects have revealed that gender-specific
expressions, which are often referred to as "men's language" and "women's language,"
and words that are more frequently used by men or women, have been becoming less and
less apparent today. At the same time, the realities of spoken Japanese, different
from those of written Japanese, have been clarified in many different ways. For the Corpus of Everyday Japanese Conversation,
which is now being developed, everyday conversations have been recorded by research
cooperators themselves, with reference to how earlier studies like the above have
been implemented. Gendai Nihongo Kenkyukai has also continued to collect and study
everyday conversations, the results of which have been published in "Danwa Shiryo:
Nichijo Seikatsu no Kotoba (Transcripts and Analysis: Japanese Daily
Interaction)" (edited by Gendai Nihongo Kenkyukai and published by Hituzi
Syobo in 2016).
Gendai Nihongo Kenkyukai carried out a research project in September and October 1993, in which 19 working women (in their 20s to 50s) in the Tokyo metropolitan area participated as research cooperators. The participants recorded their natural conversations in their respective workplaces. These conversations were recorded with recorders hung around the necks of cooperators or placed near them. These 19 cooperators were all working in different workplaces. Each person recorded one hour of conversation in the morning after arrival in their workplace, one hour of meetings, and one hour of break time, from each of which about 10 minutes of consecutive conversation was extracted and transcribed. "Josei no Kotoba: Shokuba Hen (Language of Women at Work)" (edited by Gendai Nihongo Kenkyukai), published by Hituzi Syobo, includes a CD-ROM containing the transcripts and 10 research articles based on them. (This book is now out of print. Consult the combined edition described below.)
Gendai Nihongo Kenkyukai carried out a research project from October 1999 through December 2000, in which 21 working men (in their 20s to 50s) in the Tokyo metropolitan area participated as research cooperators. The participants recorded their natural conversations in their respective workplaces. These conversations were recorded with recorders hung around the necks of cooperators or placed near them. These 21 cooperators were all working in different workplaces. Each person recorded one hour of conversation in the morning after arrival in their workplace, one hour of meetings, and one hour of break time, from each of which about 10 minutes of consecutive conversation was extracted and transcribed. "Dansei no Kotoba: Shokuba Hen (Language of Men at Work)" (edited by Gendai Nihongo Kenkyukai), published by Hituzi Syobo, includes a CD-ROM containing the transcripts and 12 research articles based on them. (This book is now out of print. Consult the combined edition described below.) This research project received financial assistance from the Faculty of Language and Literature, Bunkyo University, in the form of joint research funding, from FY1999 through FY2001.
The above two books, including the CD-ROM data, were subsequently combined into "Gappon Josei no Kotoba/Dansei no Kotoba: Shokuba Hen (Language of Women at Work and Language of Men at Work in One Volume)" (edited by Gendai Nihongo Kenkyukai) and was published by Hituzi Syobo in 2011. This combined edition is still in print.
These transcripts have been offered to the National Institute for Japanese Language and Linguistics this time through the understanding and courtesy of Gendai Nihongo Kenkyukai and Isao Matsumoto of Hituzi Syobo.
The transcripts offered to the National Institute for Japanese Language and Linguistics have been analyzed using MeCab and UniDic, and the results have been published under the new name "Gen-Nichi-Ken Corpus of Workplace Conversation."
In this project, the transcripts of the Gen-Nichi-Ken Corpus of Workplace Conversation, accompanied by morphological information (short-unit information), are published via the online corpus search application Chunagon.
The Gen-Nichi-Ken Corpus of Workplace Conversation is licensed under a Creative
Commons Attribution – Non-Commercial – No Derivative Works 4.0 International
License.
For use of the Gen-Nichi-Ken Corpus of Workplace Conversation in publishing your
research results or for other publication purposes, the following literature
information must be provided:
"Gappon
Josei no Kotoba/Dansei no Kotoba: Shokuba Hen (Language of Women at Work and
Language of Men at Work in One Volume)" (edited by Gendai Nihongo Kenkyukai)
Number | Content | Possible values | Notes |
(1) | Male/Female | M, F | M: Datasourced from "Dansei no Kotoba: Shokuba Hen (Language of Men at Work)" |
F: Data sourced from "Josei no Kotoba: Shokuba Hen (Language of Women at Work)" | |||
(2) | Cooperator code | 01, 02, ... | These are the same identification codes as those of research cooperators in the original data |
(3) | Scene 1 | A, K, Q | "Asa (Morning)," "Kaigi (Meeting)," and "Kyukei (Break)" in the original data |
(4) | Scene 2 | 01, 02, ... | New serial numbers |
Scene 1 | "Morning," "Meeting," or "Break." |
---|---|
Scene 2 | Subcategories of <Scene 1>. |
Cooperator code | Identification codes of research cooperators (who recorded the conversations). |
Speaker code | Identification codes of speakers. |
Gender | Genders of speakers. "Male," "Female," or "*." "*" means "unknown" or "no information" (The same applies to other items as well). |
Age group | Age groups the speakers were in at the time of the research. 10-year increments. |
Occupation | Occupations of speakers. May be indistinguishable from "job category" in some cases. |
Job category | Job categories of speakers. May be indistinguishable from "occupation" in some cases. |
Title | Job titles of speakers. Enter "(None)" when the face sheet says that the speaker does not have a job title. |
Home prefecture | Prefectures from which the speakers come. |
Place of longest residence | Prefectures in which the speakers lived longest between the ages of 4 and 15 years (which are not necessarily prefectures in which they spent their formative years for language learning). |