Summary of the Data

1. Summary of C-JAS (Data)

C-JAS is the abbreviation of "Corpus of Japanese as a Second Language", and it is the spoken corpus of learners of Japanese as a second language. Its intended audience is Japanese language teachers, as well as those who are interested in and/or doing research on foreign language learners of Japanese.
This corpus has four features as follows:

(1) It includes oral task data on specific learners of two different native languages, collected over a 3-year period.

(2) It includes natural conversation data collected for the purpose of grammar acquisition studies.

(3) It can be used online, as the search system with the corpus is available.

(4) Tags are attached to indicate learners' misuse of syntax, grammar, pronunciation, etc.

 

2. Summary of the Target Learners

Table 1 shows the learners' gender, native language, age during research, and learning environment. All 6 learners are classroom learners who learned Japanese at the same language school during the same period of time. The details are as follows.

Table 1. Summary of Target Learners

  Gender native language age during the research period learning environment
C1 Woman Chinese 25~28 years old 1st research period:Student of the Japanese language school
3rd~4th research periods:first year of college (nursing school)
5th~8th research periods:second year of college
C2 Woman Chinese 20~23years old 1st research period:Student of the Japanese language school
2rd~5th research periods:First year student of junior colloege (Japanese literature)
6th~8th research periods:second year student of junior colloege
C3 Woman Chinese 22~25years old 1rd~2th research periods:Student of the Japanese language school
3rd~4th research periods:College research (auditing) student (department of commerce)
6th~8th research periods:first year student (of department of commerce of another college)
K1 Man Korean 21~24years old 1rd~2th research periods:Student of the Japanese language school
3rd~4th research periods:Student of another Japanese langauge school
5th~8th research periods:first year of vocational school
K2 Man Korean 18~21years old 1rd~2th research periods:Student of the Japanese language school
3rd~4th research periods:First year student of college (engineering deparment)
5th~8th research periods:Second year student of college
K3 Woman Korean 21~24years old 1~3期:Japanese language school (quitting after the 3rd semester)
4rd~5th research periods:house wife doing a part time job
6th~8th research periods:first year of college (department of commerce)

 

3. Period of Data Collection

The data was collected from July, 1991 to March, 1994. Eight research investigations were conducted on each learner. Each investigation lasted about 60 minutes and took a conversational format. Each data set was labeled according to the research period (1st - 8th research periods). C1 is the only one missing the second (*1) data set; therefore, there are a total of 47 data sets. The second research period of K1 was 30 minutes. Table 2 below shows the content of each data and with the period of time during which the research was conducted.

Table 2 Data Contents and Research Dates

Native speaker of Chinese Native speaker of Korean
C1 C2 C3 K1 K2 K3
C1‐First research(’91/7/24) C2‐First research(’91/6/27) C3‐First research(’91/8/22) K1‐First research(’91/9/9) K2‐First research(’91/7/10) K3‐First research(’91/9/12)
*1 C2‐Second research(’92/5/1) C3‐Second research(’92/3/15) *2  K1‐Second research(’92/2/24) K2‐Second research(’91/12/4) K3‐Second research(’92/3/13)
C1‐Third research(’92/8/5) C2‐Third research(’92/7/19) C3‐Third research(’92/7/16) K1‐Third research(’92/7/22) K2‐Third research(’92/7/17) K3‐Third research(’92/7/5)
C1‐Fourth research(’92/12/20) C2‐Fourth research(’92/11/30) C3‐Fourth research(’92/11/23) K1‐Fourth research(’92/12/21) K2‐Fourth research(’92/12/5) K3‐Fourth research(’92/11/29)
C1‐Fifth research(’93/4/26) C2‐Fifth research(’93/3/2) C3‐Fifth research(’93/3/21) K1‐Fifth research(’93/4/20) K2‐Fifth research(’93/4/2) K3‐Fifth research(’93/3/18)
C1‐Sixth research(’93/7/27) C2‐Sixth research(’93/7/16) C3‐Sixth research(’93/8/2) K1‐Sixth research(’93/7/27) K2‐Sixth research(’93/8/31) K3‐Sixth research(’93/8/22)
C1‐Seventh research(’93/12/12) C2‐Seventh research(’93/12/16) C3‐Seventh research(’93/12/29) K1‐Seventh research(’93/11/27) K2‐Seventh research(’93/12/27) K3‐Seventh research(’93/11/11)
C1‐Eighth research(’94/3/9) C2‐Eighth research(’94/3/8) C3‐Eighth research(’94/3/8) K1‐Eighth research(’94/3/10) K2‐Eighth research(’94/3/4) K3‐Eighth research(’94/3/12)

 

4. Data Contents

【Research】7-8 times per learner (about 60 minutes for each research)

【Data volume】47 (total of about 46 hours 30 minutes, about 570,000 words)

【Research Methods】Free conversation with native speakers of Japanese (common topics for each research period.)

【Method of publishing the data】Plain text (text files)

Online search(Online Reference Tool "Chunagon")

 

5. The Materials Used

The following textbook was used by all learners at their Japanese language school:

『NIHONGO SHOHO (Elementary Japanese)』The Japan Foundation, Japanese Language Institut

 

6. Interview Topics

The common topic was decided in advance for each research conducted 8 times, and it took the form or casual conversation, including the topic, with a native speaker.
The common topic for each research period is as follows:

First research: Memory of the elementary and junior high school teachers

Second research: Looking back the year of study abroad in Japan

Third research: My Japanese friends

Fourth research: My school life

Fifth research: About Japanese people

Sixth research: The way to spend our holidays

Seventh research: Clothing, food, residence in Japan

Eighth research: Looking back my three years in Japan