EvaluationsonTEM—4OralTest

来源 :教师·中 | 被引量 : 0次 | 上传用户:dengjuanjuan8288
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Abstract: TEM-4 Oral Test is designed for second-year English majors in China to measure their English speaking skill. In this article, the author first gave an introduction to the purpose, format and content, and scoring criteria and procedures of the TEM-4 Oral Test. Then she evaluated the test on the aspects of validity, reliability and backwash. Though the test is feasible at present, it still has some problems in implementation. In the second section, the author also gave some advice on the improvement of the test.
  Key words: language testing; TEM-4 Oral Test; validity; reliability; backwash
  Ⅰ Introduction of TEM4 Oral Test
  1The purpose of TEM4 Oral Test
  Test for English Majors—Band 4 (TEM4) is a national examination for students in the first two years of a degree program in English and literature in P. R. China. It’s an achievement test that has been made compulsory for English language majors all over China. The objectives of TEM test are:
  To evaluate English language teaching and learning at the end of the foundation stage and at the end of the four-year undergraduate program in the light of the national teaching syllabuses; and to bring about beneficial backwash effects on teaching and learning. (Zou, 1998:1)
  TEM was “originally regarded as a means to check the implementation of the national teaching syllabus” and “it was agreed that the test content should reflect the requirements in the syllabuses” (Zou, 1998:2). Now TEM is widely taken by English majors as proof of English proficiency.
  As a subtext of TEM4, the TEM4 Oral Test is designed for second-year English majors in China to assess their spoken English in different types of situations and a wide variety of topics. According to the requirements for second-year English majors in the teaching syllabus (translated from the syllabus, 2000), candidates have to:
  (1)Be able to ask and answer questions, retell and discuss based on given material, and give a 3~4 minutes’ presentation after 1~2 minutes’ preparation time.
  (2)Be able to communicate with people of English speaking countries on daily and social subjects.
  2Test Format and Content
  TEM4 Oral Test is in the form of recording (semi-direct) which consists of three tasks, each involving a particular speech activity.
  Part One is retelling, which requires the candidates to listen to a passage with approximately 300 words. The passage may be a story, a narration of an experience or an anecdote. The candidates have to start their retelling immediately after they have heard the passage twice. This task lasts about 6 minutes.   Part Two is a talking based on a given topic. The candidates are very often given a topic related to the general theme of the passage they have heard previously in Part One. For example, the passage may describe someone’s miserable life in his/her early childhood, and the topic the candidates are required to talk about is one happy incident in their childhood. After they have heard the given topic spoken on the test tape twice, the candidates have three minutes to prepare and then talk in another three minutes. So this part lasts about 6 minutes.
  Part Three involves two candidates. Each of them will get a sheet on which contains a specific situation and a specified role they are expected to play. Although the situation is the same for the two candidates, their roles are different. For example, the given situation may be borrowing money. Students A wants to borrow 20 dollars from students B. Students A must try all the best to persuade student B to lend him/her money while student B must refuse A’s request with politeness. The preparation time is three minutes. The conversation is limited to four minutes.
  TEM4 Oral Test requires the student to take part in various kinds of oral communication. The score will be given according to their performance. TEM4 Oral Test is designed to test not only language competence including pronunciation and intonation, fluency, grammatical and lexical features, but also communicative competence including discourse and pragmatic ability. Among which, discourse ability refers to the capability to produce extended and coherent speech while pragmatic ability involves flexibility in dealing with different situations and topics, and appropriacy of language use in a specific context. The first two parts of TEM4 Oral Test concern more about language competence and the third part concerns more about communicative competence.
  3Scoring Criteria and Procedures
  According to the classification of test, TEM4 Oral Test is criterion-related normreferenced test. It aims at testing the candidates’ oral proficiency level with scientific scoring criteria and procedures. The TEM Committee has set criteria for candidates’ oral language ability with four scales: excellent (4 points); fairly good (3 points); pass (2 points) and fail (1 point). To insure fairness a-nd reliability, the scorers should assess in the light of the following criteria (translated from Syllabus for TEM4 Oral Test (2005)):
  Based on the scoring criteria, two scorers each test the answer tape independently; neither knows the scores assigned by the other. Candidates’ scores are produced from the combined average of these independent scorings. Pre-training is necessary to ensure intra-scorer consistency and interscorer consistency. If the two scorings do not show adequate agreement, the tape is scored by another two independent scorers. After the final score is worked out, candidates’ scales will be decided according to their rank orders in their groups and the theoretical norm of the whole candidates’ scores. The scale distribution in recent years is that excellence accounting for about 3%, fairly good 145%, passed 59% and failed 23% (Wen, 2003:368—373).   Ⅱ Evaluation of TEM4 Oral Test
  1Validity
  (1)Content validity.“A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc. with which it is meant to be concerned.” (Hughes, 1989:22)
  Syllabus for TEM4 Oral Test (2005) specifies that the test is to evaluate English language teaching and learning. The scope of the test should not be beyond the requirement for second-year English majors in the teaching syllabus.
  Part One and Part Three meet the first requirement of the teaching syllabus “be able to ask and answer questions, retell and discuss based on given material”. Part Two, though seems to have no correlation with the teaching syllabus, the second requirement “be able to communicate with people of English speaking countries on daily and social subjects” actually covers it. The feasibility of the test does not allow the real communication with native speakers, but the testing content basically corresponds to the requirements of the teaching syllabus. So TEM4 Oral Test has content validity.
  (2)Face validity. “A test is said to have face validity if it looks as if it measures what it is supposed to measure.” (Hughes, 1989:27)
  The face validity in Part One retelling should be questioned, because the speaking performance of the candidates much depends on their listening skill and memory.
  In Heaton’s (1988) view, “the skills of listening, speaking, reading and writing are also separated from one another as much as possible because it is considered essential to test one thing at a time.” (P15) He also said that “the test must aim to provide a true measure: to the extent that it measures external knowledge and other skills at the same time, it will not be a valid test.” (P159)
  It should be emphasized that it is an oral test which is supposed to only measure candidates’ speaking skill. However, Part One not only measures candidates’ speaking skill but also their listening skill and memory. A good speaker may have poor ears. Or even he/she is good at listening, he/she may have a bad memory. Both conditions cannot make him/her retell the listening material in detail. Even the standard of the pass band is that the candidate can retell the important information of the listening material though not orderly.It means that if the candidate speaks fluently and accurately in grammar but changes the content of the story even he/she makes it vividly, he/she still cannot pass in this part. It is unfair because he/she shows a good speaking ability.   However, Wen and her doctoral candidate Wang (2009) believed that the task has a relatively strong communicative value. They argued that people often need to tell others what they saw on TV or what they read in a newspaper in real life; in some special circumstances, such as court testimony, they also have to report others’ speech. Wen and Wang also said that a good memory is an essential qualification for interpreters.
  The author agrees that Part One can test communicative competence, but the test is defined as an oral test not an integrative one. It is not advisable to use the score influenced by other factors to judge the candidates’ speaking skill.
  (3)Construct validity. “A test, part of a test, or a testing technique is said to have construct validity if it can be demonstrated that it measures just the ability which it is supposed to measure.” (Hughes, 1989:26)
  In Part Three conversations, the candidates’ scores are counted by the number of their viewpoints. The problem is that some students come up with many points without further explanation while others may only give two points but analyze each profoundly. Thus, it seems that the scoring criterion is not fair to the latter.
  2Reliability
  Hughes (1989) gives many advices to make the oral test more reliable:
  “Make the oral test as long as is feasible. It is unlikely that much reliable probably provide all the information necessary for most purposes.”(P105) The whole TEM4 Oral Test lasts about 19 minutes could basically meet the requirement of length.
  “If possible and if appropriate, more than one format should be used.” (P105) TEM4 Oral Test has three different formats; however, each of them has some problems.
  The problem of Part One has mentioned above
  In Part Two candidates are required to narrate their own experiences. However, some candidates may feel embarrassed to share some kind of experience, and perhaps it will affect their performance. For example, the topic given to the candidates is “an unforgettable experience”. To some candidates, it may be a shameful experience and they do not want others to know it, so they may waste time in thinking about whether to tell the experience or another story instead, or they may omit some details to not make themselves sound like a fool and they may feel so nervous in the process that they may not perform well in the test. An alternative topic “a happy incident” may be a better one because nearly everyone would like to share a happy experience with others. Though as Bradshaw (1990) observed “very little is at present know about the attitudes and concerns of test-takers, and even less about the features of test items or the testing situation which can cause negative reactions which may affect performance or lead to feelings of post test resentment” (P13), the affective factors need to be put into consideration. Hughes also said that leaving candidates alone to prepare a monologue must create stress. (P110)   Another problem in this part is security. It is possible or easy for candidates to adapt an answer they had prepared or memorized before the test because the scope of the topics in this part is one’s own experience, candidates can guess the possible topics before the test, then find out some good sample answers from the training materials offered in the educational market and change the content a little bit or write a speech draft themselves with sufficient preparation, and the last thing they need to do is to recite the answer in this part. So the author thinks that giving the candidates a topic which they are very familiar with may not be a good idea. But giving an unfamiliar topic may increase the task difficulty. Maybe the task can be changed into “talk based on a 4-frame comic”. However, whether candidates can understand the comic may become another question.
  Part Three is a pair-work. Two candidates have to interact with each other. Here comes the problem of interaction. “The interactive nature of speech and the level of personal involvement which even formal speaking will lead to meaning that it is extremely hard to eliminate the effects of one speaker on another.”(Hughes, 2005:79) The cooperation between the two candidates may become a problem if there is a large gap in their language competences. The good one may give many hints to help the poor one to talk, but the poor one may fail to continue the dialogue, so the poor one may hinder the good one from having a nice talk. Or the good one may take too many turns for the poor one to have enough chances to speak. And the scoring scale does not give a clear description on how to score a candidate grabbing of his/her partner’s chance to speak. If the candidate get a high score only because he/she speaks more than his/her partner, it is quite unfair.
  In recent years, the task of conversation has become a debate. The author took the test herself in 2007, and she and her partner were required to discuss whether an undergraduate should take a job as a hotel bellboy with a high salary. And one is designated to support the proposal, and the other is to oppose. The problem is that sometimes supporting and opposing a proposal does not have the same difficulty. Maybe it is easy to collect arguments to support, but it is not the same case to oppose. So the role of a candidate may decide whether he/she has an advantage in the task. What is the worse, the scorers’ subjective thought may affect their judgment. The author heard that once a judge gave a higher mark to the proposition than the opposition in a national English debate contest just because she supported the proposal herself. But nearly all the audience thought the opposition performed better. She made an apology later but the result could not be changed. How it is to be sure that such a thing will not happen in the test scoring? At least, the scorers need to be trained to give an objective score.   The paired format may have another problem. That is, if the partner is absent, how will the candidate to finish Part Three? The author still remembers the situation just before she took the test. She was told that her test partner was taken into hospital for a sudden disease a few days ago and there was no substitute. She was very anxious because if her partner could not come, she could only get the scores of the first two tasks and the total score could not make her pass the test. Luckily, her partner came on time. However, her mood could not be calm as usual and her performance in the test was not as well as she was supposed to be.
  “Use a second tester for interview.” (P106)
  “As a general rule, and certainly where testing is subjective, all script should be scored by at least two independent scorers. Neither scorer should know how the other has scored a test paper. Scores should be recorded on separate score sheets and passed to a third, senior, colleague, who compares the two sets of scores and investigates discrepancies.” (P42)
  The TEM4 Oral Test fully meets this requirement. However, there are other problems in scoring.
  The scale description is ambiguous, such as “A few obvious grammatical mistakes”. How many grammatical mistakes can be defined as “a few” and how much can be defined as “many” or what kind of grammatical mistakes is seriously not clear. Therefore, scorers will give their judgment according to their own teaching experience and that causes the discrepancy in scorings. In Wang’s empirical study (2007:52), some scorers thought that grammatical mistakes more than three are considered to be many, while others believed that more than five is unbearable, and even some scorers did not count the mistakes and instead gave the score by intuition. Then there is little reliability in scoring.
  Another problem in scoring is that though the TEM4 Oral Test is norm-referenced in order to have a high discrimination, it is possible that a candidate scaled “fairly good” in one group may be scaled “excellence” in another group or even the candidates in one group are all high level, there are still some of them who can not pass the exam. The result could be unfair.
  The condition of test equipment may also influence the text reliability.The whole process of the TEM4 Oral Test will be recorded by computers. Each candidate has a file which will be sent to the examination center after the test is finished. The equipment will be checked before the test, but sometimes it still gets some problems.   The recording may have some noises and they cover the candidate’s voice (some candidates are so shy that they cannot speak aloud). And sometimes the candidates may not put the microphone in a right place, and their breath can be clearly heard in the recording. These make the scorers very hard to hear the candidates’ speech clearly.
  And no one can make sure that the equipment always works well in the test. In 2006, a candidate in the author’s university did not get the score just because his voice had not been recorded because of the technological problem. And dramatically, he got the highest mark in the TEM4 written test in the author’s university that year. So you can imagine that if the computer had not broken down, he perhaps would also have got a high mark in the oral test. However, he was not allowed to take the text again. It is a pity that he hasn’t got the oral test certificate.
  3Backwash
  The oral test is separated from the written test of TEM4 which is focusing on listening, reading and writing. And the oral test is non-obligatory. English majors may not take the oral test. So it does not matter if they choose to take the test and not pass. But if they fail in the written test of TEM4, they cannot get their bachelor degree as an English major and will hardly find a job in the related field.
  The negative backwash of it is that the speaking skill may be ignored in teaching. Education in China is still exam-oriented. Teachers and students prefer to pay much attention to the practice of the other three skills. Some universities may even stop the normal teaching work to prepare the students for the TEM4 written test and even help the students cheat in the test!
  Credibility and order are more important than ability. And on the other hand, the speaking skill must be paied more attention to. In the global era, the international communication becomes more and more frequent and not limited to written form. People from different countries need to talk with each other face to face. Many activities cannot be done without speaking. English speaking proficiency is an important qualification not only for English majors but also for other majors. So the author insists that the English majors must hold at least one certificate to prove their English speaking proficiency.
  “While we should not teach ‘towards’ a test, we can use tests as teaching tools.” (Brown, 1994:266) Wen (2001), Y. Huang (2002), H. Huang (2004) and Wang
其他文献
摘 要:文章通过作者实际的篮球专业知识与教学经验,利用文献资料法,对青少年在篮球快攻中应该掌握的理论知识以及训练中要注意的细节进行分析,并根据快攻的概念特点、战术形式、组成结构以及发动时机,全面地说明了快攻所应当具备的各个要素。最后的结论与建议皆为青少年的快攻训练提供参考。  关键词:篮球运动;青少年;快攻战术  1.快攻的概念及其特点  快攻是最常见的篮球进攻手段之一,由于攻守转换时,对方未及时
摘 要:在国家重视创新创业教育的情况下,为实现“学生的创新精神、创业意识和创新创业能力明显增强,投身创业实践的学生显著增加”目标,高校教师创新创业教学能力、创业服务指导能力等更被关注。文章在剖析创新创业教育内涵、界定高校青年教师实践能力概念的基础上,分析了创新创业教育与高校青年教师及其实践能力的关系。  关键词:创新创业;高校;青年教师;实践能力  中图分类号:G645.1 文献标识码:A 文章编
随着"精准医疗"概念的提出,治疗和诊断药物的精准靶向递送研究也成了纳米医学研究的前沿课题,并进入"精准"靶向纳米递药系统设计时代。针对危害人类健康的疾病的病灶特征,设
“语文教学应在师生平等对话的过程中进行”。基于这样的课程理念,笔者以为,只有构建尊重主体、师生平等、民主和谐的课堂平台,让民主教学植根语文课堂,才能全面提高学生的语文素养,促进学生的和谐发展。  一、尊重学生主体,促进个性发展  1要还给学生学习的自主权  每一个学生都蕴藏着一定的学习潜能,都拥有自主学习和发展的权利。因此我们必须充分尊重学生的个性,把学习的自主权还给学生。如在作文教学中,允许和鼓