Validity Study on Multiple-choice Question of National Matriculation English Test(NMET)

来源 :中外教育研究 | 被引量 : 0次 | 上传用户:ehvv5022
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  【Abstract】Based on the advanced theories of foreign experts and some specific examples of domestic testing paper and through the analysis of the advantages and disadvantages and washback effect of multiple-choice questions, the author concludes that the negative side of the multiple-choice question outweighs the positive side and it produces a extreme negative washback on English teaching and learning, and it is of low validity. Then some changes are suggested to be made to NMET in future so as to better guide English teaching and testing for secondary school and foster students’ ability of applying English.
  【Key words】Validity Multiple-choice NMET Washback effect
  【中图分类号】G642.47 【文献标识码】A 【文章编号】1006-9682(2008)12-0019-03
  
  Ⅰ、Introduction
  
  China is a country with centralized educational system, whose National College Entrance Examination(NCEE hereafter)is officially utilized as the only authoritative selection device for higher education to choose talents. With the growing awareness of the importance of masteringEnglish forChina’sopeningandreformpolicyandthe young individuals’ need to partake a global economy and information network, English has been authorized as a compulsory subject examined in NCEE, i.e. National Matriculation English Test(NMET hereafter). The function of NMET is to test the language ability of candidates(no matter what curriculum the candidates learn in school before). The contents of the examination were not learned from textbooksbycandidates,itistotestcandidates’proficiencyof language and thus to know whether the candidates’ English proficiency has reached the requirements of university studies(Huges, 1989:9). At the same time, it impacts on the contents and methods of English teaching in secondary schools , and it even plays a role like baton on English teaching in secondary school.At present, NCEE is extremely concerned by the whole society, which makes the schools be under tremendous pressure and makes the examination be the center tasks of schools. The scores students get in the NCEE has naturally become an important basis for measuring the teaching level. This leads to the result that what NCEE tests is what the teachers teach and what the students learn.
  NCEE should be an integrative testing which can directly test students' 'language ability of listening, speaking, reading and writing . As such an important exam, it must maintain absolutely objective and fair, so that it must be tight to ensure error-free, at the same time, the answer must be the only standard answer (criterion-referenced).
  
  Ⅱ、Testing Validity
  
  Validity is the quality which most affects the value of a test, prior to, though dependent on, reliability. The validity of a language test is established by the extent to which it succeeds in providing an accurate concrete representation of an abstract concept, for example, proficiency, achievement, or aptitude.(Davies, A. et al, 2002:221)
  Validity is an extent to which a test measures what it claims or intends or supposes to measure, or the extent to which the result of a test of a group of test takers associate with a particular criterion. From Validity as test-criterion correlation, validity consisting of different types: face validity, predictive validity, concurrent validity, construct validity, content validity.
  Face validity means whether the test looks like what it is supposed to test. So it is also called surface credibility or public acceptability. Direct tests have higher face validity than indirect tests. Qualitative measures are used to determine the face validity of a test, such as public opinion and expert judgment.
  Predicativevalidity,tosomeextent,thetestcanpredict the students’ performanceinfuture.Thetwotestsundercomparison should have a time span. As far as placement test is concerned, predictive validity is very important.
  Concurrent validity means comparison between a test and an established test. The new test and the established one are given to the same group of students and data is analyzed through quantitative measures(.correlation:-1 -- +1)A correlation coefficient of more than
  0.7 is satisfactory.
  Response validity, the tasks of a test should elicit the right performance of the students. Sometimes students use test-taking strategies rather than real language skills. So response validity concentrates on the process of test-taking, rather than test results.
  Construct validity, in language testing, it refers to language ability / competence(trait)under question. It is something very abstract, which includes proficiency, strategy, skills, etc., and is related to score interpretation.
  Content validity is about how much the test can measure the knowledge and abilitiesthetesttakersshouldattain.Itrefersto representativeness of the content selected for assessment. It also asks for adequate sampling(sufficient number of items)of the content selected. Content validity can be assessed through expert judgment / opinion and study of test specification.
  
  Ⅲ、Vadity Analysis on MCQ in NMET
  
  At present NMET includs two parts, the objective and the subjective, with total points of 150. The objective part accounts for 115 points, the subjective one accounts for 35 points. The latter is only one third of the former. The objective part are all multiple
  -choicequestions ( MCQ ) .WhetherMCQcanreallytestthe comprehensive language-applying ability? The folowing will focus on analysis of the advantages and disadvantages of MCQ in NMET.
  First of all, what kind of examination is MCQ? How to give out the best answer and distracting items. MCQ is composed of stem, options, key, and distracters or an incomplete sentence with four options. Questions of grammar and intercommunicational expressions are used to test students’ mastery of grammar and communication ability. Reading comprehension is to test candidates’ context knowledge(ideas, meaning), background knowledge, language knowledge ( grammar,word ) andorthography ( spelling ) and candidates’abilitytodistinguishbetweennon-centralideaand central idea and details, and to sythesize comprehensive information from all paragraphs, and identify concepts and viewpoints. There are just one right answer among the four answers, the other three ditractors will appear as the following:(a)If the topic is to identify
  concepts or ideas, interference items will be: ①Contrary view, ②
  The definition, facts or ideas did not appear in the article.(b)If it is to sum up central idea,the interference items would be the supporting view or the sentence which is not of central idea in the article.(c)If the question is of comprehensive information, the interference will be:① the statement contrary to definitions, facts, views.②items that are not mentioned in the article but are often granted by the candidatesinaccordancewithcommonsense. ( Huhta,A.K. Sajavaara and S.Takhca.1993:82)
  The advantages of multiple-choice questions. First, it is of a high degree of objective and fair, high reliability. Second, it is of clear subject and simple answer. Third, easy to predict. The difficulty coefficientofeachitemorthewholetestcanbeestimatedin advance. Then it can be modified in time to make sure a perfect testing.
  The disadvantages of multiple-choice questions. First, it is an indirecttestingandcan’ttestthereallanguageabilityofthe candidates. Many Chinese students have learned English for many years and are good at doing MCQs, also with good examination results, but can not speak out and are bad at listening , thus become victims of dumb English. Weir said:“The MCQ is not a valid testing item, because in real life, people rarely use a four-selection way to express if understand or not, and we show our understanding of listening and reading by speaking and writing(Weir, 1990:44).” For example:the following is a MCQ for the use of English
  knowledge for candidates.
  21. beauty, everyone has his own view. A. In the term of B. In terms of
  C. In the eye of D. In store for
  The phrase of answer C.“in the eye of”is not existed in English , there is only “in the eyes of”. Such a test is not only of no sense, but also misleading the students’ language acquisition. The answer A, can’t be found in Oxford and Longman Dictionary: It should be replaced by “in the long term.” So the students can select the correct answer as longastheyrememberthephrase“intermsof”inthetextbooks.
  Therefore, It can not test the candidates’ ability of using language.
  In accordance with the definitionsand theories ofvalidity in PartII and through the analysis above, it can be concluded that MC is of very low validity.
  
  Ⅳ、Washback Effect of MCQ on Teaching and Learning
  
  Testing, as one of the assessment tools, plays a significant role in curriculum design and instruction, mainly because some test results are used in policy-making; The larger a test scale is, the greater impact it will produce; Moreover, the impact of testing upon teaching and learning even ripples to educational system and the society as a whole
  ( ZouShen,2005 ) .Thedifferencebetween“washback”and “impact”:A language test can have macro impacts on society or the education system, or micro impacts on individuals, and regard washback as an aspect of impact on individuals(. Bachman and Palmer:
  1996).Washback refers to how tests affect teaching and learning, impactcoverstheirbroaderinfluenceoneducationandsociety.
  (Hamp-Lyons:1997)
  In the whole education system, teaching and testing are the two most important factors, which are interrelated and interdependant, and have a relationship of what Hughes(1989)called ‘partnership’. In the perspective of the relationship inherent between teaching and examination,ithasbeenwidelyassumedandacknowledgedthat testing influences teaching. However, it is not until 1950 that some scholars intendedly set their eyes to the influence of a test on teaching and learning. At the macro level of the whole education system, for example, Vernon(1956:116) first claimed that examination distorted curriculum and teachers taught to a test. At the micro level of language education system, in the 1980s, more researchers(kellaghan et al,
  1982;Alderson, 1986;Smith et al, 1989;Pearson, 1988;Hughes,
  1989)studied the washback and most of them came to conclude that language testing exerted negative influence on language teaching and learning, such as: narrowing the curriculum(Madaus, 1988), losing instructional time(Smith et al, 1989), reducing emphasis on skills that requirecomplexthinkingorproblem-solving(Frederiksen,1984; Darling-Hammond & Wise, 1985), teaching to a test and so on. Others, however, saw washback in a more positive way. Pearson (1988:107) considered good examinations as beneficial to teaching and learning. Alderson(1986:104)evenclaimedforinnovationsoflanguage curriculum through innovation of language testing.
  The washback of MCQ testing in NCEE on teaching and learning is clearly negative. It leads to deviation of the purpose of English teaching. It is well known to all that the real purpose of learning English is to communicate with people who speak English.So the most important thing is being able to speak it out and understand what’s listened. However, NMET in NCEE misleads the candidates to believe that getting high scores in the examination shows his level of English. While about 80% of MCQs is the key of their high scores. So MCQs are naturally favored by teachers and students.
  As a result, the students consider that to learn English well is to be good at doing multiple-choice questions. And the teachers focus on teaching how to do multiple-choice questions. They spend much timeonteachingtheskillsfordoingMCQs,suchasguessing,excluding etc.. The students can do even without understanding the topic or text.
  For example: 30,32. In NMET of NCEE 2005 Shanghai Part II
  (II Grammar and Vocabulary)
  30. More than a dozen students in that school _________abroad to study medicine last year.
  A. sent B. were sent C. had sent D. had been sent
  32. He got well-prepared for the job interview, for he couldn’t risk ________ the good opportunity.
  A. To lost B. losing C. To be lost D. being lost
  For 30 , through “exclusion” , for it is last year, so exclude C and D, it is of passive voice, exclude A, then come out the correct answer B.
  For 32,A and C are excluded for risk connected with nouns , and it is of active voice , so exclude D, and get the correct answer B. Candidates can get the correct answer without understang the meaning. Thus being good at doing MCQs can’t tell the real level of students’ ability ’of using language and communication in English. Thus this deviates the purpose of learning English.
  It is very difficult to design the testing, especially for beginners. The designers should undergo a rigorous training and make a pre-test. Four options of each question , should be similar in length and be independent and the distracters should be equally distracting and distinguishable and be fit for testing(Weir 1990:44). So it is of high requirements and high difficulty for the propositioner.Currently in our country, language tests on Reading Compre-hension do not reach a certain degree of distinction. Some of them do not compose of four similar items. And for many questions, the answer is the original sentence in the text. So the candidates can find the answer directly from the text.
  Forexample:2005NationalCollegeEntranceExamination
  (Shanghai), Part IV(IV Reading Comprehension, A, 65,66).
  65.Scientists believe that_______.
  A. some babies are born with a sense of direction
  B. people learn a sense of direction as they grow older
  C. people never lost their sense of direction
  D. everybody possesses a sense of direction from birth
  66. What is true of seven-year-old children according to the passage?
  A. They never have a sense of direction without maps.
  B. They should never be allowed out alone if they lack a sense of direction.
  C.Theyhaveasenseofdirectionandcanfindtheirway around.
  D. They can develop a good sense of direction if they are driven around in a car.
  The sentence “Scientists say we are all born with a sense of direction, but it is not properly understood how it works.” can be found in Paragraph 2 of the article. And the sentence “Children as young as seven have the ability to find their way around. However, if they are not allowed out alone or are taken everywhere by car, they never develop the skills.” can be found in Paragraph 3. So the candidates can find the correct answers(65. D, 66. C)directly from the article.
  
  Ⅴ、Conclusion
  
  To sum up, MCQ do more harm than good. So the author suggest that.①MC be made authentic,such as information matching,picture finding etc. ②Increase the weighting of constructed responses for listening, and reading. ③Cancel MC items for grammar and vocabulary.
  ④Provide choice for students more writing, reading tasks for students to choose from. ⑤Write and arrange questions in accordance with the cognitive requirements of different items.
  Scores for MCQ should be reduced and questions which can test candidates’ ability for applying language should be increased. Such as: Reading comprehension, can be given in the form of answering question, true or false,completing sentences, filling in forms or pictures, matching meanings between paragaphs etc. Cloze, can be used by students to fill in the missing word, the first letter of the wordcan also be given to reduce the difficulty. As for listening, can ask the candidates to choose pictures matches with the contents, complete sentences, fill in forms or pictures etc. Oral testhas been added on to NCEE, thus the most important aptitude for language has been emphasized. Only shall we adjust proportion between MC and non-MC question andimprove the contents of the test, can we enhancethevalidityofNCEEandfosterstudents’abilityfor applying language and studying-for-life, change“dumb English” situation and joint tracks with international English teaching.
  
  References
  1 Alderson, J. C. 1986. Innovations of Language Testing in Portal(ed): 93~10
  2 Alderson, J. C. & D.Wall. 1993. ‘Does Washback Exist?’ Applied Linguistics . 14:115~29
  3 Bachman, L. F. 1997. Fundamental Consideration in Language Testing. Oxford: Oxford University Press
  4Bachman,L .F.& Palmer,1996.Language TestinginPractice. Oxford: Oxford University Press
  5 Bailey, K.M. 1996. ‘Working for washback: a review of the washback concept’. Language Testing.13/3
  6 Huges, A 1989 Testing for language teachers [M]Cambridge: CUP Chapter 3:Kinds of tests and testing,9~21
  7 Zuo Shen.1998. English Language Testing—Some Theoretical and Practical Considerations. Shanghai: Shanghai Foreign Language Education Press
  8 Werir, C 1990 Communicative Language Testing. [M] New York: Prentice Hall Chapter 4: Test Methods,42~79
  9成善德.从语用学看近年来 NMET 的构卷效度.外语与外语教 学(增刊),1999
  10邓卫东.对高考英语题型改革的建议.试题研究(高中英语),2002(369)
  11 黄大勇、杨炳均. 语言测试的反拨效应研究概述. 外语教学与 研究,2002(7)
其他文献
一、生活点滴,触发感恩思绪。    最近,好几位家长向我反映他们的孩子:自作主张;对父母的建议或提醒,嫌啰嗦;常顶撞父母,认为父母为他们付出是天经地义的,从不体谅父母。  回想“徐立杀母”事件多让人揪心啊,我是一位中年教师,有一个正上学的女儿,从她的身上,我也领悟到,现在的孩子缺少感恩的意识和行为。回忆我小时候,兄妹多,生活条件很苦,父母为了养家,每天“日出而作,日落而息”,我们都很理解双亲,尽力
期刊
【摘 要】高等师范院校美术教育专业是培养普通美术教育人才的重要基地,是实现“教师专业化教育”的重要渠道。本文从明晰 美术及美术教育内涵、探究师范教育本质,诠释美术学科教育功能、倡导新的教育理念,反思高师美术教育的价值取向等,在理论上 进行了较深层次的认识和理解;并在教学实践中结合师范教学实际,全面实施美术教育性,促使教育功能在高师美术学科教学中得以 全面凸现,以培养适应社会发展需要的高素质美术基础
期刊
【摘 要】文章主要分析了高职高专《C 语言程序设计》教学中存在的问题,探讨了从多个方面来提高该课程教学质量,激发学生 编程兴趣,培养创新思维,应用交互式教学的方法和途径。  【关键词】高职高专 C 语言程序设计 教学质量 改革  【中图分类号】G712 【文献标识码】A 【文章编号】1006-9682(2008)12-0038-02     随着经济的发展和科技进步,对人才素质提出了更高的要 求,
期刊
【摘 要】近些年来,随着多媒体和网络技术的兴起,以计算机为载体的现代教育技术得到了越来越广泛的应用。在高校外语专业 中,多媒体教学丰富了常规的教学手段,并具有常规教学不可比拟的特点和优势。本文通过一些我校的实例分析了在外语教学中运用 这些技术手段的优点以及取得的成果。同时从另一方面分析了一味追求电教技术的一些弊端,并就如何发挥这些优点和避免弊端提出 了建议。  【关键词】现代教育技术 网络技术 多
期刊
【摘 要】人类交际是语言交际(verbal communication)和非语言交际(nonverbal communication)的结合。体态语(body language) 是非语言交际中最重要的一种。体态语这一无声语言在学校英语课堂教学中起着非常重要的作用。本文简述了体态语的概念;体态语 在英语课堂教学中所起的效果;以及结合职中学生心理特征谈谈职中英语教师如何在课堂教学中艺术地运用体态语,
期刊
【摘 要】英语词汇和汉语词汇各有各的特征,存在很大差异。因此,在英汉翻译中,为了达到忠实原文的目的,真正实现两种文 化的沟通与移植,我们应该对在译语中所选择的词语进行仔细斟酌和揣摩。本文通过实例介绍了英汉翻译中常用的几种词义引申法: 逻辑引申、语用引申、修辞性引申和概念范围的调整。  【关键词】词义 逻辑引申 语用引申 修辞性引申 概念范围的调整  【中图分类号】H315.9 【文献标识码】A 【
期刊
一、基本做法    06 年高考分数下来了,一群刚参加完高考的学生坐在校园里 激烈地讨论着,几个富有报考经验的学校领导和教师皱着眉头, 脸上显出无奈的表情。究其原因,往年本校学生的高考分数虽然 很高,但在录取通知书下来时,一些填报重点高等院校的高分学 生,往往没被录取,而且近几年,整个山东省都是这个趋势。追 其因,就是一些重点高校认为山东学生高分低能。也就是说,分 数虽然很高,但综合素质低,这与当
期刊
摘 要】IEEE1394 高速串行总线支持 400Mbps 的等时和异步传输应用已经到了实用化阶段,具有可热插拔,使用方便灵活等 特性,使其成为未来高速总线的首选。本文利用 1394 总线对高速视频数据传输的特性,实现了 1394 总线在高速视频输出、截取中 的应用。  【关键词】1394 总线 TSB43AB23 数字视频  【中图分类号】O45 【文献标识码】A 【文章编号】1006-9682
期刊
【摘 要】电工电子教学要改变以往的“你不想学习,我压你学习;你不会学习,我来教你学”为“你不想学习,我来引导你学; 你不会学习,我来吸引你学”。积极进行课改工作,不仅理论改革更要进行实验改革,激发学生的学习兴趣,培养学生的创新能力,让 高职院校的学生真正成为技能型优秀人才。  【关键词】电工电子基础 课程改革 实验教学改革 创新知识 创新能力  【中图分类号】G712 【文献标识码】A 【文章编号
期刊
【摘 要】通过语文能力的测试,能够从学生的需要和发展出发,拓展小学生习作的时空,有助于培养学生的创新意识、创新精神 和创造能力。  【关键词】语文能力的测试和评估 课程设置 评鉴的基本理论  【中图分类号】G622.0 【文献标识码】A 【文章编号】1006-9682(2008)12-0098-02    一节语文课的性质、目的、任务不同,评估的标准和方式 也是不同的。语文能力评估的目的就是诊断、
期刊