Welcome!

HI! Jessica, Lorena and Eduardo welcome you to this blog! Here you can see all the documents and tasks performed during this semester for the subject "Métodos y Técnicas de Evaluación Educativa". Post a comment if you want to, feel free to do it

miércoles, 27 de octubre de 2010

UNIT 4

UNIT 4

TESTING AND ASSESSMENT THE FOUR SKILLS

Which assessment techniques do you prefer? Why?

According with our learning objectives of the unit 6 of the book New Headway, and our assessment plan, we prefer to assess the four skills with the following assessment techniques:
READING AND LISTENING
20% Reading and listening skills could be test by these techniques:
· True, False, Don’t know, example Mark the sentences T/F or dk Lions are cats F. REASONS: easy to write. Quite realistic. Test gist or intensive understanding well.
· Multiple choice, example Choose the correct answer: Carolina swims a) rarely b) sometimes c) never. REASONS: Very easy to mark thus good for checking gist or intensive understanding.
· Sequencing (texts/pictures), example Listen and put the paragraphs in order. REASONS: Easy to construct. Good for stories (listening) and for linking of discourse.
· Text completion, example Listen and complete the information on about the film. REASONS: quite realistic. Good for listening for specific information.
· Identify topic (text/paragraph), example Match the title with the text. REASONS: Good for gist reading. Easy to construct and mark.
· Linking, example what does the underlined word refer to? It arrived late. REASONS: Good for testing intensive understanding. Linking with text (cohesion).
· Identify linking words in a text, example after/ next. REASONS: Good for working out how a text holds together (cohesion)
· Discrepancies, example Read the text then listen and list the differences. REASONS: Realistic integrative test of both reading and listening.
LANGUAGE
20% Grammar and Vocabulary could be test by these techniques:
Gap-fill selected words in a text are blanked out; students have to fill in the blanks. REASONS: Good for testing different structures, provides clear contexts. Easy to write and mark.
Identifying structures, example tenses/ parts of speech. Underlying the correct verb form in the sentences: while he rode/was riding in the forest he lost/was losing his wig. REASONS: Test knowledge of grammatical system and of metalanguage.
Table completion, example Complete the table with these adjectives. REASONS: Good for testing knowledge of irregulars and word-building.
Lexis classification, example Match the words with the topics. REASON: Good for testing lexical sets.
Matching words/definitions, example in a list will be a list of verbs and in the second list nouns and phrases related with some verb, so if the students know the vocabulary could associate the verb with the phrase. REASONS: Good for specific words and link with dictionary skills.
We prefer these techniques for reading - listening and language (vocabulary and grammar) because I want to get high reliability in the formal assessment. The before techniques could provide short and very practical answers and often are quick and easy to mark for the students.
WRITING
20% writing could be test by these techniques:
Guided writing. Using pictures, notes, diagrams (giving students some input of information). REASONS: More realistic than essays, because input can create reason for communication. Gives student help, thus good for lower levels. Easier to mark than free writing.
Punctuation (punctuating test). REASONS: Good for testing specific knowledge of punctuation.
Dictation, example Listen and write down the text. REASON: realistic. A good integrative test of listening and writing (spelling).
Combined, example: Read the letter and write a reply. REASON: Realistic and very good for writing.
SPEAKING
20% speaking could be test by the following techniques:
· Role play. Students assume roles (with or without cued information). REASONS: Very good for testing interaction and commonly used task in most materials.
· Oral presentations. Students prepare and give shorts talks. REASONS: Realistic and gives the tester time to assess performance.
· Picture description (using drawing or photo). REASON: Gives tester time to listen and students something concrete.
We prefer the before techniques for writing and speaking, because I would like to get also validity of my formal assessment, so I propose integrative and open-ended techniques as the before ones. Because using these techniques as teachers could involve students in communication and interaction.
Finally, If we use as discrete test with reliability techniques as integrative test with validity techniques, both have advantages and disadvantages. Consequently the solution is to combine them, so we prefer focus in the reliability techniques for receptive skills as reading, listening and language (grammar and vocabulary) and use the validity techniques for the productive skills as speaking and writing.



ACTIVITY 2 “ITEMS”

LEVEL LOW NTERMEDIATE

READING ITEM

1. - Read the story “The bald Knight”.

The Bald Knight

Once upon a time, a long time ago, there was a knight who, as he grew older, lost all his hair. He became as bald as an egg. He didn’t want anyone to see his bald head, so he bought a beautiful, black, curly wig.
One day some lords and ladies from the castle invited him to go hunting with them, so of course he put on his beautiful wig. ‘How handsome I look! He thought to himself. Then he set off happily for the forest.

However, a terrible thing happened. His wig caught on a branch and fell off in a full view of everyone. How they all laughed at him! At first the poor night felt very foolish, but then he saw the funny side of the situation, and he started laughing, too.
The knight never wore his wig again.
The following sentences have been taken from the story. Read it again and decide where they fit.

2. - The following sentences have been taken from the story. Read it again and decide where they fit in the text. You have to place the correct letter in the text. There is only one choice to each letter.

A …as he was dressing in front of his mirror.
B He was riding along, singing merrily to himself, when he passed under an oak tree and…
C They were all still laughing when they arrived back at the castle.


GRAMMAR ITEM

Read the story “The bald Knight”. Put the verb in brackets into the Past Simple. They are all irregular.

The Bald Knight

Once upon a time, a long time ago, there was a knight who, as he______ (grow) older, ______(lose) all his hair. He_____(become) as bald as an egg. He didn’t want anyone to see his bald head, so he______ (buy) a beautiful, black, curly wig.
One day some lords and ladies from the castle invited him to go hunting with them, so of course he ______ (put) on his beautiful wig. ‘How handsome I look! He_______(think) to himself. Then he________ (set) off happily for the forest.

However, a terrible thing happened. His wig_______(catch) on a branch and _______(fall) off in a full view of everyone. How they all laughed at him! At first the poor night_______(feel) very foolish, but then he_______(see) the funny side of the situation, and he started laughing, too.
The knight never_______(wear)his wig again.


LISTENING ITEM

Practice pronunciation of the words. Listening the next words and write the numbers that correspond to each ending pronunciation in the right column, according to the pronunciation of –ed.
1 /t/ 2 /d/ 3 /ed/

Arrived __________ __________ __________
Cooked __________ __________ __________
Wanted __________ __________ __________
Finished __________ __________ __________
Started __________ __________ __________
Lived __________ __________ __________
Traveled __________ __________ __________
Visited __________ __________ __________
Laughed __________ __________ __________
Danced __________ __________ __________
Listened __________ __________ __________
Invited __________ __________ __________


SPEAKING ITEM

1. - Get together with a partner and make a short conversation. Tell him/her what you did last weekend and then listen to him/her. You may use such questions as:

· What did you do last Saturday?
· Did you do anything fun?

Note: Do not forget to use the past tense.
Remember that the assessment criteria will be the use of the correct past simple. Focus in the ending of the verbs, in the use of the yes/no questions, and in the sequence of the events.


REFLECTION OF THE UNIT 4

In this unit “Testing and assessing the four skills” we learnt to focus in planning our assessment program. The questions that helped us to organize our plan were: what level are we going to assess? When are we going to test? And the most important factor what are we going to test? Depending of the skills that you as a teacher select and also taking in mind the percentages of each skill.

We noticed that the syllabus priorities are: speaking, listening, reading, writing, grammar and the vocabulary. Also we review the concepts of reliability, validity, and practicality, because these concepts are very important to decide which techniques are the best choices for our own classes.

We reviewed some techniques for the four skills and we learnt to focus in testing not teaching. We realized of the different techniques such as multiple choice, filling gaps, summaries, picture description, etc. Each technique was based in specific characteristics of reliability, validity or practice of the teacher choice.

The most useful part of this unit was to look at some items in some book or design them, because we already focus in the kind of techniques that offer more objectivity and which are easier to grade and clear for the students. Always we were thinking to include the reliability, validity, and practicality in our tests.

In this unit we also correct our mistakes; we look for extra information in books and compare the better techniques for our own English classes. Because the three classmates of this team, we are working as English teachers in an elementary school, so for us this unit was very useful. In fact we planed, select, and use the better techniques to test our students, these based in their proper characteristics as level and age. In summary, the kind of test selected was depending of the purpose of our assessments. We are very happy to use already this important knowledge. Thank you.

viernes, 15 de octubre de 2010

UNIDAD3 TESTING

UNIDAD 3 TESTING

Kinds of Tests and Testing

Proficiency Tests

Proficiency testing, regardless of the academic background one individual might have, measures a person’s ability in a dominion of a certain language. A person is considered Proficient based on what that person can do entirely in that language. An example can be the (FCE) First Certificate in English.
What does FCE involve?
FCE has five papers:
Reading:1hourYou will need to be able to understand information in fiction and non-fiction books, journals, newspapers and magazines.
Writing:1hour20minutesYou will have to show you can produce two different pieces of writing such as a short story, a letter, an article, a report, a review or an essay.
Use of English:45minutesYour use of English will be tested by tasks which show how well you control your grammar and vocabulary.
Listening:40minutesYou need to show you can understand the meaning of a range of spoken material, including news programs, speeches, stories and anecdotes and public announcements.
Speaking:14minutesYou will take the Speaking test with another candidate or in a group of three, and you will be tested on your ability to take part in different types of interaction: with the examiner, with the other candidates and by yourself.
Achievement Tests

Achievement Testing has to do solely with language courses applied to students, groups of students, or even to the test themselves to constitute how thriving they are. There are two types of Achievement tests: One is called Progress Achievement Test and the other is called Final Achievement Test. Progress Achievement Testing monitors the students progress during a course of study and Final Achievement Testing measures the student’s overall progress at the end of an academic course. Sample:
Here is a list of all of the skills students learn in second grade! The skills are organized into categories, and you can move your mouse over any skill name to see a sample question. To start practicing, just click on any link. IXL will track your score, and the questions will even increase in difficulty as you improve!
Geometry
· T.1 Identify planar and solid shapes
· T.2 Compare sides, vertices, edges, and faces
· T.3 Count sides, vertices, edges, and faces
· T.4 Symmetry
· T.5 Congruent
· T.6 Flip, turn, and slide
· T.7 Perimeter
· T.8 Perimeter - word problems
· T.9 Area
Fractions
· U.1 Halves, thirds, and fourths
· U.2 Identify the fraction
· U.3 Which shape illustrates the fraction?
· U.4 Parts of a group
· U.5 Word problems
· U.6 Compare fractions
· U.7 Order fractions
Probability and statistics
· V.1 More, less, and equally likely
· V.2 Certain, probable, unlikely, and impossible
· V.3 Median, mode, and range
· V.4 Interpret graphs to find median, mode, and range
Multiplication
· W.1 Multiplication sentences
· W.2 Multiplication tables up to 5
· W.3 Multiplication tables up to 10
Division
· X.1 Divisors and quotients up to 5
· X.2 Divisors and quotients up to 10

Diagnostic Test
This test lets the teacher know the student’s weak and strong points. Teachers can make specific tests targeting weaknesses of certain students and in the process let’s them know the strengths of other students. An example of this is The Lollipop Test - A Diagnostic Test of School Readiness
http://r3cc.ceee.gwu.edu/standards_assessments/EAC/eac0181.htm
From the CEEE andthe ERIC Clearinghouse on Assessment and Evaluation
Other example is:
Diagnostic test
20 QUESTIONS, 15 MINUTES
SENTENCE COMPLETION:
1. The revolution in art has not lost its steam; it ____ on as fiercely as ever.
A. trudges B. meanders C. edges
D. ambles E. rages
2. Each occupation has its own ____ ; bankers, lawyers and computer professionals, for example, all use among themselves language which outsiders have difficulty following.
A. merits B. disadvantages C. rewards
D. jargon E. problems

3. ____ by nature, Jones spoke very little even to his own family members.
A. garrulous B. equivocal C. taciturn
D. arrogant E. gregarious
4. Many people at that time believed that spices help preserve food; however, Hall found
that many marketed spices were ____ bacteria, moulds and yeasts.
A. devoid of B. teeming with C. improved by
D. destroyed by E. active against
5. If there is nothing to absorb the energy of sound waves, they travel on ____ , but their
intensity ____ as they travel further from their source.
A. erratically - mitigates B. eternally – alleviates
C. forever – increases D. steadily – stabilizes
E. indefinitely - diminishes
6. The two artists differed markedly in their temperaments; Palmer was reserved and
courteous, Frazer ____ and boastful.
A. phlegmatic B. choleric C. constrained
D. tractable E. stoic
7. The conclusion of his argument, while ____ , is far from ____ .
A. stimulating - interesting B. worthwhile – valueless
C. esoteric – obscure D. germane – relevant
E. abstruse - incomprehensible
http://www.scribd.com/doc/2123225/IBA-Sample-Diagnostic-Test-fake-but-similar-pattern
Placement Tests
This test is designed to know where a teacher should place students according to the level they have. An example of this is the UW Inglés Placement Test (EPT) http://translate.google.com.mx/translate?hl=es&sl=en&u=http://testing.wisc.edu/english%2520test.html&ei=dTKcTKiOK4b4swPL2IHWAQ&sa=X&oi=translate&ct=result&resnum=1&ved=0CBsQ7gEwAA&prev=/search%3Fq%3Dexample%2Bof%2Bplacement%2Btest%26hl%3Des%26rlz%3D1W1ADSA_es
Other example is:
Direct vs. Indirect Testing
Direct testing is when we look for a specific skill from a student that we want to assess. Indirect Testing is basically the understanding of a given language (or recognition) than actual performance. Example of direct testing is to ask the candidate to perform the skill that we wish to measure. If we want to test writing, ask candidate to write a composition. An example of indirect testing is one section of the writing ability in the TOEFL. It contains items with underlined elements, so the candidate has to identify which is erroneous or inappropriate in formal Standard English.
Discrete Point vs. Integrative Testing
Discrete Point Testing is a process of testing one element at a time. Integrative Testing is the opposite, since the student must integrate different elements. Example of Discrete point testing:
Multiple True False
This alternative to ETS-style format is an item set such as the following:
“Below are references to creatures. Mark “A” if absurd and “B” is realistic:
1. Aquatic mammal
2. Fish with a lung
3. Fern gemtophyte with spores
4. Algae with no nucleus
5. Chordate without a notochord
6. Single-celled metazoa
7. Featherless, flying mammal
8. Flatworm with a skeleton
9. Amoeba with a fixed mouth
10. Warm-blooded reptile
Advantages
Students prefer them over multiple choice questions.
A large number of questions can be asked in a given time period, improving reliability.
Disadvantages
Truncated range of scores per item (50%-100%) may encourage guessing.
Preliminary research indicates that this format may be better suited to basic knowledge than complex competence. For example, it is best at test for examples vs. non-examples or characteristics and non-characteristics. http://www.teachopolis.org/library/how_to_write_multiple_choice_questions.htm
Example of integrative testing could be a French test, because assess the four abilities speaking, writing, listening and reading.
Norm Reference vs. Criterion Referenced Testing
Norm Reference is nothing but the comparison between one student’s performance and another’s. Criterion Referenced Testing separates students depending if they can do certain tasks in a satisfactory manner. An example of Norm reference is the admission exam to UABC, and The Berkshire Certificated of Proficiency in German is an example of Criterion Referenced Testing.
Example of Norm-referenced: The Graduate Record Exam (GRE), The SAT and ACT are other examples of norm-referenced measures.
The GRE is taken by college students wishing to enter graduate schools.
Example of Criterion-referenced: The Performance Assessment, Classroom quizzes and exams that are based on course objectives are other examples of criterion-references measures.
Most appropriate for determining the progress of smaller numbers of students on higher-order learning tasks.
http://www.edtech.vt.edu/edtech/id/assess/purposes.html

Objective Testing vs Subjective Testing
Objective Testing is when the Scorer does not need to use judgement to evaluate the test questions. Example of this is a multiple choice test. Subjective is the opposite, since scorer does have to use his/her judgement to score. En example of this can be a Composition done by the student.
Another example is:
OBJECTIVE TEST
SAMPLE QUESTIONS
1. Which of the following is NOT a valid ending for a graphics file?
a. .jpg
b. .tif
c. .mid
d. .gif
e. .psd
2. With regards to e-mail addresses:
a. they must always contain an @ symbol
b. they can never contain spaces
c. they are case-INsensitive
d. [all of the above]
e. [none of the above]
3. When saving a document, a differencebetween 'Save' and 'Save As...' is:
a. there is no difference — they do exactly the same thing all the time
b. 'Save' will save a file that has already been assigned a name and saving location
c. 'Save As...' allows one to change or set the name and/or saving location for the file
d. [ b & c only ]
e. [ none of the above ]
Computer Adaptative Testing
This type o test is controlled, evaluated and graded by the computer. It offers a more efficient way of collecting information on people’s ability. At the beginning students presented the same item of average difficulty. Students, who respond correctly continue with a more difficult item, and who respond incorrectly continue with an easy item. An example of this is The Computer Adaptive Test (CAT)
http://www.kaptest.com/GMAT/Learn-and-Discuss/Everything-GMAT/the-cat.html

Communicative Language Testing
It tests the ability to cummunicate. Example:
Examples of Communicative Test Tasks
Speaking/Listening
Information gap. An information gap activity is one in which two or more testees work together, though it is possible for a confederate of the examiner rather than a testee to take one of the parts. Each testee is given certain information but also lacks some necessary information. The task requires the testees to ask for and give information. The task should provide a context in which it is logical for the testees to be sharing information.
The following is an example of an information gap activity.
Student A
You are planning to buy a tape recorder. You don't want to spend more than about 80 pounds, but you think that a tape recorder that costs less than 50 pounds is probably not of good quality. You definitely want a tape recorder with auto reverse, and one with a radio built in would be nice. You have investigated three models of tape recorder and your friend has investigated three models. Get the information from him/her and share your information. You should start the conversation and make the final decision, but you must get his/her opinion, too.
(Information about three kinds of tape recorders)
Student B
Your friend is planning to buy a tape recorder, and each of you investigated three types of tape recorder. You think it is best to get a small, light tape recorder. Share your information with your friend, and find out about the three tape recorders that your friend investigated. Let him/her begin the conversation and make the final decision, but don't hesitate to express your opinion.
(Information about three kinds of tape recorders)
This kind of task would be evaluated using a system of band scales. The band scales would emphasize the testee's ability to give and receive information, express and elicit opinions, etc. If its intention were communicative, it would probably not emphasize pronunciation, grammatical correctness, etc., except to the extent that these might interfere with communication. The examiner should be an observer and not take part in the activity, since it is difficult to both take part in the activity and evaluate it. Also, the activity should be tape recorded, if possible, so that it could be evaluated later and it does not have to be evaluated in real time.
Role Play. In a role play, the testee is given a situation to play out with another person. The testee is given in advance information about what his/her role is, what specific functions he/she needs to carry out, etc. A role play task would be similar to the above information gap activity, except that it would not involve an information gap. Usually the examiner or a confederate takes one part of the role play.
The following is an example of a role play activity.
Student
You missed class yesterday. Go to the teacher's office and apologize for having missed the class. Ask for the handout from the class. Find out what the homework was.
Examiner
You are a teacher. A student who missed your class yesterday comes to your office. Accept her/his apology, but emphasize the importance of attending classes. You do not have any extra handouts from the class, so suggest that she/he copy one from a friend. Tell her/him what the homework was.
Again, if the intention of this test was to test communicative language, the testee would be assessed on his/her ability to carry out the functions (apologizing, requesting, asking for information, responding to a suggestion, etc.) required by the role.
http://iteslj.org/Articles/Kitao-Testing.html.



ANALYSIS OF A TEST

READER ACTIVITIES
1.-What is the purpose of the test?

This language test has the purpose to measure grammatical knowledge in not contextualized exercises. The test establishes how successful individual students have been in achieving objectives. The type of the test is final achievement test; because it assesses the end of a course of study, and its content must be related to the course with which it is concerned in this case is Morfosintaxis del Segundo Idioma.

2.-Does it represent direct or indirect testing (or a mixture of both)?

This test is direct because requires the candidate to perform accurately the grammar skills of the fifth sections (subject, predicate, direct object, indirect object, predicate words etc.)

3. - Are the items discrete point or integrative (or a mixture of both)?

This test is discrete point because refers to the testing of one element at a time, item by item. The test has a series of items, each testing a particular grammatical structure, for example the part I assesses the grammatical categories, the part II assesses the singular and plural forms of the subjects of the sentences. The third part assesses the type of conjunctions. The fourth part the infinitive, participial, and gerund phrases and the fifth part assesses to build relative clauses.

4. - Which items are objective, and which are subjective? Can you order the subjective items according to degree of subjectivity?

The activities I through V are objective, they have only one correct answer, and also these exercises do not require the judgment of the teacher while the last part (VI) of the test is subjective, it could be considered as a strategy for the teacher to receive feedback from the students but was not part of the test.
5. - Is the test norm-referenced or criterion-referenced?
The test is criterion-based because the idea is to see how much each student knew about that topic. Teacher needs to know the learning about what their can actually do in the language.
6. - Does the test measure communicative abilities? Would you describe it as a communicative test? Justify your answers.

No, this test does not measure communicative abilities. This test is working only in grammar, the student doesn’t need to talk or take part in act of communication to pass this test.

7. - What relationship is there between the answers to question 6 and the answers to the other questions?
For us the relationship is that the answers from question 1 to 5 justify the answer 6 (that test is not a communicative language testing) the purpose of this test is to assesses the end of the course (final achievement test) and measure grammatical knowledge in not contextualized exercises. The test is criterion-referenced and has discrete items; consequently the test has objective exercises and an additional subjective feedback comment for the teacher.
LANGUAGE TEST



VALIDITY, RELIABILITY, PRACTICALITY, AND BACKWASH EFFECTS

Validity is when the test items measures accurately what they are intended to measure. In addition the inferences of the assessment are meaningful, useful and appropriate according with the purpose of the assessment. The validity has subtypes such as content validity, and criterion-related validity. The first one relates with the content validity (skills, structures etc.) of the test. The structures will depend of the purpose and level of students. The criterion-related validity relates to the degree to which results on the test agree with others. There are two types: concurred validity, and predictive validity. In the first one, the tests are presented at the same time. The correlation coefficient is a mathematical measure of similarity. Coefficient 1 is obtained when there is a total agreement between two sets of scores, consequently if it is the opposite the coefficient is zero. Predictive validity relates the degree to which a test can predict candidate’s future performance. Validity in scoring depends of the construct (reading, spelling, grammar etc) and scores in shorts answers.

Reliability is the consistency of the test measurement. It is important to have the same conditions, the same way of measurement, and the same students. We could use the test-retest method, or split half method to obtain the reliability of a test. The reliability coefficient allows comparing the reliability of different test. The ideal is 1, means the students get the same results in a test, which show High reliability coefficient. In summary there are two components of test reliability: the performance of students from occasion to occasion, and the reliability of the scoring.

Practicality refers to acquire suitable, useful, and efficient tests. The test should be easy to administer, with an accurate and appropriate evaluation procedure, also it not be too short, long or expensive.

Backwash is the impact or effect that tests have in the students, teachers, educational systems, and even in our society. There are some variables that contribute to backwash, so there are some suggestions to overcome some effects. For example: teacher could test the abilities whose development he wants to encourage (example listen or oral abilities). Teacher has to ensure the test is known and understood for the students. Other suggestions are give students feedback during and after the test, give them specific comments, give them their sequenced grades of the course to provide positive backwash.

REFLECTION

This unit was very interesting because we already have learned the different types of exams; we could analyze some examples of each of them and discover their real use in their context. We also realized of the difference and similitude of each other. We learned to analyze a scholar language test, which confirmed the theory and practice of the characteristics of the tests. We made little mistakes in the analysis test, however we realized of them and we corrected them; we had to review the information and made some changes. It is important to mention that is very important to know how each of the test works and when, where, and how as a teacher need to use them. We think this unit is one of the most important in our teaching performance, because as a teacher testing will be part of our job; we always are going to assess the students with formal and informal assessments and in different contexts. We felt very happy to be a little bit closer to the teaching performance. We had to say thank you teacher Rocio for your teaching support, it made us feel comfortable.