- A test that is relatively free of measurement error is considere _______
- Reliable
- Conceptualization of Error
- Measurement Error:
- There will ALWAYS be error in measurement
- Goal: design tests relatively free of error
- The observed score consists of the true score plus measurement error (O = T + E)
- "Rubber Yardstick" comparison
- carpenter will never get the same measurement with a rubber yardstick
- Systematic Error is biased
- When can we increase measurement error?
- how the test was created and situational factors
- What is Reliability
- Methods to test for Reliability
- Test-retest reliability
- Test someone now and then test them later
- Will the same person take the same test in the same way
- Alternate-form/parallel-forms reliability
- Example
- Vocabulary test and then give another vocab test but with content slightly altered
- Different versions of the same test
- Split-half reliability
- Take the scores on the first half of the test and compare them to the scores on the second half of the test
- Often test get harder as you go on so often people will say to split the test into even and odd questions
- Inter-item consistency: Cronbach's Alpha
- What does it mean if your cronbach's alpha is .95?
- It means everything is telling you the same thing so you probably don't need so many items because some of your items are telling you the same thing.
- The more items you have the more reliable your test will be...but at what cost?
- So if your cronbach's alpha is .95 you could say that it is too reliable because you aren't getting much information. But if it is below .7 then it is too low.
- Kappa coefficient
- is similar to cronbach's alpha but it also takes chance into account
- Inter-rater reliability
- This is when you have multiple testers doing ratings to make sure you are getting accurate information
- Conceptual Definitions of Reliability
- The degree to which test-takers' scores reflect "true" abilities
- Domain Sampling Model
- Domain: extremely large collection of items
- The larger the sample the more accurately it measures the domain
- Might help to think of a test item as a person in a study
- Classical Test Score Theory
- Because we assume error is random we also make the assumption that the distribution of error is the same for everyone
- If we have a wide variance in the test = lots of error
- Less variance = less error
- Test Construction
- Test Administration
- It's just as interesting in some tests to not just know their score but to know how they got that score and what influenced it.
- Test Environment
- different environments affect scores
- Test-taker variables
- What if the test-taker doesn't eat breakfast
- Examiner-related variables
- Perhaps the test-taker is being defiant so the examiner gives her an ultimatum to either take the test or he will call the police. Will this affect the test scores? How so?
- What if you are having a bad day or if you are biased in some way? We need to be aware of our biases because we all have them.
- Neuropsych majors give simple tests for malingering. For example: asking someone with brain damage to put their fingers together and then pull them apart. Anyone can do this but someone who is malingering will pretend to not be able to.
Monday, February 6, 2012
304: Test Reliability
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Your writing a comment!!! I love you now.