Test Reliability: Response to 2 Classmate’s Discussion Posts -1 Paragraph Each (2 total)

Read a selection of your colleagues’ postings.

Respond by Day 6 to at least two of your colleagues’ postings in one or more of the following ways:

  • Ask a probing question.
  • Share an insight from having read your colleague’s posting.
  • Offer and support an opinion.
  • Validate an idea with your own experience.
  • Make a suggestion.
  • Expand on your colleague’s posting.
  • APA format

Classmate 1:

” Test-retest reliability is a very simple way of measuring test reliability over time. In this method, a person is given a test, then the same test is taken again after a varying length of time. In an employment setting, this can be helpful in measuring effectiveness of training programs. The first test establishes a baseline of the individuals’ knowledge on a certain subject. After training or further education, the same test is taken again and improvements (if any) can be measured. This method requires only one test to be created, which is a financial benefit, as well as a time-saving one. However, test-retest reliability has fairly significant difficulties. Test-retest reliability’s most notable problem is that it is highly susceptible to practice or carryover effects (Anastasi & Urbina, 1997). Memory of the test questions and/or the ability to recall test questions and answers on the second test can also skew true results. The testing interval can also add to the error rate of the test. If the testing period is too long, results gathered may be the result of factors other than the variable being measured (Anastasi & Urbina, 1997). For instance, if a retest is given several weeks after a training has concluded, we cannot be assured that information gathered in ways other than the training program was not a factor in the testing results.

Split-half reliability is another way to measure the reliability of a testing instrument. In this method, one test is split into two equivalent halves and the scores on each half are compared and correlated. A major benefit of this reliability method is the savings in time and money by utilizing one test instead of two. However, only using one test is also a noted downside to this method. Creating two equivalent halves of one test can be challenging. Simply splitting the test in half (ex. Questions 1-25 is one half, questions 26-50 is the other half) fails to take into consideration outside influences, such as fatigue or boredom (Anastasi & Urbina, 1997). A more useful way of splitting the test is to assign all even numbered questions to one half and all odd numbered questions to the other. This reduces the effect of outside influences and creates two nearly equivalent halves (Anastasi & Urbina, 1997).

Coefficient alpha, also known as Cronbach’s alpha, is another method of test reliability that utilizes a single test. However, unlike split-half reliability which focuses on consistency in content sampling, this method deals specifically with internal reliability (Anastasi & Urbina, 1997). Cronbach’s alpha expresses the consistency between test items. The higher the alpha, the more consistent the test items. For instance, if you were testing the level of satisfaction in the workplace, you would want to utilize questions that relate to that specific topic. The more closely the questions are related, the higher the alpha would be. If you added in unrelated questions, the alpha would be lowered. Understanding this coefficient is extremely helpful in test creation. If you find that your test has an alpha near 1.0, you would know that your test questions are redundant and you could reduce the number of questions asked (Cortina, 1993). Contrariwise, if your alpha is found to be closer to zero (or even negative!), you may word your questions differently or add more questions. As stated above, increasing the length of the test can increase reliability.”

Classmate 2:

“Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability) (Urbina, Kaufman, & Kaufman, 2014). Three effective methods to assess the reliability of employment tests are: Test-retest, Split-half, and Alternate forms. Test-retest indicates the repeatability of test scores with the passage of time. The scores from the different tests are correlated, and if the reliability coefficient is positive and high, the test is reliable. The disadvantages of the test-retest method are that it is time-consuming to obtain results. This form of testing is common for testing an applicant’s cognitive abilities. According to an article titled ‘Is This Test Valid? : A Guide to for Determining the Validity of a Pre-employment Test’, cognitive ability testing is the best indicator of future performance (Anonymous, 1992). This type of pre-employment testing could be very valuable in predicting future success on the job in spite of the disadvantage of being time-consuming.

The split-half method assesses the internal consistency of a test, such as psychometric tests and questionnaires that measure the extent to which all parts of the test contribute equally to what is being measured. The split-half method is a quick and effortless way to establish reliability. The disadvantage to split-half method is that it can only be effective with large questionnaires in which every question(s) measure the same constructs. This means it would not be appropriate for tests which measure different constructs. Reliability decreases between tests, administrators of the tests, testing environment and other factors. The split-half method increases the reliability of the test (Green, et al., 2016).

Alternate forms specify how consistent test scores are likely to be if a person takes two or more forms of a test. Scores on the two forms should show a high positive reliability coefficient (correlation). An example would be a test certify or license professional engineers (PE). If the applicant for PE credentialing, is knowledgeable in the content area; the applicant should be able to gain passing scores on alternate versions of the qualifying tests. According to Bischoff (2002), careful design is necessary to ensure alternate forms of testing are measuring what they were intended to measure (Bischoff, 2002).”

