An important part of assessment selection involves obtaining and understanding technical documentation, which may be available on the assessment publisher’s website or by contacting the publisher directly. Technical documentation will help determine the quality of the assessment and how well it meets its intended purpose. For example, it should describe the test’s purpose and target audience, the skills and knowledge being measured, how assessment items were constructed, evidence of how the test was evaluated to meet common psychometric standards, and the limitations and uses of results.
Availability of technical documentation

  Is supporting documentation available from the assessment publisher?

  Is the assessment purpose and content clearly identified?

  Are external independent reviews of the assessment available (i.e., subject matter expert reviews, psychometric analysis)?

  Are assessment limitations and benefits described?

Validity (i.e., content, construct, and criterion-related validity) refers to how well the assessment measures what it is intended to measure, including skills and knowledge, the underlying theoretical construct, and the relationship between test scores and external measures of success (e.g., job readiness or college success). In other words, how trustworthy are assessment scores in representing student proficiency in the skills and knowledge being assessed? Validity information helps the user understand how results can be interpreted and for what purposes and is often reported as correlation coefficients (ranging from .00 to 1.00). The higher the value of the validity coefficient, the stronger the correlation between the assessment scores and intended performance.

  What is the assessment's validity coefficient?

  Is the validity value appropriate for your intended purpose and use?

An assessment with high reliability will produce consistent results over time, each time it is given. Simply put, a reliability value indicates the degree of confidence a user should have in assessment results. Reliability is typically reported as a correlation coefficient (ranging from .00 to 1.00). The higher the value of the reliability coefficient, the more reliable the scores. Several types of reliability coefficients can be calculated, providing different types of information about the scores, so it is important to determine not just whether an assessment is reliable, but also what reliability means for that particular assessment.

  What is the assessment's reliability coefficient?

  Does the assessment produce consistent results over time, each time it is given?

Fairness means that an assessment is free from bias, ensuring that test-takers are able to demonstrate their degree of proficiency without the interference of unrelated external factors that may affect their performance. A fair assessment is not biased toward or against a particular population, nor will it employ regional or other stereotypes. Fairness factors include age, culture, socioeconomic status, race, and gender, as well as whether the assessment attempts to measure skills that students were not taught. Fairness is a significant concern in assessing employability skills, because eliminating cultural and social biases can be difficult when assessing interpersonal skills, for example. Users should consider how fair the test is and if it is appropriate for the target population and aligned with the skills and knowledge intended for assessment.

  What evidence exists to support the fairness of the assessment?

  Is the test fair and appropriate for your target population?