What is the purpose of validation of assessment?

The assessment validation process ensures an RTO’s assessment system can consistently produce valid assessment judgements, so that learners are assessed against all tasks identified in a unit of competency and the evidence outlined in the associated assessment requirements. This usually happens after assessment is complete so that the training organisation can consider the validity of both assessment practices and judgements.

Assessment validation is highly important for RTOs to complete, but many get it wrong and are surprised when auditing assessors come in and find a mountain of issues.

What are the assessment validation responsibilities of an RTO?

RTOs need to check that their assessment tools have produced valid, reliable, sufficient, current and authentic evidence. By reviewing this evidence on statistically valid samples of the assessments to make reasonable validation judgements, the RTO can then make recommendations for future improvements to the assessment tool, process and outcomes.

What does a validation team within an RTO need to consider?

As a part of validation, your RTO must have a documented plan which describes:

1. Who will lead and participate in the validation activities – this can be one person or a team, external or employees of the RTO, collectively holding: 

  • vocational competencies and current industry skills relevant to the assessment being validated
  • current knowledge and skills in vocational teaching and learning, and
  • the TAE40110 Certificate IV in Training and Assessment (or its successor) or the TAESS00001 Assessor Skills Set (or its successor)

2. Which training products will be the focus of the validation

3. When assessment validation will occur

  • An RTO's validation schedule is a five year plan, meaning each training product needs to be reviewed at least once in that five year period, with at least half of the training products needing to be reviewed in the first three years of the schedule. 

4. How the outcomes of those activities will be documented and acted upon 

  • make sure you have a template to guide staff in what they need to look for when validating assessments and to record the outcomes
  • RTOs also need to ensure they are validating a statistically valid sample, which is a sample that is large enough that the validation outcomes of the sample can be applied to the entire set of judgements, and is taken randomly from the set of assessment judgements being considered. Find ASQA’s validation sample size calculator here.

The validation team also needs to focus on whether the principles of assessment (fairness, flexibility, reliability and validity) are adhered to. 

RTOs must have a records management process to retain evidence of the validation. RTOs should retain evidence of:

  • the person/people leading and participating in the validation activities (including their qualifications, skills and knowledge)
  • the sample pool
  • the validation tools used
  • all assessment samples considered, and
  • the validation outcomes.

RTOs with a student management system integrated with a learning management system will find the scheduling and management of the validation process, and retention of evidence, easier and more streamlined.

How to systematically monitor assessment practice

RTOs should develop a planned and ongoing process to systematically evaluate and improve assessment. 

According to the Department of Training and Workforce Development WA[1], systematic approaches that support improvement include:

  • planning where data will be collected from, how it will be collected, the form it will take, how often it will be collected, and how it will be collated, analysed and used
  • ensuring that data collection and analysis confirm good practice and show where improvements need to be made
  • making improvements where analysis demonstrates that they are needed
  • regularly reviewing data collection to assess its usefulness for improving products and services
  • giving feedback to those who have contributed to the data.

Risk indicators

An RTO might choose to validate its training products more often, for example, if risk indicators demonstrate that more frequent validation is required. Indicators of risk might include:

  • the use of new assessment tools
  • delivery of training products where safety is a concern
  • the level and experience of the assessor, or
  • changes in technology, workplace processes, legislation, and licensing requirements.

Implementing recommendations

If the validation outcomes recommend improvements to the assessment tool, these recommendations should be implemented across all training products, not only those included in the sample. If changes are made to the assessment tool, quality checks need to be completed and the revised tool should be reviewed prior to implementation.

What happens when positive changes from the validation process are implemented?

Your learners will want to leave glowing reviews about their student experience. This is because learners will be clear about what is expected of them, they will better engage with the assessment because they can see how it relates to the world of work, and learner outcomes will be maximised[1].   

To read about assessment validation in more detail, read ASQA’s information here.

Want more info? Check out this great video on tips for validation from Lauren Hollows.

References

If your RTO is ready to take your business success to the next level – and elevate assessments, compliance and the student experience – learn how aXcelerate's One System solution can help you here.

What is the purpose of validation of assessment?

Want to know more about assessment and compliance? Check out these articles:

Don't forget to subscribe to VET:eXpress for more important VET insights and news.

Enjoy this blog? Please share using the buttons below

  1. Norcini J, Burch V. Workplace-based assessment as an educational tool: AMEE Guide No. 31. Med Teach. 2007;29:855–71.

    Article  PubMed  Google Scholar 

  2. Kogan JR, Holmboe ES, Hauer KE. Tools for direct observation and assessment of clinical skills of medical trainees: a systematic review. JAMA. 2009;302:1316–26.

    Article  CAS  PubMed  Google Scholar 

  3. Holmboe ES, Sherbino J, Long DM, Swing SR, Frank JR. The role of assessment in competency-based medical education. Med Teach. 2010;32:676–82.

    Article  PubMed  Google Scholar 

  4. Ziv A, Wolpe PR, Small SD, Glick S. Simulation-based medical education: an ethical imperative. Acad Med. 2003;78:783–8.

    Article  PubMed  Google Scholar 

  5. Schuwirth LWT, van der Vleuten CPM. The use of clinical simulations in assessment. Med Educ. 2003;37(s1):65–71.

    Article  PubMed  Google Scholar 

  6. Boulet JR, Jeffries PR, Hatala RA, Korndorffer Jr JR, Feinstein DM, Roche JP. Research regarding methods of assessing learning outcomes. Simul Healthc. 2011;6(Suppl):S48–51.

    Article  PubMed  Google Scholar 

  7. Issenberg SB, McGaghie WC, Hart IR, Mayer JW, Felner JM, Petrusa ER, et al. Simulation technology for health care professional skills training and assessment. JAMA. 1999;282:861–6.

    Article  CAS  PubMed  Google Scholar 

  8. Amin Z, Boulet JR, Cook DA, Ellaway R, Fahal A, Kneebone R, et al. Technology-enabled assessment of health professions education: consensus statement and recommendations from the Ottawa 2010 conference. Med Teach. 2011;33:364–9.

    Article  PubMed  Google Scholar 

  9. Cook DA, Brydges R, Zendejas B, Hamstra SJ, Hatala R. Technology-enhanced simulation to assess health professionals: a systematic review of validity evidence, research methods, and reporting quality. Acad Med. 2013;88:872–83.

    Article  PubMed  Google Scholar 

  10. Kane MT. Validation. In: Brennan RL, editor. Educational measurement. 4th ed. Westport: Praeger; 2006. p. 17–64.

    Google Scholar 

  11. Cook DA, Kuper A, Hatala R, Ginsburg S. When assessment data are words: validity evidence for qualitative educational assessments. Acad Med. 2016. Epub ahead of print: 2016 Apr 5.

  12. Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane’s framework. Med Educ. 2015;49:560–75.

    Article  PubMed  Google Scholar 

  13. Barry MJ. Screening for prostate cancer—the controversy that refuses to die. N Engl J Med. 2009;360:1351–4.

    Article  CAS  PubMed  Google Scholar 

  14. Moyer VA. Screening for prostate cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2012;157:120–34.

    Article  PubMed  Google Scholar 

  15. Hayes JH, Barry MJ. Screening for prostate cancer with the prostate-specific antigen test: a review of current evidence. JAMA. 2014;311:1143–9.

    Article  CAS  PubMed  Google Scholar 

  16. Artino AR, Jr., La Rochelle JS, Dezee KJ, Gehlbach H. Developing questionnaires for educational research: AMEE Guide No. 87. Med Teach. 2014;36:463–74.

  17. Brydges R, Hatala R, Zendejas B, Erwin PJ, Cook DA. Linking simulation-based educational assessments and patient-related outcomes: a systematic review and meta-analysis. Acad Med. 2015;90:246–56.

    Article  PubMed  Google Scholar 

  18. Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med. 1996;125:605–13.

    Article  CAS  PubMed  Google Scholar 

  19. Prentice RL. Surrogate endpoints in clinical trials: definition and operational criteria. Stat Med. 1989;8:431–40.

    Article  CAS  PubMed  Google Scholar 

  20. Downing SM. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003;37:830–7.

    Article  PubMed  Google Scholar 

  21. Messick S. Validity. In: Linn RL, editor. Educational measurement. 3rd ed. New York: American Council on Education and Macmillan; 1989. p. 13–103.

    Google Scholar 

  22. Association AER. American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 1999.

    Google Scholar 

  23. Association AER. American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 2014.

    Google Scholar 

  24. American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Validity. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 2014. p. 11–31.

    Google Scholar 

  25. Cook DA, Zendejas B, Hamstra SJ, Hatala R, Brydges R. What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Adv Health Sci Educ. 2014;19:233–50.

    Article  Google Scholar 

  26. Cook DA, Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med. 2006;119:166.e7-16.

  27. Cook DA, Lineberry M. Consequences validity evidence: evaluating the impact of educational assessments. Acad Med. 2016;91:785–95.

    Article  PubMed  Google Scholar 

  28. Moss PA. The role of consequences in validity theory. Educ Meas Issues Pract. 1998;17(2):6–12.

    Article  Google Scholar 

  29. Haertel E. How is testing supposed to improve schooling? Measurement. 2013;11:1–18.

    Google Scholar 

  30. Kane MT. Validating the interpretations and uses of test scores. J Educ Meas. 2013;50:1–73.

    Article  Google Scholar 

  31. Cook DA. When I say… validity. Med Educ. 2014;48:948–9.

    Article  PubMed  Google Scholar 

  32. Martin JA, Regehr G, Reznick R, MacRae H, Murnaghan J, Hutchison C, et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84:273–8.

    Article  CAS  PubMed  Google Scholar 

  33. Moorthy K, Munz Y, Forrest D, Pandey V, Undre S, Vincent C, et al. Surgical crisis management skills training and assessment: a simulation-based approach to enhancing operating room performance. Ann Surg. 2006;244:139–47.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Lammers RL, Temple KJ, Wagner MJ, Ray D. Competence of new emergency medicine residents in the performance of lumbar punctures. Acad Emerg Med. 2005;12:622–8.

    Article  PubMed  Google Scholar 

  35. Shanks D, Brydges R, den Brok W, Nair P, Hatala R. Are two heads better than one? Comparing dyad and self-regulated learning in simulation training. Med Educ. 2013;47:1215–22.

    Article  PubMed  Google Scholar 

  36. Brydges R, Nair P, Ma I, Shanks D, Hatala R. Directed self-regulated learning versus instructor-regulated learning in simulation training. Med Educ. 2012;46:648–56.

    Article  PubMed  Google Scholar 

  37. Conroy SM, Bond WF, Pheasant KS, Ceccacci N. Competence and retention in performance of the lumbar puncture procedure in a task trainer model. Simul Healthc. 2010;5:133–8.

    Article  PubMed  Google Scholar 

  38. White ML, Jones R, Zinkan L, Tofil NM. Transfer of simulated lumbar puncture training to the clinical setting. Pediatr Emerg Care. 2012;28:1009–12.

    Article  PubMed  Google Scholar 

  39. Hatala R, Issenberg SB, Kassen B, Cole G, Bacchus CM, Scalese RJ. Assessing cardiac physical examination skills using simulation technology and real patients: a comparison study. Med Educ. 2008;42:628–36.

    Article  PubMed  Google Scholar 

  40. Hatala R, Scalese RJ, Cole G, Bacchus M, Kassen B, Issenberg SB. Development and validation of a cardiac findings checklist for use with simulator-based assessments of cardiac physical examination competence. Simul Healthc. 2009;4:17–21.

    Article  PubMed  Google Scholar 

  41. Dong Y, Suri HS, Cook DA, Kashani KB, Mullon JJ, Enders FT, et al. Simulation-based objective assessment discerns clinical proficiency in central line placement: a construct validation. Chest. 2010;137:1050–6.

    Article  PubMed  Google Scholar 

  42. Hatala R, Cook DA, Brydges R, Hawkins RE. Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS): a systematic review of validity evidence. Adv Health Sci Educ Theory Pract. 2015;20:1149–75.

    Article  PubMed  Google Scholar 

  43. Zendejas B, Ruparel RK, Cook DA. Validity evidence for the Fundamentals of Laparoscopic Surgery (FLS) program as an assessment tool: a systematic review. Surg Endosc. 2016;30:512–20.

    Article  PubMed  Google Scholar 

  44. Cook DA. Much ado about differences: why expert-novice comparisons add little to the validity argument. Adv Health Sci Educ Theory Pract. 2014. Epub ahead of print 2014 Sep 27.

  45. Brennan RL. Educational measurement. 4th ed. Westport: Praeger; 2006.

    Google Scholar 

  46. DeVellis RF. Scale Development: Theory and applications. 2nd ed. Thousand Oaks, CA: Sage Publications; 2003.

    Google Scholar 

  47. Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. 3rd ed. New York: Oxford University Press; 2003.

    Google Scholar 

  48. Downing SM, Yudkowsky R. Assessment in health professions education. New York, NY: Routledge; 2009.

    Google Scholar 

  49. Neufeld VR, Norman GR, Feightner JW, Barrows HS. Clinical problem-solving by medical students: a cross-sectional and longitudinal analysis. Med Educ. 1981;15:315–22.

    Article  CAS  PubMed  Google Scholar 

  50. Norman GR. The glass is a little full—of something: revisiting the issue of content specificity of problem solving. Med Educ. 2008;42:549–51.

    Article  PubMed  Google Scholar 

  51. Eva KW, Hodges BD. Scylla or Charybdis? Can we navigate between objectification and judgement in assessment? Med Educ. 2012;46:914–9.

    Article  PubMed  Google Scholar 

  52. Ilgen JS, Ma IW, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. Med Educ. 2015;49:161–73.

    Article  PubMed  Google Scholar 

  53. Norman GR, Van der Vleuten CP, De Graaff E. Pitfalls in the pursuit of objectivity: issues of validity, efficiency and acceptability. Med Educ. 1991;25:119–26.

    Article  CAS  PubMed  Google Scholar 

  54. Kuper A, Reeves S, Albert M, Hodges BD. Assessment: do we need to broaden our methodological horizons? Med Educ. 2007;41:1121–3.

    Article  PubMed  Google Scholar 

  55. Govaerts M, van der Vleuten CP. Validity in work-based assessment: expanding our horizons. Med Educ. 2013;47:1164–74.

    Article  PubMed  Google Scholar 

  56. Hamstra SJ, Brydges R, Hatala R, Zendejas B, Cook DA. Reconsidering fidelity in simulation-based training. Acad Med. 2014;89:387–92.

    Article  PubMed  Google Scholar 

  57. Cook DA, Beckman TJ. High-value, cost-conscious medical education. JAMA Pediatr. 2015. Epub ahead of print 12/23/2014.

  58. Schuwirth LW, van der Vleuten CP. A plea for new psychometric models in educational assessment. Med Educ. 2006;40:296–300.

    Article  PubMed  Google Scholar 

  59. Downing SM. Face validity of assessments: faith-based interpretations or evidence-based science? Med Educ. 2006;40:7–8.

    Article  PubMed  Google Scholar 


Page 2

Type of validitya Definition Examples of evidence
Content Test items and format constitute a relevant and representative sample of the domain of tasks Procedures for item development and sampling
Criterion (includes correlational, concurrent, and predictive validity) Correlation between actual test scores and the “true” (criterion) score Correlation with a definitive standard
Construct Scores vary as expected based on an underlying psychological construct (used when no definitive criterion exists) Correlation with another measure of the same construct Factor analysis Expert-novice comparisons

Change or stability over time

  1. aSome authors also include “face validity” as a fourth type of validity in the classical framework. However, face validity refers either to superficial appearances that have little merit in evaluating the defensibility of assessment [26, 59] (like judging the speed of the car by its color) or to influential features that are better labeled content validity (like judging the speed of the car by its model or engine size). We discourage use of the term "face validity"