Intraobserver and interobserver reliability assessment
of tibial plateau fracture classification systems

Anıl Taşkesen; İsmail Demirkale; Mustafa Caner Okkaoğlu; Mahmut Özdemir; Mustafa Gökhan Bilgili; Murat Altay

doi:10.5606/ehc.2017.56816

Anıl Taşkesen¹, İsmail Demirkale¹, Mustafa Caner Okkaoğlu¹, Mahmut Özdemir¹, Mustafa Gökhan Bilgili², Murat Altay¹

⁰²Department of Orthopedics and Traumatology, University of Health Sciences, Bakırköy Dr. Sadi Konuk
Training and Research Hospital, Istanbul, Turkey
¹Department of Orthopedics and Traumatology, University of Health Sciences, Keçiören Training and Research Hospital, Ankara, Turkey

Keywords: Classification; interobserver variation; reliability; tibial plateau fracture

Abstract

ABSTRACT Objectives: This study aims to assess the intra- and interobserver reliability of commonly used tibial plateau fracture classification systems.
Patients and methods: This retrospective cohort study included computed tomography (CT) and plain radiographic images (lateral and anteroposterior X-rays) of 60 patients (40 males, 20 females; mean age 45.9 years; range 18 to 80 years) who presented to two orthopaedic clinics between January 2011 and January 2015 with unilateral tibial plateau fractures. All plain X-rays (XR) and CT images were evaluated by four observers on two separate occasions, 1.5 months apart. All fractures were classified according to the Arbeitsgemeinschaft für Osteosynthesefragen- Orthopaedic Trauma Association (AO-OTA), Schatzker, Hohl and Moore, Luo and revised Duparc systems. Intraobserver reliability was measured with Cohen’s kappa (?) coefficient and interobserver reliability with Fleiss’ kappa coefficient.
Results: When Schatzker classification was performed, interobserver reliability was in moderate level for (?=0.51) for XR and in substantial level for CT (?=0.61). When AO/OTA classification was used, interobserver reliability was in moderate level for both methods of diagnosis (?XR=0.43 and ?CT=0.54, respectively). In the Hohl and Moore classification, the interobserver reliability was also moderate for both methods of diagnosis (?XR=0.45 and ?CT=0.51, respectively). Revised Duparc classification showed the lowest interobserver reliability ranging from fair to moderate level (?XR=0.27-0.55 and ?CT=0.44-0.61). Interobserver reliability for Luo classification was ?CT=0.47. Intraobserver reliability for CT in Luo classification was in substantial level for observers 1, 2 and 3 (?CT=0.67-0.71) and in perfect level for observer 4 (?CT=0.84). Intraobserver reliability was in substantial level in Schatzker classification and in moderate level at the other classifications.
Conclusion: Among the classification systems compared in this study, Schatzker was the most reliable particularly when CT was used. On the other hand, revised Duparc classification presented the worse reliability results due to its complexity and different morphological subtypes.

Intraobserver and interobserver reliability assessment of tibial plateau fracture classification systems

Abstract