Skip to main content

Main menu

  • Home
  • Our journals
    • Clinical Medicine
    • Future Healthcare Journal
  • Subject collections
  • About the RCP
  • Contact us

Clinical Medicine Journal

  • ClinMed Home
  • Content
    • Current
    • Ahead of print
    • Archive
  • Author guidance
    • Instructions for authors
    • Submit online
  • About ClinMed
    • Scope
    • Editorial board
    • Policies
    • Information for reviewers
    • Advertising

User menu

  • Log in
  • Log out

Search

  • Advanced search
RCP Journals
Home
  • Log in
  • Log out
  • Home
  • Our journals
    • Clinical Medicine
    • Future Healthcare Journal
  • Subject collections
  • About the RCP
  • Contact us
Advanced

Clinical Medicine Journal

clinmedicine Logo
  • ClinMed Home
  • Content
    • Current
    • Ahead of print
    • Archive
  • Author guidance
    • Instructions for authors
    • Submit online
  • About ClinMed
    • Scope
    • Editorial board
    • Policies
    • Information for reviewers
    • Advertising

Inter-rater agreement of observable and elicitable neurological signs

Mark Thaller and Thomas Hughes
Download PDF
DOI: https://doi.org/10.7861/clinmedicine.14-3-264
Clin Med June 2014
Mark Thaller
AGlangwili Hospital, Carmarthen, UK
Roles: foundation year 1 doctor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mark.thaller@doctors.org.uk
Thomas Hughes
BDepartment of Neurology, University Hospital of Wales, Cardiff, UK
Roles: consultant neurologist
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
Loading

Abstract

This paper reports on a study that aimed to assess the inter-rater agreement of observable neurological signs in the upper and lower limbs (eg inspection, gait, cerebellar tests and coordination) and elicitable signs (eg tone, strength, reflexes and sensation). Thirty patients were examined by two neurology doctors, at least one of whom was a consultant. The doctors’ findings were recorded on a standardised pro forma. Inter-rater agreement was assessed using the kappa (κ) statistic, which is chance corrected. There was significantly better agreement between the two doctors for observable than for elicitable signs (mean ± standard deviation [SD] κ, 0.70 ± 0.17 vs 0.41 ± 0.22, p = 0.002). Almost perfect agreement was seen for cerebellar signs and inspection (a combination of speed of movement, muscle bulk, wasting and tremor); substantial agreement for strength, gait and coordination; moderate agreement for tone and reflexes; and only fair agreement for sensation. The inter-rater agreement is therefore better for observable neurological signs than for elicitable signs, which may be explained by the additional skill and cooperation required to elicit rather than just observe clinical signs. These findings have implications for clinical practice, particularly in telemedicine, and highlight the need for standardisation of the neurological examination.

KEYWORDS 
  • Elicitable signs
  • inter-rater reliability
  • neurological examination
  • observable signs
  • telemedicine

Introduction

A neurological consultation comprises a verifiable history, a reliable examination, appropriate investigations and their subsequent interpretation. When a specialist opinion is sought using telemedicine, the remote clinician relies on another doctor's neurological examination. Some neurological signs have to be elicited by the examining physician, eg tone, strength and sensory deficits, but some valuable signs can be seen and heard by the remote and examining physician, eg walking, speed of finger movements and maintaining the outstretched arm in a particular posture.

The inter-rater reliability of the National Institutes of Health Stroke Scale (NIHSS, Table 1), which splits motor aspects into five groups – no drift (0), drift before 10 seconds (1), falls before 10 seconds (2), no effort against gravity (3) and no movement (4) – and the traditional neurological examination has been investigated before (Table 2), but these investigations did not include a comparison of signs categorised as elicitable or observable. These studies have not analysed their data according to whether the clinical signs could be observed from the end of the bed.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1.

Inter-rater reliability of NIHSS.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2.

Inter-rater reliability of components of the traditional neurological examination.

Telemedicine has been used to provide an out-of-hours stroke thrombolysis service to hospitals in south-east Wales since April 2012. We therefore investigated the inter-rater agreement of some elicitable and observable neurological signs in the upper and lower limbs to inform an assessment of their utility in the clinical examination performed using telemedicine.

Methods

Thirty patients (mean ± standard deviation [SD] age 55 ± 15 years) recruited over a 4-week period in a routine neurology outpatient clinic gave written consent to be examined by a consultant and, in the same clinic session, one other neurology doctor (foundation year 2, core medical trainee, specialty registrar or consultant). The second examiner, blinded to the findings of the first, repeated the examination of the upper and lower limbs. Examiners were asked to record their findings immediately on a standardised pro forma (Table 3) by selecting from binary options (eg present/absent for clonus) and categorical options (eg absent, depressed, normal or brisk for reflexes and Medical Research Council grades 0–5 for strength). Clinicians did not undertake any special training or instruction in clinical examination as part of this study and were asked to examine patients in accordance with their usual clinical practice, with appropriate equipment provided.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3.

Proforma for examination of the limbs (replicated for each side).

Inter-rater agreement was assessed using the κ statistic, which makes no assumptions about which doctor is correct – only whether they agree. The κ benchmarks used in this paper were that of Landis and Koch: <0 represents poor agreement, 0–0.20 slight agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement and 0.81–1.00 almost perfect agreement.15 A significant difference in agreement is present if there is no overlap in the 95% confidence intervals for the κ value. The mean κ and t-test results were used to assess the significance of the difference between grouped data. The analysis was performed using Microsoft Excel 2007 spreadsheet software.

The study was part of a medical student placement and was approved by the North Wales Research Ethics (Central & East) Proportionate Review Sub-Committee (11-WA-0311) and the Cardiff and Vale Research and Development Department (11/CMC/5212).

Results

The results are summarised in Fig 1 and Table 4. The inter-rater reliability for observable signs was better than for elicitable signs (mean ± SD κ value 0.70 ± 0.17 vs 0.41 ± 0.22, p = 0.002). We considered whether the difference between observable and elicitable signs was a consequence of the variable number of available options – for example, reflexes could be normal, brisk, reduced or absent but speed of movement could only be normal or slow. We therefore recalculated the inter-rater agreement for all data using a binary grouping – for example, reflexes could be abnormal (brisk, reduced or absent) or normal and strength could be abnormal (any grading ≤4) or normal (grade 5). The difference in the inter-rater agreement between observable and elicitable signs was still significant (mean ± SD κ value 0.76 ± 0.09 vs 0.46 ± 0.21, p = 0.014).

Fig 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig 1.

Agreement of neurological signs. Note: κ results for chorea, fasciculation and pronator drift are zero, because they were not observed by any of the clinicians.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 4.

Main components of the neurological examination of the limbs with combined data for each aspect.

Discussion

Signs that have to be elicited involve skill on the part of the examiner, the cooperation of the patient and then interpretation – for example, to test tone, the patient must be relaxed and comfortable and the examining doctor must have an understanding of the actions required to elicit the clinical features of spasticity and rigidity. Informal observation of the techniques used by different doctors in this study suggested marked variations in technique and interpretation, which may explain the poor inter-rater agreement. By comparison, it is more straightforward to observe patients at rest or when performing actions such as tapping movements of the finger and thumb to assess speed of movement or walking, which may explain the better agreement seen for these observable signs. Miller and Johnston16 found foot tapping (κ = 0.73) to be more reliable (sensitivity 86% and specificity 84%) for upper motor neurone weakness than Babinski testing (plantar reflex) (κ = 0.30) (sensitivity 35% and specificity 77%).

The previous literature (see Tables 1 and 2) shows a wide variation in the elicitable signs, with the κ statistic value ranging from 0.29 to 1.00 (mean 0.65) for strength and from 0.15 to 1.00 (0.46) for sensation. The variation in reliability of the peripheral neurological examination in the literature, as well as with the results of this study, highlights that relying on another doctor's assessment may affect diagnosis and management.

One of the concerns of clinicians providing opinions about patients they are not able to examine in person is that their clinical examination is impoverished by the lack of direct patient contact. However, this study suggests that those signs that require elicitation have poorer inter-rater reliability than ‘end-of-the-bed’ signs, which can be observed by both the attending physician and the remote physician using telemedicine equipment. The importance of being a good noticer17 is as relevant today as it ever was, and rather than compromising clinical skills, the technology of telemedicine may demand of clinicians a review of the parts of the clinical examination that are most reliable.

Conclusion

Observable neurological signs have significantly better inter-rater agreement than elicitable signs. These findings have implications for clinical practice, including telemedicine.

Acknowledgements

We would like to thank clinical colleagues in the department of neurology for their help and support. This work was first presented at the video conference All Wales Stroke Meeting (AWSM).

  • © 2014 Royal College of Physicians

References

    1. Anderson ER,
    2. Smith B,
    3. Ido M,
    4. Frankel M
    . Remote assessment of stroke using iPhone 4. J Stroke Cerebrovascular Dis 2013;22:340–4.doi:10.1016/j.jstrokecerebrovasdis.2011.09.013
    OpenUrlCrossRef
    1. Demaerschalk BM,
    2. Vegunta S,
    3. Vargas BB,
    4. et al.
    Reliability of real-time video smartphone for assessing National Institutes of Health Stroke Scale scores in acute stroke patients. Stroke 2012;43:3271–7.doi:10.1161/STROKEAHA.112.669150
    OpenUrlAbstract/FREE Full Text
    1. Gonzalez MA,
    2. Hanna N,
    3. Rodrigo ME,
    4. et al.
    Reliability of pre–hospital real-time cellular video phone in assessing the simplified National Institutes of Health Stroke Scale in patients with acute stroke: a novel telemedicine technology. Stroke 2011;42:1522–7.doi:10.1161/STROKEAHA.110.600296
    OpenUrlAbstract/FREE Full Text
    1. Meyer BC,
    2. Lyden PD,
    3. Al-Khoury L,
    4. et al.
    Prospective reliability of the STRokE DOC wireless/site independent telemedicine system. Neurology 2005;64:1058–60.doi:10.1212/01.WNL.0000154601.26653.E7
    OpenUrlAbstract/FREE Full Text
    1. Handschu R,
    2. Littmann R,
    3. Reulbach U,
    4. et al.
    Telemedicine in -emergency evaluation of acute stroke: interrater agreement in remote video examination with a novel multimedia system. Stroke 2003;34:2842–6.doi:10.1161/01.STR.0000102043.70312.E9
    OpenUrlAbstract/FREE Full Text
    1. Meyer BC,
    2. Hemmen TM,
    3. Jackson CM,
    4. Lyden PD
    . Modified National Institutes of Health Stroke Scale for use in stroke clinical trials: prospective reliability and validity. Stroke 2002;33:1261–6.doi:10.1161/01.STR.0000015625.87603.A7
    OpenUrlAbstract/FREE Full Text
    1. Shafqat S,
    2. Kvedar JC,
    3. Guanci MM,
    4. et al.
    Role for telemedicine in acute stroke: feasibility and reliability of remote administration of the NIH stroke scale. Stroke 1999;30:2141–5.doi:10.1161/01.STR.30.10.2141
    OpenUrlAbstract/FREE Full Text
    1. Brott T,
    2. Adams HP Jr.,
    3. Olinger CP,
    4. et al.
    Measurements of acute cerebral infarction: a clinical examination scale. Stroke 1989;20:864–70.doi:10.1161/01.STR.20.7.864
    OpenUrlAbstract/FREE Full Text
    1. Goldstein L,
    2. Bertels C,
    3. Davis J
    . Interrater reliability of the NIH stroke scale. Arch Neurol 1989;46:660–2.doi:10.1001/archneur.1989.00520420080026
    OpenUrlCrossRefPubMed
    1. Carswell C,
    2. Rañopa M,
    3. Pal S,
    4. et al.
    Video rating in neurodegenerative disease clinical trials: the experience of PRION-1. Dement Geriatr Cogn Dis Extra 2012;2:286–97.doi:10.1159/000339730
    OpenUrlCrossRefPubMed
    1. Hand P,
    2. Haisma JA,
    3. Kwan J,
    4. et al.
    Interobserver agreement for the bedside clinical assessment of suspected stroke. Stroke 2006;37:776–80.doi:10.1161/01.STR.0000204042.41695.a1
    OpenUrlAbstract/FREE Full Text
    1. Jepsen J,
    2. Laursen LH,
    3. Hagert CG,
    4. et al.
    Diagnostic accuracy of the neurological upper limb examination I: Inter-rater reproducibility of selected findings and patterns. BMC Neurology 2006;6:8.doi:10.1186/1471-2377-6-8
    OpenUrlCrossRefPubMed
    1. Lindley RI,
    2. Warlow CP,
    3. Wardlaw JM,
    4. et al.
    Interobserver reliability of a clinical classification of acute cerebral infarction. Stroke 1993;24:1801–4.doi:10.1161/01.STR.24.12.1801
    OpenUrlAbstract/FREE Full Text
    1. Shinar D,
    2. Gross CR,
    3. Mohr JP,
    4. et al.
    Interobserver variability in the assessment of neurologic history and examination in the Stroke Data Bank. Arch Neurol 1985;42:557–65.doi:10.1001/archneur.1985.04060060059010
    OpenUrlCrossRefPubMed
  1. ↵
    1. Landis J,
    2. Koch G
    . The measurement of observer agreement for categorical data. Biometrics 1977;33:159–74.doi:10.2307/2529310
    OpenUrlCrossRefPubMed
  2. ↵
    1. Miller T,
    2. Johnston SC
    . Should the Babinski sign be part of the routine neurologic examination? Neurology 2005;65:1165–8.doi:10.1212/01.wnl.0000180608.76190.10
    OpenUrlAbstract/FREE Full Text
  3. ↵
    1. Asher R
    . Clinical sense. BMJ 1960;1:985–93.doi:10.1136/bmj.1.5178.985
    OpenUrlFREE Full Text
Back to top
Previous articleNext article

Article Tools

Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Citation Tools
Inter-rater agreement of observable and elicitable neurological signs
Mark Thaller, Thomas Hughes
Clinical Medicine Jun 2014, 14 (3) 264-267; DOI: 10.7861/clinmedicine.14-3-264

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
Inter-rater agreement of observable and elicitable neurological signs
Mark Thaller, Thomas Hughes
Clinical Medicine Jun 2014, 14 (3) 264-267; DOI: 10.7861/clinmedicine.14-3-264
del.icio.us logo Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Introduction
    • Methods
    • Results
    • Discussion
    • Conclusion
    • Acknowledgements
    • References
  • Figures & Data
  • Info & Metrics

Related Articles

  • No related articles found.
  • PubMed
  • Google Scholar

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • Management of high-risk non-ST elevation myocardial infarction in the UK: need for alternative models of care to reduce length of stay and admission to angiography times
  • Transcatheter aortic valve implantation – what the general physician needs to know
  • Diabetes patient at risk score – a novel system for triaging appropriate referrals of inpatients with diabetes to the diabetes team
Show more Clinical Practice

Similar Articles

FAQs

  • Difficulty logging in.

There is currently no login required to access the journals. Please go to the home page and simply click on the edition that you wish to read. If you are still unable to access the content you require, please let us know through the 'Contact us' page.

  • Can't find the CME questionnaire.

The read-only self-assessment questionnaire (SAQ) can be found after the CME section in each edition of Clinical Medicine. RCP members and fellows (using their login details for the main RCP website) are able to access the full SAQ with answers and are awarded 2 CPD points upon successful (8/10) completion from:  https://cme.rcplondon.ac.uk

Navigate this Journal

  • Journal Home
  • Current Issue
  • Ahead of Print
  • Archive

Related Links

  • ClinMed - Home
  • FHJ - Home
clinmedicine Footer Logo
  • Home
  • Journals
  • Contact us
  • Advertise
HighWire Press, Inc.

Follow Us:

  • Follow HighWire Origins on Twitter
  • Visit HighWire Origins on Facebook

Copyright © 2021 by the Royal College of Physicians