Diagnostic accuracy of artificial intelligence dermatology apps compared to clinical evaluation in Indian patients with common skin conditions

Authors

  • Pavithra Reddy Police Department of Dermatology, Sathya Hospital and Sathya Diagnostics, Hyderabad, Telangana, India
  • Samyuktha Danda Department of Medicine, Sathya Hospital and Sathya Diagnostics, Hyderabad, Telangana, India
  • Satya Revanth Karri Department of Psychiatry, Apollo Institute of Medical Sciences and Research, Hyderabad, Telangana, India

DOI:

https://doi.org/10.18203/issn.2455-4529.IntJResDermatol20252064

Keywords:

Artificial intelligence, Dermatology apps, Diagnostic accuracy, Indian skin types, Skin tone bias

Abstract

Background: Artificial intelligence (AI)-based dermatology mobile applications are increasingly used for preliminary diagnosis of skin conditions. However, their diagnostic accuracy in populations with darker skin tones, such as those in India, remains poorly studied. This research aimed to assess and compare the accuracy of three AI-based dermatology apps against dermatologist consensus diagnoses in Indian patients presenting with common dermatoses.

Methods: A prospective, cross-sectional study was conducted on 32 patients attending a dermatology outpatient clinic in Hyderabad, India. Each patient was clinically diagnosed by board-certified dermatologists and subsequently evaluated using three AI apps: Aysa, Skinner by Arboreal, and AI Dermatologist Skin Scanner. The diagnostic output from each app was compared to the dermatologist consensus. Accuracy was calculated for top diagnosis and top three diagnoses. Cohen’s kappa was used to assess agreement.

Results: Aysa showed the highest top diagnosis accuracy (59.4%), followed by Skinner (53.1%) and AI Dermatologist (46.9%). Diagnostic agreement varied by condition, with higher accuracy observed for acne vulgaris and tinea corporis. Performance was poorest for pigmentary and inflammatory disorders such as vitiligo and psoriasis. Diagnostic accuracy declined with increasing Fitzpatrick skin type, indicating skin tone bias in the AI models. Cohen’s kappa indicated moderate agreement for Aysa (κ=0.43) and fair agreement for the other two apps.

Conclusions: While AI dermatology apps show moderate accuracy for certain conditions, they remain inconsistent and biased toward lighter skin tones. These tools may serve as preliminary screening aids but cannot substitute clinical judgment. Enhanced training on diverse skin tones is necessary for equitable AI deployment.

Metrics

Metrics Loading ...

References

Ramesh AN, Kambhampati C, Monson JRT, Drew PJ. Artificial intelligence in medicine. Ann R Coll Surg Engl. 2004;86(5):334–8. DOI: https://doi.org/10.1308/147870804290

Kaliyadan F, Ashique KT. Artificial intelligence in dermatology: Where do we stand. Indian Dermatol Online J. 2020;11(6):895–8. DOI: https://doi.org/10.4103/idoj.IDOJ_260_20

Esteva A, Kuprel B, Novoa RA. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–8. DOI: https://doi.org/10.1038/nature21056

Liu Y, Jain A, Eng C. A deep learning system for differential diagnosis of skin diseases. Nat Med. 2020;26:900–8. DOI: https://doi.org/10.1038/s41591-020-0842-3

Tschandl P, Codella N, Akay BN. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification. Lancet Oncol. 2019;20(7):938–47. DOI: https://doi.org/10.1016/S1470-2045(19)30333-X

Mahajan R. Clinical patterns of dermatoses in patients attending a tertiary care hospital in North India. Indian J Dermatol. 2010;55(2):137–40.

Miyake T, Lee H, Nomura S. Teledermatology in developing nations: A review. Int J Dermatol. 2022;61(2):132–8.

Thappa DM, Sivaranjini R. Common skin problems in children. Indian J Pediatr. 2011;78(6):709–15. DOI: https://doi.org/10.1007/s12098-011-0381-5

Bossuyt PM, Reitsma JB, Bruns DE. STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015;351:5527. DOI: https://doi.org/10.1136/bmj.h5527

Griffiths CEM, Barker JNWN, Bleiker TO, Chalmers RJG, Creamer D. Rook’s Textbook of Dermatology. 9th ed. Wiley-Blackwell. 2016. DOI: https://doi.org/10.1002/9781118441213

Han SS, Park GH, Lim W. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: Automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):191493. DOI: https://doi.org/10.1371/journal.pone.0191493

Haenssle HA, Fink C, Schneiderbauer R. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836–42. DOI: https://doi.org/10.1093/annonc/mdy166

Marri SS, Albadri W, Hyder MS, Janagond AB, Inamadar AC. Efficacy of an artificial intelligence App (Aysa) in dermatological diagnosis: cross-sectional analysis. JMIR Dermatol. 2024;7(1):48811. DOI: https://doi.org/10.2196/48811

Tschandl P, Rinner C, Apalla Z. Human–computer collaboration for skin cancer recognition. Nat Med. 2020;26(8):1229–34. DOI: https://doi.org/10.1038/s41591-020-0942-0

Brinker TJ, Hekler A, Enk AH. Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images. Eur J Cancer. 2019;118:91–6. DOI: https://doi.org/10.1016/j.ejca.2019.06.012

Adamson AS, Smith A. Machine learning and health care disparities in dermatology. JAMA Dermatol. 2018;154(11):1247–8. DOI: https://doi.org/10.1001/jamadermatol.2018.2348

Hogarty DT, Su JC, Phan K. Artificial intelligence in dermatology: where we are and the way to the future. Australas J Dermatol. 2020;61(1):90–5. DOI: https://doi.org/10.1007/s40257-019-00462-6

Daneshjou R, Smith MP, Sun MD. Lack of transparency and potential bias in artificial intelligence data sets and algorithms. Lancet Digit Health. 2021;3(8):508–9. DOI: https://doi.org/10.1001/jamadermatol.2021.3129

Buolamwini J, Gebru T. Gender shades: Intersectional accuracy disparities in commercial gender classification. Conf Fairness Accountability Transp. 2018:77–91.

Salava A, Lauerma AI. How to improve AI-supported dermatology: Learning from human diagnostic strategies. J Eur Acad Dermatol Venereol. 2021;35(10):631–2.

Downloads

Published

2025-06-27

How to Cite

Police, P. R., Danda, S., & Karri, S. R. (2025). Diagnostic accuracy of artificial intelligence dermatology apps compared to clinical evaluation in Indian patients with common skin conditions. International Journal of Research in Dermatology, 11(4), 284–290. https://doi.org/10.18203/issn.2455-4529.IntJResDermatol20252064

Issue

Section

Original Research Articles