Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Oliver Beren Kaul
  • Kersten Behrens
  • Michael Rohs
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021
ErscheinungsortNew York, NY, USA
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten1-7
ISBN (elektronisch)9781450380959
PublikationsstatusVeröffentlicht - 8 Mai 2021
VeranstaltungCHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths (CHI EA 2021) - Virtual, Online, Japan
Dauer: 8 Mai 202113 Mai 2021

Publikationsreihe

NameConference on Human Factors in Computing Systems - Proceedings

Abstract

People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).

ASJC Scopus Sachgebiete

Zitieren

Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments. / Kaul, Oliver Beren; Behrens, Kersten; Rohs, Michael.
Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021. New York, NY, USA: Association for Computing Machinery (ACM), 2021. S. 1-7 394 (Conference on Human Factors in Computing Systems - Proceedings).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Kaul, OB, Behrens, K & Rohs, M 2021, Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments. in Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021., 394, Conference on Human Factors in Computing Systems - Proceedings, Association for Computing Machinery (ACM), New York, NY, USA, S. 1-7, CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths (CHI EA 2021), Virtual, Online, Japan, 8 Mai 2021. https://doi.org/10.1145/3411763.3451611
Kaul, O. B., Behrens, K., & Rohs, M. (2021). Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021 (S. 1-7). Artikel 394 (Conference on Human Factors in Computing Systems - Proceedings). Association for Computing Machinery (ACM). https://doi.org/10.1145/3411763.3451611
Kaul OB, Behrens K, Rohs M. Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments. in Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021. New York, NY, USA: Association for Computing Machinery (ACM). 2021. S. 1-7. 394. (Conference on Human Factors in Computing Systems - Proceedings). doi: 10.1145/3411763.3451611
Kaul, Oliver Beren ; Behrens, Kersten ; Rohs, Michael. / Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments. Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021. New York, NY, USA : Association for Computing Machinery (ACM), 2021. S. 1-7 (Conference on Human Factors in Computing Systems - Proceedings).
Download
@inproceedings{40e4e0a96de542cc80a88c5de98672b0,
title = "Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments",
abstract = "People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).",
keywords = "Accessibility, Auditory Scene Description, Collision Warnings, Mobile Scene Recognition, Object Detection, Spatial Audio, Text-to-Speech, Visually Impaired",
author = "Kaul, {Oliver Beren} and Kersten Behrens and Michael Rohs",
year = "2021",
month = may,
day = "8",
doi = "10.1145/3411763.3451611",
language = "English",
series = "Conference on Human Factors in Computing Systems - Proceedings",
publisher = "Association for Computing Machinery (ACM)",
pages = "1--7",
booktitle = "Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021",
address = "United States",
note = "CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths (CHI EA 2021), CHI EA 2021 ; Conference date: 08-05-2021 Through 13-05-2021",

}

Download

TY - GEN

T1 - Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments

AU - Kaul, Oliver Beren

AU - Behrens, Kersten

AU - Rohs, Michael

PY - 2021/5/8

Y1 - 2021/5/8

N2 - People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).

AB - People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).

KW - Accessibility

KW - Auditory Scene Description

KW - Collision Warnings

KW - Mobile Scene Recognition

KW - Object Detection

KW - Spatial Audio

KW - Text-to-Speech

KW - Visually Impaired

UR - http://www.scopus.com/inward/record.url?scp=85105779312&partnerID=8YFLogxK

U2 - 10.1145/3411763.3451611

DO - 10.1145/3411763.3451611

M3 - Conference contribution

AN - SCOPUS:85105779312

T3 - Conference on Human Factors in Computing Systems - Proceedings

SP - 1

EP - 7

BT - Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021

PB - Association for Computing Machinery (ACM)

CY - New York, NY, USA

T2 - CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths (CHI EA 2021)

Y2 - 8 May 2021 through 13 May 2021

ER -