Details
Original language | English |
---|---|
Title of host publication | Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021 |
Place of Publication | New York, NY, USA |
Publisher | Association for Computing Machinery (ACM) |
Pages | 1-7 |
ISBN (electronic) | 9781450380959 |
Publication status | Published - 8 May 2021 |
Event | CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths (CHI EA 2021) - Virtual, Online, Japan Duration: 8 May 2021 → 13 May 2021 |
Publication series
Name | Conference on Human Factors in Computing Systems - Proceedings |
---|
Abstract
People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).
Keywords
- Accessibility, Auditory Scene Description, Collision Warnings, Mobile Scene Recognition, Object Detection, Spatial Audio, Text-to-Speech, Visually Impaired
ASJC Scopus subject areas
- Computer Science(all)
- Human-Computer Interaction
- Computer Science(all)
- Computer Graphics and Computer-Aided Design
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021. New York, NY, USA: Association for Computing Machinery (ACM), 2021. p. 1-7 394 (Conference on Human Factors in Computing Systems - Proceedings).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Mobile Recognition and Tracking of Objects in the Environment through Augmented Reality and 3D Audio Cues for People with Visual Impairments
AU - Kaul, Oliver Beren
AU - Behrens, Kersten
AU - Rohs, Michael
PY - 2021/5/8
Y1 - 2021/5/8
N2 - People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).
AB - People with visual impairments face challenges in scene and object recognition, especially in unknown environments. We combined the mobile scene detection framework Apple ARKit with MobileNet-v2 and 3D spatial audio to provide an auditory scene description to people with visual impairments. The combination of ARKit and MobileNet allows keeping recognized objects in the scene even if the user turns away from the object. An object can thus serve as an auditory landmark. With a search function, the system can even guide the user to a particular item. The system also provides spatial audio warnings for nearby objects and walls to avoid collisions. We evaluated the implemented app in a preliminary user study. The results show that users can find items without visual feedback using the proposed application. The study also reveals that the range of local object detection through MobileNet-v2 was insufficient, which we aim to overcome using more accurate object detection frameworks in future work (YOLOv5x).
KW - Accessibility
KW - Auditory Scene Description
KW - Collision Warnings
KW - Mobile Scene Recognition
KW - Object Detection
KW - Spatial Audio
KW - Text-to-Speech
KW - Visually Impaired
UR - http://www.scopus.com/inward/record.url?scp=85105779312&partnerID=8YFLogxK
U2 - 10.1145/3411763.3451611
DO - 10.1145/3411763.3451611
M3 - Conference contribution
AN - SCOPUS:85105779312
T3 - Conference on Human Factors in Computing Systems - Proceedings
SP - 1
EP - 7
BT - Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021
PB - Association for Computing Machinery (ACM)
CY - New York, NY, USA
T2 - CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths (CHI EA 2021)
Y2 - 8 May 2021 through 13 May 2021
ER -