Skip to main content

Visual and tactile perception techniques for braille recognition

Abstract

In the case of a visually impaired person, literal communication often relies on braille, a system predominantly dependent on vision and touch. This study entailed the development of a visual and tactile perception technique for braille character recognition. In the visual perception approach, a braille character recognition was performed using a deep learning model (Faster R-CNN–FPN–ResNet-50), based on custom-made braille dataset collected through data augmentation and preprocessing. The attained performance was indicated by an mAP50 of 94.8 and mAP75 of 70.4 on the generated dataset. In the tactile perception approach, a braille character recognition was performed using a flexible capacitive pressure sensor array. The sensor size and density were designed according to braille standards, and a single sensor with a size of 1.5 mm × 1.5 mm was manufactured into a 5 × 5 sensor array by using a printing technique. Additionally, the sensitivity was improved by incorporating a pressure-sensitive micro dome-structured array layer. Finally, braille character recognition was visualized in the form of a video-based heatmap. These results will potentially be a cornerstone in developing assistive technology for the visually impaired through the fusion of visual-tactile sensing technology.

Introduction

There is a growing interest in sensor technologies that replicate human-level sensory modalities, such as vision, touch, and smell [1,2,3]. These technologies are used in assistive devices and systems for people with impairments [4,5,6]. In the case of a visually impaired person, literal communication often relies on braille, a system predominantly dependent on vision and touch. As the degree of impairment progresses, visual function gradually diminishes; consequently, individuals become increasingly reliant on their sense of touch. Specifically, according to a report by the World Health Organization (WHO) in 2019, there are 238 million visually impaired people, and 39 million people are blind (relying solely on their sense of touch). Unfortunately, these individuals face difficulties in learning braille because the current state of braille education and rehabilitation technologies for visually impaired individuals is insufficient both domestically and internationally [7]. Therefore, assistive technologies that utilize both visual and tactile senses are essential for the education and rehabilitation of visually impaired individuals.

Braille consists of six small dots with a diameter of 1.5 mm, and each dot is separated by 1 mm. Visual and tactile perception techniques have been introduced to recognize braille characters. In the visual perception techniques, deep-learning-based vision techniques have enabled object detection, segmentation, and classification because of the advancement of graphics processing units (GPUs) for parallel computing and high-quality datasets [8,9,10,11]. However, vision-based braille recognition remains susceptible to external environmental factors, such as variations in light intensity, distance, and angle of view, because of the small size of braille characters. Therefore, collecting a new braille dataset with a preprocessing technique for braille images must be accomplished using a deep learning model. As an alternative technique unaffected by the aforementioned constraints, the tactile-based technique shows clear advantages in the perception of braille. Through tactile perception, braille structures can be detected using microfabricated pressure sensors and sensor arrays [12,13,14,15,16]. To recognize braille patterns through tactile feedback, the sensors need to be fabricated on a millimeter scale and form a high-density array. Also, they must exhibit sufficiently high sensitivity for detecting small braille structures. However, in addition to the insufficient size and density of tactile sensors and arrays for braille recognition [17, 18], combining vision and tactile techniques has not been demonstrated to educate and rehabilitate visually impaired individuals.

As a preliminary investigation on the fusion of visual and tactile perceptions, this study devised visual and tactile approaches for recognizing braille. The visual perception technique utilizes a deep learning model to recognize braille based on RGB images (Fig. 1a). In contrast, the tactile perception technique utilizes a flexible capacitive pressure sensor array to recognize braille (Fig. 1b). Finally, we discuss the advantages and disadvantages of visual and tactile perception approaches for realizing a human-level visuotactile fusion technique.

Fig. 1
figure 1

Vision- and tactile-based braille recognition. a Visual perception based on deep learning technique and b tactile perception based on pressure sensor technique

Results and discussion

Visual perception based on deep learning

To enable visual perception, a transfer learning approach based on a deep learning model was adopted. Transfer learning utilizes a pretrained deep learning model, and the output layer is trained to extract results to improve computational efficiency [19]. Transfer learning can be effective for visual perception because it exhibits rapid processing and excellent performance, even when dealing with relatively small datasets. In this study, the transfer learning model, namely, Faster region-based convolutional neural network (Faster R-CNN)–feature pyramid network (FPN)–ResNet-50, was utilized for object detection [20]. The Faster R-CNN model is fast and accurate for object detection because it combines the region proposal and detection steps by identifying and classifying the location of an object. The FPN–ResNet-50 serves as the backbone network that extracts features from images and constitutes a critical component of the object detection model [21]. In the Faster R-CNN–FPN–ResNet-50 architecture pipeline shown in Fig. 2a, images were processed using the FPN to derive feature maps. These derived feature maps were used by the region proposal network (RPN) to identify possible object locations. Region of interest (RoI) pooling was performed on the proposed locations using the RPN, and the results passed through a fully connected layer. Finally, predictions were generated using the results of the classifier for object detection and bounding box (B-box) regression for locating the objects.

Fig. 2
figure 2

Model architecture and braille for creating custom-made dataset with constraints. a Faster R-CNN–FPN–ResNet-50 model architecture. b Position of the braille for the alphabet “a.” c Distance of the camera from braille for all alphabets

In this study, a new braille dataset was constructed to facilitate accurate braille recognition while considering environmental factors such as position and distance because of insufficient prior research related to braille and open-source datasets [22, 23]. The new dataset was classified into two categories: online and custom-made. In the online category, 210 digital braille images consisting of all 26 alphabetic characters from “a” to “z” were collected. In addition, 210 real braille images, including 26 alphabet characters and 10 special symbols, were captured using an RGB camera with three constraints: the number of braille characters, the position of the braille, and the height of the camera (Fig. 2b, c). Using data augmentation techniques, such as those provided by the Albumentation library [24], these images were expanded to 520 images to construct a custom-made braille dataset. The Albumentation library provides various data augmentation algorithms, such as image flip, random resize crop, and random gamma, which randomly adjust the brightness tone of braille images.

As aforementioned, braille consists of small dots, each with a diameter of 1.5 mm. When recognizing braille through an RGB image, all objects except for the braille dots are considered background. Therefore, preprocessing is necessary to accentuate the braille dots while suppressing background noise. Figure 3a illustrates the preprocessing algorithms used in this study. Preprocessing involves several steps, including contrast-limited adaptive histogram equalization, binarization, medial filtering, erosion, and dilation. The histogram equalization divides an image into multiple tiles, and each tile is equalized by considering the local pixel values, resulting in natural equalization. Subsequently, binarization using the adaptive threshold technique emphasizes the dots and weakens the background through dilation and erosion. Finally, the noise is removed using a median filter. Figure 3b shows the enhanced braille dots obtained by comparing the images before and after preprocessing. The new datasets, both online and custom-made with data augmentation, were preprocessed and learned using transfer learning techniques. As a learning result, mAP50 and mAP75, which represent the mean average precision (mAP) for each class at intersection over union (IoU) 0.5 and 0.75, respectively, were confirmed for images that had undergone preprocessing and those that had not. In terms of overall performance, the preprocessed images exhibited an improvement of approximately 30% compared to images that had not been preprocessed (Fig. 3c).

Fig. 3
figure 3

Preprocessing algorithms, final image, and accuracy for visual perception. a Preprocessing algorithms for model performance improvement. b Braille image before and after preprocessing. c Accuracy of the original image and preprocessed image for mAP50 and mAP75

The braille detection model utilizes a Faster R-CNN with FPN–ResNet-50 as the backbone network. The model was trained using the aforementioned dataset and a transfer learning technique. The process of training the model involved applying data augmentation to increase the number of data instances and performing preprocessing using various algorithms on the augmented data. In the final output of the trained model, the class is predicted using the classifier and B-box. Additionally, the results of mAP50 and mAP75 are displayed for the preprocessed and non-preprocessed braille datasets. The overall model performance showed improved accuracy when the model was trained with preprocessed images.

Tactile perception based on capacitive pressure sensor array

Pressure sensors are transducers that convert applied pressure into changes in electrical signals (e.g., resistance and capacitance) [25, 26]. In this study, a capacitive sensing approach was utilized with a micropatterned pressure-sensitive layer to enhance sensitivity and response/recovery time, as shown in Fig. 4a. In capacitive pressure sensors, the sensing performance is primarily determined by the gap between the top and bottom electrodes and the relative permittivity of the sensing layer. When pressure is applied, the micropatterned dome structure gradually deforms, resulting in a decrease in gap distance and an increase in effective permittivity by deforming the polymeric microdome structure [27, 28]. Therefore, owing to the synergistic effect, the resulting capacitance increases in response to the applied pressure.

Fig. 4
figure 4

Capacitive pressure sensor and sensor array. a Mechanism of capacitive pressure sensor. b Fabrication process using printing technique. Print the bottom electrode on a PI film, and then print the interdigitated electrode on the opposite side of the film through an alignment process. Finally, attach the top electrode with a dome-structure layer to the capacitive sensor. c Exploded view of capacitive pressure sensor array. d Top electrodes with a dome diameter of 30 μm and a pitch of 60 μm in the dome-structure array

To recognize braille characters with a single dot having a 1.5 mm diameter, pressure sensors must be fabricated on a millimeter scale to form large-area and high-density sensor arrays using a printing technique (V-One, Voltera). Each sensor was designed to be 1.5 mm × 1.5 mm in size with a parallel-plate capacitor structure, comprising a plane electrode at the top and interdigitated electrodes at the bottom [29, 30]. Between the electrodes, a microfabricated dome-structured array was used as a pressure-sensitive layer. The fabrication process is shown in Fig. 4b. The bottom electrode was first printed on the back side of the polyimide (PI) substrate using a silver nanoparticle paste (AgNP paste, with 75% AgNPs and glycol 20%; Voltera). Subsequently, the interdigitated electrodes were printed on the front side of the PI substrate. The overlapping area between the back-side and front-side electrodes formed a static junction capacitor to connect electrically without a direct vertical interconnection. This approach helped address the issue of wire complexity and allowed for the efficient expansion of the array. Subsequently, a top electrode with a dome-structured layer was integrated into the bottom interdigitated electrodes to form a parallel-plate capacitor, as shown in Fig. 4c and d. As a pressure-sensitive layer, the dome-structure array was fabricated using deformable polymeric materials (styrene–ethylene–butylene–styrene, SEBS) with a dome diameter of 30 μm and a pitch of 60 μm (Fig. 4d) [31, 32].

By attaching the film to the top electrode, the capacitive pressure sensor, comprising a static junction capacitor and a parallel-plate capacitive pressure sensor, was integrated. The circuit model includes a single sensor and multiplexed sensors, forming a 5 × 5 sensor array as shown in Fig. 5a, b shows the measured relative capacitance change as a function of the applied pressure and compares the pressure sensing performance using the dome-structured film with that of the non-structured blank film. The pressure sensor with the dome structure exhibited an eightfold improvement in performance compared with the pressure sensor with the blank film. In addition, it exhibited stable performance even under high-pressure conditions exceeding 100 kPa, as shown in Fig. 5c. The pressure sensing performance was conducted under flat conditions and measured using an LCR meter (E4980A, Keysight). Finally, a multiplexed sensor system with a 5 × 5 array was implemented to generate real-time heatmap images for braille character recognition, as shown in Fig. 6. The tested braille word was “wearable,” and each braille character, including “w,” “e,” “a,” “r,” “a,” “b,” “l,” and “e,” was gently pressed on the sensing area. The input braille characters matched well with the heatmap images (Fig. 6).

Fig. 5
figure 5

Equivalent circuit model and pressure sensing performance. a Equivalent circuit model of single pressure sensor and sensor array. b Comparing pressure sensor performance using dome-structured film with non-structured blank film. c Pressure sensor performance under high pressure (> 100 kPa) conditions

Fig. 6
figure 6

Heatmap-based braille recognition using capacitive pressure sensor array. Multiplexed sensor system configuration (top) and heatmap-based braille recognition for the letters “WEARABLE” (bottom)

Conclusion

This study examined visual and tactile perception techniques for recognizing braille characters among the visually impaired. In the visual perception technique, a new dataset was built using data augmentation techniques, and braille recognition technology was implemented through the dataset using transfer learning techniques and preprocessing. In the tactile perception technique, small-sized pressure sensors are expanded into an array to conform to braille standards. Subsequently, a heatmap-based braille recognition technique is developed and implemented. The proposed braille recognition technique based on vision and tactile exhibited complementary characteristics. Visual perception techniques offer fast and efficient data processing, making them suitable for handling large datasets. However, they are limited by environmental factors such as light, distance, and angle. Conversely, tactile perception techniques overcome environmental constraints; however, they are not only limited by slower data processing rates but also require specific conditions, including small size, high density, and high sensitivity.

In future work, we will enhance visual technology with improved braille character recognition accuracy across various scenarios and tactile technology with increased sensor density and sensitivity to ensure precise braille character recognition. Thus, we need to develop an adaptive sensor fusion technology that will incorporate a fusion module with four layers to address the limitations of visual (non-environment-free) and tactile information (slow frame rate). The fusion module is expected to effectively use either visual or tactile information based on the surrounding environment, thereby enhancing braille character recognition. These results can serve as the foundation for future advancements in assistive technology development for the visually impaired using visuo-tactile fusion approaches.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding authors on reasonable request.

References

  1. Jung YH, Park B, Kim JU, Kim TI (2019) Bioinspired electronics for artificial sensory systems. Adv Mater 31(34):1803637

    Article  Google Scholar 

  2. Lee GJ, Choi C, Kim DH, Song YM (2018) Bioinspired artificial eyes: optic components, digital cameras, and visual prostheses. Adv Func Mater 28(24):1705202

    Article  Google Scholar 

  3. Chortos A, Bao Z (2014) Skin-inspired electronic devices. Mater Today 17(7):321–331

    Article  Google Scholar 

  4. Elmannai W, Elleithy K (2017) Sensor-based assistive devices for visually-impaired people: current status, challenges, and future directions. Sensors 17(3):565

    Article  Google Scholar 

  5. Tapu R, Mocanu B, Zaharia T (2020) Wearable assistive devices for visually impaired: a state of the art survey. Pattern Recogn Lett 137:37–52

    Article  Google Scholar 

  6. Manjari K, Verma M, Singal G (2020) A survey on assistive technology for visually impaired. Internet of Things 11:100188

    Article  Google Scholar 

  7. Harris LN, Gladfelter A, Santuzzi AM, Lech IB, Rodriguez R, Lopez LE, Soto D, Li A (2023) Braille literacy as a human right: a challenge to the “inefficiency” argument against braille instruction. Int J Psychol 58(1):52–58

    Article  Google Scholar 

  8. Zhao ZQ, Zheng P, Xu ST, Wu X (2019) Object detection with deep learning: a review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232

    Article  Google Scholar 

  9. Minaee S, Boykov Y, Porikli F, Plaza A, Kehtarnavaz N, Terzopoulos D (2021) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell 44(7):3523–3542

    Google Scholar 

  10. Kotsiantis SB, Zaharakis I, Pintelas P (2007) Supervised machine learning: a review of classification techniques. Emerg Artif Intell Appl Comput Eng 160(1):3–24

    Google Scholar 

  11. Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Comput Intell Neurosci 2018:1–13

  12. Zhang Y, Hu Y, Zhu P, Han F, Zhu Y, Sun R, Wong CP (2017) Flexible and highly sensitive pressure sensor based on microdome-patterned PDMS forming with assistance of colloid self-assembly and replica technique for wearable electronics. ACS Appl Mater Interfaces 9(41):35968–35976

    Article  Google Scholar 

  13. Ji B, Mao Y, Zhou Q, Zhou J, Chen G, Gao Y, Tian Y, Wen W, Zhou B (2019) Facile preparation of hybrid structure based on mesodome and micropillar arrays as flexible electronic skin with tunable sensitivity and detection range. ACS Appl Mater Interfaces 11(31):28060–28071

    Article  Google Scholar 

  14. Tee BCK, Chortos A, Dunn RR, Schwartz G, Eason E, Bao Z (2014) Tunable flexible pressure sensors using microstructured elastomer geometries for intuitive electronics. Adv Func Mater 24(34):5427–5434

    Article  Google Scholar 

  15. Yang JC, Kim JO, Oh J, Kwon SY, Sim JY, Kim DW, Choi HB, Park S (2019) Microstructured porous pyramid-based ultrahigh sensitive pressure sensor insensitive to strain and temperature. ACS Appl Mater Interfaces 11(21):19472–19480

    Article  Google Scholar 

  16. Yang J, Luo S, Zhou X, Li J, Fu J, Yang W, Wei D (2019) Flexible, tunable, and ultrasensitive capacitive pressure sensor with microconformal graphene electrodes. ACS Appl Mater Interfaces 11(16):14997–15006

    Article  Google Scholar 

  17. Zhao F, Hang Z, Lu L, Xu K, Zhang H, Yang F, Zhang W (2020) A skin-like sensor for intelligent Braille recognition. Nano Energy 68:104346

    Article  Google Scholar 

  18. Yan Y, Hu Z, Shen Y, Pan J (2022) Surface texture recognition by deep learning-enhanced tactile sensing. Adv Intell Syst 4(1):2100076

    Article  Google Scholar 

  19. Weiss K, Khoshgoftaar TM, Wang D (2016) A survey of transfer learning. J Big Data 3(1):1–40

    Article  Google Scholar 

  20. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  21. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2117–2125

  22. Gonçalves D, Santos G, Campos M, Amory A, Manssour I (2020) Braille character detection using deep neural networks for an educational robot for visually impaired people. In Anais do XVI Workshop de Visao Computacional (pp. 123–128). SBC

  23. Li R, Liu H, Wang X, Xu J, Qian Y (2020) Optical braille recognition based on semantic segmentation network with auxiliary learning strategy. In Proceeding of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 554–555)

  24. Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA (2020) Albumentations: fast and flexible image augmentations. Information 11(2):125

    Article  Google Scholar 

  25. Ruth SRA, Feig VR, Tran H, Bao Z (2020) Microengineering pressure sensor active layers for improved performance. Adv Func Mater 30(39):2003491

    Article  Google Scholar 

  26. Cui X, Huang F, Zhang X, Song P, Zheng H, Chevali V, Wang H, Xu Z (2022) Flexible pressure sensors via engineering microstructures for wearable human-machine interaction and health monitoring applications. iScience 25(4):104148

    Article  Google Scholar 

  27. Jafarizadeh B, Chowdhury AH, Khakpour I, Pala N, Wang C (2022) Design rules for a wearable micro-fabricated piezo-resistive pressure sensor. Micromachines 13(6):838

    Article  Google Scholar 

  28. Park J, Kim J, Hong J, Lee H, Lee Y, Cho S, Kim S, Kim JJ, Kim SY, Ko H (2018) Tailoring force sensitivity and selectivity by microstructure engineering of multidirectional electronic skins. NPG Asia Materials 10(4):163–176

    Article  Google Scholar 

  29. Al Rumon MA, Shahariar H (2021) Fabrication of interdigitated capacitor on fabric as tactile sensor. Sensors Int 2:100086

    Article  Google Scholar 

  30. Rahman MT, Rahimi A, Gupta S, Panat R (2016) Microscale additive manufacturing and modeling of interdigitated capacitive touch sensors. Sens Actuators A Phys 248:94–103

    Article  Google Scholar 

  31. Zhang YY, Zhang J, Wang GL, Wang ZF, Luo ZW, Zhang M (2019) Manufacturing and characterizing of CCTO/SEBS dielectric elastomer as capacitive strain sensors. Rare Met 42:2344–2349

    Article  Google Scholar 

  32. Wang S, Nie Y, Zhu H, Xu Y, Cao S, Zhang J, Li Y, Wang J, Ning X, Kong D (2022) Intrinsically stretchable electronics with ultrahigh deformability to monitor dynamically moving organs. Sci Adv 8(13):eabl5511

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the convergence R&D over Science and Technology Liberal Arts Program through the National Research Foundation of Korea funded by the Ministry of Science and ICT (Grant No. 2022M3C1B6081061), the Samsung Electronics Future Technology Development Center (Grant No. SRFC-TD2103-01), and INHA UNIVERSITY Research Grant.

Funding

This research was supported by the convergence R&D over Science and Technology Liberal Arts Program through the National Research Foundation of Korea funded by the Ministry of Science and ICT (Grant No. 2022M3C1B6081061), the Samsung Electronics Future Technology Development Center (Grant No. SRFC-TD2103-01), and INHA UNIVERSITY Research Grant. Also,this work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (Grant No. RS-2023-00211348).

Author information

Authors and Affiliations

Authors

Contributions

BP and SI fabricated the devices and conducted experiments. BP, SI, and HL analyzed the data. BP and SI drafted the figures and manuscript. YTL, CN, SH, and MK supervised the experiments and revised the manuscript. All the authors have read and approved the manuscript.

Corresponding author

Correspondence to Min-gu Kim.

Ethics declarations

Ethics approval and consent to participate

The authors declare that they have no competing interests.

Consent for publication

The authors consent the Springer Open license agreement to publish the article.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, BS., Im, SM., Lee, H. et al. Visual and tactile perception techniques for braille recognition. Micro and Nano Syst Lett 11, 23 (2023). https://doi.org/10.1186/s40486-023-00191-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40486-023-00191-w

Keywords