15–16 Oct 2024
School of Sciences, JAIN (Deemed-to-be University), JC ROad, Bengaluru-560027
Asia/Colombo timezone
Deadline for Abstract Submission: 11:59 pm IST, 10th September 2024.

This is a sandbox server intended for trying out Indico. It should not be used for real events and any events on this instance may be deleted without notice.

The art of audio: generating text and images

Not scheduled
20m
School of Sciences, JAIN (Deemed-to-be University), JC ROad, Bengaluru-560027

School of Sciences, JAIN (Deemed-to-be University), JC ROad, Bengaluru-560027

Jain University School Of Sciences, JC Road, 34, 1st Cross Rd, Near Ravindra Kalakshetra, Bengaluru, Karnataka 560027
Poster Innovation and Technology for Sustainability

Speaker

aashita paliwal

Description

In this paper, we explore a novel multi modal approach that bridges the gap between audio, text and visual representation by utilizing the Clotho dataset, a comprehensive collection of audio recordings with detailed textual annotation. Our methodology involves two primary stages: first we employ state-of-the-art audio processing and natural language processing techniques to transcribe audio data into accurate textual representations. In the second stage, we transform these transcriptions into visual forms, creating an innovative way to visualize the content and structure of audio information.
Through this study, we aim to advance the field of multi modal data analysis by demonstrating how audio, text, and visual elements can be seamlessly integrated to offer enriched user experiences and deeper analytical capabilities. The proposed framework contributes to ongoing research in this field and opens up new possibilities for future exploration and applications.

Primary authors

Presentation materials

There are no materials yet.