research and things

M16, the Eagle nebula hii this is a summary of all the research i've done since around 2020. my main interests are natural language processing and creative and generative ai, although i have some experience with image processing and musical/audio analysis. i am currently working on the speechbrain project at mila, particularly in automatic speech recognition and using large language models to enhance this. most recently, i published on the usage of language modelling to analyze and create influential political speech, and presented this at the aaai workshop on ai and diplomacy in february 2023!

feel free to also check out my google scholar at this link!

additionally, below, you can find brief summaries/abstracts of my publications. enjoy! ^-^

President Botrick: An Analysis of Deep Learning-Based Conversational AI Models to Identify and Create Influential Political Speeches [PAPER]

Conference: AAAI 2023 Workshop for AI and Diplomacy

             "This paper explores the defining qualities of language that are considered influential 
              and charismatic in the context of political speech. Transformer-based models have shown
              to be efficient in analyzing contextual clues and generating coherent texts in a variety
              of domains. With limited research in the identification and exploration of the replication 
              of persua- sion in natural human language and generation of influential speech, we seek to 
              analyze the aspects of public speech that are deemed persuasive and impactful, and generate 
              text accordingly. We propose a two-part experiment: First, we train a BERT-based encoder 
              to weigh segments of speech in or- der to predict its influence on an audience; second, we 
              train a GPT-based decoder to use an established understanding of persuasion to generate new 
              political speech. We show that, using these models, a speech can be created that mimics the 
              natural language habits of prominent political figures."

Comparing Approaches to Language Understanding for Human-Robot Dialogue: An Error Taxonomy and Analysis [PAPER]

Conference: Language Resources and Evaluation Conference 2022

             "In this paper, we compare two different approaches to language understanding for a
              human-robot interaction domain in which a human commander gives navigation instructions to a
              robot. We contrast a relevance-based classifier with a GPT-2 model, using about 2000 input-output
              examples as training data. With this level of training data, the relevance-based model outperforms
              the GPT-2 based model 79% to 68%, and an Oracle combination set an upper-bound of 85%. We also 
              present a taxonomy of types of errors made by each model, indicating that they have somewhat
              different strengths and weaknesses, so we also examine the potential for a combined model."

ML-Based Eye Tracking for Augmented Reality Heads-Up Displays (AR HUDs) [PAPER]

Conference: Society for Information Display Annual Display Week 2021

             "3D Augmented Reality (AR) Heads-up Displays (HUDs) have the potential of overlaying
              virtual objects at the correct locations with accurate motion parallax. Accurate overlays 
              require tracking the pupils of the driver’s eyes. We developed an ML- based pupil tracking 
              system based on a convolutional neural network (CNN) to find the precise location of the pupils."

to get back home: | home! |

to send us an email use: