All Work
Title
Topic
-
‘A Tool for Mutation Analysis in Racket’
“Racket is a functional programming language that is used to teach CS1 at many high schools and colleges. … In order to evaluate [mutation analysis’s] efficacy in our college’s introductory programming courses, we created a prototype mutation analysis tool for Racket. We describe the design and features of the tool and perform a feasibility study using two assignments from an intro CS course where student test suite thoroughness was evaluated by hand by human graders.” Find the paper and full list of authors in the 2023 IEEE International Conference on Software Testing Verification and Validation Workshop.
-
‘SliceLens: Guided Exploration of Machine Learning Datasets’
“SliceLens is a tool for exploring labeled, tabular, machine learning datasets. To explore a dataset, the user selects combinations of features in the dataset that they are interested in. The tool splits those features into bins and then visualizes the label distributions for the subsets of data created by the intersections of the bins. SliceLens guides the user in determining which feature combinations to explore. Guidance is based on a user-selected rating metric, which assigns a score to the subsets created by a given combination of features.”
-
‘”Everyone is Covered”: Exploring the Role of Online Interactions in Facilitating Connection and Social Support in Black Churches’
“Faith institutions provide social support and community care for many in the United States (U.S.). Notably, churches with predominantly Black populations have served as a site for social change and care provision. … However, the pandemic has emphasised how localising these care networks in physical spaces can limit access to social support. … Through interviews and focus groups with nine church members, we explore how hybrid faith communities that bridge offline and online contexts can enable social support and care provision.” Find the paper and full list of authors in the 2023 CHI Conference on Human Factors in Computing Systems…
-
‘Game Level Blending Using a Learned Level Representation’
“Game level blending via machine learning, the process of combining features of game levels to create unique and novel game levels using Procedural Content Generation via Machine Learning (PCGML) techniques, has gained increasing popularity in recent years. However, many existing techniques rely on human-annotated level representations, which limits game level blending to a limited number of annotated games. … In this paper, we present a novel approach to game level blending … that can serve as a level representation for unannotated games and a unified level representation across games without … human annotation.” Find the paper and full list of…
-
‘Typed-Untyped Interactions: A Comparative Analysis’
“The literature presents many strategies for enforcing the integrity of types when typed code interacts with untyped code. This article presents a uniform evaluation framework that characterizes the differences among some major existing semantics for typed–untyped interaction. Type system designers can use this framework to analyze the guarantees of their own dynamic semantics.” Find the paper and full list of authors in ACM Transactions on Programming Languages and Systems.
-
‘EMShepherd: Detecting Adversarial Samples via Side-channel Leakage’
“Deep Neural Networks (DNN) are vulnerable to adversarial perturbations-small changes crafted deliberately on the input to mislead the model for wrong predictions. Adversarial attacks have disastrous consequences for deep learning-empowered critical applications. … Inspired by the fact that electromagnetic (EM) emanations of a model inference are dependent on both operations and data and may contain footprints of different input classes, we propose a framework, EMShepherd, to capture EM traces of model execution, perform processing on traces and exploit them for adversarial detection.” Find the paper and full list of authors at ArXiv.
-
‘Injecting Language Workbench Technology Into Mainstream Languages’
“Eelco Visser envisioned a future where DSLs become a commonplace abstraction in software development. He took strides towards implementing this vision with the Spoofax language workbench. However, his vision is far from the mainstream of programming today. How will the many mainstream programmers encounter and adopt language workbench technology? We propose that the macro systems found in emerging industrial languages open a path towards delivering language workbenches as easy-to-adopt libraries.” Find the paper and full list of authors at the Dagstuhl Research Online Publication Server.
-
‘Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime’
“Overparameterization is known to permit strong generalization performance in neural networks. In this work, we provide an initial theoretical analysis of its effect on catastrophic forgetting in a continual learning setup. We show experimentally that in Permuted MNIST image classification tasks, the generalization performance of multilayer perceptrons trained by vanilla stochastic gradient descent can be improved by overparameterization, and the extent of the performance increase achieved by overparameterization is comparable to that of state-of-the-art continual learning algorithms.” Find the paper and full list of authors in the Proceedings of Machine Learning Research.
-
Enhancing security for brain-inspired computing
“Electrical and computer engineering assistant professor Xiaolin Xu, in collaboration with Shaolei Ren from the University of California-Riverside, was awarded a $600,000 NSF grant for ‘Securing Brain-inspired Hyperdimensional Computing against Design-time and Run-time Attacks for Edge Devices.'”
-
Eco-friendly passive cooling with recycled packaging plastics
“Mechanical and industrial engineering associate professor Yi Zheng’s research group Nano Energy Laboratory published their research on ‘Oil-paper-umbrella-inspired passive radiative cooling using recycled packaging foam’ in the Journal of Materials Chemistry A and ‘Eco-friendly passive radiative cooling using recycled packaging plastics’ in Materials Today Sustainability. In both works, they studied eco-friendly passive cooling materials made of recycled packaging plastics for a greener and cleaner community. The low material cost and ease of fabrication provide a path for effective passive cooling, which requires zero external energy consumption (e.g., air conditioners), especially in less developed areas.”
-
‘Information Transfer in Multitask Learning, Data Augmentation and Beyond’
“A hallmark of human intelligence is that we continue to learn new information and then extrapolate the learned information onto new tasks and domains (see, e.g., Thrun and Pratt (1998)). While this is a fairly intuitive observation, formulating such ideas has proved to be a challenging research problem and continues to inspire new studies. Recently, there has been increasing interest in AI/ML about building models that generalize across tasks, even when they have some form of distribution shifts. How can we ground this research in a solid framework to develop principled methods for better practice?”
-
‘Universal Amplification of KDM Security: From 1-Key Circular to Multi-Key KDM’
“An encryption scheme is Key Dependent Message (KDM) secure if it is safe to encrypt messages that can arbitrarily depend on the secret keys themselves. In this work, we show how to upgrade essentially the weakest form of KDM security into the strongest one. In particular, we assume the existence of a symmetric-key bit-encryption that is circular-secure in the 1-key setting, meaning that it maintains security even if one can encrypt individual bits of a single secret key under itself.” Find the paper and full list of authors at Cryptology E-Print Archive.
-
‘Leveraging Large Language Models for Mental Health Prediction via Online Text Data’
“The recent technology boost of large language models (LLMs) has empowered a variety of applications. However, there is very little research on understanding and improving LLMs’ capability for the mental health domain. In this work, we present the first comprehensive evaluation of multiple LLMs … on various mental health prediction tasks via online text data. We conduct a wide range of experiments, covering zero-shot prompting, few-shot prompting, and instruction finetuning. The results indicate the promising yet limited performance of LLMs with zero-shot and few-shot prompt designs for mental health tasks.” Find the paper and full list of authors at ArXiv.
-
‘On-Robot Bayesian Reinforcement Learning for POMDPs’
“Robot learning is often difficult due to the expense of gathering data. The need for large amounts of data can, and should, be tackled with effective algorithms and leveraging expert information on robot dynamics. Bayesian reinforcement learning (BRL), thanks to its sample efficiency and ability to exploit prior knowledge, is uniquely positioned as such a solution method. Unfortunately, the application of BRL has been limited due to the difficulties of representing expert knowledge. … This paper advances BRL for robotics by proposing a specialized framework for physical systems.” Find the paper and full list of authors at ArXiv.
-
‘A Guessing Entropy-Based Framework for Deep Learning-Assisted Side-Channel Analysis’
‘Recently deep-learning (DL) techniques have been widely adopted in side-channel power analysis. A DL-assisted SCA generally consists of two phases: a deep neural network (DNN) training phase and a follow-on attack phase using the trained DNN. However, currently the two phases are not well aligned, as there is no conclusion on what metric used in the training can result in the most effective attack in the second phase. … We propose to conduct DNN training directly with a common SCA effectiveness metric, Guessing Entropy (GE).’ Find the paper and full list of authors in IEEE Transactions on Information Forensics and…
-
‘DRIP: Domain Refinement Iteration With Polytopes for Backward Reachability Analysis of Neural Feedback Loops’
“Safety certification of data-driven control techniques remains a major open problem. This letter investigates backward reachability as a framework for providing collision avoidance guarantees for systems controlled by neural network (NN) policies. … Existing methods conservatively assume a domain over which to relax the NN, which causes loose over-approximations of the set of states that could lead the system into the obstacle. … To address this issue, we introduce DRIP, an algorithm with a refinement loop on the relaxation domain, which substantially tightens the BP set bounds.” Find the paper and full list of authors at IEEE Control Systems Letters.
-
‘Using Sequences of Life-Events to Predict Human Lives’
“Machine learning has revolutionized computers’ ability to analyze text through flexible computational models. Due to their structural similarity to written language, transformer-based architectures have also shown promise as tools to make sense of a range of multi-variate sequences. … We can also represent human lives in a way that shares this structural similarity to language. From one perspective, lives are simply sequences of events. … Here, we exploit this similarity to adapt innovations from natural language processing to examine the evolution and predictability of human lives based on detailed event sequences.” Find the paper and full list of authors at…
-
‘Defense Against Shortest Path Attacks’
“Identifying shortest paths between nodes in a network is an important task in applications involving routing of resources. Recent work has shown that a malicious actor can manipulate a graph to make traffic between two nodes of interest follow their target path. In this paper, we develop a defense against such attacks by modifying the weights of the graph that users observe. … In this context, we also consider a zero-sum version of the game, in which the defender’s goal is to minimize cost while achieving the minimum possible attack probability.” Find the paper and full list of authors at…
-
‘On Regularity Lemma and Barriers in Streaming and Dynamic Matching’
“We present a new approach for finding matchings in dense graphs by building on Szemerédi’s celebrated Regularity Lemma. This allows us to obtain non-trivial albeit slight improvements over longstanding bounds for matchings in streaming and dynamic graphs.” Find the paper and full list of authors in the Proceedings of the 55th Annual ACM Symposium on Theory of Computing.
-
‘Jointly Extracting Interventions, Outcomes and Findings From RCT Reports With LLMs’
“Results from Randomized Controlled Trials (RCTs) establish the comparative effectiveness of interventions, and are in turn critical inputs for evidence-based care. However, results from RCTs are presented in (often unstructured) natural language articles describing the design, execution, and outcomes of trials; clinicians must manually extract findings pertaining to interventions and outcomes of interest from such articles. … We propose and evaluate a text-to-text model built on instruction-tuned Large Language Models (LLMs) to jointly extract Interventions, Outcomes, and Comparators (ICO elements) from clinical abstracts, and infer the associated results reported.” Find the paper and full list of authors at ArXiv.
-
‘Revisiting Relation Extraction in the Era of Large Language Models’
“Relation extraction (RE) is the core NLP task of inferring semantic relationships between entities from text. Standard supervised RE techniques entail training modules to tag tokens comprising entity spans and then predict the relationship between them. Recent work has instead treated the problem as a sequence-to-sequence task, linearizing relations between entities as target strings to be generated conditioned on the input. Here we … [use] larger language models (GPT-3 and Flan-T5 large) than considered in prior work and evaluat[e] their performance on standard RE tasks under varying levels of supervision.” Find the paper and full list of authors at ArXiv.
-
‘RedHOT: A Corpus of Annotated Medical Questions, Experiences and Claims on Social Media’
“We present Reddit Health Online Talk (RedHOT), a corpus of 22,000 richly annotated social media posts from Reddit spanning 24 health conditions. … We collect additional granular annotations on identified claims.Specifically, we mark snippets that describe patient Populations, Interventions, and Outcomes (PIO elements) within these. Using this corpus, we introduce the task of retrieving trustworthy evidence relevant to a given claim made on social media. We propose a new method to automatically derive (noisy) supervision for this task which we use to train a dense retrieval model.” Find the paper and full list of authors at ACL Anthology.
-
‘SemEval-2023 Task 8: Causal Medical Claim Identification and Related PIO Frame Extraction From Social Media Posts’
“Identification of medical claims from user-generated text data is an onerous but essential step for various tasks including content moderation, and hypothesis generation. SemEval-2023 Task 8 is an effort towards building those capabilities and motivating further research in this direction. This paper summarizes the details and results of shared task 8 at SemEval-2023 which involved identifying causal medical claims and extracting related Populations, Interventions, and Outcomes (“PIO”) frames from social media (Reddit) text.” Find the paper and full list of authors at ACL Anthology.
-
‘Speak Much, Remember Little: Cryptography in the Bounded Storage Model, Revisited’
“The goal of the bounded storage model (BSM) is to construct unconditionally secure cryptographic protocols, by only restricting the storage capacity of the adversary, but otherwise giving it unbounded computational power. Here, we consider a streaming variant of the BSM, where honest parties can stream huge amounts of data to each other so as to overwhelm the adversary’s storage, even while their own storage capacity is significantly smaller than that of the adversary.” Find the paper and full list of authors at Advances in Cryptology—EUROCRYPT 2023.
-
‘Exploring the Role of Audio in Video Captioning’
“Recent focus in video captioning has been on designing architectures that can consume both video and text modalities, and using large-scale video datasets with text transcripts for pre-training, such as HowTo100M. … In this work, we present an audio-visual framework, which aims to fully exploit the potential of the audio modality for captioning. Instead of relying on text transcripts extracted via automatic speech recognition (ASR), we argue that learning with raw audio signals can be more beneficial, as audio has additional information including acoustic events, speaker identity, etc.” Find the paper and full list of authors at ArXiv.
-
‘Semi-Quantitative Detection of Pseudouridine Modifications and Type I/II Hypermodifications in Human mRNAs … Direct Long-Read Sequencing’
“Here, we develop and apply a semi-quantitative method for the high-confidence identification of pseudouridylated sites on mammalian mRNAs via direct long-read nanopore sequencing. A comparative analysis of a modification-free transcriptome reveals that the depth of coverage and specific k-mer sequences are critical parameters for accurate basecalling. By adjusting these parameters for high-confidence U-to-C basecalling errors, we identify many known sites of pseudouridylation and uncover previously unreported uridine-modified sites, many of which fall in k-mers that are known targets of pseudouridine synthases.” Find the paper and full list of authors in Nature Communications.
-
‘Pulcherrimin Protects Bacillus subtilis Against Oxidative Stress During Biofilm Development’
“Pulcherrimin is an iron-binding reddish pigment produced by various bacterial and yeast species. In the soil bacterium Bacillus subtilis, this pigment is synthesized intracellularly as the colorless pulcherriminic acid by using two molecules of tRNA-charged leucine as the substrate; pulcherriminic acid molecules are then secreted and bind to ferric iron extracellularly to form the red- colored pigment pulcherrimin. … In this study, we identified that pulcherrimin is primarily produced under biofilm conditions and provides protection to cells in the biofilm against oxidative stress.” Find the paper and full list of authors in NPJ Biofilms and Microbes.