arXiv Research Digest

April 06, 2026 • 125 papers across 5 interests

🔬

Efficient ML / Edge AI

🟢 Applied

Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization

💡 This research tackles the problem of language AI.

Multi-aspect Knowledge Distillation (MaKD) mimics self-attention and feed-forward modules in greater depth to capture rich language knowledge information at different aspects . MaKD can achieve competitive performance compared with strong baselines with the same storage parameter budget .

Abstract ↗ PDF ↗

🟢 Applied

QVAD: A Question-Centric Agentic Framework for Efficient and Training-Free Video Anomaly Detection

💡 This research automatically finding language AI.

Video Anomaly Detection (VAD) is a fundamental challenge in computer vision . We argue that the bottleneck in VAD is not necessarily model capacity, but rather the static nature of inquiry . We propose QVAD, a question-centric agentic framework that treats VLM-LLM interaction as a dynamic dialogue .

Abstract ↗ PDF ↗

🟢 Applied

Prompt Compression in the Wild: Measuring Latency, Rate Adherence, and Quality for Faster LLM Inference

💡 This research reduces language AI.

Prompt compression has established itself as a cost-effective and low-latency method for accelerating inference in large language models . We present the first systematic, large-scale study of this trade-off . We show effective compression can reduce memory usage enough to offload workloads from data center .

Abstract ↗ PDF ↗

🟢 Applied

Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization

💡 This research running AI locally on devices for language AI.

A human validated multi judge LLM evaluation further reveals calibration bias in single judge pipelines . Logit level KD provides the most reliable gains, while complex distillation improves semantic similarity for short summaries but degrades longer outputs .

Abstract ↗ PDF ↗

🟢 Applied

DSBD: Dual-Aligned Structural Basis Distillation for Graph Domain Adaptation

💡 This research running AI locally on devices for computer vision.

Graph domain adaptation (GDA) aims to transfer knowledge from a labeled source graph to an unlabeled target graph under distribution shifts . Existing methods are largely feature-centric and overlook structural discrepancies . Dual-Aligned Structural Basis Distillation (DSBD) for GDA is a novel framework that explicitly models and adapts cross-domain structural variation . DSBD constructs a differentiable structural basis by synthesizing continuous probabilistic prototype graphs, enabling gradient-based optimization over

Abstract ↗ PDF ↗

🟢 Applied

MI-Pruner: Crossmodal Mutual Information-guided Token Pruner for Efficient MLLMs

💡 This research makes more efficient language AI.

For multimodal large language models, visual information is relatively sparse compared with text . MI-Pruner is simple, efficient and non-intrusive, requiring no access to internal attention maps or architectural modifications .

Abstract ↗ PDF ↗

🟢 Applied

SFFNet: Synergistic Feature Fusion Network With Dual-Domain Edge Enhancement for UAV Image Object Detection

💡 This research automatically finding computer vision.

Traditional methods struggle to effectively separate objects from intricate backgrounds and fail to fully leverage the rich multi-scale information contained within images . We have developed a synergistic feature fusion network (SFFNet) with dual-domain edge enhancement specifically tailored for object detection in UAV images .

Abstract ↗ PDF ↗

🟡 Advanced

JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

💡 This research optimizes language AI.

JoyAI-LLM Flash is an efficient Mixture-of-Experts (MoE) language model designed to redefine the trade-off between strong performance and token efficiency in the sub-50B parameter regime . The model comprises 48B total parameters while activating only 2.7B parameters per forward pass, achieving a substantially higher sparsity ratio than contemporary industry leading models .

Abstract ↗ PDF ↗

🟢 Applied

Not All Frames Deserve Full Computation: Accelerating Autoregressive Video Generation via Selective Computation and Predictive Extrapolation

💡 This research speeds up machine learning.

Autoregressive (AR) video diffusion models enable long-form video generation but remain expensive due to repeated multi-step denoising . Existing training-free acceleration methods rely on binary cache-or-recompute decisions, overlooking intermediate cases where direct reuse is too coarse yet full recomputation is unnecessary . SCOPE introduces a tri-modal scheduler over cache, predict, and recompute .

Abstract ↗ PDF ↗

🟢 Applied

The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

💡 This research reduces computer vision.

This paper reviews the NTIRE 2026 challenge on efficient single-image super-resolution . The aim of this challenge is to devise a network that reduces one or several aspects, such as runtime, parameters, and FLOPs . The challenge had 95 registered participants, and 15 teams made valid submissions .

Abstract ↗ PDF ↗

🟢 Applied

The Compression Gap: Why Discrete Tokenization Limits Vision-Language-Action Model Scaling

💡 This research improves language AI.

Scaling vision-language-Action models by upgrading the vision encoder is expected to improve downstream manipulation performance . We show that this expectation fails when actions are represented as discrete tokens . In any visuomotor pipeline, scaling behavior is governed by the location of the tightest information bottleneck .

Abstract ↗ PDF ↗

🟢 Applied

SkillRT: Compiling Skills for Efficient Execution Everywhere

💡 This research explores techniques in language AI.

SkillRT is a compilation and runtime system designed for portable and efficient skill execution . SkillRT achieves up to 3.2x speedup with enhanced parallelism, and 19-50x latency reduction through code solidification .

Abstract ↗ PDF ↗

🟢 Applied

VOSR: A Vision-Only Generative Model for Image Super-Resolution

💡 This research creating new content with computer vision.

Most of the recent generative image super-resolution (SR) methods rely on adapting large text-to-image (T2I) diffusion models . While effective, this paradigm starts from a generic T2I generator . We propose VOSR, a Vision-Only generative framework for SR .

Abstract ↗ PDF ↗

🟢 Applied

Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation

💡 This research running AI locally on devices for language AI.

Large language models (LLMs) encode vast world knowledge in their parameters, yet they remain limited by static knowledge, finite context windows, and weakly structured causal reasoning . This survey provides a unified account of augmentation strategies along a single axis: the degree of structured context supplied at inference time .

Abstract ↗ PDF ↗

🟢 Applied

EffiMiniVLM: A Compact Dual-Encoder Regression Framework

💡 This research forecasting language AI.

EffiMiniVLM is a compact dual-encoder vision-language regression framework that integrates an EfficientNet-B0 image encoder and a MiniLM-based text encoder with a lightweight regression head . The proposed model contains 27.7M parameters and requires 6.8 GFLOPs, yet achieves a CES score of 0.40 .

Abstract ↗ PDF ↗

🟢 Applied

Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation

💡 This research faster predictions in machine learning.

Distilling video generation models to extremely low inference budgets (e.g., 2--4 NFEs) is crucial for real-time deployment . Trajectory-style consistency distillation often becomes conservative under complex video dynamics, yielding an over-smoothed appearance and weak motion . We propose Self-Consistent Distribution Matching Distillation (SC-DMD) which explicitly regularizes the endpoint-consistent composition of consecutive denoising updates .

Abstract ↗ PDF ↗

🟢 Applied

STEAR: Layer-Aware Spatiotemporal Evidence Intervention for Hallucination Mitigation in Video Large Language Models

💡 This research explores techniques in language AI.

Video Large Language Models (Video-LLMs) remain prone to spatiotemporal hallucinations . STEAR identifies high-risk decoding steps and selects token-conditioned visual evidence from grounding-sensitive middle layers . It uses this shared evidence for two coupled purposes: restoring missing local grounding in middle layers, and constructing temporal counterfactuals to falsify inconsistent reasoning during late-layer decoding .

Abstract ↗ PDF ↗

🟢 Applied

NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-Neurons

💡 This research achieves better machine learning.

Large Reasoning Models (LRMs) have recently achieved remarkable success in complex reasoning tasks . However, closer scrutiny reveals persistent failure modes compromising performance and cost . NeuReasoner is an explainable, controllable, and unified reasoning framework driven by MoN .

Abstract ↗ PDF ↗

🟢 Applied

HyperCT: Low-Rank Hypernet for Unified Chest CT Analysis

💡 This research proposes a method for computer vision.

HyperCT is a framework that dynamically adapts a Vision Transformer backbone via a Hypernetwork . Validated on a large-scale dataset of radiological and cardiological tasks, it outperforms various strong baselines .

Abstract ↗ PDF ↗

🟢 Applied

Hierarchical Planning with Latent World Models

💡 This research explores techniques in machine learning.

Model predictive control (MPC) with learned world models has emerged as a promising paradigm for embodied control . We demonstrate that this hierarchical approach enables zero-shot control on real-world non-greedy robotic tasks, achieving a 70% success rate on pick-&-place using only a final goal specification .

Abstract ↗ PDF ↗

🟢 Applied

Learning the Signature of Memorization in Autoregressive Language Models

💡 This research faster predictions in language AI.

Learned Transfer MIA (LT-MIA) trains a membership inference classifier exclusively on transformer-based models . It transfers zero-shot to Mamba (state-space), RWKV-4 (linear attention), and RecurrentGemma (gated recurrence) all exceed performance on held-out transformers .

Abstract ↗ PDF ↗

🟢 Applied

Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model

💡 This research explores techniques in computer vision.

MV-VDP is a multi-view video diffusion policy that jointly models the 3D spatio-temporal state of the environment . The core idea is to simultaneously predict multiple-view heatmap videos and RGB videos . It also aligns the representation format of video pretraining with action finetuning, and specify what actions the robot should take .

Abstract ↗ PDF ↗

🟢 Applied

PRISM: LLM-Guided Semantic Clustering for High-Precision Topics

💡 This research proposes a method for language AI.

PRISM is a structured topic modeling framework combining rich representations captured by LLMs with the low cost and interpretability of latent semantic clustering methods . PRISM fine-tunes a sentence encoding model using a sparse set of LLM- provided labels on samples drawn from some corpus of interest . We segment this embedding space with thresholded clustering, yielding clusters that separate closely related topics within some narrow domain .

Abstract ↗ PDF ↗

🟢 Applied

Beyond Precision: Importance-Aware Recall for Factuality Evaluation in Long-Form LLM Generation

💡 This research explores techniques in language AI.

Evaluating the factuality of long-form output generated by large language models remains challenging . We propose a comprehensive factuality evaluation framework that jointly measures precision and recall . Our method leverages external knowledge sources to construct reference facts and determine whether they are captured in generated text .

Abstract ↗ PDF ↗

🟢 Applied

SD-FSMIS: Adapting Stable Diffusion for Few-Shot Medical Image Segmentation

💡 This research introduces a new approach to computer vision.

Few-Shot Medical Image Segmentation (FSMIS) aims to segment novel object classes in medical images using only minimal annotated examples . Diffusion Models (DM) excel in visual tasks but their potential for FSMIS remains largely unexplored . We propose that the rich visual priors learned by large-scale DMs offer a powerful foundation for a more robust and data-efficient segmentation approach .

Abstract ↗ PDF ↗

🔬

Privacy-Preserving ML

🟢 Applied

Enhancing Robustness of Federated Learning via Server Learning

💡 This research enhances privacy-preserving AI.

This paper explores the use of server learning for enhancing the robustness of federated learning against malicious attacks . We propose a heuristic algorithm that uses server learning and client update filtering in combination with geometric median aggregation .

Abstract ↗ PDF ↗

🟡 Advanced

On Data-Driven Koopman Representations of Nonlinear Delay Differential Equations

💡 This research explores techniques in machine learning.

This work establishes a rigorous bridge between infinite-dimensional delay dynamics and finite-dimensional Koopman learning, with explicit and interpretable error guarantees .

Abstract ↗ PDF ↗

🟢 Applied

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

💡 This research explores techniques in language AI.

Agent Skills is an emerging open standard that defines a modular, filesystem-based packaging format enabling LLM-based agents to acquire domain-specific expertise on demand . Despite rapid adoption across multiple agentic platforms, the security properties of Agent Skills have not been studied .

Abstract ↗ PDF ↗

🟢 Applied

Open Challenges for Secure and Scalable Wi-Fi Connectivity in Rural Areas

💡 This research explores techniques in machine learning.

Pay-for-use Wi-Fi hotspots are emerging as a scalable solution to provide affordable Internet access in underserved and rural regions . Despite their growing adoption, their security properties remain largely unexplored .

Abstract ↗ PDF ↗

🟢 Applied

FedSQ: Optimized Weight Averaging via Fixed Gating

💡 This research distributed machine learning across computer vision.

Federated learning (FL) enables collaborative training across organizations without sharing raw data . FedSQ freezes a structural copy of the pretrained model to induce fixed binary gating masks during fine-tuning . Fixing the gating reduces learning to within-regime affine refinements, which stabilizes aggregation under heterogeneous partitions .

Abstract ↗ PDF ↗

🟢 Applied

Explainable Machine Learning Reveals 12-Fold Ucp1 Upregulation and Thermogenic Reprogramming in Female Mouse White Adipose Tissue After 37 Days of Microgravity: First AI/ML Analysis of NASA OSD-970

💡 This research presents techniques for machine learning.

This paper presents the first machine learning (ML) analysis of NASA Open Science Data Repository (OSDR) dataset OSD-970, derived from the Rodent Research-1 (RR-1) mission . Using RT-qPCR data from 89 adipogenesis and thermogenesis pathway genes in gonadal WAT of 16 female C57BL/6J mice .

Abstract ↗ PDF ↗

🟢 Applied

STDDN: A Physics-Guided Deep Learning Framework for Crowd Simulation

💡 This research explores techniques in machine learning.

Accurate crowd simulation is crucial for public safety management, emergency evacuation planning, and intelligent transportation systems . The proposed STDDN method has demonstrated significantly superior simulation performance compared to state-of-the-art methods on long-term tasks .

Abstract ↗ PDF ↗

🟢 Applied

Analytic Drift Resister for Non-Exemplar Continual Graph Learning

💡 This research presents techniques for privacy-preserving AI.

Non-Exemplar Continual Graph Learning (NECGL) seeks to eliminate the privacy risks intrinsic to rehearsal-based paradigms . We propose Analytic Drift Resister (ADR), a novel and theoretically grounded NECGL framework . ADR exploits iterative backpropagation to break free from the frozen pre-trained constraint, adapting to evolving task graph distributions and fortifying model plasticity .

Abstract ↗ PDF ↗

🟢 Applied

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

💡 This research explores techniques in language AI.

Environment-injected Trajectory-based Agent Memory Poisoning (eTAMP) is the first attack to achieve cross-session, cross-site compromise without requiring direct memory access . eTAMP achieves substantial attack success rates: up to 32.5% on GPT-5.2, 23.4% .

Abstract ↗ PDF ↗

🟢 Applied

A Tsetlin Machine-driven Intrusion Detection System for Next-Generation IoMT Security

💡 This research explores techniques in edge computing.

The Internet of Medical Things (IoMT) is transforming healthcare by enabling seamless connectivity among medical devices, systems, and services . This paper proposes a novel Tsetlin Machine (TM)-based Intrusion Detection System . The TM is a rule-based and interpretable machine learning (ML) approach that models attack patterns using propositional logic .

Abstract ↗ PDF ↗

🟢 Applied

PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction

💡 This research categorizing computer vision.

PR3DICTR: Platform for Research in 3D Image Classification and sTandardised tRaining . Built using community-standard distributions (PyTorch and MONAI) Built using modular design principles and standardization .

Abstract ↗ PDF ↗

🟢 Applied

Real-Time Surrogate Modeling for Personalized Blood Flow Prediction and Hemodynamic Analysis

💡 This research automatically finding machine learning.

Cardiovascular modeling has rapidly advanced over the past few decades due to the rising needs for health tracking and early detection of cardiovascular diseases . Certain hemodynamic parameters like the terminal resistance/compliance, are difficult to clinically estimate and often yield non-physiological hemodynamics when sampled naively . We present a systematic framework for training machine learning (ML) models, capable of instantaneous hemodynamic prediction and parameter estimation .

Abstract ↗ PDF ↗

🟢 Applied

Gradient Boosting within a Single Attention Layer

💡 This research explores techniques in machine learning.

Transformer attention computes a single softmax-weighted average over values -- a one-pass estimate that cannot correct its own errors . We show that a single Hopfield-style update erases all query information orthogonal to the stored-pattern subspace . Further iteration under local contraction can collapse distinct queries in the same region to the same fixed point .

Abstract ↗ PDF ↗

🟢 Applied

Reflective Context Learning: Studying the Optimization Primitives of Context Space

💡 This research explores techniques in machine learning.

Reflective Context Learning (RCL) is a unified framework for agents that learn through repeated interaction, reflection on behavior and failure modes, and iterative updates to context . In RCL, reflection converts trajectories and current context into a directional update signal analogous to gradients .

Abstract ↗ PDF ↗

🟢 Applied

Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models

💡 This research improves language AI.

Recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption of RL for post-training Multimodal Large Language Models to enhance their visual reasoning capabilities . Hallucination-as-Cue Framework introduces hallucination-inductive, modality-specific corruptions that remove or replace essential information required to derive correct answers .

Abstract ↗ PDF ↗

🟢 Applied

HyperFitS -- Hypernetwork Fitting Spectra for metabolic quantification of ${}^1$H MR spectroscopic imaging

💡 This research explores techniques in machine learning.

Proton magnetic resonance spectroscopic imaging ($^1$H MRSI) enables mapping of whole-brain metabolites concentrations in-vivo . Metabolite maps of human subjects acquired at 3T and 7T with isotropic resolutions of 10 mm, 3.4 mm and 2 mm were quantified with HyperFitS and compared to conventional LCModel fitting . Results show a substantial agreement between the new and gold-standard methods, with significantly faster fitting times by HyperFit

Abstract ↗ PDF ↗

🔴 Theory-Heavy

Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk Minimization

💡 This research explores techniques in machine learning.

We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs . By extending Convex Gaussian Min-Max Theorem, we derive an asymptotic min-max characterization of key statistics . This result clarifies the scope and limits of Gaussian universality for ERMs .

Abstract ↗ PDF ↗

🟢 Applied

A Systematic Security Evaluation of OpenClaw and Its Variants

💡 This research presents techniques for language AI.

Tool-augmented AI agents substantially extend the practical capabilities of large language models, but they also introduce security risks that cannot be identified through model-only evaluation . We present a systematic security assessment of six representative OpenClaw-series agent frameworks .

Abstract ↗ PDF ↗

🟡 Advanced

Self-Distilled RLVR

💡 This research explores techniques in language AI.

On-policy distillation (OPD) has become a popular training paradigm in the LLM community . This paper demonstrates that learning signals solely derived from the privileged teacher result in severe information leakage and unstable long-term training . We identify the optimal niche for self-distillation and propose RLSD (RLSD)

Abstract ↗ PDF ↗

🟢 Applied

An Independent Safety Evaluation of Kimi K2.5

💡 This research explores techniques in language AI.

Kimi K2.5 is an open-weight LLM that rivals closed models across coding, multimodal, and agentic benchmarks, but was released without an accompanying safety evaluation . We evaluate the model for CBRNE misuse risk, cybersecurity risk, misalignment, political censorship, bias, and harmlessness, in both agentic and non-agentic settings .

Abstract ↗ PDF ↗

🟢 Applied

AlertStar: Path-Aware Alert Prediction on Hyper-Relational Knowledge Graphs

💡 This research tackles the problem of language AI.

Cyber-attacks continue to grow in scale and sophistication, yet existing network intrusion detection approaches lack the semantic depth required for path reasoning over attacker-victim interactions . We address this by first modelling network alerts as a knowledge graph, then formulating hyper-relational alert prediction .

Abstract ↗ PDF ↗

🟢 Applied

Co-Evolution of Policy and Internal Reward for Language Agents

💡 This research tackles the problem of language AI.

Large language model (LLM) agents learn by interacting with environments . Long-horizon training remains bottlenecked by sparse and delayed rewards . We propose Self-Guide, a self-generated internal reward for language agents that supports both inference-time guidance and training-time supervision .

Abstract ↗ PDF ↗

🟢 Applied

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

💡 This research explores techniques in language AI.

LLM-based coding agents extend their capabilities via third-party agent skills distributed through open marketplaces without mandatory security review . Unlike traditional packages, these skills are executed as operational directives with system-level privileges .

Abstract ↗ PDF ↗

🟢 Applied

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

💡 This research presents techniques for language AI.

Third-party skills extend LLM agents with powerful capabilities but often handle sensitive credentials in privileged environments . We present the first large-scale empirical study of this problem, analyzing 17,022 skills (sampled from 170,226 on SkillsMP) We identify 520 vulnerable skills with 1,708 issues and derive a taxonomy of 10 leakage patterns .

Abstract ↗ PDF ↗

🟢 Applied

Analyzing Healthcare Interoperability Vulnerabilities: Formal Modeling and Graph-Theoretic Approach

💡 This research explores techniques in edge computing.

HL7 FHIR allows concurrent access to a set of shared patient resources, i.e., EHR systems, pharmacy systems, lab systems, and devices . The FRAG model is implemented as a three-pass graph traversal detection algorithm and tested against a time window-based baseline on 1,500 synthetic transaction logs .

Abstract ↗ PDF ↗

🔬

Creative AI / Emotion

🟢 Applied

MECO: A Multimodal Dataset for Emotion and Cognitive Understanding in Older Adults

💡 This research understanding emotions in speech processing.

MECO includes 42 participants and provides approximately 38 hours of multimodal signals, yielding 30,592 synchronized samples . The modalities cover video, audio, electroencephalography (EEG), and electrocardiography (ECG) In addition, the dataset offers comprehensive annotations of emotional and cognitive states .

Abstract ↗ PDF ↗

🟢 Applied

If It's Good Enough for You, It's Good Enough for Me: Transferability of Audio Sufficiencies across Models

💡 This research proposes a method for speech processing.

Transferability analysis finds transferability rates vary depending on the task . Some models, in particular on deepfake detection, have different transferability behavior . We call these models `flat-earther' models .

Abstract ↗ PDF ↗

🟢 Applied

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

💡 This research presents techniques for language AI.

We present a method to identify a valence-arousal (VA) subspace within large language model representations . Projections along our recovered VA subspace correlate with human-crowdsourced VA ratings across 44k lexical items . Steering along these directions induces near-monotonic bidirectional control over refusal and sycophancy .

Abstract ↗ PDF ↗

🟢 Applied

Same Feedback, Different Source: How AI vs. Human Feedback Attribution and Credibility Shape Learner Behavior in Computing Education

💡 This research explores techniques in language AI.

AI systems increasingly take on instructional roles - providing feedback, guiding practice, evaluating work . Does it matter to learners who they believe is on the other side? We investigated this using a three-condition experiment (N=148) in which participants completed a creative coding tutorial and received feedback generated by the same large language model attributed to either an AI system or a human teaching assistant .

Abstract ↗ PDF ↗

🟢 Applied

User-Aware Conditional Generative Total Correlation Learning for Multi-Modal Recommendation

💡 This research improves computer vision.

Multi-modal recommendation (MMR) enriches item representations by introducing item content, e.g., visual and textual descriptions . Success hinges on aligning these content modalities with user preferences derived from interaction data .

Abstract ↗ PDF ↗

🟢 Applied

Split and Conquer Partial Deepfake Speech

💡 This research proposes a method for speech processing.

Partial deepfake speech detection requires identifying manipulated regions that may occur within short temporal portions of otherwise bona fide utterances . We propose a split-and-conquer framework that decomposes the problem into two stages: boundary detection and segment-level classification . This formulation simplifies the learning objective by separating temporal localization from authenticity assessment .

Abstract ↗ PDF ↗

🟢 Applied

Generative AI Use in Professional Graduate Thesis Writing: Adoption, Perceived Outcomes, and the Role of a Research-Specialized Agent

💡 This research creating new content with computer vision.

This paper reports a survey of generative AI use among 83 MBA thesis students in Japan . 95.2% reported at least some use and 77.1% heavy use . Students engaged AI across the full research-writing workflow - literature review, drafting, and consultation when stuck .

Abstract ↗ PDF ↗

🟢 Applied

Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models

💡 This research explores techniques in language AI.

Chart-RL is a novel reinforcement learning framework that enhances VLMs chart understanding through feedback-driven policy optimization of visual perception and logical inference . The RL fine-tuned Qwen3-VL-4B-Instruct model achieved an answer accuracy of 0.634 .

Abstract ↗ PDF ↗

🟢 Applied

Toward an Artificial General Teacher: Procedural Geometry Data Generation and Visual Grounding with Vision-Language Models

💡 This research explores techniques in language AI.

We study visual explanation in geometry education as a Referring Image Segmentation problem . We present a fully automated procedural data engine that generates over 200,000 synthetic geometry diagrams with pixel-perfect segmentation masks and linguistically diverse referring expressions . We propose domain-specific fine-tuning of vision-language models (VLMs)

Abstract ↗ PDF ↗

🟢 Applied

CharTool: Tool-Integrated Visual Reasoning for Chart Understanding

💡 This research presents techniques for language AI.

DuoChart combines synthesized charts with real-world charts to construct diverse, high-quality chart training data . CharTool-7B outperforms the base model by **+8.0%** on CharXiv (Reasoning) and **+9.78% on ChartQAPro .

Abstract ↗ PDF ↗

🟢 Applied

Coupled Control, Structured Memory, and Verifiable Action in Agentic AI (SCRAT -- Stochastic Control with Retrieval and Auditable Trajectories): A Comparative Perspective from Squirrel Locomotion and Scatter-Hoarding

💡 This research explores techniques in machine learning.

Squirrel ecology offers a sharp comparative case because arboreal locomotion, scatter-hoarding, and audience-sensitive caching couple all three demands in one organism . We introduce a minimal hierarchical partially observed control model with latent dynamics and structured episodic memory .

Abstract ↗ PDF ↗

🟢 Applied

InCoder-32B-Thinking: Industrial Code World Model for Thinking

💡 This research optimizes edge computing.

Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics . We propose InCoder-32B-Thinking, trained on data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning traces .

Abstract ↗ PDF ↗

🟢 Applied

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

💡 This research explores techniques in language AI.

Multi-speaker scenarios remain challenging due to overlapping speech, backchannels, rapid turn-taking, and context window constraints . We propose Speaker-Reasoner, an end-to-end Speech LLM with agentic multi-turn temporal reasoning .

Abstract ↗ PDF ↗

🟢 Applied

Comparing the Impact of Pedagogy-Informed Custom and General-Purpose GAI Chatbots on Students' Science Problem-Solving Processes and Performance Using Heterogeneous Interaction Network Analysis

💡 This research creating new content with machine learning.

Problem solving plays an essential role in science education . Generative AI (GAI) chatbots have emerged as a promising tool for supporting students' science problem solving . However, general-purpose chatbots (e.g., ChatGPT) often provide direct, ready-made answers, may lead to cognitive offloading .

Abstract ↗ PDF ↗

🟢 Applied

R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning

💡 This research improves language AI.

R2-Write is an automated framework that synthesizes high-quality thinking trajectories enriched with explicit reflection and revision patterns . To prevent redundant reflections, we design a process reward mechanism that supervises reflection quality during reinforcement learning .

Abstract ↗ PDF ↗

🟢 Applied

Learning from Synthetic Data via Provenance-Based Input Gradient Guidance

💡 This research improves computer vision.

Learning methods using synthetic data have attracted attention as an effective approach for increasing the diversity of training data while reducing collection costs . Many existing methods improve robustness only indirectly through diversification of training samples and do not explicitly teach the model which regions in the input space truly contribute to discrimination .

Abstract ↗ PDF ↗

🟢 Applied

Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus

💡 This research achieves better language AI.

Council Mode is a novel multi-agent consensus framework . It dispatches queries to multiple heterogeneous frontier LLMs in parallel and synthesizes their outputs through a dedicated consensus model . Council pipeline operates in three phases: (1) an intelligent triage classifier that routes queries based on complexity, (2) parallel expert generation across architecturally diverse models, and (3) structured consensus synthesis .

Abstract ↗ PDF ↗

🟢 Applied

SentiAvatar: Towards Expressive and Interactive Digital Humans

💡 This research achieves better speech processing.

We present SentiAvatar, a framework for building expressive interactive 3D digital humans, and use it to create SuSu, a virtual character that speaks, gestures, and emotes in real time . The source code, model, and dataset are available at https://sentiavatar.io .

Abstract ↗ PDF ↗

🟢 Applied

High-resolution probabilistic estimation of three-dimensional regional ocean dynamics from sparse surface observations

💡 This research presents techniques for machine learning.

The ocean interior regulates Earth's climate but remains sparsely observed due to limited in situ measurements . We present a depth-aware generative framework for reconstructing high-resolution ocean states from extremely sparse surface data . The framework accurately reconstructs subsurface temperature, salinity, and velocity fields across multiple depths .

Abstract ↗ PDF ↗

🟢 Applied

ESL-Bench: An Event-Driven Synthetic Longitudinal Benchmark for Health Agents

💡 This research explores techniques in language AI.

Longitudinal health agents must reason across multi-source trajectories that combine continuous device streams, sparse clinical exams, and episodic life events . We present ESL-Bench, an event-driven synthesis framework and benchmark providing 100 synthetic users . Users paired with 100 evaluation queries across five dimensions - Lookup, Trend, Comparison, Anomaly, Explanation - stratified into Easy, Medium, and Hard tiers .

Abstract ↗ PDF ↗

🟢 Applied

NavCrafter: Exploring 3D Scenes from a Single Image

💡 This research introduces a new approach to computer vision.

NavCrafter explores 3D scenes from a single image by synthesizing novel-view video sequences with camera controllability and temporal-spatial consistency . The framework leverages video diffusion models to capture rich 3D priors and adopts a geometry-aware expansion strategy to progressively extend scene coverage . We further propose a collision-aware camera trajectory planner and enhanced 3D Gaussian Splatting pipeline .

Abstract ↗ PDF ↗

🟢 Applied

ChatSVA: Bridging SVA Generation for Hardware Verification via Task-Specific LLMs

💡 This research enhances language AI.

ChatSVA is an end-to-end SVA generation system built upon a multi-agent framework . The AgentBridge platform enables this approach by systematically generating high-purity datasets, overcoming the data scarcity inherent to few-shot scenarios . The online service has been publicly released at an online service .

Abstract ↗ PDF ↗

🟢 Applied

Help Converts Newcomers, Not Veterans: Generalized Reciprocity and Platform Engagement on Stack Overflow

💡 This research running AI locally on devices for edge computing.

Generalized reciprocity -- the tendency to help others after receiving help oneself -- is widely theorized as a mechanism sustaining cooperation on online knowledge-sharing platforms . Yet robust empirical evidence from field settings remains surprisingly scarce . Using Cox proportional hazards models on over 21 million questions, we find that receiving an answer significantly increases a user's propensity to help other users . This effect is concentrated among newcomers and declines with platform experience .

Abstract ↗ PDF ↗

🟢 Applied

Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts

💡 This research presents techniques for language AI.

We present a domain-adapted RAG pipeline for tutoring move annotation . We adapt retrieval by fine-tuning a lightweight embedding model on tutoring corpora and indexing dialogues at the utterance level to retrieve labeled demonstrations . Retrieval corrects systematic label biases present in zero-shot prompting .

Abstract ↗ PDF ↗

🟢 Applied

A Data-Centric Vision Transformer Baseline for SAR Sea Ice Classification

💡 This research categorizing computer vision.

Synthetic Aperture Radar (SAR) is the operational standard because of its all-weather capability, but it remains challenging to distinguish morphologically similar ice classes under severe class imbalance . This paper establishes a trustworthy SAR only baseline that future fusion work can build upon .

Abstract ↗ PDF ↗

🔬

Lightweight Systems

🟢 Applied

MSAO: Adaptive Modality Sparsity-Aware Offloading with Edge-Cloud Collaboration for Efficient Multimodal LLM Inference

💡 This research proposes a method for language AI.

Multimodal large language models (MLLMs) enable powerful cross-modal reasoning capabilities but impose substantial computational and latency burdens . We propose MSAO, an adaptive modality sparsity-aware offloading framework with edge-cloud collaboration for efficient MLLM Inference .

Abstract ↗ PDF ↗

🟢 Applied

Digital Twin-Assisted In-Network and Edge Collaboration for Joint User Association, Task Offloading, and Resource Allocation in the Metaverse

💡 This research makes more efficient computer vision.

Advancements in extended reality (XR) are driving the development of the metaverse, which demands efficient real-time transformation of 2D scenes into 3D objects . We propose a digital twin (DT)-based in-network computing (INC)-assisted multi-access edge computing (MEC) framework . In this framework, a network operator manages wireless and computational resources for XR user devices . XUDs autonomously offload tasks to maximize their utilities .

Abstract ↗ PDF ↗

🟢 Applied

Deep Learning-Based Anomaly Detection in Spacecraft Telemetry on Edge Devices

💡 This research presents techniques for computer vision.

This paper investigates three approaches for spacecraft telemetry anomaly detection -- forecasting & threshold, direct classification, and image classification -- and optimizes them for edge deployment using multi-objective neural architecture optimization . Analysis of deployment viability shows our optimized models require just 0.36-6.25% of CubeSat RAM .

Abstract ↗ PDF ↗

🟢 Applied

Storing Less, Finding More: How Novelty Filtering Improves Cross-Modal Retrieval on Edge Cameras

💡 This research introduces a new approach to language AI.

An on-device epsilon-net filter retains only semantically novel frames, building a denoised embedding index . A single-pass streaming filter outperforms offline alternatives (k-means, farthest-point, uniform, random) across eight vision-language models (8M-632M) on two egocentric datasets (AEA, EPIC-KITCHENS) Combined, architecture reaches 45.6% Hit@5 on held-out

Abstract ↗ PDF ↗

🟢 Applied

Downsides of Smartness Across Edge-Cloud Continuum in Modern Industry

💡 This research creating new content with edge computing.

The fast pace of modern AI is rapidly transforming traditional industrial systems into vast, intelligent and potentially unmanned autonomous operational environments driven by AI-based solutions . These solutions leverage various forms of machine learning, reinforcement learning, and generative AI . These smart capabilities have pushed the envelope in multiple industrial domains, enabling predictive maintenance, optimized performance, and streamlined workflows .

Abstract ↗ PDF ↗

🟢 Applied

Physical Design of UET-RVMCU: A Streamlined Open-Source RISC-V Microcontroller

💡 This research reduces edge computing.

UET-RVMCU is a lightweight RISC-V microcontroller derived from the UETRV-PCore . The project demonstrates the feasibility of transforming an application-class SoC into a feature-rich microcontroller suitable for embedded systems .

Abstract ↗ PDF ↗

🟢 Applied

EEspice: A Modular Circuit Simulation Platform with Parallel Device Model Evaluation via Graph Coloring

💡 This research optimizes language AI.

EEspice is an open-source circuit simulation framework whose modular architecture decouples device model evaluation into independently replaceable kernels . It partitions MOSFET instances into independent color groups, which can be processed in parallel .

Abstract ↗ PDF ↗

🟢 Applied

Taming the Exponential: A Fast Softmax Surrogate for Integer-Native Edge Inference

💡 This research faster predictions in edge computing.

Softmax can become a computational bottleneck in the Transformer model's Multi-Head Attention (MHA) block . Head-Calibrated Clipped-Linear Softmax (HCCS) is a bounded, monotone surrogate to the exponential softmax function . HCCS uses a clipped linear mapping of the max centered attention logits .

Abstract ↗ PDF ↗

🟢 Applied

A Practical Two-Stage Framework for GPU Resource and Power Prediction in Heterogeneous HPC Systems

💡 This research makes more efficient machine learning.

Efficient utilization of GPU resources and power has become critical with the growing demand for GPUs in high-performance computing . VASP is a widely used materials science application on Perlmutter at NERSC, an HPE Cray EX system .

Abstract ↗ PDF ↗

🟢 Applied

RePart: Efficient Hypergraph Partitioning with Logic Replication Optimization for Multi-FPGA System

💡 This research explores techniques in machine learning.

Multi-FPGA systems (MFS) are widely adopted for VLSI emulation and rapid prototyping . In an MFS, FPGAs connect only to a limited number of neighbors through bandwidth-constrained links . RePart reduces total hop distance by 52.3% on average over state-of-the-art hypergraph partitioners .

Abstract ↗ PDF ↗

🟢 Applied

Causal Inference for Quantifying Noisy Neighbor Effects in Multi-Tenant Cloud Environments

💡 This research explores techniques in machine learning.

Resource sharing in multi-tenant cloud environments enables cost efficiency but introduces the Noisy Neighbor problem . Despite extensive research on detecting such effects, there are no explainable methodologies for quantifying the severity of impact and establishing causal relationships among tenants .

Abstract ↗ PDF ↗

🟢 Applied

HistMSO: A Logic for Reasoning about Consistency Models with MONA

💡 This research explores techniques in machine learning.

Reasoning about consistency models for replicated data systems is a challenging task that requires a deep understanding of both the consistency models themselves and a large part of human inputs in mechanized verification approaches . We introduce HistMSO, a monadic second-order logic (MSO) for histories and abstract executions .

Abstract ↗ PDF ↗

🟢 Applied

Scalable Mean-Variance Portfolio Optimization via Subspace Embeddings and GPU-Friendly Nesterov-Accelerated Projected Gradient

💡 This research reduces machine learning.

We develop a sketch-based factor reduction and a Nesterov-accelerated projected gradient algorithm (NPGA) with GPU acceleration . The method combines randomized subspace embedding, spectral truncation, and ridge stabilization to construct an effective factor $L_{eff}$. It then solves the resulting constrained problem with a structured projection computed by scalar dual search and matrix-vector kernels .

Abstract ↗ PDF ↗

🟢 Applied

Communication-free Sampling and 4D Hybrid Parallelism for Scalable Mini-batch GNN Training

💡 This research explores techniques in edge computing.

ScaleGNN is a 4D parallel framework for scalable mini-batch GNN training . It combines communication-free distributed sampling, 3D parallel matrix multiplication (PMM), and data parallelism . On Perlmutter, scaleGNN achieves 3.5x end-to-end training speedup over the SOTA baseline on ogbn-products .

Abstract ↗ PDF ↗

🟢 Applied

Fast NF4 Dequantization Kernels for Large Language Model Inference

💡 This research reduces language AI.

Large language models (LLMs) have grown beyond the memory capacity of single GPU devices . While NF4 (4-bit NormalFloat) quantization enables 4$\times$ memory reduction, inference on current NVIDIA GPUs (e.g., Ampere A100) requires expensive dequantization back to FP16 format .

Abstract ↗ PDF ↗

🟢 Applied

Intelligent Cloud Orchestration: A Hybrid Predictive and Heuristic Framework for Cost Optimization

💡 This research explores techniques in computer vision.

Cloud computing allows scalable resource provisioning, but dynamic workload changes often lead to higher costs due to over-provisioning . Machine learning (ML) approaches are effective for predicting workload patterns, but they can introduce delays during sudden traffic spikes . In contrast, mathematical heuristics like Game Theory provide fast and reliable scheduling decisions, but do not account for future workload changes .

Abstract ↗ PDF ↗

🟢 Applied

Optimization Opportunities for Cloud-Based Data Pipeline Infrastructures

💡 This research optimizes machine learning.

Cloud infrastructure supports efficient operation of data pipelines regarding requirements like cost, speed, and resource utilization . Study contributes theory of optimization goals like minimizing cost, reducing execution time, and cost-makespan trade-offs .

Abstract ↗ PDF ↗

🟢 Applied

FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models

💡 This research optimizes language AI.

Parameter-efficient fine-tuning (PEFT) has emerged as a crucial paradigm for adapting large language models under constrained computational budgets . But standard PEFT methods often struggle in multi-task fine-tuneing settings, where diverse optimization objectives induce task interference and limited parameter budgets lead to representational deficiency . We propose FourierMoE, which integrates the MoE architecture with the inverse discrete Fourier transform (IDFT) for frequency-aware adaptation .

Abstract ↗ PDF ↗

🟢 Applied

ModTrans: Translating Real-world Models for Distributed Training Simulator

💡 This research explores techniques in machine learning.

Large-scale distributed training has been a research hot spot in machine learning systems for industry and academia in recent years . But conducting experiments without physical machines and corresponding resources is difficult . One solution is to leverage distributed training simulators, but current ones do not support importing real-world developed models .

Abstract ↗ PDF ↗

🟢 Applied

Highly-Parallel Atom-Detection Accelerator for Tweezer-Based Neutral Atom Quantum Computers

💡 This research explores techniques in computer vision.

Neutral atom quantum computers (NAQCs) are among the most promising computational platforms for quantum computing . Controlling and measuring individual atoms and their states is typically the most time-consuming task during computation . To resolve this challenge, we propose a highly-parallel atom-detection accelerator for tweezer-based NAQCs .

Abstract ↗ PDF ↗

🟢 Applied

Fast Deterministic Distributed Degree Splitting

💡 This research explores techniques in edge computing.

We obtain better algorithms for computing more balanced orientations and degree splits in LOCAL . Important to our result is a connection to the hypergraph sinkless orientation problem [BMNSU, SODA'25]

Abstract ↗ PDF ↗

🟢 Applied

MPI-Q: A Message Communication Library for Large-Scale Classical-Quantum Heterogeneous Hybrid Distributed Computing

💡 This research explores techniques in edge computing.

The classical-quantum system heterogeneity renders existing distributed communication mechanisms (e.g. MPI, NCCL etc.) inadequate . This bottleneck severely impairs operational synergy and programming efficiency . To address these challenges, this paper proposes a message-passing library tailored for large-scale quantum heterogeneous distributed computing .

Abstract ↗ PDF ↗

🟢 Applied

Reclaiming Idle CPU Cycles on Kubernetes: Sparse-Domain Multiplexing for Concurrent MPI-CFD Simulations

💡 This research presents techniques for computer vision.

When MPI-parallel simulations run on shared Kubernetes clusters, conventional CPU scheduling leaves the vast majority of provisioned cycles idle . This paper presents a multiplexing framework that reclaims this idle capacity by co-locating multiple simulations .

Abstract ↗ PDF ↗

🟢 Applied

TENT: A Declarative Slice Spraying Engine for Performant and Resilient Data Movement in Disaggregated LLM Serving

💡 This research explores techniques in language AI.

Modern GPU clusters are built upon a complex hierarchy of heterogeneous interconnects, ranging from multi-rail RDMA to proprietary fabrics such as Multi-Node NVLink and Ascend UB . Orchestrating these diverse links effectively remains a critical challenge in disaggregated LLM serving . We present TENT, a data-movement engine that decouples transfer intent from physical execution . In LLM inference with SGLang HiCache, TENT achieves up to 1.36

Abstract ↗ PDF ↗

🟢 Applied

Scalable AI-assisted Workflow Management for Detector Design Optimization Using Distributed Computing

💡 This research explores techniques in machine learning.

The Production and Distributed Analysis (PanDA) system was originally developed for the ATLAS experiment at the CERN Large Hadron Collider . PanDA supports AI/ML-driven workflows through a scalable and flexible workflow engine . We present an AI-assisted framework for detector design optimization .

Abstract ↗ PDF ↗

🔬

Offline-First / Local AI

🟡 Advanced

Product-Stability: Provable Convergence for Gradient Descent on the Edge of Stability

💡 This research running AI locally on devices for edge computing.

Modern deep learning training often occurs at the Edge of Stability (EoS) where the sharpness of the loss exceeds the threshold below which classical convergence analysis applies . In this work, we introduce and study a structural property of loss functions that we term product-stability . We show that for losses with product-stable minima, gradient descent applied to objectives of the form $(x,y) \mapsto l(xy)$ can provably converge to the local minimum even

Abstract ↗ PDF ↗

🟢 Applied

Reinforcement Learning-based Knowledge Distillation with LLM-as-a-Judge

💡 This research improves language AI.

Reinforcement Learning (RL) has been shown to substantially improve the reasoning capability of small and large language models . We propose an RL framework that uses rewards from an LLM that acts as a judge evaluating model outputs over large amounts of unlabeled data .

Abstract ↗ PDF ↗

🟢 Applied

Towards Near-Real-Time Telemetry-Aware Routing with Neural Routing Algorithms

💡 This research makes more efficient machine learning.

Routing algorithms must be able to react to traffic bursts within milliseconds . LogGIA is a scalable graph neural routing algorithm that predicts log-space link weights from attributed topology-and-telemetry graphs . It uses data-driven pre-training stage followed by on-policy Reinforcement Learning .

Abstract ↗ PDF ↗

🟢 Applied

Towards Realistic Class-Incremental Learning with Free-Flow Increments

💡 This research explores techniques in computer vision.

Class-incremental learning (CIL) is typically evaluated under predefined schedules with equal-sized tasks . We propose a model-agnostic framework for robust CIL learning under free-flow arrivals . We constrain distillation to replayed data, normalize the scale of contrastive and knowledge transfer losses .

Abstract ↗ PDF ↗

🟢 Applied

Understanding Latent Diffusability via Fisher Geometry

💡 This research explores techniques in machine learning.

Diffusion models often degrade when trained in latent spaces (e.g., VAEs) yet formal causes remain poorly understood . We quantify latent-space diffusability through the rate of change of the Minimum Mean Squared Error (MMSE) along the diffusion trajectory . We demonstrate that while global isometry ensures FI alignment, FIR is governed by the encoder's local geometric properties .

Abstract ↗ PDF ↗

🟢 Applied

AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models

💡 This research presents techniques for language AI.

AutoVerifier is an LLM-based agentic framework that automates end-to-end verification of technical claims without requiring domain expertise . It decomposes every technical assertion into structured claim triples of the form (Subject, Predicate, Object) and builds knowledge graphs that enable structured reasoning across six progressively enriching layers .

Abstract ↗ PDF ↗

🟢 Applied

Complex-Valued GNNs for Distributed Basis-Invariant Control of Planar Systems

💡 This research explores techniques in machine learning.

Graph neural networks (GNNs) are a well-regarded tool for learned control of networked dynamical systems due to their ability to be deployed in a distributed manner . Current distributed GNNs assume that all nodes in the network collect geometric observations in compatible bases . This paper presents a GNN parametrization that is globally invariant to choice of local basis .

Abstract ↗ PDF ↗

🟡 Advanced

Steerable but Not Decodable: Function Vectors Operate Beyond the Logit Lens

💡 This research explores techniques in language AI.

Function vectors (FVs) can steer large language model behavior when added to the residual stream . We hypothesized that FV steering failures reflect an absence of task-relevant information: logit lens would fail alongside steering . We found that FVs succeed even when the logit lenses cannot decode the correct answer at any layer . This steerability-without-decodability pattern is universal .

Abstract ↗ PDF ↗

🟢 Applied

Communication-Efficient Distributed Learning with Differential Privacy

💡 This research reduces privacy-preserving AI.

We address nonconvex learning problems over undirected networks . We provide theoretical privacy guarantees within a differential privacy framework . We show the algorithm's superior performance on a classification task under the same privacy budget, compared with state-of-the-art methods .

Abstract ↗ PDF ↗

🟢 Applied

Financial Anomaly Detection for the Canadian Market

💡 This research explores techniques in machine learning.

In this work we evaluate the performance of three classes of methods for detecting financial anomalies . We apply these methods to the TSX-60 data to identify major financial stress events in the Canadian stock market . We show how neural network-based methods achieve the strongest performance .

Abstract ↗ PDF ↗

🟡 Advanced

Learning Contractive Integral Operators with Fredholm Integral Neural Operators

💡 This research proposes a method for machine learning.

We generalize the framework of Fredholm Neural Networks to learn non-expansive integral operators arising in Fredholm Integral Equations (FIEs) of the second kind in arbitrary dimensions . We also demonstrate how FREDINOs can be used to learn the solution operator of non-linear elliptic PDEs .

Abstract ↗ PDF ↗

🟡 Advanced

Generating DDPM-based Samples from Tilted Distributions

💡 This research explores techniques in machine learning.

Given $n$ independent samples from a $d$-dimensional probability distribution, our aim is to generate diffusion-based samples . We define a plug-in estimator and show that it is minimax-optimal . We develop Wasserstein bounds between the distribution of the model and the true distribution .

Abstract ↗ PDF ↗

🟡 Advanced

A semicontinuous relaxation of Saito's criterion and freeness as angular minimization

💡 This research explores techniques in computer vision.

We introduce a nonnegative functional on the space of line arrangements in $\mathbb{P}^2$ that vanishes precisely on free arrangements . The functional is obtained as a semicontinuous relaxation of Saito's criterion for freeness . Using this functional as a reward signal, we develop a sequential construction procedure in which lines are added one at a time so as to minimize the angular distance .

Abstract ↗ PDF ↗

🟢 Applied

Mitigating Reward Hacking in RLHF via Advantage Sign Robustness

💡 This research reduces machine learning.

Reward models (RMs) used in reinforcement learning from human feedback are vulnerable to reward hacking . We propose SignCertified Policy Optimization (SignCert-PO) down-weighting non-robust completions in the policy gradient update . We make the assumption that reward hacking is often caused by flipped advantage signs .

Abstract ↗ PDF ↗

🟡 Advanced

Inversion-Free Natural Gradient Descent on Riemannian Manifolds

💡 This research optimizes machine learning.

Paper proposes inversion-free stochastic natural gradient method for probability distributions whose parameters lie on a Riemannian manifold . The method maintains an online approximation of the inverse FIM, which is efficiently updated at quadratic cost using score vectors sampled at successive iterates .

Abstract ↗ PDF ↗

🟡 Advanced

Efficient Logistic Regression with Mixture of Sigmoids

💡 This research explores techniques in machine learning.

Exponential Weights (EW) algorithm with isotropic Gaussian prior for online logistic regression . We show that EW can be both computationally tractable and geometrically adaptive in online classification .

Abstract ↗ PDF ↗

🟢 Applied

Extracting Money Laundering Transactions from Quasi-Temporal Graph Representation

💡 This research presents techniques for machine learning.

ExSTraQt (EXtract Suspicious TRAnsactions from Quasi-Temporal graph representation) is an advanced supervised learning approach to detect money laundering (or suspicious) transactions in financial datasets . We claim that our framework could seamlessly complement existing AML detection systems in banks .

Abstract ↗ PDF ↗

🟢 Applied

Rethinking Forward Processes for Score-Based Data Assimilation in High Dimensions

💡 This research forecasting machine learning.

Data assimilation is the process of estimating the time-evolving state of a dynamical system by integrating model predictions and noisy observations . Classical filters often struggle with accuracy or computational feasibility in high dimensions .

Abstract ↗ PDF ↗

🔴 Theory-Heavy

Lipschitz bounds for integral kernels

💡 This research explores techniques in machine learning.

Feature maps associated with positive definite kernels play a central role in kernel methods and learning theory . They are closely related to robustness and stability guarantees . In this paper, we study the Lipschitz regularity of feature maps under differentiability assumptions . We also study continuous and shift-invariant kernels such as Gaussian, Laplace, and Matérn kernels .

Abstract ↗ PDF ↗

🟢 Applied

Toward an Operational GNN-Based Multimesh Surrogate for Fast Flood Forecasting

💡 This research explores techniques in computer vision.

AI-based surrogate models have shown strong potential in several areas of computational physics for accelerating otherwise expensive high-fidelity simulations . On the studied case, the learned surrogate produces 6-hour predictions in about $0.4\,\mathrm{s}$ on a single NVIDIA A100 GPU .

Abstract ↗ PDF ↗

🟢 Applied

Transfer Learning for Loan Recovery Prediction under Distribution Shifts with Heterogeneous Feature Spaces

💡 This research explores techniques in machine learning.

Transfer learning (TL) offers a promising avenue to mitigate this challenge by exploiting information from related but richer source domains . FT-MDN-Transformer is a mixture-density tabular Transformer architecture specifically designed for TL in RR forecasting .

Abstract ↗ PDF ↗

🟢 Applied

Structure-Aware Commitment Reduction for Network-Constrained Unit Commitment with Solver-Preserving Guarantees

💡 This research reduces language AI.

Growing number of individual generating units, hybrid resources, and security constraints has significantly increased the computational burden of network-constrained unit commitment . This paper proposes a solver-compatible dimensionality reduction framework for UC that exploits structural regularities in commitment decisions . The framework identifies a sparse subset of structurally stable commitment binaries to fix prior to optimization .

Abstract ↗ PDF ↗

🟢 Applied

Random Is Hard to Beat: Active Selection in online DPO with Modern LLMs

💡 This research optimizes language AI.

Active Preference Learning (APL) seeks to optimize query efficiency in online Direct Preference Optimization . Modern LLMs inherit strong priors from web-scale pretraining, which can limit headroom of post-training data-selection strategies . APL fails to mitigate this capability collapse or reduce variance significantly better than random sampling .

Abstract ↗ PDF ↗

🔴 Theory-Heavy

State estimations and noise identifications with intermittent corrupted observations via Bayesian variational inference

💡 This research explores techniques in machine learning.

This paper focuses on the state estimation problem in distributed sensor networks . Unlike existing AKF that handle missing data and measurement outliers, the proposed VB-AKF adopts a dual-mask generative model with two independent Bernoulli random variables .

Abstract ↗ PDF ↗

🟡 Advanced

MOMO: Mars Orbital Model Foundation Model for Mars Orbital Applications

💡 This research presents techniques for computer vision.

MOMO uses model merge to integrate representations learned independently from three key Martian sensors (HiRISE, CTX, and THEMIS) Central to our method is our novel Equal Validation Loss (EVL) strategy .

Abstract ↗ PDF ↗