arXiv Research Digest

May 04, 2026 β€’ 125 papers across 5 interests
πŸ”¬

Efficient ML / Edge AI

🟒 Applied

Make Your LVLM KV Cache More Lightweight

πŸ’‘ This research enhances language AI.
Key-Value cache has become a de facto component of modern Large Vision-Language Models (LVLMs) for inference . LightKV employs cross-modality message passing to aggregate informative messages across vision tokens and progressively compress them during prefill .
🟒 Applied

Quantum Gradient-Based Approach for Edge and Corner Detection Using Sobel Kernels

πŸ’‘ This research running AI locally on devices for computer vision.
Edge detection refers to identifying points in digital images where intensity changes sharply, indicating object boundaries or structural features . Corners are locations where gray-level intensity changes abruptly in multiple directions and are widely used in feature extraction, object tracking and 3D modeling .
🟑 Advanced

Budget Constraints as Riemannian Manifolds

πŸ’‘ This research explores techniques in language AI.
Assigning one of K options to each of N groups under a total cost budget is a recurring problem in machine learning . The objective (model loss) depends jointly on all assignments and does not decompose across groups . Evolutionary search evaluates the actual loss but lacks gradient information . We propose Riemannian Constrained Optimization (RCO) augments a standard Adam update with tangent projection .
🟒 Applied

Decouple before Integration: Test-time Synthesis of SFT and RLVR Task Vectors

πŸ’‘ This research enhances language AI.
Decoupled Test-time Synthesis (DoTS) allows SFT and RLVR checkpoints to be trained independently and synthesizes their capabilities only at inference time via task vector arithmetic . DOTS uses selective sparsification with norm-preserving rescaling to reduce interference .
🟑 Advanced

Randomized Subspace Nesterov Accelerated Gradient

πŸ’‘ This research reduces machine learning.
Randomized-subspace methods reduce the cost of first-order optimization by using only low-dimensional projected-gradient information . The key technical ingredient is a three-sequence formulation tailored to matrix smoothness . The resulting theory establishes accelerated oracle-complexity guarantees .
🟒 Applied

FedKPer: Tackling Generalization and Personalization in Medical Federated Learning via Knowledge Personalization

πŸ’‘ This research distributed machine learning across privacy-preserving AI.
Federated learning (FL) holds great potential for medical applications . However, statistical heterogeneity across healthcare institutions poses a major challenge for FL . We introduce FedKPer, which introduces knowledge personalization into the training stage of each local device . Afterwards, generalization is considered via the global model aggregation process .
🟒 Applied

Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding

πŸ’‘ This research achieves better computer vision.
Graphical User Interface (GUI) grounding maps natural language instructions to visual coordinates of target elements . Recent reinforcement learning methods (e.g., GRPO) have achieved strong performance, but they rely on expensive multiple rollouts and suffer from sparse signals on hard samples . On-policy self-distillation (OPSD) provides dense token-level supervision from a single rollout . In this paper, we present the first OPSD framework tailored for GUI grounding .
🟒 Applied

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

πŸ’‘ This research explores techniques in language AI.
Persistent Visual Memory (PVM) is a lightweight learnable module designed to ensure sustained, on-demand visual perception . PVM establishes a distance-agnostic retrieval pathway that directly provides visual embeddings for precise visual perception, thereby structurally mitigating the signal suppression inherent to deep generation .
🟒 Applied

When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI

πŸ’‘ This research protecting data privacy in language AI.
Patient-facing medical chatbots based on retrieval-augmented generation (RAG) are increasingly promoted to deliver accessible, grounded health information . AI-assisted development lowers the barrier to building them but they still demand rigorous security, privacy, and governance controls .
🟑 Advanced

Evaluating the Architectural Reasoning Capabilities of LLM Provers via the Obfuscated Natural Number Game

πŸ’‘ This research achieves better language AI.
Large Language Models have achieved notable success on formal mathematics benchmarks such as MiniF2F . It remains unclear whether these results stem from genuine logical reasoning or semantic pattern matching against pre-training data . This paper identifies Architectural Reasoning as the necessary ability for future automated theorem discovery AI .
🟒 Applied

SC-Taxo: Hierarchical Taxonomy Generation under Semantic Consistency Constraints using Large Language Models

πŸ’‘ This research makes more efficient language AI.
Scientific literature is expanding at an unprecedented pace, making it increasingly challenging to efficiently organize and access domain knowledge . A high-quality scientific taxonomy offers a structured and hierarchical representation of a research field, facilitating literature exploration and topic navigation . We propose a semantic-consistent taxonomy generation (SC-Taxo) framework that leverages large language models with hierarchy-aware refinement stages to ensure semantic consistency .
🟒 Applied

Faithful Extreme Image Rescaling with Learnable Reversible Transformation and Semantic Priors

πŸ’‘ This research proposes a method for computer vision.
Most extreme rescaling methods struggle to preserve semantically consistent structures and produce realistic details . To alleviate the above problems, we propose FaithEIR, a diffusion-based framework for extreme image rescaling .
🟒 Applied

Quantum Interval Bound Propagation for Certified Training of Quantum Neural Networks

πŸ’‘ This research makes more efficient machine learning.
Quantum machine learning is a promising field for efficiently learning features of a dataset to perform a specified task, such as classification . Quantum interval bound propagation (QIBP) is a popular certified training method in classical machine learning .
🟒 Applied

Position: agentic AI orchestration should be Bayes-consistent

πŸ’‘ This research explores techniques in language AI.
Bayesian decision theory provides a framework for agentic systems that can help to maintain beliefs over task-relevant latent quantities . This paper articulates practical properties for Bayesian control that fit modern agentic AI systems and human-AI collaboration .
🟒 Applied

EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure

πŸ’‘ This research distributed machine learning across computer vision.
Federated Multimodal Learning (FML) trains multimodal models across decentralized clients while keeping their image-text pairs private . But joint embedding training entangles forgotten knowledge across both modalities and client gradient subspaces, hindering federated unlearning . We present EASE, an Entanglement-Aware Subspace Excision framework that closes all three anchor channels under unified design .
🟒 Applied

PhysEdit: Physically-Consistent Region-Aware Image Editing via Adaptive Spatio-Temporal Reasoning

πŸ’‘ This research faster predictions in computer vision.
PhysEdit introduces two inference-time modules that compose without retraining the backbone . PhysEdit delivers a 1.18x wall-clock speedup (64.3s vs. 76.1s per sample) on the full 737-case ImgEdit Basic-Edit Suite .
🟒 Applied

Adaptive Querying with AI Persona Priors

πŸ’‘ This research explores techniques in language AI.
We study adaptive querying for learning user-dependent quantities of interest, such as responses to held-out items and psychometric indicators, within tight question budgets . We introduce a persona-induced latent variable model that represents a user's state through membership in a finite dictionary of AI personas . This yields expressive priors with closed-form posterior updates and efficient finite-mixture predictions .
🟒 Applied

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

πŸ’‘ This research explores techniques in language AI.
ML-Bench is a policy-grounded multilingual safety benchmark covering 14 languages . ML-Guard is a Diffusion Large Language Model (dLLM)-based guardrail model that supports multilingual judgment and policy-conditioned compliance assessment .
🟒 Applied

Static and Dynamic Graph Alignment Network for Temporal Video Grounding

πŸ’‘ This research enhances computer vision.
Temporal Video Grounding (TVG) aims to localize temporal moments in an untrimmed video that semantically correspond to given natural language queries . Graph Convolutional Networks (GCN) have been widely adopted in TVG to model temporal relations among video clips and enhance contextual reasoning by constructing clip-level graphs .
🟒 Applied

InpaintSLat: Inpainting Structured 3D Latents via Initial Noise Optimization

πŸ’‘ This research optimizes machine learning.
We present a training-free approach for controllable 3D inpainting based on initial noise optimization . The underlying geometric structure is established during the early stages of the diffusion process and exhibits high sensitivity to the initial noise .
🟒 Applied

Affordance Agent Harness: Verification-Gated Skill Orchestration

πŸ’‘ This research automatically finding computer vision.
Affordance grounding requires identifying where and how an agent should interact in open-world scenes, where actionable regions are often small, occluded, reflective, and visually ambiguous . Recent systems combine multiple skills (e.g., detection, segmentation, interaction-imagination) yet most orchestrate them with fixed pipelines that are poorly matched to per-instance difficulty .
🟒 Applied

PEACE: Cross-modal Enhanced Pediatric-Adult ECG Alignment for Robust Pediatric Diagnosis

πŸ’‘ This research enhances computer vision.
PEACE is a structured cross-modal alignment framework for adult-to-pediatric ECG transfer . PEACE integrates tri-axial clinical semantic decomposition, label-query feature extraction, curriculum-gated optimization to align adult ECG representations with pediatric diagnostic targets . On ZZU-pECG, PEACE achieves 59.39%, 79.03%, and 90.89% AUC on the shared PTB-XL label space .
🟒 Applied

Learning Multimodal Energy-Based Model with Multimodal Variational Auto-Encoder via MCMC Revision

πŸ’‘ This research creating new content with machine learning.
Energy-based models (EBMs) are well-suited to capture complex dependencies in multimodal data . Multimodal VAEs have made progress in capturing such inter-modal dependencies by introducing a shared latent generator and a joint inference model . We present a learning framework that effectively interweaves their updates with corresponding MCMC refinements in both the data and latent spaces .
🟒 Applied

Affinity Is Not Enough: Recovering the Free Energy Principle in Mixture-of-Experts

πŸ’‘ This research explores techniques in machine learning.
Sparse MoE routing fails at domain transitions, where the current token belongs to one distribution and the next to another . The mechanisms draw from Friston's Free Energy Principle and use LIF dynamics from spiking neural networks . In a controlled experiment (4 experts, 5 seeds), standard affinity routing assigns only 0.006 +/- 0.001 probability to the correct expert at the transition .
🟒 Applied

Posterior Augmented Flow Matching

πŸ’‘ This research explores techniques in computer vision.
Posterior-Augmented Flow Matching (PAFM) replaces single-target supervision with an expectation over an approximate posterior of valid target completions for a given intermediate state and condition . PAFM improves over FM by up to 3.4 FID50K across different model scales (SiT-B/2 and SiT-XL/2) and different architectures .
πŸ”¬

Privacy-Preserving ML

🟑 Advanced

Scaling Federated Linear Contextual Bandits via Sketching

πŸ’‘ This research distributed machine learning across privacy-preserving AI.
Federated Sketch Contextual Linear Bandits (FSCLB) uses SVD to indirectly obtain the determinant required for communication . FSCLB significantly reduces computational and communication costs by over 90 \% while sacrificing only a negligible amount of cumulative reward .
🟒 Applied

Federated Learning with Hypergradient-based Online Update of Aggregation Weights

πŸ’‘ This research proposes a method for privacy-preserving AI.
Federated learning using mobile and Internet of Things devices requires high adaptability to varying communication environments . FedHAW (Federated Learning with Hypergradient-based update of Aggregation Weights) implements online updates of aggregation weights .
🟒 Applied

Defense against Poisoning Attacks under Shuffle-DP

πŸ’‘ This research protecting data privacy in privacy-preserving AI.
Differential Privacy (DP) has become the gold standard for protecting individual privacy in data analytics . The shuffle-DP model has attracted significant attention from both academia and industry due to its favorable balance between privacy and utility . In real-world scenarios, adversarial users can exploit this vulnerability through poisoning attacks .
🟒 Applied

Meritocratic Fairness in Budgeted Combinatorial Multi-armed Bandits via Shapley Values

πŸ’‘ This research proposes a method for privacy-preserving AI.
We propose a new framework for meritocratic fairness in budgeted combinatorial multi-armed bandits with full-bandit feedback (BCMAB-FBF) We show that $K$-Shapley value is a unique solution concept that satisfies Symmetry, Linearity, Null player, and efficiency properties .
🟒 Applied

Unlearning Offline Stochastic Multi-Armed Bandits

πŸ’‘ This research protecting data privacy in privacy-preserving AI.
Machine unlearning aims to unlearn data points from a learned model, offering a principled way to process data-deletion requests and mitigate privacy risks without full retraining . We conduct a systematic study of both single- and multi-source unlearning scenarios .
🟑 Advanced

Zero-Knowledge Model Checking

πŸ’‘ This research explores techniques in edge computing.
Method combines deductive approach to model checking to obtain a formal certificate of correctness for the system, with zero-knowledge proofs to convince an external verifier that the system -- kept secret -- complies with its specification of correctness -- made public .
🟒 Applied

HyCOP: Hybrid Composition Operators for Interpretable Learning of PDEs

πŸ’‘ This research explores techniques in machine learning.
HyCOP learns parametric PDE solution operators by composing simple modules (advection, diffusion, learned closures) in a query-conditioned way . Modules may be numerical sub-solvers or learned components, enabling hybrid surrogates evaluated at arbitrary query times without autoregressive rollout .
🟒 Applied

Generating Statistical Charts with Validation-Driven LLM Workflows

πŸ’‘ This research explores techniques in language AI.
A structured LLM-based workflow decomposes chart generation into dataset screening, plot proposal, code synthesis, rendering, validation-driven refinement, description generation, and question-answer generation . It treats chart generation as an inspectable process rather than a one-shot prompt-to-code task . The results show that chart-syntax questions are nearly saturated, while value extraction, comparison and reasoning remain more challenging .
🟒 Applied

RunAgent: Interpreting Natural-Language Plans with Constraint-Guided Execution

πŸ’‘ This research proposes a method for language AI.
RunAgent is a multi-agent plan execution platform that interprets natural-language plans while enforcing stepwise execution through constraints and rubrics . RunAgent bridges the expressiveness of natural language with the determinism of programming via an agentic language . Evaluations on Natural-plan and SciBench Datasets demonstrate that RunAgent outperforms baseline LLMs .
🟒 Applied

Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift

πŸ’‘ This research explores techniques in computer vision.
Public image diffusion models are now powerful enough that an attacker without the resources to train a tabular-specific generator may repurpose one off the shelf . An attacker succeeds with synthetic evidence by thinking like the machine that will receive it . The more the attacker succeeds, the more they can induce ground truth drift .
🟒 Applied

SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control

πŸ’‘ This research improves machine learning.
State-Action Value Geometry Optimization (SAVGO) incorporates value-based similarity into policy updates . SAVGO learns a joint state-action embedding space in which pairs with similar action-value estimates exhibit high cosine similarity .
🟒 Applied

Observable Performance Does Not Fully Reflect System Organization: A Multi-Level Analysis of Gait Dynamics Under Occlusal Constraint

πŸ’‘ This research explores techniques in machine learning.
In biomechanical systems, observable performance is often used as a proxy for underlying system organization . In this study, the vertical dimension of occlusion (VDO) is considered as a constraint applied to an adaptive neuromechanical system . A single-case design in a patient with Parkinson's disease allows an intra-individual analysis across repeated conditions .
🟒 Applied

Learning the Helmholtz equation operator with DeepONet for non-parametric 2D geometries

πŸ’‘ This research explores techniques in machine learning.
This paper deals with solving the Helmholtz equation on non-parametric domains . It uses a physics-informed neural operator network based on the DeepONet framework . This approach enables the encoding of arbitrary geometries, whether they are parameterized or not .
🟒 Applied

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

πŸ’‘ This research explores techniques in language AI.
Reward models (RMs) have become an indispensable fixture of the language model (LM) post-training playbook, enabling policy alignment and test-time scaling . Research on application of RMs in code generation has been comparatively sparse, with existing work largely focusing on execution feedback .
🟒 Applied

NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search

πŸ’‘ This research proposes a method for machine learning.
Monte Carlo Tree Search scales poorly in cooperative multi-agent domains . Expansion must consider an exponentially large set of joint actions, limiting exploration under realistic search budgets . We propose NonZero, a proposal rule that keeps multi-agents MCTS tractable .
🟒 Applied

Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

πŸ’‘ This research makes more efficient language AI.
ASPO is self-adaptive multi-agent security pattern selection that integrates Large Language Model (LLM)-based reasoning with deterministic enforcement within a MAPE-K control loop . ASPO explicitly separates stochastic decision generation from execution: LLM agents propose candidate mitigation portfolios, while a deterministic optimisation core enforces closed-world action integrity .
🟒 Applied

Temporal Data Requirement for Predicting Unplanned Hospital Readmissions

πŸ’‘ This research explores techniques in machine learning.
The optimal time window for unstructured clinical notes is significantly shorter than for structured data . Maximum predictive performance was achieved using notes from just three to six months prior to surgery . Performance using structured data improved as the time window lengthened, but strictly plateaued after twelve months .
🟒 Applied

Weisfeiler Lehman Test on Combinatorial Complexes: Generalized Expressive Power of Topological Neural Networks

πŸ’‘ This research explores techniques in machine learning.
Combinatorial complexes have unified set-based (e.g., graphs, hypergraphs) and part-whole structures into a common topological framework . Existing topological neural networks and Weisfeiler-Lehman variants remain fragmented, lacking a unified theoretical foundation for topological deep learning .
πŸ”΄ Theory-Heavy

Decentralized Proximal Stochastic Gradient Langevin Dynamics

πŸ’‘ This research proposes a method for machine learning.
Decentralized Proximal Stochastic Gradient Langevin Dynamics (DE-PSGLD) is a decentralized Markov chain Monte Carlo algorithm for sampling from a log-concave probability distribution constrained to a convex domain . Constraints are enforced through a shared proximal regularization based on the Moreau-Yosida envelope, enabling unconstrained updates while preserving consistency with the target constrained posterior .
🟒 Applied

Aitchison Embeddings for Learning Compositional Graph Representations

πŸ’‘ This research presents techniques for edge computing.
Many networks naturally admit a role-mixture view, where nodes are best described as mixtures over latent archetypal factors . We propose a compositional graph embedding framework grounded in Aitchison geometry . Nodes are represented as simplex-valued compositions and embedded via isometric log-ratio coordinates .
🟒 Applied

Deep Kernel Learning for Stratifying Glaucoma Trajectories

πŸ’‘ This research introduces a new approach to computer vision.
Clinicians need tools to identify patients at high risk of progression from sparse and irregularly-sampled electronic health records . We propose a novel deep kernel learning (DKL) architecture that leverages a Gaussian Process (GP) backend .
🟒 Applied

STARE: Step-wise Temporal Alignment and Red-teaming Engine for Multi-modal Toxicity Attack

πŸ’‘ This research explores techniques in language AI.
Red-teaming Vision-Language Models is essential for identifying vulnerabilities where adversarial image-text inputs trigger toxic outputs . Existing approaches treat image generation as a black box, leaving open the question of when and how toxic semantics emerge during multi-step synthesis . We introduce STARE, a hierarchical reinforcement learning framework that treats the denoising trajectory itself as the attack surface .
🟑 Advanced

Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning

πŸ’‘ This research explores techniques in machine learning.
Safety is a primary challenge in real-world reinforcement learning (RL) Formulating safety requirements as state-wise constraints has become a prominent paradigm . Existing stabilization techniques are designed for scalar multipliers, which are inadequate for state-dependent multiplier networks . To address this challenge, we propose an augmented Lagrangian multiplier network (ALaM) framework .
🟒 Applied

Spiking Sequence Machines and Transformers

πŸ’‘ This research reduces machine learning.
Sequence learning reduces to similarity-based retrieval over a temporally indexed representation space . We formalise a Phase-Latency Isomorphism showing that sinusoidal positional phase and spike timing are linearly related . Time, phase, and rank are three instantiations of the same computational primitive .
🟒 Applied

Reinforcement Learning with Markov Risk Measures and Multipattern Risk Approximation

πŸ’‘ This research explores techniques in machine learning.
For a risk-averse finite-horizon Markov Decision Problem, we introduce a special class of Markov coherent risk measures, called mini-batch measures . We also define the class of multipattern risk-Averse problems that generalizes the classes of linear systems . We propose an economical version of the $Q$-learning method that streamlines the policy evaluation (backward) step .
πŸ”¬

Creative AI / Emotion

🟒 Applied

Prop-Chromeleon: Adaptive Haptic Props in Mixed Reality through Generative Artificial Intelligence

πŸ’‘ This research creating new content with computer vision.
Prop-Chromeleon is a MR system based on generative artificial intelligence (AI) that dynamically transforms everyday objects into adaptive passive haptic props through user-provided text prompts .
🟒 Applied

GaMMA: Towards Joint Global-Temporal Music Understanding in Large Multimodal Models

πŸ’‘ This research achieves better speech processing.
GaMMA is a state-of-the-art (SoTA) large multimodal model (LMM) designed to achieve comprehensive musical content understanding . It inherits the streamlined encoder-decoder design of LLaVA, enabling effective cross-modal learning between music and language . Our approach combines carefully curated datasets at scale with progressive training pipeline .
🟒 Applied

Towards Improving Speaker Distance Estimation through Generative Impulse Response Augmentation

πŸ’‘ This research improves machine learning.
Room Acoustics and Speaker Distance Estimation (SDE) Challenge at ICASSP 2025 explores effectiveness of augmented room impulse response (RIR) data for improving SDE model performance .
🟒 Applied

The impact of coercive, normative, and mimetic Stress on Chinese teachers' continuance intention to use generative AI: An integrated perspective of the Expectation-Confirmation Model and Institutional Theory

πŸ’‘ This research creating new content with machine learning.
Chinese teachers' continuance intention to use generative artificial intelligence (AI) is investigated by integrating the Expectation-Confirmation Model with Institutional Theory . Confirmation, perceived usefulness, and satisfaction play important roles in shaping teachers' continued use of AI .
🟒 Applied

Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling

πŸ’‘ This research creating new content with computer vision.
World models have recently re-emerged as a central paradigm for embodied intelligence, robotics, autonomous driving, and model-based reinforcement learning . We propose Hamiltonian World Models as a physically grounded perspective on world modeling . The key idea is to encode observations into a structured latent phase space, evolve the latent state through Hamiltonian-inspired dynamics .
🟒 Applied

BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis

πŸ’‘ This research presents techniques for language AI.
BlenderRAG is a retrieval-augmented generation system that operates on a curated multimodal dataset of 500 expert-validated examples (text, code, image) across 50 object categories . The dataset and code will be available at https://github.com/MaxRondelli/BlenderRAG .
🟒 Applied

Linking Behaviour and Perception to Evaluate Meaningful Human Control over Partially Automated Driving

πŸ’‘ This research reduces machine learning.
Meaningful human control (MHC) has been proposed as a normative framework to address this tension . But empirical methods for evaluating whether existing systems provide MHC remain underdeveloped .
🟒 Applied

MMAudio-LABEL: Audio Event Labeling via Audio Generation for Silent Video

πŸ’‘ This research explores techniques in speech processing.
Recent advances in multimodal generation have enabled high-quality audio generation from silent videos . Practical applications, such as sound production, demand explicit sound event labels detailing the type and timing of sounds . We propose MMAudio-LABEL (LAtent-Based Event Labeling), an event-aware audio generation framework .
🟒 Applied

On the Role of Artificial Intelligence in Human-Machine Symbiosis

πŸ’‘ This research explores techniques in computer vision.
The evolution of artificial intelligence (AI) has rendered the boundary between humanity and machines increasingly ambiguous . In general, the role assumed by AI is often specified, either implicitly or explicitly in the input prompt, yet becomes less apparent or altogether unobservable when the generated content alone is available . This study considers the problem of tracing the functional role played by AI in natural language generation .
🟒 Applied

MMAudioReverbs: Video-Guided Acoustic Modeling for Dereverberation and Room Impulse Response Estimation

πŸ’‘ This research explores techniques in computer vision.
Video-to-audio (V2A) models do not explicitly model room-acoustic effects such as reverberation or room impulse responses . MMAudioReverbs is a unified framework dealing with dereverberation and room impulse response (RIR) estimation .
🟒 Applied

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

πŸ’‘ This research explores techniques in language AI.
A piece of content claims a behavior; the runtime must decide whether to believe it . Without skill verification, a human-in-the-loop gate must fire on every irreversible call . With skill verification treated as a separate, gated process, HITL fires only for what is unverified, and the system becomes sustainable .
🟒 Applied

LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation

πŸ’‘ This research explores techniques in speech processing.
A speaker encoder used in multilingual voice cloning should treat the same speaker identically regardless of which script the audio was uttered in . Off-the-shelf encoders do not, and the failure is accent-conditional . We present LASE (Language-Adversarial Speaker Encoder), a small projection head over frozen WavLM-base-plus trained with two losses . LASE matches ECAPA-TDNN on cross-script speaker recall (0
🟒 Applied

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

πŸ’‘ This research explores techniques in language AI.
Directed Social Regard (DSR) approach to multi-dimensional, multi-valence sentiment analysis . NLP tools cannot report that positive and negative sentiments coexist . DSR approach is comprised of a pair of transformer-based models that (1) detects span-level targets of sentiment in a message and (2) scores all spans within the message context along three (-1, 1) axes of regard .
🟒 Applied

Empowering Heterogeneous Graph Foundation Models via Decoupled Relation Alignment

πŸ’‘ This research achieves better edge computing.
Decoupled relation Subspace Alignment (DRSA) introduces a dual-relation subspace projection mechanism to coordinate cross-type interactions within a shared low-rank relation subspace explicitly . DRSA constructs a well-calibrated, structure-aware latent space .
🟒 Applied

Structure Liberates: How Constrained Sensemaking Produces More Novel Research Output

πŸ’‘ This research explores techniques in language AI.
SCISENSE is a sensemaking-grounded framework that operationalizes ideation as a structured sequence of eight cognitive stages (Pirolli \& Card, 2005) We construct a 100K-scale dataset of citation-conditioned research trajectories in two modes: Target and Infer . Target-trained models achieve a 2.0\% improvement in trajectory quality over Infer models . This advantage propagates downstream: coding agents conditioned on Target trajectories produce research artifacts with higher
🟒 Applied

Silicon Showdown: Performance, Efficiency, and Ecosystem Barriers in Consumer-Grade LLM Inference

πŸ’‘ This research presents techniques for language AI.
The operational landscape of local Large Language Model (LLM) inference has shifted from lightweight models to datacenter-class weights exceeding 70B parameters . We conclude that for consumer-grade inference, optimal hardware is defined by a complex interplay between compute density (N Nvidia) and memory capacity .
🟒 Applied

Unsupervised Denoising of Real Clinical Low Dose Liver CT with Perceptual Attention Networks

πŸ’‘ This research reduces computer vision.
This paper focuses on the denoising problem of low-dose computed tomography using deep learning . The proposed framework combines a U-Net structure for multi-scale feature extraction and an attention mechanism for feature fusion . It also introduces perceptual loss to improve the network for the characteristics of medical images .
🟒 Applied

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

πŸ’‘ This research explores techniques in language AI.
Effective tool use hinges on a core LLM decision: whether to call or not call a tool, when performing a task . This decision is particularly challenging for web search tools, where the benefits of external information depend on the model's internal knowledge . We introduce a principled framework inspired by decision-making theory to evaluate web search tool-use decisions .
🟒 Applied

Possibilistic Predictive Uncertainty for Deep Learning

πŸ’‘ This research achieves better machine learning.
Dirichlet-approximated possibilistic posterior predictions (DAPPr) is a principled framework leveraging possibility theory . We define a possible posterior over parameters, projects this posterior to the prediction space via supremum operators, and approximates the projected posterior using learnable Dirichlets possibility functions .
🟒 Applied

AI Washing Inflates Expected Performance but Not Interaction Outcomes: An AI Placebo Study Using Fitts' Law

πŸ’‘ This research explores techniques in edge computing.
Expectations about the support of artificial intelligence may influence interaction outcomes similar to placebos . Such expectations may result from AI washing, a practice of overstating a system's capabilities when actual functionality is limited .
🟒 Applied

DySRec: Dynamic Context-Aware Psychometric Scale Recommendation via Multi-Agent Collaboration

πŸ’‘ This research explores techniques in machine learning.
DySRec operates as an interactive chatbot that engages users in multi-turn dialogue . It models scale selection as a continuous conversational decision process, and coordinates specialized agents to maintain user context, recommend assessment scales, monitor psychological risk, and log decision trajectories .
🟒 Applied

Instance-Aware Parameter Configuration in Bilevel Late Acceptance Hill Climbing for the Electric Capacitated Vehicle Routing Problem

πŸ’‘ This research optimizes machine learning.
A single globally tuned configuration often fails to exploit the heterogeneity of instances . This limitation is particularly evident in the Electric Capacitated Vehicle Routing Problem . The proposed approach achieves an average objective value reduction of $0.28\%$ across eight held-out test instances .
🟒 Applied

Pick and Sort for Graphical Authentication

πŸ’‘ This research proposes a method for computer vision.
We propose a graphical authentication scheme that follows a simple "Pick and Sort'' design . Users choose visual elements and arrange them within a grid . The number of selected elements and the grid size are configurable . The scheme is easy to learn and flexible to deploy .
🟒 Applied

Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

πŸ’‘ This research enhances language AI.
$Ξ¨$-RAG is a tree-RAG framework with two key components . It has a hierarchical abstract tree index built through an iterative "merging and collapse" process . A multi-granular retrieval agent that intelligently interacts with the knowledge base with reorganized queries and an agent-powered hybrid retriever .
🟒 Applied

Space Network of Experts: Architecture and Expert Placement

πŸ’‘ This research explores techniques in language AI.
Space Network of Experts (Space-XNet) framework targets distributed execution of a popular mixture-of-experts (MoE) model in space . Space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs)
πŸ”¬

Lightweight Systems

🟒 Applied

At the Edge of the Heart: ULP FPGA-Based CNN for On-Device Cardiac Feature Extraction in Smart Health Sensors for Astronauts

πŸ’‘ This research speeds up edge computing.
The convergence of accelerating human spaceflight ambitions and critical terrestrial health monitoring demands is driving unprecedented requirements for reliable, real-time feature extraction on extremely resource-constrained wearable health sensors . The implementation achieves a validation accuracy of 98% while consuming only 8.55 mW .
🟒 Applied

Lightweight Tamper-Evident Log Integrity Verification for IoT Edge Environments: A Merkle Tree Pipeline with Adaptive Chunking

πŸ’‘ This research explores techniques in edge computing.
A paper presents a lightweight and evaluated integrity verification pipeline that combines Merkle-tree commitments with resource-aware adaptive chunking to provide tamper evidence without relying on distributed ledger technologies . Tampering detection achieves perfect precision, recall, and F1-score (1.0) across corruption ratios ranging from 1% to 50% .
🟒 Applied

Progressive Semantic Communication for Efficient Edge-Cloud Vision-Language Models

πŸ’‘ This research running AI on low-power devices for language AI.
Vision-Language Models (VLMs) on edge devices remain challenging due to their substantial computational and memory demands . Fully offloading inference to the cloud is often impractical in bandwidth-limited environments . We propose a progressive semantic communication framework for edge-cloud VLM inference .
🟒 Applied

NVLLM: A 3D NAND-Centric Architecture Enabling Edge on-Device LLM Inference

πŸ’‘ This research speeds up language AI.
NVLLM is a 3D NAND-centric inference architecture that offloads feed-forward network (FFN) computation into the Flash while executing attention on lightweight CMOS logic with external DRAM . The rapid growth of LLMs demands high-throughput, memory-capacity-intensive inference on resource-constrained edge devices .
🟒 Applied

Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge

πŸ’‘ This research improves language AI.
Tempus is a Resource-Invariant Temporal GEMM framework for the AMD Versal AI Edge SoC . Tempus employs a fixed compute block of 16 AIE-ML cores . The framework maintains a 0.00% utilization of URAM/DSP, yielding 22.0x core frugality .
🟒 Applied

VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices

πŸ’‘ This research speeds up language AI.
VitaLLM is a mixed precision accelerator that enables ternary weight large language models to run efficiently on edge devices . A 16 nm silicon prototype at 1 GHz/0.8 V achieves 72.46 tokens/s in decode and 0.88 s prefill (64 tokens) within 0.214 mm^2 and 120 KB on-chip memory .
🟒 Applied

VitaLLM: A Versatile, Ultra-Compact Ternary LLM Accelerator with Dependency-Aware Scheduling

πŸ’‘ This research reduces language AI.
Large Language Models (LLMs) on resource-constrained edge devices faces bottlenecks in memory bandwidth and power consumption . VitaLLM achieves a decoding throughput of 70.70 tokens/s within an ultra-compact area of 0.223 mm$^2$ and a power consumption of 65.97 mW .
🟒 Applied

DMRlib: Easy-coding and Efficient Resource Management for Job Malleability

πŸ’‘ This research explores techniques in machine learning.
Process malleability has proved to have a highly positive impact on resource utilization and global productivity in data centers compared with the conventional static resource allocation policy . However, the non-negligible additional development effort this solution imposes has constrained its adoption by the scientific programming community . We present DMRlib, a library designed to offer the global advantages of process malleable while providing a minimalist MPI-like syntax .
🟒 Applied

DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference

πŸ’‘ This research makes more efficient language AI.
The increasing deployment of Large Language Model (LLM) inference on edge AI systems demands efficient execution under tight memory budgets . A key challenge arises from Key-Value (KV) caches, which often exceed available device memory . We present DUAL-BLADE, a dual-path KV residency framework that dynamically assigns KV tensors to either a page-cache path or an NVMe-direct path based on memory availability .
🟒 Applied

AnTi-MiCS: Analytical Framework for Bounding Time in Embedded Mixed-Criticality Systems

πŸ’‘ This research improves edge computing.
In Mixed-Criticality (MC) systems, the high Worst-Case Execution Time (WCET) serves as a conservative upper bound representing the task's maximum execution time under all conditions . Opting for a very low value of this WCET enhances processor utilization by scheduling more tasks in LO mode . Employing a larger WCET ensures fewer mode switches, thereby enhancing QoS .
🟒 Applied

AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework

πŸ’‘ This research faster predictions in machine learning.
AI inference is becoming a persistent and geographically distributed source of electricity demand . AI inference workloads can sometimes be executed away from the user-facing service location, provided that latency, state locality, capacity and regulatory constraints remain acceptable .
🟒 Applied

Towards the Democratization and Standardization of Dynamic Resources with MPI Spawning

πŸ’‘ This research makes more efficient machine learning.
This paper presents an efficient tool for managing dynamic resources in production high-performance computing (HPC) settings . We introduce a unified dynamic resource management application programming interface (API)
🟒 Applied

End-to-End and Phase-Level Performance Optimization for Hyperledger Fabric

πŸ’‘ This research enhances edge computing.
Hyperledger Fabric (HLF) is a modular, permissioned blockchain widely adopted in enterprise settings . We present a systematic, phase-level and end-to-end study of HLF optimization along three fronts .
🟒 Applied

A Test Taxonomy and Continuous Integration Ecosystem for Dynamic Resource Management in HPC

πŸ’‘ This research explores techniques in machine learning.
High-performance computing systems are increasingly exploring dynamic resource management and malleable MPI applications . The correctness of these techniques is often evaluated through ad hoc experiments that can be difficult to reproduce and maintain . The proposed methodology improves early fault detection, simplifies maintenance under evolving dependencies .
🟒 Applied

Efficient, VRAM-Constrained xLM Inference on Clients

πŸ’‘ This research makes more efficient language AI.
To usher in the next round of client AI innovation, there is an urgent need to enable efficient, lossless inference of high-accuracy large language models and vision language models . To address this, we present pipelined sharding, a novel, benchmark-guided CPU-GPU hybrid scheduling technique .
🟒 Applied

A PVT-Resilient Subthreshold SRAM-Based In-Memory Computing Accelerator with In-Situ Regulation for Energy-Efficient Spiking Neural Networks

πŸ’‘ This research makes more efficient edge computing.
This paper presents a PVT-resilient, subthreshold SRAM-based computing-in-memory (CIM) macro tailored for energy-efficient spiking neural networks (SNNs) The macro integrates in-situ current sensors and distributed voltage regulators to enable robust large-scale (1024 wordlines, 1304 bitlines and 128 shared neuron cells) subth threshold current-mode CIM .
🟒 Applied

DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference

πŸ’‘ This research speeds up computer vision.
Video and image streaming on edge devices requires low latency . To address this, Neural Networks (NNs) are widely used, and prior work mainly focuses on accelerating them with single hardware units . However, further reductions in latency can be observed by combining these units . In this paper, partitioning CNN inference across DPU and GPU is proposed .
🟒 Applied

Efficient Training on Multiple Consumer GPUs with RoundPipe

πŸ’‘ This research reduces language AI.
Fine-tuning Large Language Models on consumer-grade GPUs constrained by limited memory and slow PCIe interconnects . Pipeline parallelism combined with CPU offloading mitigates hardware bottlenecks by reducing communication overhead .
🟒 Applied

FaaSMoE: A Serverless Framework for Multi-Tenant Mixture-of-Experts Serving

πŸ’‘ This research makes more efficient computer vision.
Mixture-of-Experts (MoE) models offer high capacity with efficient inference cost by activating a small subset of expert models per input . FaaSMoE decouples the control and execution planes of MoE by deploying experts as stateless FaaS functions, enabling on-demand and scale-to-zero expert invocation across tenants .
🟒 Applied

SplitFT: An Adaptive Federated Split Learning System For LLMs Fine-Tuning

πŸ’‘ This research makes more efficient language AI.
Federated Split Learning has been identified as an efficient approach to address the computational resource constraints of clients in classical federated learning . However, it faces some critical challenges when such a training strategy meets large language models for fine-tuning . To bridge this gap, we propose SplitTF, an adaptive federated split learning system . SplitFT enables different clients to set different cut layers according to their resources and trained model performance .
🟒 Applied

Folding Tensor and Sequence Parallelism for Memory-Efficient Transformer Training & Inference

πŸ’‘ This research presents techniques for edge computing.
TSP is a parallel execution strategy that folds tensor parallelism and sequence parallelism onto a single device axis . By sharding both weights and activations across the same devices, TSP trades additional communication volume for reduced memory overhead . We provide theoretical communication and memory analysis, describe our implementation of TSP attention and gated MLP blocks .
🟒 Applied

DAK: Direct-Access-Enabled GPU Memory Offloading with Optimal Efficiency for LLM Inference

πŸ’‘ This research faster predictions in language AI.
DAK is an end-to-end direct-access memory offloading framework that repurposes the Tensor Memory Accelerator (TMA) to fetch offloaded weights and KV caches directly from remote memory into GPU shared memory . DAK achieves near-optimal bandwidth aggregation, with up to 3$\times$ performance gains on NVLink-C2C and 1.8$/times$ on PCIe systems .
🟒 Applied

Network Digital Untwinning: Towards Backward Optimization of Digital Twins

πŸ’‘ This research protecting data privacy in privacy-preserving AI.
Network digital twins (NDTs) are transforming network management by offering precise virtual replicas of physical network systems . Their reliance on diverse and sensitive data introduces significant challenges related to data management, regulatory compliance, and user privacy . Traditional approaches often fall short of preserving the integrity of the twin model .
🟒 Applied

From Impermanent Loss to Sustainable Gain: Quantifying Profitability Zones for Liquidity Providers on DEX

πŸ’‘ This research explores techniques in machine learning.
Decentralized Finance (DeFi) is a rapidly evolving segment of blockchain technology that enables a transformative approach to financial services through Web3 applications . By leveraging smart contracts, DeFi allows developers to build flexible and innovative financial instruments .
🟒 Applied

Distributed Santa Claus via Global Rounding

πŸ’‘ This research running AI locally on devices for edge computing.
In this paper, we consider the Santa Claus problem in the CONGEST model . This NP-hard problem can be modeled as a bipartite graph of children and gifts where an edge indicates that a child desires a gift . Each gift can have a different value .
πŸ”¬

Offline-First / Local AI

🟒 Applied

Revealing graph bandits for maximizing local influence

πŸ’‘ This research explores techniques in edge computing.
We study a graph bandit setting where the objective of the learner is to detect the most influential node of a graph by requesting as little information from the graph as possible . We propose BARE, a bandit strategy for which we prove a regret guarantee that scales with the detectable dimension .
🟒 Applied

Scalable Context-Aware Graph Attention for Unsupervised Anomaly Detection in Large-Scale Mobile Networks

πŸ’‘ This research explores techniques in machine learning.
Mobile network operators must monitor thousands of heterogeneous network elements across the radio access network and the packet core . Scale and cost of incident labelling make supervised approaches impractical, motivating unsupervised anomaly detection robust to context shifts and nonstationarity . C-MTAD-GAT is an anomaly detection framework designed to operate as a single shared model across large populations of network elements .
🟒 Applied

CleanBase: Detecting Malicious Documents in RAG Knowledge Databases

πŸ’‘ This research running AI locally on devices for edge computing.
Retrieval-augmented generation (RAG) is vulnerable to prompt injection attacks, in which an adversary inserts malicious documents containing carefully crafted injected prompts into the knowledge database . When a user issues a question targeted by the attack, the RAG system may retrieve these malicious documents, whose injected prompts mislead it into generating attacker-specified answers . CleanBase constructs a similarity graph over the database, where each node represents a document and an edge connects two nodes if their semantic similarity exceeds a statistically
🟑 Advanced

SAGA: Workflow-Atomic Scheduling for AI Agent Inference on GPU Clusters

πŸ’‘ This research explores techniques in language AI.
AI agents execute tens to hundreds of chained LLM calls per task . GPU schedulers treat each call as independent, discarding gigabytes of intermediate state between steps and inflating end-to-end latency by 3-8x . We present SAGA, a distributed scheduler that implements this abstraction through three mechanisms .
🟒 Applied

ControBench: An Interaction-Aware Benchmark for Controversial Discourse Analysis on Social Networks

πŸ’‘ This research explores techniques in language AI.
ControBench is a benchmark for controversial discourse analysis that combines heterogeneous social interaction graphs with rich textual semantics . Built from Reddit discussions on three topics, Trump, abortion, and religion, Controbench contains 7,370 users, 1,783 posts, and 26,525 interactions .
🟒 Applied

Bridging Graph Drawing and Dimensionality Reduction with Stochastic Stress Optimization

πŸ’‘ This research reduces computer vision.
We present a scikit-learn compatible estimator that minimizes global stress through local pairwise updates, improving upon the existing implementation . Experiments on standard high-dimensional benchmarks show that our stochastic solver converges substantially faster than SMACOF .
🟒 Applied

Class Angular Distortion Index for Dimensionality Reduction

πŸ’‘ This research reduces computer vision.
Dimensionality reduction (DR) techniques are often characterized by whether they preserve global, high-level structures in data or local, neighborhood structures . Existing cluster quality metrics either only measure cluster separability or assume spherical, globular clusters in the original space . We introduce the Class Angular Distortion Index (CADI), a metric that uses internal angles among point triples to determine the faithfulness of cluster organization .
🟒 Applied

Gradient Regularized Newton Boosting Trees with Global Convergence

πŸ’‘ This research explores techniques in edge computing.
Restricted Newton Descent studies convex optimization with Newton's method on Hilbert spaces with inexact iterates . Modern implementations like XGBoost, LightGBM, and CatBoost are based on Newton boosting: a second-order descent step in the space of decision trees .
🟑 Advanced

Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance

πŸ’‘ This research achieves better language AI.
Large Language Model (LLM) Red-Teaming, which proactively identifies vulnerabilities of LLMs, is an essential process for ensuring safety . Generative Flow Networks (GFNs) that perform distribution matching are notorious for training instability and mode collapse . We propose Stable-GFN, which eliminates partition function $Z$ estimation in GFN and reduces training instability .
🟒 Applied

Scale-Aware Adversarial Analysis: A Diagnostic for Generative AI in Multiscale Complex Systems

πŸ’‘ This research explores techniques in machine learning.
Complex physical systems, from supersonic turbulence to the macroscopic structure of the universe, are governed by continuous multiscale dynamics . Modern machine learning architectures excel at mapping the high-dimensional observables of these systems, it remains unclear whether they internalize the governing physical laws or merely interpolate discrete statistical correlations .
🟒 Applied

A Comparative Study of QSPR Methods on a Unique Multitask PAMPA dataset

πŸ’‘ This research presents techniques for edge computing.
We present a unique, multitask dataset comprising 143 drug and drug candidate molecules . Each evaluated on in vitro, parallel artificial-membrane permeability assays (PAMPA) using six different model membranes . This is the most comprehensive study on simultaneous modeling of multiple organ-specific PAMPA membranes to date .
🟒 Applied

Foresight Arena: An On-Chain Benchmark for Evaluating AI Forecasting Agents

πŸ’‘ This research explores techniques in language AI.
Foresight Arena is the first permissionless, on-chain benchmark for evaluating AI forecasting agents on real-world prediction markets . Agents submit probabilistic forecasts on binary Polymarket markets via a commit-reveal protocol enforced by Solidity smart contracts . Performance is measured by the Brier Score and a novel Alpha Score .
🟑 Advanced

AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

πŸ’‘ This research reduces language AI.
AdaMeZO is a zeroth-order optimizer that leverages Adam-style first- and second-moment estimates without maintaining them in memory . It can outperform MeZO while requiring up to $70\%$ fewer forward passes .
🟒 Applied

From Prediction to Practice: A Task-Aware Evaluation Framework for Blood Glucose Forecasting

πŸ’‘ This research explores techniques in machine learning.
We present a task-aware evaluation framework for blood glucose forecasting built around two downstream uses: hypoglycemia early warning and insulin dosing decision support . We evaluate on real data from three clinical cohorts using event-level recall and false alarms per patient-day . We show that models appearing acceptable overall, with recall above 0.9 on the full test set, can fail badly in the post-bolus slice .
🟒 Applied

Knowing when to trust machine-learned interatomic potentials

πŸ’‘ This research forecasting machine learning.
Machine-learned uncertainty-quantification methods rely on ensembles of independently trained backbones . These methods scale unfavorably with foundation-scale MLIPs, and their member-disagreement signals correlate weakly with per-molecule prediction error . The resulting method, PROBE (Post-hoc Reliability frOm Backbone Embeddings), produces a per-prediction reliability probability that monotonically tracks actual error without modification to the underlying
🟒 Applied

Fairness of Classifiers in the Presence of Constraints between Features

πŸ’‘ This research explores techniques in machine learning.
In Machine Learning, an accepted definition of fairness of a decision taken by a classifier is that it should not depend on protected features, such as gender . We propose that a decision be considered fair if it has a fair explanation . We identify relationships between different definitions of fairness and study the computational complexity of testing fairness of classifiers .
🟒 Applied

Jailbreaking Vision-Language Models Through the Visual Modality

πŸ’‘ This research explores techniques in language AI.
The visual modality of vision-language models (VLMs) is an underexplored attack surface for bypassing safety alignment . We introduce four jailbreak attacks exploiting the vision component . They include encoding harmful instructions as visual symbol sequences with a decoding legend .
🟑 Advanced

Beyond Continuity: Simulation-free Reconstruction of Discrete Branching Dynamics from Single-cell Snapshots

πŸ’‘ This research faster predictions in machine learning.
Inferring cellular trajectories from destructive snapshots is complicated by the challenges of stochasticity and non-conservative mass dynamics such as cell proliferation and apoptosis . Existing unbalanced Optimal Transport (OT) methods treat mass as a continuous fluid, performing inference at the population level . But this macroscopic view often fails to capture the discrete, jump-like nature of birth-death events at single-cell resolution . We present Unbalanced SchrΓΆdinger Bridge (USB
🟒 Applied

Vesselpose: Vessel Graph Reconstruction from Learned Voxel-wise Direction Vectors in 3D Vascular Images

πŸ’‘ This research explores techniques in computer vision.
The prevailing segment-then-fix paradigm is fundamentally limited regarding its suitability for modeling the task of complete and topologically accurate vascular network reconstruction . We propose an approach to extract topologically more accurate vascular graphs from 3D image data . Our approach achieves state-of-the-art performance on three benchmark datasets, spanning both synthetic and real imagery .
🟒 Applied

Multi-frame Restoration for High-rate Lissajous Confocal Laser Endomicroscopy

πŸ’‘ This research explores techniques in computer vision.
Lissajous confocal laser endomicroscopy (CLE) is a promising solution for high speed in vivo optical biopsy for handheld scenarios . However, at high frame rates, many pixels remain unvisited, creating structured holes . We propose MIRA, a lightweight recurrent framework for CLE restoration . MIRA outperforms both lightweight and high-complexity baselines in restoration quality .
🟒 Applied

End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer

πŸ’‘ This research optimizes computer vision.
Autoregressive image modeling relies on visual tokenizers to compress images into compact latent representations . We design an end-to-end training pipeline that jointly optimizes reconstruction and generation . This contrasts with prior two-stage approaches that train tokenizers and generative models separately .
🟒 Applied

LambdaRankIC: Directly Optimizing Rank IC for Financial Prediction

πŸ’‘ This research forecasting machine learning.
In financial predictions, the performance of machine learning models is often assessed by Rank IC . Rank IC is Spearman rank correlation between the model predictions and the realized asset returns . We propose LambdaRankIC, a novel learning-to-rank approach that directly optimizes Rank IC.
🟒 Applied

Distance metric learning for conditional anomaly detection

πŸ’‘ This research proposes a method for machine learning.
A recently proposed conditional anomaly detection framework extends anomaly detection to the problem of identifying anomalous patterns on a subset of attributes in the data . The anomaly always depends (is conditioned) on the value of remaining attributes . The work presented in this paper focuses on instance-based methods for detecting conditional anomalies .
🟒 Applied

Trading off rewards and errors in multi-armed bandits

πŸ’‘ This research presents techniques for machine learning.
In multi-armed bandits, the most-explored arms are the most informative, while reward maximization typically pulls only the best arm . We present an algorithm with regret guarantees that interpolates between the two objectives .
🟒 Applied

Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation

πŸ’‘ This research makes more efficient machine learning.
Multi-task learning (MTL) has emerged as a pivotal paradigm in machine learning by leveraging shared structures across multiple related tasks . Despite its empirical success, the development of likelihood-based efficiently solvable algorithms remains largely underdeveloped . This paper introduces a first-order algorithm that jointly learns a shared representation and task-specific parameters .