{"papers":[{"title":"Diverse Dictionary Learning","url":"https://huggingface.co/papers/2604.17568","source":"HF Daily","date":"2026-04-18","authors":"Yujia Zheng et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.17568"},{"title":"SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution","url":"https://huggingface.co/papers/2604.18982","source":"HF Daily","date":"2026-04-20","authors":"Xiachong Feng et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.18982"},{"title":"ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis","url":"https://huggingface.co/papers/2604.19720","source":"HF Daily","date":"2026-04-20","authors":"Zhengwentai Sun et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19720"},{"title":"Visual Reasoning through Tool-supervised Reinforcement Learning","url":"https://huggingface.co/papers/2604.19945","source":"HF Daily","date":"2026-04-20","authors":"Qihua Dong et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19945"},{"title":"Scaling Test-Time Compute for Agentic Coding","url":"https://huggingface.co/papers/2604.16529","source":"HF Daily","date":"2026-04-15","authors":"Joongwon Kim et al.","is_new":false,"pwc_url":"https://paperswithcode.com/paper/2604.16529"},{"title":"Near-Future Policy Optimization","url":"https://huggingface.co/papers/2604.20733","source":"HF Daily","date":"2026-04-21","authors":"Chuanyu Qin et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20733"},{"title":"Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges","url":"https://huggingface.co/papers/2604.13602","source":"HF Daily","date":"2026-04-14","authors":"Xiaohua Wang et al.","is_new":false,"pwc_url":"https://paperswithcode.com/paper/2604.13602"},{"title":"Exploring Spatial Intelligence from a Generative Perspective","url":"https://huggingface.co/papers/2604.20570","source":"HF Daily","date":"2026-04-21","authors":"Muzhi Zhu et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20570"},{"title":"Convergent Evolution: How Different Language Models Learn Similar Number Representations","url":"https://huggingface.co/papers/2604.20817","source":"HF Daily","date":"2026-04-21","authors":"Deqing Fu et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20817"},{"title":"WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training","url":"https://huggingface.co/papers/2604.14932","source":"HF Daily","date":"2026-04-15","authors":"Yifu Chen et al.","is_new":false,"pwc_url":"https://paperswithcode.com/paper/2604.14932"},{"title":"Diagnosing CFG Interpretation in LLMs","url":"http://arxiv.org/abs/2604.20811v1","source":"arXiv","date":"2026-04-22","authors":"Hanqi Li et al.","abstract":"As LLMs are increasingly integrated into agentic systems, they must adhere to dynamically defined, machine-interpretable interfaces. We evaluate LLMs as in-context interpreters: given a novel context-free grammar, can LLMs generate syntactically valid, behaviorally functional, and semantically faithful outputs? We introduce RoboGrid, a framework that disentangles syntax, behavior, and semantics through controlled stress-tests of recursion depth, expression complexity, and surface styles. Our exp","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20811"},{"title":"Relative Principals, Pluralistic Alignment, and the Structural Value Alignment Problem","url":"http://arxiv.org/abs/2604.20805v1","source":"arXiv","date":"2026-04-22","authors":"Travis LaCroix et al.","abstract":"The value alignment problem for artificial intelligence (AI) is often framed as a purely technical or normative challenge, sometimes focused on hypothetical future systems. I argue that the problem is better understood as a structural question about governance: not whether an AI system is aligned in the abstract, but whether it is aligned enough, for whom, and at what cost. Drawing on the principal-agent framework from economics, this paper reconceptualises misalignment as arising along three in","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20805"},{"title":"Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelli","url":"http://arxiv.org/abs/2604.20795v1","source":"arXiv","date":"2026-04-22","authors":"Pavel Salovskii et al.","abstract":"This paper presents a hybrid architecture for intelligent systems in which large language models (LLMs) are extended with an external ontological memory layer. Instead of relying solely on parametric knowledge and vector-based retrieval (RAG), the proposed approach constructs and maintains a structured knowledge graph using RDF/OWL representations, enabling persistent, verifiable, and semantically grounded reasoning. The core contribution is an automated pipeline for ontology construction from h","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20795"},{"title":"SWE-chat: Coding Agent Interactions From Real Users in the Wild","url":"http://arxiv.org/abs/2604.20779v1","source":"arXiv","date":"2026-04-22","authors":"Joachim Baumann et al.","abstract":"AI coding agents are being adopted at scale, yet we lack empirical evidence on how people actually use them and how much of their output is useful in practice. We present SWE-chat, the first large-scale dataset of real coding agent sessions collected from open-source developers in the wild. The dataset currently contains 6,000 sessions, comprising more than 63,000 user prompts and 355,000 agent tool calls. SWE-chat is a living dataset; our collection pipeline automatically and continually discov","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20779"},{"title":"Interval POMDP Shielding for Imperfect-Perception Agents","url":"http://arxiv.org/abs/2604.20728v1","source":"arXiv","date":"2026-04-22","authors":"William Scarbro et al.","abstract":"Autonomous systems that rely on learned perception can make unsafe decisions when sensor readings are misclassified. We study shielding for this setting: given a proposed action, a shield blocks actions that could violate safety. We consider the common case where system dynamics are known but perception uncertainty must be estimated from finite labeled data. From these data we build confidence intervals for the probabilities of perception outcomes and use them to model the system as a finite Int","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20728"},{"title":"Supplement Generation Training for Enhancing Agentic Task Performance","url":"http://arxiv.org/abs/2604.20727v1","source":"arXiv","date":"2026-04-22","authors":"Young Min Cho et al.","abstract":"Training large foundation models for agentic tasks is increasingly impractical due to the high computational costs, long iteration cycles, and rapid obsolescence as new models are continuously released. Instead of post-training massive models for every new task or domain, we propose Supplement Generation Training (SGT), a more efficient and sustainable strategy. SGT trains a smaller LLM to generate useful supplemental text that, when appended to the original input, helps the larger LLM solve the","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20727"},{"title":"Learning to Evolve: A Self-Improving Framework for Multi-Agent Systems via Textual Parameter Graph Optimization","url":"http://arxiv.org/abs/2604.20714v1","source":"arXiv","date":"2026-04-22","authors":"Shan He et al.","abstract":"Designing and optimizing multi-agent systems (MAS) is a complex, labor-intensive process of \"Agent Engineering.\" Existing automatic optimization methods, primarily focused on flat prompt tuning, lack the structural awareness to debug the intricate web of interactions in MAS. More critically, these optimizers are static; they do not learn from experience to improve their own optimization strategies. To address these gaps, we introduce Textual Parameter Graph Optimization (TPGO), a framework that ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20714"},{"title":"A Field Guide to Decision Making","url":"http://arxiv.org/abs/2604.20669v1","source":"arXiv","date":"2026-04-22","authors":"Richard B. Arthur et al.","abstract":"High-consequence decision making demands peak performance from individuals in positions of responsibility. Such executive authority bears the obligation to act despite uncertainty, limited resources, time constraints, and accountability risks. Tools and strategies to motivate confidence and foster risk tolerance must confront informational noise and can provide qualified accountability. Machine intelligence augments human cognition and perception to improve situational awareness, decision framin","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20669"}],"categorized_papers":{"agent_research":[{"title":"Diagnosing CFG Interpretation in LLMs","url":"http://arxiv.org/abs/2604.20811v1","source":"arXiv","date":"2026-04-22","authors":"Hanqi Li et al.","abstract":"As LLMs are increasingly integrated into agentic systems, they must adhere to dynamically defined, machine-interpretable interfaces. We evaluate LLMs as in-context interpreters: given a novel context-free grammar, can LLMs generate syntactically valid, behaviorally functional, and semantically faithful outputs? We introduce RoboGrid, a framework that disentangles syntax, behavior, and semantics through controlled stress-tests of recursion depth, expression complexity, and surface styles. Our exp","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20811"},{"title":"Relative Principals, Pluralistic Alignment, and the Structural Value Alignment Problem","url":"http://arxiv.org/abs/2604.20805v1","source":"arXiv","date":"2026-04-22","authors":"Travis LaCroix et al.","abstract":"The value alignment problem for artificial intelligence (AI) is often framed as a purely technical or normative challenge, sometimes focused on hypothetical future systems. I argue that the problem is better understood as a structural question about governance: not whether an AI system is aligned in the abstract, but whether it is aligned enough, for whom, and at what cost. Drawing on the principal-agent framework from economics, this paper reconceptualises misalignment as arising along three in","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20805"},{"title":"Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelli","url":"http://arxiv.org/abs/2604.20795v1","source":"arXiv","date":"2026-04-22","authors":"Pavel Salovskii et al.","abstract":"This paper presents a hybrid architecture for intelligent systems in which large language models (LLMs) are extended with an external ontological memory layer. Instead of relying solely on parametric knowledge and vector-based retrieval (RAG), the proposed approach constructs and maintains a structured knowledge graph using RDF/OWL representations, enabling persistent, verifiable, and semantically grounded reasoning. The core contribution is an automated pipeline for ontology construction from h","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20795"},{"title":"SWE-chat: Coding Agent Interactions From Real Users in the Wild","url":"http://arxiv.org/abs/2604.20779v1","source":"arXiv","date":"2026-04-22","authors":"Joachim Baumann et al.","abstract":"AI coding agents are being adopted at scale, yet we lack empirical evidence on how people actually use them and how much of their output is useful in practice. We present SWE-chat, the first large-scale dataset of real coding agent sessions collected from open-source developers in the wild. The dataset currently contains 6,000 sessions, comprising more than 63,000 user prompts and 355,000 agent tool calls. SWE-chat is a living dataset; our collection pipeline automatically and continually discov","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20779"},{"title":"Interval POMDP Shielding for Imperfect-Perception Agents","url":"http://arxiv.org/abs/2604.20728v1","source":"arXiv","date":"2026-04-22","authors":"William Scarbro et al.","abstract":"Autonomous systems that rely on learned perception can make unsafe decisions when sensor readings are misclassified. We study shielding for this setting: given a proposed action, a shield blocks actions that could violate safety. We consider the common case where system dynamics are known but perception uncertainty must be estimated from finite labeled data. From these data we build confidence intervals for the probabilities of perception outcomes and use them to model the system as a finite Int","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20728"},{"title":"Supplement Generation Training for Enhancing Agentic Task Performance","url":"http://arxiv.org/abs/2604.20727v1","source":"arXiv","date":"2026-04-22","authors":"Young Min Cho et al.","abstract":"Training large foundation models for agentic tasks is increasingly impractical due to the high computational costs, long iteration cycles, and rapid obsolescence as new models are continuously released. Instead of post-training massive models for every new task or domain, we propose Supplement Generation Training (SGT), a more efficient and sustainable strategy. SGT trains a smaller LLM to generate useful supplemental text that, when appended to the original input, helps the larger LLM solve the","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20727"},{"title":"Learning to Evolve: A Self-Improving Framework for Multi-Agent Systems via Textual Parameter Graph Optimization","url":"http://arxiv.org/abs/2604.20714v1","source":"arXiv","date":"2026-04-22","authors":"Shan He et al.","abstract":"Designing and optimizing multi-agent systems (MAS) is a complex, labor-intensive process of \"Agent Engineering.\" Existing automatic optimization methods, primarily focused on flat prompt tuning, lack the structural awareness to debug the intricate web of interactions in MAS. More critically, these optimizers are static; they do not learn from experience to improve their own optimization strategies. To address these gaps, we introduce Textual Parameter Graph Optimization (TPGO), a framework that ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20714"},{"title":"A Field Guide to Decision Making","url":"http://arxiv.org/abs/2604.20669v1","source":"arXiv","date":"2026-04-22","authors":"Richard B. Arthur et al.","abstract":"High-consequence decision making demands peak performance from individuals in positions of responsibility. Such executive authority bears the obligation to act despite uncertainty, limited resources, time constraints, and accountability risks. Tools and strategies to motivate confidence and foster risk tolerance must confront informational noise and can provide qualified accountability. Machine intelligence augments human cognition and perception to improve situational awareness, decision framin","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20669"}],"llm_models":[{"title":"SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation","url":"http://arxiv.org/abs/2604.20842v1","source":"arXiv","date":"2026-04-22","authors":"Ruohan Liu et al.","abstract":"Paralinguistic cues are essential for natural human-computer interaction, yet their evaluation in Large Audio-Language Models (LALMs) remains limited by coarse feature coverage and the inherent subjectivity of assessment. To address these challenges, we introduce SpeechParaling-Bench, a comprehensive benchmark for paralinguistic-aware speech generation. It expands existing coverage from fewer than 50 to over 100 fine-grained features, supported by more than 1,000 English-Chinese parallel speech ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20842"},{"title":"Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL","url":"http://arxiv.org/abs/2604.20835v1","source":"arXiv","date":"2026-04-22","authors":"Zhaofeng Wu et al.","abstract":"Modern language models demonstrate impressive coding capabilities in common programming languages (PLs), such as C++ and Python, but their performance in lower-resource PLs is often limited by training data availability. In principle, however, most programming skills are universal across PLs, so the capability acquired in one PL should transfer to others. In this work, we propose the task of zero-shot cross-programming-language transfer for code RL. We find that, for Llama-3.1, RL training for c","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20835"},{"title":"AVISE: Framework for Evaluating the Security of AI Systems","url":"http://arxiv.org/abs/2604.20833v1","source":"arXiv","date":"2026-04-22","authors":"Mikko Lempinen et al.","abstract":"As artificial intelligence (AI) systems are increasingly deployed across critical domains, their security vulnerabilities pose growing risks of high-profile exploits and consequential system failures. Yet systematic approaches to evaluating AI security remain underdeveloped. In this paper, we introduce AVISE (AI Vulnerability Identification and Security Evaluation), a modular open-source framework for identifying vulnerabilities in and evaluating the security of AI systems and models. As a demon","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20833"},{"title":"Convergent Evolution: How Different Language Models Learn Similar Number Representations","url":"http://arxiv.org/abs/2604.20817v1","source":"arXiv","date":"2026-04-22","authors":"Deqing Fu et al.","abstract":"Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: while Transformers, Linear RNNs, LSTMs, and classical word embeddings trained in different ways all learn features that have period-$T$ spikes in the Fourier domain, only some learn geometrically separable features that can be used to linearly classify a number mod-$T$. To explain this incongruity, w","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20817"},{"title":"OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model","url":"http://arxiv.org/abs/2604.20806v1","source":"arXiv","date":"2026-04-22","authors":"Qiguang Chen et al.","abstract":"Large vision-language models (LVLMs) have made substantial advances in reasoning tasks at the Olympiad level. Nevertheless, current Olympiad-level multimodal reasoning benchmarks for these models often emphasize single-image analysis and fail to exploit contextual information across multiple images. We present OMIBench, a benchmark designed to evaluate Olympiad-level reasoning when the required evidence is distributed over multiple images. It contains problems from biology, chemistry, mathematic","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20806"},{"title":"Can \"AI\" Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs","url":"http://arxiv.org/abs/2604.20791v1","source":"arXiv","date":"2026-04-22","authors":"Mariano Barone et al.","abstract":"Large Language Models (LLMs) are increasingly deployed in healthcare, yet their communicative alignment with clinical standards remains insufficiently quantified. We conduct a multidimensional evaluation of general-purpose and domain-specialized LLMs across structured medical explanations and real-world physician-patient interactions, analyzing semantic fidelity, readability, and affective resonance. Baseline models amplify affective polarity relative to physicians (Very Negative: 43.14-45.10% v","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20791"},{"title":"Working Memory Constraints Scaffold Learning in Transformers under Data Scarcity","url":"http://arxiv.org/abs/2604.20789v1","source":"arXiv","date":"2026-04-22","authors":"Pranava Madhyastha et al.","abstract":"We investigate the integration of human-like working memory constraints into the Transformer architecture and implement several cognitively inspired attention variants, including fixed-width windows based and temporal decay based attention mechanisms. Our modified GPT-2 models are trained from scratch on developmentally plausible datasets (10M and 100M words). Performance is evaluated on grammatical judgment tasks (BLiMP) and alignment with human reading time data. Our results indicate that thes","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20789"},{"title":"RespondeoQA: a Benchmark for Bilingual Latin-English Question Answering","url":"http://arxiv.org/abs/2604.20738v1","source":"arXiv","date":"2026-04-22","authors":"Marisa Hudspeth et al.","abstract":"We introduce a benchmark dataset for question answering and translation in bilingual Latin and English settings, containing about 7,800 question-answer pairs. The questions are drawn from Latin pedagogical sources, including exams, quizbowl-style trivia, and textbooks ranging from the 1800s to the present. After automated extraction, cleaning, and manual review, the dataset covers a diverse range of question types: knowledge- and skill-based, multihop reasoning, constrained translation, and mixe","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20738"}],"machine_learning":[{"title":"FedSIR: Spectral Client Identification and Relabeling for Federated Learning with Noisy Labels","url":"http://arxiv.org/abs/2604.20825v1","source":"arXiv","date":"2026-04-22","authors":"Sina Gholami et al.","abstract":"Federated learning (FL) enables collaborative model training without sharing raw data; however, the presence of noisy labels across distributed clients can severely degrade the learning performance. In this paper, we propose FedSIR, a multi-stage framework for robust FL under noisy labels. Different from existing approaches that mainly rely on designing noise-tolerant loss functions or exploiting loss dynamics during training, our method leverages the spectral structure of client feature represe","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20825"},{"title":"Closing the Domain Gap in Biomedical Imaging by In-Context Control Samples","url":"http://arxiv.org/abs/2604.20824v1","source":"arXiv","date":"2026-04-22","authors":"Ana Sanchez-Fernandez et al.","abstract":"The central problem in biomedical imaging are batch effects: systematic technical variations unrelated to the biological signal of interest. These batch effects critically undermine experimental reproducibility and are the primary cause of failure of deep learning systems on new experimental batches, preventing their practical use in the real world. Despite years of research, no method has succeeded in closing this performance gap for deep learning models. We propose Control-Stabilized Adaptive ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20824"},{"title":"Global Offshore Wind Infrastructure: Deployment and Operational Dynamics from Dense Sentinel-1 Time Series","url":"http://arxiv.org/abs/2604.20822v1","source":"arXiv","date":"2026-04-22","authors":"Thorsten Hoeser et al.","abstract":"The offshore wind energy sector is expanding rapidly, increasing the need for independent, high-temporal-resolution monitoring of infrastructure deployment and operation at global scale. While Earth Observation based offshore wind infrastructure mapping has matured for spatial localization, existing open datasets lack temporally dense and semantically fine-grained information on construction and operational dynamics. We introduce a global Sentinel-1 synthetic aperture radar (SAR) time series dat","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20822"},{"title":"Stream-CQSA: Avoiding Out-of-Memory in Attention Computation via Flexible Workload Scheduling","url":"http://arxiv.org/abs/2604.20819v1","source":"arXiv","date":"2026-04-22","authors":"Yiming Bian et al.","abstract":"The scalability of long-context large language models is fundamentally limited by the quadratic memory cost of exact self-attention, which often leads to out-of-memory (OOM) failures on modern hardware. Existing methods improve memory efficiency to near-linear complexity, while assuming that the full query, key, and value tensors fit in device memory. In this work, we remove this assumption by introducing CQS Divide, an operation derived from cyclic quorum sets (CQS) theory that decomposes atten","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20819"},{"title":"Convergent Evolution: How Different Language Models Learn Similar Number Representations","url":"http://arxiv.org/abs/2604.20817v1","source":"arXiv","date":"2026-04-22","authors":"Deqing Fu et al.","abstract":"Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: while Transformers, Linear RNNs, LSTMs, and classical word embeddings trained in different ways all learn features that have period-$T$ spikes in the Fourier domain, only some learn geometrically separable features that can be used to linearly classify a number mod-$T$. To explain this incongruity, w","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20817"},{"title":"ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control","url":"http://arxiv.org/abs/2604.20816v1","source":"arXiv","date":"2026-04-22","authors":"Shelly Golan et al.","abstract":"Reinforcement Learning (RL) post-training has become the standard for aligning generative models with human preferences, yet most methods rely on a single scalar reward. When multiple criteria matter, the prevailing practice of ``early scalarization'' collapses rewards into a fixed weighted sum. This commits the model to a single trade-off point at training time, providing no inference-time control over inherently conflicting goals -- such as prompt adherence versus source fidelity in image edit","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20816"}],"rag":[{"title":"Coverage, Not Averages: Semantic Stratification for Trustworthy Retrieval Evaluation","url":"http://arxiv.org/abs/2604.20763v1","source":"arXiv","date":"2026-04-22","authors":"Andrew Klearman et al.","abstract":"Retrieval quality is the primary bottleneck for accuracy and robustness in retrieval-augmented generation (RAG). Current evaluation relies on heuristically constructed query sets, which introduce a hidden intrinsic bias. We formalize retrieval evaluation as a statistical estimation problem, showing that metric reliability is fundamentally limited by the evaluation-set construction. We further introduce \\emph{semantic stratification}, which grounds evaluation in corpus structure by organizing doc","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20763"},{"title":"ORPHEAS: A Cross-Lingual Greek-English Embedding Model for Retrieval-Augmented Generation","url":"http://arxiv.org/abs/2604.20666v1","source":"arXiv","date":"2026-04-22","authors":"Ioannis E. Livieris et al.","abstract":"Effective retrieval-augmented generation across bilingual Greek--English applications requires embedding models capable of capturing both domain-specific semantic relationships and cross-lingual semantic alignment. Existing multilingual embedding models distribute their representational capacity across numerous languages, limiting their optimization for Greek and failing to encode the morphological complexity and domain-specific terminological structures inherent in Greek text. In this work, we ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20666"},{"title":"RSRCC: A Remote Sensing Regional Change Comprehension Benchmark Constructed via Retrieval-Augmented Best-of-N Ranking","url":"http://arxiv.org/abs/2604.20623v1","source":"arXiv","date":"2026-04-22","authors":"Roie Kazoom et al.","abstract":"Traditional change detection identifies where changes occur, but does not explain what changed in natural language. Existing remote sensing change captioning datasets typically describe overall image-level differences, leaving fine-grained localized semantic reasoning largely unexplored. To close this gap, we present RSRCC, a new benchmark for remote sensing change question-answering containing 126k questions, split into 87k training, 17.1k validation, and 22k test instances. Unlike prior datase","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20623"},{"title":"Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confide","url":"http://arxiv.org/abs/2604.20598v1","source":"arXiv","date":"2026-04-22","authors":"Naizhong Xu et al.","abstract":"Modern retrieval-augmented generation (RAG) systems treat vector embeddings as static, context-free artifacts: an embedding has no notion of when it was created, how trustworthy its source is, or which other embeddings depend on it. This flattening of knowledge has a measurable cost: recent work on VersionRAG reports that conventional RAG achieves only 58% accuracy on versioned technical queries, because retrieval returns semantically similar but temporally invalid content. We propose SmartVecto","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20598"},{"title":"Knowledge Capsules: Structured Nonparametric Memory Units for LLMs","url":"http://arxiv.org/abs/2604.20487v1","source":"arXiv","date":"2026-04-22","authors":"Bin Ju et al.","abstract":"Large language models (LLMs) encode knowledge in parametric weights, making it costly to update or extend without retraining. Retrieval-augmented generation (RAG) mitigates this limitation by appending retrieved text to the input, but operates purely through context expansion, where external knowledge competes as tokens within the attention mechanism. As a result, its influence is indirect and often unstable, particularly in long context and multi hop reasoning scenarios. We propose Knowledge Ca","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20487"}],"code_gen":[{"title":"WebGen-R1: Incentivizing Large Language Models to Generate Functional and Aesthetic Websites with Reinforcement Learning","url":"http://arxiv.org/abs/2604.20398v1","source":"arXiv","date":"2026-04-22","authors":"Juyong Jiang et al.","abstract":"While Large Language Models (LLMs) excel at function-level code generation, project-level tasks such as generating functional and visually aesthetic multi-page websites remain highly challenging. Existing works are often limited to single-page static websites, while agentic frameworks typically rely on multi-turn execution with proprietary models, leading to substantial token costs, high latency, and brittle integration. Training a small LLM end-to-end with reinforcement learning (RL) is a promi","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20398"},{"title":"CreativeGame:Toward Mechanic-Aware Creative Game Generation","url":"http://arxiv.org/abs/2604.19926v1","source":"arXiv","date":"2026-04-21","authors":"Hongnan Ma et al.","abstract":"Large language models can generate plausible game code, but turning this capability into \\emph{iterative creative improvement} remains difficult. In practice, single-shot generation often produces brittle runtime behavior, weak accumulation of experience across versions, and creativity scores that are too subjective to serve as reliable optimization signals. A further limitation is that mechanics are frequently treated only as post-hoc descriptions, rather than as explicit objects that can be pl","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19926"},{"title":"PlayCoder: Making LLM-Generated GUI Code Playable","url":"http://arxiv.org/abs/2604.19742v1","source":"arXiv","date":"2026-04-21","authors":"Zhiyuan Peng et al.","abstract":"Large language models (LLMs) have achieved strong results in code generation, but their ability to generate GUI applications, especially games, remains insufficiently studied. Existing benchmarks mainly evaluate correctness through test cases, which are inadequate for GUI applications because these systems are interactive, event-driven, and require correct state transitions across sequences of user actions. Their evaluation therefore should consider interaction flows and UI logic rather than onl","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19742"},{"title":"CASCADE: Detecting Inconsistencies between Code and Documentation with Automatic Test Generation","url":"http://arxiv.org/abs/2604.19400v1","source":"arXiv","date":"2026-04-21","authors":"Tobias Kiecker et al.","abstract":"Maintaining consistency between code and documentation is a crucial yet frequently overlooked aspect of software development. Even minor mismatches can confuse API users, introduce new bugs, and increase overall maintenance effort. This creates demand for automated solutions that can assist developers in identifying code-documentation inconsistencies. However, since automatic reports still require human confirmation, false positives carry serious consequences: wasting developer time and discoura","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19400"},{"title":"BONSAI: A Mixed-Initiative Workspace for Human-AI Co-Development of Visual Analytics Applications","url":"http://arxiv.org/abs/2604.19247v1","source":"arXiv","date":"2026-04-21","authors":"Thilo Spinner et al.","abstract":"Developing Visual Analytics (VA) applications requires integrating complex machine learning models with expressive interactive interfaces. Developers face a stark trade-off: building tightly-coupled monoliths plagued by fragile interdependencies, or relying on restrictive, simplistic frameworks. Meanwhile, unconstrained, single-shot AI code generation promises speed but yields unstructured, unauditable chaos. The core challenge is combining the control and expressiveness of custom development wi","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19247"}],"safety":[{"title":"SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation","url":"http://arxiv.org/abs/2604.20842v1","source":"arXiv","date":"2026-04-22","authors":"Ruohan Liu et al.","abstract":"Paralinguistic cues are essential for natural human-computer interaction, yet their evaluation in Large Audio-Language Models (LALMs) remains limited by coarse feature coverage and the inherent subjectivity of assessment. To address these challenges, we introduce SpeechParaling-Bench, a comprehensive benchmark for paralinguistic-aware speech generation. It expands existing coverage from fewer than 50 to over 100 fine-grained features, supported by more than 1,000 English-Chinese parallel speech ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20842"},{"title":"Diagnosing CFG Interpretation in LLMs","url":"http://arxiv.org/abs/2604.20811v1","source":"arXiv","date":"2026-04-22","authors":"Hanqi Li et al.","abstract":"As LLMs are increasingly integrated into agentic systems, they must adhere to dynamically defined, machine-interpretable interfaces. We evaluate LLMs as in-context interpreters: given a novel context-free grammar, can LLMs generate syntactically valid, behaviorally functional, and semantically faithful outputs? We introduce RoboGrid, a framework that disentangles syntax, behavior, and semantics through controlled stress-tests of recursion depth, expression complexity, and surface styles. Our exp","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20811"},{"title":"Relative Principals, Pluralistic Alignment, and the Structural Value Alignment Problem","url":"http://arxiv.org/abs/2604.20805v1","source":"arXiv","date":"2026-04-22","authors":"Travis LaCroix et al.","abstract":"The value alignment problem for artificial intelligence (AI) is often framed as a purely technical or normative challenge, sometimes focused on hypothetical future systems. I argue that the problem is better understood as a structural question about governance: not whether an AI system is aligned in the abstract, but whether it is aligned enough, for whom, and at what cost. Drawing on the principal-agent framework from economics, this paper reconceptualises misalignment as arising along three in","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20805"},{"title":"Can \"AI\" Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs","url":"http://arxiv.org/abs/2604.20791v1","source":"arXiv","date":"2026-04-22","authors":"Mariano Barone et al.","abstract":"Large Language Models (LLMs) are increasingly deployed in healthcare, yet their communicative alignment with clinical standards remains insufficiently quantified. We conduct a multidimensional evaluation of general-purpose and domain-specialized LLMs across structured medical explanations and real-world physician-patient interactions, analyzing semantic fidelity, readability, and affective resonance. Baseline models amplify affective polarity relative to physicians (Very Negative: 43.14-45.10% v","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20791"},{"title":"Working Memory Constraints Scaffold Learning in Transformers under Data Scarcity","url":"http://arxiv.org/abs/2604.20789v1","source":"arXiv","date":"2026-04-22","authors":"Pranava Madhyastha et al.","abstract":"We investigate the integration of human-like working memory constraints into the Transformer architecture and implement several cognitively inspired attention variants, including fixed-width windows based and temporal decay based attention mechanisms. Our modified GPT-2 models are trained from scratch on developmentally plausible datasets (10M and 100M words). Performance is evaluated on grammatical judgment tasks (BLiMP) and alignment with human reading time data. Our results indicate that thes","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20789"}],"benchmarks":[{"title":"SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation","url":"http://arxiv.org/abs/2604.20842v1","source":"arXiv","date":"2026-04-22","authors":"Ruohan Liu et al.","abstract":"Paralinguistic cues are essential for natural human-computer interaction, yet their evaluation in Large Audio-Language Models (LALMs) remains limited by coarse feature coverage and the inherent subjectivity of assessment. To address these challenges, we introduce SpeechParaling-Bench, a comprehensive benchmark for paralinguistic-aware speech generation. It expands existing coverage from fewer than 50 to over 100 fine-grained features, supported by more than 1,000 English-Chinese parallel speech ","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20842"},{"title":"AVISE: Framework for Evaluating the Security of AI Systems","url":"http://arxiv.org/abs/2604.20833v1","source":"arXiv","date":"2026-04-22","authors":"Mikko Lempinen et al.","abstract":"As artificial intelligence (AI) systems are increasingly deployed across critical domains, their security vulnerabilities pose growing risks of high-profile exploits and consequential system failures. Yet systematic approaches to evaluating AI security remain underdeveloped. In this paper, we introduce AVISE (AI Vulnerability Identification and Security Evaluation), a modular open-source framework for identifying vulnerabilities in and evaluating the security of AI systems and models. As a demon","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20833"},{"title":"FedSIR: Spectral Client Identification and Relabeling for Federated Learning with Noisy Labels","url":"http://arxiv.org/abs/2604.20825v1","source":"arXiv","date":"2026-04-22","authors":"Sina Gholami et al.","abstract":"Federated learning (FL) enables collaborative model training without sharing raw data; however, the presence of noisy labels across distributed clients can severely degrade the learning performance. In this paper, we propose FedSIR, a multi-stage framework for robust FL under noisy labels. Different from existing approaches that mainly rely on designing noise-tolerant loss functions or exploiting loss dynamics during training, our method leverages the spectral structure of client feature represe","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20825"},{"title":"Diagnosing CFG Interpretation in LLMs","url":"http://arxiv.org/abs/2604.20811v1","source":"arXiv","date":"2026-04-22","authors":"Hanqi Li et al.","abstract":"As LLMs are increasingly integrated into agentic systems, they must adhere to dynamically defined, machine-interpretable interfaces. We evaluate LLMs as in-context interpreters: given a novel context-free grammar, can LLMs generate syntactically valid, behaviorally functional, and semantically faithful outputs? We introduce RoboGrid, a framework that disentangles syntax, behavior, and semantics through controlled stress-tests of recursion depth, expression complexity, and surface styles. Our exp","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20811"},{"title":"OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model","url":"http://arxiv.org/abs/2604.20806v1","source":"arXiv","date":"2026-04-22","authors":"Qiguang Chen et al.","abstract":"Large vision-language models (LVLMs) have made substantial advances in reasoning tasks at the Olympiad level. Nevertheless, current Olympiad-level multimodal reasoning benchmarks for these models often emphasize single-image analysis and fail to exploit contextual information across multiple images. We present OMIBench, a benchmark designed to evaluate Olympiad-level reasoning when the required evidence is distributed over multiple images. It contains problems from biology, chemistry, mathematic","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20806"}],"tool_use":[{"title":"R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling","url":"http://arxiv.org/abs/2604.20316v1","source":"arXiv","date":"2026-04-22","authors":"Aijia Cheng et al.","abstract":"Function calling empowers large language models (LLMs) to interface with external tools, yet existing RL-based approaches suffer from misalignment between reasoning processes and tool-call decisions. We propose R2IF, a reasoning-aware RL framework for interpretable function calling, adopting a composite reward integrating format/correctness constraints, Chain-of-Thought Effectiveness Reward (CER), and Specification-Modification-Value (SMV) reward, optimized via GRPO. Experiments on BFCL/ACEBench","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20316"},{"title":"Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models","url":"http://arxiv.org/abs/2604.20148v1","source":"arXiv","date":"2026-04-22","authors":"Sachin Kumar et al.","abstract":"Can small language models achieve strong tool-use performance without complex adaptation mechanisms? This paper investigates this question through Meta-Tool, a controlled empirical study comparing hypernetwork-based LoRA adaptation against carefully designed few-shot prompting. Using a Llama-3.2-3B-Instruct backbone, we evaluate four adaptation mechanisms--few-shot prompting, documentation encoding, hypernetwork-generated LoRA weights, and value-guided beam search--across four diverse benchmarks","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20148"},{"title":"SAKE: Self-aware Knowledge Exploitation-Exploration for Grounded Multimodal Named Entity Recognition","url":"http://arxiv.org/abs/2604.20146v1","source":"arXiv","date":"2026-04-22","authors":"Jielong Tang et al.","abstract":"Grounded Multimodal Named Entity Recognition (GMNER) aims to extract named entities and localize their visual regions within image-text pairs, serving as a pivotal capability for various downstream applications. In open-world social media platforms, GMNER remains challenging due to the prevalence of long-tailed, rapidly evolving, and unseen entities. To tackle this, existing approaches typically rely on either external knowledge exploration through heuristic retrieval or internal knowledge explo","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20146"},{"title":"Visual Reasoning through Tool-supervised Reinforcement Learning","url":"http://arxiv.org/abs/2604.19945v1","source":"arXiv","date":"2026-04-21","authors":"Qihua Dong et al.","abstract":"In this paper, we investigate the problem of how to effectively master tool-use to solve complex visual reasoning tasks for Multimodal Large Language Models. To achieve that, we propose a novel Tool-supervised Reinforcement Learning (ToolsRL) framework, with direct tool supervision for more effective tool-use learning. We focus on a series of simple, native, and interpretable visual tools, including zoom-in, rotate, flip, and draw point/line, whose tool supervision is easy to collect. A reinforc","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19945"},{"title":"Rethinking Reinforcement Fine-Tuning in LVLM: Convergence, Reward Decomposition, and Generalization","url":"http://arxiv.org/abs/2604.19857v1","source":"arXiv","date":"2026-04-21","authors":"Carter Adams et al.","abstract":"Reinforcement fine-tuning with verifiable rewards (RLVR) has emerged as a powerful paradigm for equipping large vision-language models (LVLMs) with agentic capabilities such as tool use and multi-step reasoning. Despite striking empirical successes, most notably Visual Agentic Reinforcement Fine-Tuning (Visual-ARFT), the theoretical underpinnings of this paradigm remain poorly understood. In particular, two critical questions lack rigorous answers: (i)~how does the composite structure of verifia","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19857"}]},"hf_papers":[{"title":"Diverse Dictionary Learning","url":"https://huggingface.co/papers/2604.17568","source":"HF Daily","date":"2026-04-18","authors":"Yujia Zheng et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.17568"},{"title":"SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution","url":"https://huggingface.co/papers/2604.18982","source":"HF Daily","date":"2026-04-20","authors":"Xiachong Feng et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.18982"},{"title":"ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis","url":"https://huggingface.co/papers/2604.19720","source":"HF Daily","date":"2026-04-20","authors":"Zhengwentai Sun et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19720"},{"title":"Visual Reasoning through Tool-supervised Reinforcement Learning","url":"https://huggingface.co/papers/2604.19945","source":"HF Daily","date":"2026-04-20","authors":"Qihua Dong et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.19945"},{"title":"Scaling Test-Time Compute for Agentic Coding","url":"https://huggingface.co/papers/2604.16529","source":"HF Daily","date":"2026-04-15","authors":"Joongwon Kim et al.","is_new":false,"pwc_url":"https://paperswithcode.com/paper/2604.16529"},{"title":"Near-Future Policy Optimization","url":"https://huggingface.co/papers/2604.20733","source":"HF Daily","date":"2026-04-21","authors":"Chuanyu Qin et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20733"},{"title":"Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges","url":"https://huggingface.co/papers/2604.13602","source":"HF Daily","date":"2026-04-14","authors":"Xiaohua Wang et al.","is_new":false,"pwc_url":"https://paperswithcode.com/paper/2604.13602"},{"title":"Exploring Spatial Intelligence from a Generative Perspective","url":"https://huggingface.co/papers/2604.20570","source":"HF Daily","date":"2026-04-21","authors":"Muzhi Zhu et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20570"},{"title":"Convergent Evolution: How Different Language Models Learn Similar Number Representations","url":"https://huggingface.co/papers/2604.20817","source":"HF Daily","date":"2026-04-21","authors":"Deqing Fu et al.","is_new":true,"pwc_url":"https://paperswithcode.com/paper/2604.20817"},{"title":"WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training","url":"https://huggingface.co/papers/2604.14932","source":"HF Daily","date":"2026-04-15","authors":"Yifu Chen et al.","is_new":false,"pwc_url":"https://paperswithcode.com/paper/2604.14932"}],"trending_models":[{"name":"Qwen/Qwen3.6-35B-A3B","url":"https://huggingface.co/Qwen/Qwen3.6-35B-A3B","downloads":582961,"task":"image-text-to-text","trendingScore":1203},{"name":"moonshotai/Kimi-K2.6","url":"https://huggingface.co/moonshotai/Kimi-K2.6","downloads":54456,"task":"image-text-to-text","trendingScore":806},{"name":"unsloth/Qwen3.6-35B-A3B-GGUF","url":"https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF","downloads":1112454,"task":"image-text-to-text","trendingScore":637},{"name":"Qwen/Qwen3.6-27B","url":"https://huggingface.co/Qwen/Qwen3.6-27B","downloads":0,"task":"image-text-to-text","trendingScore":517},{"name":"tencent/HY-World-2.0","url":"https://huggingface.co/tencent/HY-World-2.0","downloads":0,"task":"image-to-3d","trendingScore":451},{"name":"openai/privacy-filter","url":"https://huggingface.co/openai/privacy-filter","downloads":3,"task":"token-classification","trendingScore":370},{"name":"HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive","url":"https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive","downloads":312962,"task":"image-text-to-text","trendingScore":358},{"name":"OBLITERATUS/gemma-4-E4B-it-OBLITERATED","url":"https://huggingface.co/OBLITERATUS/gemma-4-E4B-it-OBLITERATED","downloads":79024,"task":"text-generation","trendingScore":299},{"name":"unsloth/Qwen3.6-27B-GGUF","url":"https://huggingface.co/unsloth/Qwen3.6-27B-GGUF","downloads":0,"task":"image-text-to-text","trendingScore":235},{"name":"google/gemma-4-31B-it","url":"https://huggingface.co/google/gemma-4-31B-it","downloads":5103971,"task":"image-text-to-text","trendingScore":213}],"trending_datasets":[{"name":"lambda/hermes-agent-reasoning-traces","url":"https://huggingface.co/datasets/lambda/hermes-agent-reasoning-traces","downloads":7289},{"name":"Roman1111111/claude-opus-4.6-10000x","url":"https://huggingface.co/datasets/Roman1111111/claude-opus-4.6-10000x","downloads":6646},{"name":"Jackrong/GLM-5.1-Reasoning-1M-Cleaned","url":"https://huggingface.co/datasets/Jackrong/GLM-5.1-Reasoning-1M-Cleaned","downloads":1301},{"name":"Kassadin88/GLM-5.1-1000000x","url":"https://huggingface.co/datasets/Kassadin88/GLM-5.1-1000000x","downloads":923},{"name":"ShadenA/MathNet","url":"https://huggingface.co/datasets/ShadenA/MathNet","downloads":1724},{"name":"nvidia/Nemotron-Personas-Korea","url":"https://huggingface.co/datasets/nvidia/Nemotron-Personas-Korea","downloads":1044},{"name":"TeraflopAI/SEC-EDGAR","url":"https://huggingface.co/datasets/TeraflopAI/SEC-EDGAR","downloads":4674},{"name":"llamaindex/ParseBench","url":"https://huggingface.co/datasets/llamaindex/ParseBench","downloads":12577}],"trending_spaces":[{"name":"r3gm/wan2-2-fp8da-aoti-preview","url":"https://huggingface.co/spaces/r3gm/wan2-2-fp8da-aoti-preview","sdk":"gradio","likes":2306,"trendingScore":133},{"name":"k2-fsa/OmniVoice","url":"https://huggingface.co/spaces/k2-fsa/OmniVoice","sdk":"gradio","likes":663,"trendingScore":117},{"name":"webml-community/bonsai-webgpu","url":"https://huggingface.co/spaces/webml-community/bonsai-webgpu","sdk":"static","likes":156,"trendingScore":95},{"name":"smolagents/ml-intern","url":"https://huggingface.co/spaces/smolagents/ml-intern","sdk":"docker","likes":95,"trendingScore":81},{"name":"prithivMLmods/FireRed-Image-Edit-1.0-Fast","url":"https://huggingface.co/spaces/prithivMLmods/FireRed-Image-Edit-1.0-Fast","sdk":"gradio","likes":965,"trendingScore":79},{"name":"webml-community/bonsai-ternary-webgpu","url":"https://huggingface.co/spaces/webml-community/bonsai-ternary-webgpu","sdk":"static","likes":71,"trendingScore":66},{"name":"r3gm/wan2-2-fp8da-aoti-preview2","url":"https://huggingface.co/spaces/r3gm/wan2-2-fp8da-aoti-preview2","sdk":"gradio","likes":768,"trendingScore":61},{"name":"selfit-camera/Omni-Image-Editor","url":"https://huggingface.co/spaces/selfit-camera/Omni-Image-Editor","sdk":"gradio","likes":1500,"trendingScore":52},{"name":"baidu/ERNIE-Image-Turbo","url":"https://huggingface.co/spaces/baidu/ERNIE-Image-Turbo","sdk":"gradio","likes":81,"trendingScore":51},{"name":"victor/ace-step-jam","url":"https://huggingface.co/spaces/victor/ace-step-jam","sdk":"gradio","likes":77,"trendingScore":51}],"fetched_at":"2026-04-23T08:25:12.783Z"}