본문 바로가기

Paper/NLP9

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs ReTool: Reinforcement Learning for Strategic Tool Use in LLMsWhile reasoning models (e.g., DeepSeek R1) trained with reinforcement learning (RL), excel in textual reasoning, they struggle in scenarios requiring structured problem-solving, such as geometric reasoning, concise computation, or complex equation solving-arxiv.org 2025. 7. 29.

A-MEM: Agentic Memory for LLM Agents A-MEM: Agentic Memory for LLM AgentsWhile large language model (LLM) agents can effectively use external tools for complex real-world tasks, they require memory systems to leverage historical experiences. Current memory systems enable basic storage and retrieval but lack sophisticated memoryarxiv.org1. IntroductionLLM agent의 발전으로, 환경과 상호작용하고 작업을 실행하며 의사결정을 할 수 있게됨Reasoning과 planning 능력을 향상시키기 위해.. 2025. 3. 5.

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference ModelsIn this work, we investigate whether small language models can determine high-quality subsets of large-scale text datasets that improve the performance of larger language models. While existing work has shown that pruning based on the perplexity of a largearxiv.org1. Methods전체 dataset 중에서 일부 data를 사용하여, perplexity를.. 2025. 3. 5.

Data Selection for Language Models via Importance Resampling Data Selection for Language Models via Importance ResamplingSelecting a suitable pretraining dataset is crucial for both general-domain (e.g., GPT-3) and domain-specific (e.g., Codex) language models (LMs). We formalize this problem as selecting a subset of a large raw unlabeled dataset to match a desired target diarxiv.org1. MethodDSIR FrameworkLarge raw dataset에서 target data의 distribution과 일치하.. 2025. 3. 5.

이전 1 2 3 다음

티스토리툴바