Yuksekgonul, Mert

		Random Link ¯\_(ツ)_/¯
Jul 11, 2025	»	CS 329A: Self-Improving AI Agents 2 min; updated Jul 11, 2025 Stanford CS329A \| Self-Improving AI Agents. Azalia Mirhoseini; Aakanksha Chowdhery; Mert Yuksekgonul; Jon Saad-Falcon. cs329a.stanford.edu . Accessed Jul 11, 2025. Test-time Compute Scaling Large Language Monkeys: Scaling Inference Compute with Repeated Sampling Archon; An Architecture Search Framework for Inference-Time Techniques Scaling LLM Test-Time Compute Optimally Can Be More Effective Than Scaling Model Parameters Self-Improvement Techniques with Verifiers: Training Verifiers to Solve Math Word Problems Let’s Verify Step by Step Math-Shepherd: Verify and Reinforce LLMs Step-by-step Without Human Annotations Self-Improvement Techniques with RL Constitutional AI: Harmlessness from AI Feedback STaR: Bootstrapping Reasoning With Reasoning Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models Self-Improvement Techniques with Search Thinking Fast and Slow with Deep Learning and Tree Search Competitive-level Code Generation with AlphaCode AlphaCode 2 Technical Report Open-ended Agent Learning in the Era of Foundation Models The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Automated Design of Agentic Systems Augmenting LLMs with Tool Use/Actions ReAct: Synergizing Reasoning and Acting in Language Models Toolformer: Language Models Can Teach Themselves to Use Tools RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning Planning and Multi-Step Reasoning Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models LLMs Still Can’t Plan; Can LRMs?...