AI 29
- Do Transformers Need Three Projections? — QKV 투영을 공유해 KV 캐시를 절반으로
- Hierarchical Reasoning Model — 뇌에서 영감받은 계층적 잠재 추론 아키텍처
- Hallucinations Undermine Trust; Metacognition is a Way Forward — Faithful Uncertainty로 환각을 재정의하다
- MCP(Model Context Protocol) 개념과 구조
- Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets
- TurboQuant: 정보 이론적 최적에 근접하는 온라인 벡터 양자화
- 270개 API를 가진 구조해석 SW를 LLM에 연결하기 - GEN NX MCP 서버 만들기
- Stanford CME295: Lecture 9 - Recap & Current Trends
- Prompt Repetition Improves Non-Reasoning LLMs
- Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
- REFRAG: Rethinking RAG based Decoding
- Do As We Do, Not As You Think: The Conformity of Large Language Models
- Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate
- How we built our multi-agent research system
- Improving Factuality and Reasoning in Language Models through Multiagent Debate
- LLM 양자화 (Quantization) 가이드
- Stanford CME295: Lecture 8 - LLM Evaluation
- Stanford CME295: Lecture 7 - Agentic LLMs (RAG, Tool Calling, Agents)
- Visual Studio 2022 Copilot vs Copilot CLI 아키텍처 비교
- Stanford CME295: Lecture 6 - LLM Reasoning
- Stanford CME295: Lecture 5 - LLM Tuning (Preference Tuning)
- Stanford CME295: Lecture 4 - LLM Training
- Stanford CME295: Lecture 3 - LLMs & 추론 최적화
- Stanford CME295: Lecture 2 - Transformer-Based Models & Tricks
- Stanford CME295: Lecture 1 - Transformer 기초
- Stanford CME295: Lecture 0 - Transformer 개요
- AI Coding Tool 활용 팁 - 캐시 암호화 문제 해결
- AI Coding Tool 동작 원리 이해하기
- LLM 동작 원리 알아보기