Skip to main content

Sasha Rush on Building Cursor Composer and the Future of Agentic Coding

[HPP] Sasha RushJanuary 12, 202612 min
18 connections·27 entities in this video→

Vision and Core Principles of Composer

  • πŸ’‘ The primary vision for Cursor Composer was to create a smart, fast, and agentic coding model, building on the success of Cursor Tab's snappy autocomplete feature.
  • ⚑ Speed was a critical design decision, as a fast model allows developers to stay in the flow of coding and iterate quickly on solutions.
  • βœ… The goal was to develop a model that was not only fast but also smart enough to be trusted for writing code, addressing limitations of previous fast but less capable models.

Architectural Innovations

  • 🧠 Composer utilizes Reinforcement Learning (RL) to specialize the model for coding tasks, moving beyond general-purpose abilities to excel in a specific domain.
  • 🧩 The model incorporates a Mixture of Experts (MoE) architecture, which is an enhancement to the core transformer design where a set of distributed neural networks are used, and only a subset is activated for each computation.
  • πŸš€ MoE allows for computational efficiency and distributed computation by sharding experts onto different GPUs, making training on many GPUs more effective.

Agentic Capabilities and Tool Use

  • πŸ› οΈ Composer is designed as an agentic system, differing from simple LLMs by its ability to persistently call numerous tools (e.g., 150+) to find answers.
  • πŸ” These tools include capabilities like searching large codebases and running terminal commands, enabling the model to perform tasks that humans might find tedious or time-consuming.
  • 🎯 The agent's power lies in its persistence in trying to find the correct answer, making it highly effective for complex coding challenges.

Scaling Training and Infrastructure

  • ☁️ Training Composer at scale required running a full version of Cursor within virtual machines, launching hundreds of thousands of these to simulate real user experiences.
  • πŸ“Š Distributed computing was essential, with Ray being used extensively across five different areas, including as the standard for the RL controller to manage rollouts with varying completion times.
  • πŸ“ˆ Ray Data was also employed for processing and analyzing large volumes of rollout data, allowing for efficient identification of what was working or not in the training process.

Practical Application and Future Outlook

  • πŸ’‘ A recommended way to use Composer is to have a slower model draft a strategic plan for large changes, then use Composer to quickly implement that plan, allowing for rapid corrections and iterations.
  • πŸš€ Cursor 2.0 offers features like running multiple agents in parallel and cloud agents for offline or long-term changes, providing flexibility in how developers interact with the system.
  • 🌱 The future of AI models in coding is expected to involve continued improvements in intelligence and speed, with a trend towards more specialized models and the increasing importance of open-source frameworks like PyTorch and VLM.
Knowledge graph27 entities Β· 18 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
27 entities
Chapters7 moments

Key Moments

Transcript45 segments

Full Transcript

Topics15 themes

What’s Discussed

Cursor ComposerAgentic CodingReinforcement LearningMixture of ExpertsTransformersTool UseDistributed ComputingRayPyTorchVirtual MachinesSpecialized ModelsOpen Source LibrariesFrontier ModelsParallel AgentsCloud Agents
Smart Objects27 Β· 18 links
CompaniesΒ· 2
ProductsΒ· 6
ConceptsΒ· 18
PersonΒ· 1