Skip to main content

Getting Unstuck with Python: Modern Tools for Data Science Workflows

[HPP] Vicki BoykisNovember 7, 202520 min
37 connections·40 entities in this video

Initial Python Struggles

  • ⚠️ Julia Silge initially found learning Python challenging ten years ago, despite a background in scientific computing, physics, and astronomy.
  • 🧠 She felt stuck and blocked due to difficulties with Python's ecosystem, contrasting with the language's reputation for being explicit, simple, and readable.
  • 💡 Her early struggles led her to discover R, where she found a welcoming open-source community and built a successful career.

Evolving Python Tooling

  • 🧩 A major challenge was managing Python environments, often leading to "dependency hell" where conflicting package versions prevented simultaneous use.
  • 🚫 Unlike R's CRAN or JavaScript's NPM, Python historically lacked a single, built-in package manager, instead offering a multitude of overlapping tools like Conda, Poetry, and pip.
  • ✅ Modern tooling has significantly improved; uv is now recommended as an all-in-one solution for blazing-fast dependency resolution, package installation, and virtual environment creation.

The Role of IDEs in Data Science

  • 🛠️ Another hurdle was choosing an effective IDE or code editor for data science, as general-purpose Python editors often treated data science tasks as a "second-class citizen."
  • 📉 Attempts to adapt tools like VS Code with extensions or use Jupyter Notebooks (due to concerns about their execution model and state management) resulted in a lackluster experience for iterative, exploratory data science.
  • 🚀 Positron, Posit's new IDE, is built on Code OSS but specifically designed for data science, offering a better integrated and more productive experience for Python users.

Why Re-engage with Python?

  • 🤝 External pressures, such as collaborating with Python-first colleagues or working on Python packages, often necessitate learning or re-engaging with the language.
  • ✨ Internal motivations include discovering delightful Python tools like the requests package for HTTP requests, Pydantic for data validation, and Ruff for formatting and linting.
  • 🎯 The speaker emphasizes choosing tools that align with the type of work being done, whether it's more engineering-like, statistical analysis, exploratory, or operationalized.

Positron and Workflow Integration

  • 💻 Positron offers in-IDE support for creating Python projects, including setting up virtual environments using uv, or users can opt for a terminal-driven approach.
  • 🔄 For projects involving both R and Python, Positron allows for mixed R and Python chunks in Quarto files (using reticulate) or running separate R and Python consoles simultaneously within the same workspace.
  • 📊 While still preferring R (especially the tidyverse) for initial dataset exploration and statistical analysis, Python is favored for tasks like interacting with APIs and data manipulation in a less "stats-forward" manner.

Addressing Jupyter Notebook Concerns

  • ⚠️ The speaker "bounced off" Jupyter Notebooks due to concerns about reproducible practices and the execution model, which can easily lead to an unknown state when cells are run out of order.
  • 💡 This concern stemmed from a background that ingrained the importance of managing state and hygiene around inputs and outputs to ensure reliable results.
  • 💬 While acknowledging Jupyter's popularity, the speaker found its approach to state management not a good fit for their preferred working habits.
Knowledge graph40 entities · 37 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore
40 entities
Chapters7 moments

Key Moments

Transcript72 segments

Full Transcript

Topics13 themes

What’s Discussed

PythonData SciencePositronPython Environment ManagementIDEsDependency HellPackage Managersuv (tool)R (programming language)Jupyter NotebooksAPI InteractionReproducible PracticesPython Tooling
Smart Objects40 · 37 links
Concepts· 15
Products· 18
People· 2
Companies· 3
Event· 1
Media· 1