☺️

Breaking Free from the Chat Box

May 12, 2023

Chat interfaces are now a staple in the software world— I helped pave that road during my 6 years at Drift. We campaigned hard to prove chat wasn’t just for customer service or casual conversations: championing chat over forms with our “no forms campaign”. Fast forward to now, and chat’s more versatile than ever, breathing life into various software and user experiences.

Enter Chat GPT, Midjourney, and a slew of other chat-first AI tools. They’ve elevated chat from a simple communication tool to the control center for powerful AI capabilities. But its come at a cost: while chat is a solid starting point, it’s also a creative bottleneck. It’s not that it’s the wrong tool for the job—it’s more that our reliance on chat stifles our imagination. We’re so accustomed to interacting with these advanced AI systems through chat that it’s hard to envision other, potentially richer, ways to engage.

Chat's Limitations in UX and Technical Scope

While chat’s ubiquitous presence makes it an easy go-to, its simplicity is also its downfall. We're often so accustomed to chat that we overlook opportunities for more engaging user experiences.

We’re so used to chat that it’s like we’re wearing blinders, focusing only on a narrow pathway of interaction. This leads to a one-size-fits-all approach, where unique opportunities for richer, more engaging user experiences are overlooked. Chat has its own language, a set of rules we’ve internalized so deeply that we forget other languages exist—languages that could offer users far more nuanced interactions.

The Time-Complexity Dilemma

Chat’s format struggles with complex, structured data—think of it as the linear time complexity of UX, good for quick operations but lacking when you need to scale the conversation. While dropping in graphs or videos is possible, a series of message boxes doesn’t always cut it. Waiting for a response from GPT-4 could take 30 seconds or more, and in UX, as in algorithms, efficiency matters.

When you’re working with tools like OpenAI’s Chat API, time can stretch out. A complex prompt may require tens of seconds for a response. Streaming text or markdown alleviates the wait time, a crucial UX improvement. However, the limitation lies in the challenge of merging this real-time benefit with structured JSON or richer data types.

The streaming approach, while efficient, doesn’t easily support the simultaneous delivery of such data. And there’s the rub: we’ve solved one UX problem but inadvertently narrowed our options for richer, multi-layered interactions.

Time isn’t just money; it’s also user engagement.

The Cycle of Constraints and Creative Exploration

The limitations of existing AI tooling are not mere inconveniences; they set boundaries on what’s possible and, more importantly, what can be easily imagined. While workarounds are possible, they can deter creators from truly exploring the full potential of these technologies.

In constraining our tools, we may unintentionally be constraining our creative capabilities as well.

I started to encounter firsthand the challenges in crafting what I envisioned using just the OAI SDK or LangChain. The need for a tool that could efficiently handle complex, structured data and offer more than just string responses was apparent. It wasn't just about managing prompts and state; it was about envisioning an agent instance with a persistent identity and dependable response model, accessible on every request.


We should reimagine how we interact with AI systems and the tools we create for them--move beyond linear, text-based interfaces and embrace more engaging and immersive experiences.

All posts