In Part 1, I explored how the shift from keyword search to conversational commerce represents a fundamental linguistic change - from locutionary acts (literal word matching) to illocutionary acts (understanding intent and context). We looked at three key changes: context-dependence, anaphoric resolution, and the move from lexical to compositional semantics (essentially, from SEO to AEO).
Now let’s talk about what this means for payments infrastructure and why learning to speak human is harder than it sounds.
When Saying Becomes Doing: Payments as Speech Acts
In linguistic philosophy, a speech act is when saying something is doing something. “I promise” doesn’t describe a promise, it creates one. “I apologise” doesn’t report an apology, it performs one.
In the agent era, payments could become speech acts. Saying “yes, charge me” in conversation might not initiate a separate checkout flow, it could be the transaction itself. The utterance and the action could collapse into the same moment.
This would require infrastructure that treats natural language confirmation as authorisation. Security would shift toward pragmatic authentication (verifying the speaker has authority) rather than primarily form-based authentication (requiring explicit credential entry).
Biometric verification, voice recognition, behavioural patterns alongside or instead of CVV codes. The question would shift from “Do you have the password?” to “Are you the person I’ve been talking to?”
And if that makes you think carefully about security implications, good. That’s appropriate. Because we’re talking about fundamentally reconsidering what “authorisation” means in a linguistic sense, not just a technical one.
Why This Transition Isn’t Straightforward (Even for Those Who Seem to Get It)
There’s a philosopher, Paul Grice, who said successful conversation relies on four cooperative principles:
Quantity: be as informative as necessary
Quality: be truthful
Relation: be relevant
Manner: be clear
These aren’t abstract philosophy, they’re practical considerations for merchant-agent communication.
Merchants who game agent recommendations with misleading attributes (violating Quality) will likely face consequences, just like keyword-stuffing affected SEO. Products tagged with irrelevant attributes (violating Relation) may not get recommended as effectively. Ambiguous product data (violating Manner) might get passed over.
But here’s what strikes me: most merchants didn’t grow up in an environment where semantic clarity and conversational cooperation were technical requirements. Just like some of us didn’t grow up seeing certain workplace communication patterns modelled, most commerce platforms developed in an environment of forms and databases.
They learned the language of structured data. Now there’s a shift toward the language of discourse and meaning-making. That’s not a simple transition.
And it’s not just merchants and payment companies that need to adapt, consumers do too.
Here’s something I’ve written about before: many people have no idea how to articulate what they actually need. I’ve seen this as a writing teacher, as someone who built AI writing platforms for higher education, and as someone who’s watched the American business writing crisis unfold in the last decade.
The same clarity problem that costs US businesses approximately $2 trillion annually in poor communication shows up in how people interact with AI agents. When someone can’t write a clear email, they also can’t construct a clear prompt. When they don’t understand their own thinking process well enough to articulate it to a colleague, they definitely can’t articulate it to an AI agent trying to help them buy something.
AI agents are demanding something we’re not used to providing: clear expression of unclear needs. “I need something for focus” isn’t a failure of the agent to understand, it’s the human not yet knowing whether they need noise cancellation, caffeine, ergonomic support, or time management software. The agent has to work with what linguistics calls “underspecified input.”
This creates a new kind of cooperative burden. Grice’s maxims assumed both parties knew what they were trying to communicate but in agent-mediated commerce, humans often don’t know what they need until the agent helps them figure it out through conversation. The agent becomes a collaborative thinking partner, not just a transaction processor.
This is why the shift to conversational commerce isn’t just about better AI, it’s about helping humans develop what linguists call metalinguistic awareness: the ability to think consciously about what they’re trying to communicate. Writing is thinking made visible, as composition researchers have shown, and prompting an agent is the same skill: making your thinking visible enough for a system to work with it.
But let’s be honest about what agents can and can’t do.
In Part 1, I talked about how agents excel at understanding discourse, parsing syntax, interpreting compositional semantics, understanding context, and that’s true, AI agents are remarkably good at the receptive side: understanding what you mean from what you say.
Where they still struggle is with the deeper pragmatics. The power dynamics in conversation. The subtle social hierarchies. The cultural context that shapes what’s appropriate to say when. The emotional undercurrents that humans navigate instinctively.
I’ve written before about how different aspects of language present varying challenges for AI versus humans. Modern LLMs demonstrate impressive proficiency in syntax and semantics. They can generate grammatically correct sentences and handle complex semantic relationships with accuracy.
But pragmatics, especially the deeply social aspects like recognising authority relationships, understanding when language establishes dominance or submission, interpreting the political dimensions of discourse, remains challenging for artificial systems. The nuanced ways humans navigate complex social relationships through language, the unspoken hierarchies, the microaggressions, the cultural implications of linguistic choices - these require embodied, social, experiential knowledge.
So when we talk about agents understanding discourse, we need to be precise. They’re learning to understand the what (syntax and semantics). They’re still learning the why and the how (pragmatics and social context). They can process conversational structure, but they can’t always navigate conversational dynamics the way humans do.
This matters for payments because transactions aren’t just semantic exchanges, they’re social ones. There’s power, trust, vulnerability, cultural expectation embedded in “yes, charge me.” Agents can process the authorisation. Whether they can navigate the full social and emotional context of that authorisation is still an open question.
What This Transition Might Look Like
Look, I’m not going to pretend anyone has fully solved this yet but what’s becoming clear is that payment companies are exploring something that doesn’t quite exist yet in its complete form: a semantic payments layer.
This would involve middleware that can:
Parse conversational intent into transaction parameters without requiring structured input
Maintain discourse state across multi-turn conversations
Resolve ambiguous references by accessing conversational history
Negotiate uncertainty through clarifying questions rather than error messages
Support compositional transactions where payment is part of a larger conversational goal
Some companies are experimenting with this, like Stripe’s work with OpenAI explores payment confirmation embedded in chat interfaces, which is an interesting approach to the discourse challenge.
But the linguistic infrastructure goes way deeper than APIs. It involves rethinking:
How product data is structured (ontological, not just categorical)- How merchant catalogs are indexed (semantic, not just lexical)
How pricing is communicated (contextual, not static)
How fulfilment gets coordinated (conversational, not just form-based)
This isn’t about adding a chatbot. This is about building commerce infrastructure that could understand discourse the way humans actually use it.
The Learning Curve We’re All On
Here’s what I keep coming back to: the people building payments infrastructure right now learned their craft in a world where commerce meant filling out forms. Each piece of information had its designated box. Everything was structured, predictable, defined.
That’s the paradigm they were trained in. Structured data: everything has a place, a format, and a clear relationship to everything else.
Now with AI agents, commerce happens through conversation. Information comes out of order (”send it to my mum” before you’ve even said what “it” is). Context shapes meaning (”I’ll take it” means entirely different things depending on what you just discussed). Intent is implicit - you don’t fill out a field that says “execute payment,” you just say “yes, charge me” and the system has to understand that means authorisation.
This is what linguists call unstructured discourse - natural human communication.
But there’s another layer here that I hinted at in Part 1: trust. We learned not to trust technology with full sentences because it failed us so many times. Years of clunky chatbots, voice assistants that couldn’t understand basic requests, and “smart” systems that weren’t very smart taught us to dumb down our language. We trained ourselves to speak in keywords because anything else didn’t work. Now agents need to rebuild that trust, proving they can actually understand discourse, not just pretend to. That’s a psychological barrier, not just a technical one. And it’s as significant as any infrastructure challenge.
Just like I wrote about people not knowing how to negotiate or handle workplace conflict because they never saw those patterns modeled growing up, payment infrastructure developers never saw conversational commerce modeled. They learned the patterns that existed at the time. And those patterns were built around forms and structured data because that’s what the technology could handle.
Now the technology can handle more but you can’t just add a “conversational payment” button and call it done. The entire mental model is different. Instead of checking whether someone filled in all the required fields correctly, you’re trying to understand what someone meant three sentences ago. Instead of processing one transaction at a time, you’re maintaining context across an entire conversation. Instead of testing “does this data format work?”, you’re testing “does this conversation make sense to a human?”
The companies that will navigate this successfully are the ones who understand they’re exploring a linguistics challenge that happens to require engineering. You need people who understand how humans actually communicate - the pragmatics of context, the semantics of meaning-making - working alongside the people who build systems. You test for whether the conversation flows naturally, not just whether the transaction processed correctly.
This isn’t about anyone doing anything wrong. It’s about recognising that the language of commerce itself has fundamentally changed, and learning a new language takes time, especially when you’re building the infrastructure for it as you go.
Where This Leaves Us
The shift from keyword search to conversational commerce represents a significant linguistic change in digital commerce. And it’s being approached from many different angles: as a feature opportunity, as an infrastructure challenge, as a user experience evolution.
For merchants: if products are only indexed with keyword tags and not semantic attributes, discoverability through agents becomes more challenging.
For payment companies: infrastructure that can maintain discourse state, resolve pronouns, and understand pragmatic context represents a meaningful evolution from form-based systems.
And for all of us watching this unfold: this is what a linguistic transition looks like in real-time. An entire industry is shifting from one communication model to another.
The language of agents isn’t simpler. It’s more complex, more human, more dependent on context and shared understanding. It’s closer to how we naturally communicate.
I may have spent years studying what happens when language fails us, when our systems can’t capture what we mean, when the infrastructure we carry, or don’t carry, tells stories we never meant to tell. This fascination with how language works, how it breaks, how it evolves, is probably why I find this particular moment in commerce so compelling.
Because sometimes the most important conversations happen in the spaces between languages: in the gap between what we can say and what we mean. And right now, the payments industry is learning to navigate that space. Not because anyone failed, but because the language itself has changed.
Maybe the answer isn’t about speaking the language we’re supposed to speak. Maybe it’s about learning to build infrastructure for the language humans actually use.











