Eliciting product feedback elegantly is a competitive advantage for LLM-software.
Over the weekend, I queried Google’s Bard, & noticed the elegant feedback loop the product team has incorporated into their product.
I asked Bard to compare the 3rd-row leg room of the leading 7-passenger SUVs.
At the bottom of the post is a little G button, which double-checks the response using Google searches.
I decided to click it. This is what I would be doing in any case ; spot-checking some of the results.
LLM systems aren’t deterministic. 1 can be larger than 4 for an LLM. If an LLM produces a few spurious results, the user won’t trust it.
Bard highlights confirmed data in green & potentially erroneous data in red. I confirmed the green is correct. The red sometimes was correct and other times wasn’t.
In addition to saving me time, I can use a less-than-trusted system, benefit from the accurate portion of the response – which should keep me coming back – all while improving the system for the next time.
It’s symbiotic.
I wonder if it won’t become the dominant feedback mechanism for LLM-enabled apps, replacing the now ubiquitous but deeply amorphous thumbs up/down.