I've gotten a lot of question about this recent so:
I would not recommend using neural language generation (BERT, GPT-3, etc.) to generate text you send to users.
Why?
It *will* produce plausible sounding but factually incorrect output. Not if but when.
I would not recommend using neural language generation (BERT, GPT-3, etc.) to generate text you send to users.
Why?
It *will* produce plausible sounding but factually incorrect output. Not if but when.
Just a couple hypothetical examples of things that keep me up at night.
Truth: Retailer policy is only unopened products and a two week window
Assistant output when asked: You can return any product for a full refund at any time
Truth: Retailer policy is only unopened products and a two week window
Assistant output when asked: You can return any product for a full refund at any time
Truth: Medicine x and y do not have any known interactions and should be taken together
Assistant output when asked: Medicines x and y have interactions and have been shown to slow blood clotting and lead to respiratory distress (citation to a paper that does not exist)
Assistant output when asked: Medicines x and y have interactions and have been shown to slow blood clotting and lead to respiratory distress (citation to a paper that does not exist)
And that's leaving aside the question of producing abusive/foul language, which WILL happen (and can be intentionally triggered: https://www.ericswallace.com/triggers )
More discussion on Twitter: https://twitter.com/an_open_mind/status/1285940858290933767
More discussion on Twitter: https://twitter.com/an_open_mind/status/1285940858290933767
I'm ok with showing generated text to users IFF:
- Users expect to be served text that is not factually correct (a game setting like AI Dungeon or something similar)
- The output has been carefully vetted by a trained human first (and at that point, I'd write my own output)
- Users expect to be served text that is not factually correct (a game setting like AI Dungeon or something similar)
- The output has been carefully vetted by a trained human first (and at that point, I'd write my own output)
(And this list of companies using it to generate text in the legal domain is, uh, worrying. Again: these models WILL generate reasonable-sounding but factually inaccurate text. Not if but when.
https://www.artificiallawyer.com/2020/07/29/gpt-3-a-game-changer-for-legal-tech/)
https://www.artificiallawyer.com/2020/07/29/gpt-3-a-game-changer-for-legal-tech/)