Tweak the handling of invalid text not to split UTF-8 characters apart