Public Git Hosting - mediawiki.git/commit

commit	64d4e75208953bda72cee850f5287a6929c119f3
author	Brion Vibber <brion@users.mediawiki.org>
	Mon, 4 Apr 2011 20:59:04 +0000 (4 20:59 +0000)
committer	Brion Vibber <brion@users.mediawiki.org>
	Mon, 4 Apr 2011 20:59:04 +0000 (4 20:59 +0000)
tree	5e3971952aa9903160de1d5a712a13d997ea7f81	tree \| snapshot (tar.gz zip)
parent	d3eec98cb79fbee8193e8c61b38e98d45a755325	commit \| diff

Workaround for bug 28146: running out of memory during Unicode validation/normalization when uploading DjVu file with lots of embedded page text

This provisional workaround runs a page at a time through UtfNormal::cleanUp() instead of running the entire file's dumped text at once. This avoids exploding memory too much during the preg_match_all() used to divide up ASCII and non-ASCII runs for validation, which is very wasteful for long texts in Latin languages with many mixed-in non-ASCII characters (like French and German text).
Won't fix legit cases of huge texts, such as realllllllllly long page text, which would still be subject to getting run through at web input time in a giant chunk.