From c1fa441b9d6015f804ef7659bb03bf1999628f5c Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Fri, 4 Oct 2019 07:13:46 +0000 Subject: [PATCH] [test] Remove locale dependency for mri-utf8.test Summary: llvm-ar's mri-utf8.test test relies on the en_US.UTF-8 locale to be installed for its last RUN line to work. If not installed, the unicode string gets encoded (interpreted) as ascii which fails since the most significant byte is non zero. This commit changes the call to open to use a binary literal of the UTF-8 encoding for the pound sign instead, thus bypassing the encoding step. Note that the echo to create the .txt file will work regardless of the locale because both the shell and the echo (in case it's not a builtin of the shell concerned) only care about ascii character to operate. Indeed, the mri-utf8.test file (and in particular the pound sign) is encoded in UTF-8 and UTF-8 guarantees only ascii characters can create bytes that can be interpreted as ascii characters (i.e. bytes with the most significant bit null). So the process to break down the filename in the line goes something along: - find an ascii chevron '>' - find beginning of the filename by removing ascii space-like characters - find ascii newline character indicating the end of the redirection (no semicolon ';', closing curly bracket '}' or parenthesis ')' or the like - create a file whose name is made of all the bytes in between beginning and end of filename *without interpretting them* Reviewers: gbreynoo, MaskRay, rupprecht, JamesNagurne, jfb Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68418 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373700 91177308-0d34-0410-b5e6-96231b3b80d8 --- test/tools/llvm-ar/mri-utf8.test | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/test/tools/llvm-ar/mri-utf8.test b/test/tools/llvm-ar/mri-utf8.test index 64999960074..af2be04111b 100644 --- a/test/tools/llvm-ar/mri-utf8.test +++ b/test/tools/llvm-ar/mri-utf8.test @@ -12,8 +12,4 @@ RUN: echo "SAVE" >> %t/script.mri RUN: llvm-ar -M < %t/script.mri RUN: cd %t/extracted && llvm-ar x %t/mri.ar -# This works around problems launching processess that -# include arguments with non-ascii characters. -# Python on Linux defaults to ASCII encoding unless the -# environment specifies otherwise, so it is explicitly set. -RUN: env LANG=en_US.UTF-8 %python -c "assert open(u'\U000000A3.txt', 'rb').read() == b'contents\n'" +RUN: %python -c "assert open(b'\xC2\xA3.txt', 'rb').read() == b'contents\n'" -- 2.11.4.GIT