[WebAssembly] Add new target feature in support of 'extended-const' proposal
[llvm-project.git] / llvm / lib / Target / WebAssembly / WebAssemblyLowerEmscriptenEHSjLj.cpp
blobc165542019532c462a23f9df6fad783773c27b56
1 //=== WebAssemblyLowerEmscriptenEHSjLj.cpp - Lower exceptions for Emscripten =//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 ///
9 /// \file
10 /// This file lowers exception-related instructions and setjmp/longjmp function
11 /// calls to use Emscripten's library functions. The pass uses JavaScript's try
12 /// and catch mechanism in case of Emscripten EH/SjLj and Wasm EH intrinsics in
13 /// case of Emscripten SjLJ.
14 ///
15 /// * Emscripten exception handling
16 /// This pass lowers invokes and landingpads into library functions in JS glue
17 /// code. Invokes are lowered into function wrappers called invoke wrappers that
18 /// exist in JS side, which wraps the original function call with JS try-catch.
19 /// If an exception occurred, cxa_throw() function in JS side sets some
20 /// variables (see below) so we can check whether an exception occurred from
21 /// wasm code and handle it appropriately.
22 ///
23 /// * Emscripten setjmp-longjmp handling
24 /// This pass lowers setjmp to a reasonably-performant approach for emscripten.
25 /// The idea is that each block with a setjmp is broken up into two parts: the
26 /// part containing setjmp and the part right after the setjmp. The latter part
27 /// is either reached from the setjmp, or later from a longjmp. To handle the
28 /// longjmp, all calls that might longjmp are also called using invoke wrappers
29 /// and thus JS / try-catch. JS longjmp() function also sets some variables so
30 /// we can check / whether a longjmp occurred from wasm code. Each block with a
31 /// function call that might longjmp is also split up after the longjmp call.
32 /// After the longjmp call, we check whether a longjmp occurred, and if it did,
33 /// which setjmp it corresponds to, and jump to the right post-setjmp block.
34 /// We assume setjmp-longjmp handling always run after EH handling, which means
35 /// we don't expect any exception-related instructions when SjLj runs.
36 /// FIXME Currently this scheme does not support indirect call of setjmp,
37 /// because of the limitation of the scheme itself. fastcomp does not support it
38 /// either.
39 ///
40 /// In detail, this pass does following things:
41 ///
42 /// 1) Assumes the existence of global variables: __THREW__, __threwValue
43 /// __THREW__ and __threwValue are defined in compiler-rt in Emscripten.
44 /// These variables are used for both exceptions and setjmp/longjmps.
45 /// __THREW__ indicates whether an exception or a longjmp occurred or not. 0
46 /// means nothing occurred, 1 means an exception occurred, and other numbers
47 /// mean a longjmp occurred. In the case of longjmp, __THREW__ variable
48 /// indicates the corresponding setjmp buffer the longjmp corresponds to.
49 /// __threwValue is 0 for exceptions, and the argument to longjmp in case of
50 /// longjmp.
51 ///
52 /// * Emscripten exception handling
53 ///
54 /// 2) We assume the existence of setThrew and setTempRet0/getTempRet0 functions
55 /// at link time. setThrew exists in Emscripten's compiler-rt:
56 ///
57 /// void setThrew(uintptr_t threw, int value) {
58 /// if (__THREW__ == 0) {
59 /// __THREW__ = threw;
60 /// __threwValue = value;
61 /// }
62 /// }
64 /// setTempRet0 is called from __cxa_find_matching_catch() in JS glue code.
65 /// In exception handling, getTempRet0 indicates the type of an exception
66 /// caught, and in setjmp/longjmp, it means the second argument to longjmp
67 /// function.
68 ///
69 /// 3) Lower
70 /// invoke @func(arg1, arg2) to label %invoke.cont unwind label %lpad
71 /// into
72 /// __THREW__ = 0;
73 /// call @__invoke_SIG(func, arg1, arg2)
74 /// %__THREW__.val = __THREW__;
75 /// __THREW__ = 0;
76 /// if (%__THREW__.val == 1)
77 /// goto %lpad
78 /// else
79 /// goto %invoke.cont
80 /// SIG is a mangled string generated based on the LLVM IR-level function
81 /// signature. After LLVM IR types are lowered to the target wasm types,
82 /// the names for these wrappers will change based on wasm types as well,
83 /// as in invoke_vi (function takes an int and returns void). The bodies of
84 /// these wrappers will be generated in JS glue code, and inside those
85 /// wrappers we use JS try-catch to generate actual exception effects. It
86 /// also calls the original callee function. An example wrapper in JS code
87 /// would look like this:
88 /// function invoke_vi(index,a1) {
89 /// try {
90 /// Module["dynCall_vi"](index,a1); // This calls original callee
91 /// } catch(e) {
92 /// if (typeof e !== 'number' && e !== 'longjmp') throw e;
93 /// _setThrew(1, 0); // setThrew is called here
94 /// }
95 /// }
96 /// If an exception is thrown, __THREW__ will be set to true in a wrapper,
97 /// so we can jump to the right BB based on this value.
98 ///
99 /// 4) Lower
100 /// %val = landingpad catch c1 catch c2 catch c3 ...
101 /// ... use %val ...
102 /// into
103 /// %fmc = call @__cxa_find_matching_catch_N(c1, c2, c3, ...)
104 /// %val = {%fmc, getTempRet0()}
105 /// ... use %val ...
106 /// Here N is a number calculated based on the number of clauses.
107 /// setTempRet0 is called from __cxa_find_matching_catch() in JS glue code.
109 /// 5) Lower
110 /// resume {%a, %b}
111 /// into
112 /// call @__resumeException(%a)
113 /// where __resumeException() is a function in JS glue code.
115 /// 6) Lower
116 /// call @llvm.eh.typeid.for(type) (intrinsic)
117 /// into
118 /// call @llvm_eh_typeid_for(type)
119 /// llvm_eh_typeid_for function will be generated in JS glue code.
121 /// * Emscripten setjmp / longjmp handling
123 /// If there are calls to longjmp()
125 /// 1) Lower
126 /// longjmp(env, val)
127 /// into
128 /// emscripten_longjmp(env, val)
130 /// If there are calls to setjmp()
132 /// 2) In the function entry that calls setjmp, initialize setjmpTable and
133 /// sejmpTableSize as follows:
134 /// setjmpTableSize = 4;
135 /// setjmpTable = (int *) malloc(40);
136 /// setjmpTable[0] = 0;
137 /// setjmpTable and setjmpTableSize are used to call saveSetjmp() function in
138 /// Emscripten compiler-rt.
140 /// 3) Lower
141 /// setjmp(env)
142 /// into
143 /// setjmpTable = saveSetjmp(env, label, setjmpTable, setjmpTableSize);
144 /// setjmpTableSize = getTempRet0();
145 /// For each dynamic setjmp call, setjmpTable stores its ID (a number which
146 /// is incrementally assigned from 0) and its label (a unique number that
147 /// represents each callsite of setjmp). When we need more entries in
148 /// setjmpTable, it is reallocated in saveSetjmp() in Emscripten's
149 /// compiler-rt and it will return the new table address, and assign the new
150 /// table size in setTempRet0(). saveSetjmp also stores the setjmp's ID into
151 /// the buffer 'env'. A BB with setjmp is split into two after setjmp call in
152 /// order to make the post-setjmp BB the possible destination of longjmp BB.
154 /// 4) Lower every call that might longjmp into
155 /// __THREW__ = 0;
156 /// call @__invoke_SIG(func, arg1, arg2)
157 /// %__THREW__.val = __THREW__;
158 /// __THREW__ = 0;
159 /// %__threwValue.val = __threwValue;
160 /// if (%__THREW__.val != 0 & %__threwValue.val != 0) {
161 /// %label = testSetjmp(mem[%__THREW__.val], setjmpTable,
162 /// setjmpTableSize);
163 /// if (%label == 0)
164 /// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
165 /// setTempRet0(%__threwValue.val);
166 /// } else {
167 /// %label = -1;
168 /// }
169 /// longjmp_result = getTempRet0();
170 /// switch %label {
171 /// label 1: goto post-setjmp BB 1
172 /// label 2: goto post-setjmp BB 2
173 /// ...
174 /// default: goto splitted next BB
175 /// }
176 /// testSetjmp examines setjmpTable to see if there is a matching setjmp
177 /// call. After calling an invoke wrapper, if a longjmp occurred, __THREW__
178 /// will be the address of matching jmp_buf buffer and __threwValue be the
179 /// second argument to longjmp. mem[%__THREW__.val] is a setjmp ID that is
180 /// stored in saveSetjmp. testSetjmp returns a setjmp label, a unique ID to
181 /// each setjmp callsite. Label 0 means this longjmp buffer does not
182 /// correspond to one of the setjmp callsites in this function, so in this
183 /// case we just chain the longjmp to the caller. Label -1 means no longjmp
184 /// occurred. Otherwise we jump to the right post-setjmp BB based on the
185 /// label.
187 /// * Wasm setjmp / longjmp handling
188 /// This mode still uses some Emscripten library functions but not JavaScript's
189 /// try-catch mechanism. It instead uses Wasm exception handling intrinsics,
190 /// which will be lowered to exception handling instructions.
192 /// If there are calls to longjmp()
194 /// 1) Lower
195 /// longjmp(env, val)
196 /// into
197 /// __wasm_longjmp(env, val)
199 /// If there are calls to setjmp()
201 /// 2) and 3): The same as 2) and 3) in Emscripten SjLj.
202 /// (setjmpTable/setjmpTableSize initialization + setjmp callsite
203 /// transformation)
205 /// 4) Create a catchpad with a wasm.catch() intrinsic, which returns the value
206 /// thrown by __wasm_longjmp function. In Emscripten library, we have this
207 /// struct:
209 /// struct __WasmLongjmpArgs {
210 /// void *env;
211 /// int val;
212 /// };
213 /// struct __WasmLongjmpArgs __wasm_longjmp_args;
215 /// The thrown value here is a pointer to __wasm_longjmp_args struct object. We
216 /// use this struct to transfer two values by throwing a single value. Wasm
217 /// throw and catch instructions are capable of throwing and catching multiple
218 /// values, but it also requires multivalue support that is currently not very
219 /// reliable.
220 /// TODO Switch to throwing and catching two values without using the struct
222 /// All longjmpable function calls will be converted to an invoke that will
223 /// unwind to this catchpad in case a longjmp occurs. Within the catchpad, we
224 /// test the thrown values using testSetjmp function as we do for Emscripten
225 /// SjLj. The main difference is, in Emscripten SjLj, we need to transform every
226 /// longjmpable callsite into a sequence of code including testSetjmp() call; in
227 /// Wasm SjLj we do the testing in only one place, in this catchpad.
229 /// After testing calling testSetjmp(), if the longjmp does not correspond to
230 /// one of the setjmps within the current function, it rethrows the longjmp
231 /// by calling __wasm_longjmp(). If it corresponds to one of setjmps in the
232 /// function, we jump to the beginning of the function, which contains a switch
233 /// to each post-setjmp BB. Again, in Emscripten SjLj, this switch is added for
234 /// every longjmpable callsite; in Wasm SjLj we do this only once at the top of
235 /// the function. (after setjmpTable/setjmpTableSize initialization)
237 /// The below is the pseudocode for what we have described
239 /// entry:
240 /// Initialize setjmpTable and setjmpTableSize
242 /// setjmp.dispatch:
243 /// switch %label {
244 /// label 1: goto post-setjmp BB 1
245 /// label 2: goto post-setjmp BB 2
246 /// ...
247 /// default: goto splitted next BB
248 /// }
249 /// ...
251 /// bb:
252 /// invoke void @foo() ;; foo is a longjmpable function
253 /// to label %next unwind label %catch.dispatch.longjmp
254 /// ...
256 /// catch.dispatch.longjmp:
257 /// %0 = catchswitch within none [label %catch.longjmp] unwind to caller
259 /// catch.longjmp:
260 /// %longjmp.args = wasm.catch() ;; struct __WasmLongjmpArgs
261 /// %env = load 'env' field from __WasmLongjmpArgs
262 /// %val = load 'val' field from __WasmLongjmpArgs
263 /// %label = testSetjmp(mem[%env], setjmpTable, setjmpTableSize);
264 /// if (%label == 0)
265 /// __wasm_longjmp(%env, %val)
266 /// catchret to %setjmp.dispatch
268 ///===----------------------------------------------------------------------===//
270 #include "Utils/WebAssemblyUtilities.h"
271 #include "WebAssembly.h"
272 #include "WebAssemblyTargetMachine.h"
273 #include "llvm/ADT/StringExtras.h"
274 #include "llvm/CodeGen/TargetPassConfig.h"
275 #include "llvm/CodeGen/WasmEHFuncInfo.h"
276 #include "llvm/IR/DebugInfoMetadata.h"
277 #include "llvm/IR/Dominators.h"
278 #include "llvm/IR/IRBuilder.h"
279 #include "llvm/IR/IntrinsicsWebAssembly.h"
280 #include "llvm/Support/CommandLine.h"
281 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
282 #include "llvm/Transforms/Utils/Local.h"
283 #include "llvm/Transforms/Utils/SSAUpdater.h"
284 #include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
286 using namespace llvm;
288 #define DEBUG_TYPE "wasm-lower-em-ehsjlj"
290 static cl::list<std::string>
291 EHAllowlist("emscripten-cxx-exceptions-allowed",
292 cl::desc("The list of function names in which Emscripten-style "
293 "exception handling is enabled (see emscripten "
294 "EMSCRIPTEN_CATCHING_ALLOWED options)"),
295 cl::CommaSeparated);
297 namespace {
298 class WebAssemblyLowerEmscriptenEHSjLj final : public ModulePass {
299 bool EnableEmEH; // Enable Emscripten exception handling
300 bool EnableEmSjLj; // Enable Emscripten setjmp/longjmp handling
301 bool EnableWasmSjLj; // Enable Wasm setjmp/longjmp handling
302 bool DoSjLj; // Whether we actually perform setjmp/longjmp handling
304 GlobalVariable *ThrewGV = nullptr; // __THREW__ (Emscripten)
305 GlobalVariable *ThrewValueGV = nullptr; // __threwValue (Emscripten)
306 Function *GetTempRet0F = nullptr; // getTempRet0() (Emscripten)
307 Function *SetTempRet0F = nullptr; // setTempRet0() (Emscripten)
308 Function *ResumeF = nullptr; // __resumeException() (Emscripten)
309 Function *EHTypeIDF = nullptr; // llvm.eh.typeid.for() (intrinsic)
310 Function *EmLongjmpF = nullptr; // emscripten_longjmp() (Emscripten)
311 Function *SaveSetjmpF = nullptr; // saveSetjmp() (Emscripten)
312 Function *TestSetjmpF = nullptr; // testSetjmp() (Emscripten)
313 Function *WasmLongjmpF = nullptr; // __wasm_longjmp() (Emscripten)
314 Function *CatchF = nullptr; // wasm.catch() (intrinsic)
316 // type of 'struct __WasmLongjmpArgs' defined in emscripten
317 Type *LongjmpArgsTy = nullptr;
319 // __cxa_find_matching_catch_N functions.
320 // Indexed by the number of clauses in an original landingpad instruction.
321 DenseMap<int, Function *> FindMatchingCatches;
322 // Map of <function signature string, invoke_ wrappers>
323 StringMap<Function *> InvokeWrappers;
324 // Set of allowed function names for exception handling
325 std::set<std::string> EHAllowlistSet;
326 // Functions that contains calls to setjmp
327 SmallPtrSet<Function *, 8> SetjmpUsers;
329 StringRef getPassName() const override {
330 return "WebAssembly Lower Emscripten Exceptions";
333 using InstVector = SmallVectorImpl<Instruction *>;
334 bool runEHOnFunction(Function &F);
335 bool runSjLjOnFunction(Function &F);
336 void handleLongjmpableCallsForEmscriptenSjLj(
337 Function &F, InstVector &SetjmpTableInsts,
338 InstVector &SetjmpTableSizeInsts,
339 SmallVectorImpl<PHINode *> &SetjmpRetPHIs);
340 void
341 handleLongjmpableCallsForWasmSjLj(Function &F, InstVector &SetjmpTableInsts,
342 InstVector &SetjmpTableSizeInsts,
343 SmallVectorImpl<PHINode *> &SetjmpRetPHIs);
344 Function *getFindMatchingCatch(Module &M, unsigned NumClauses);
346 Value *wrapInvoke(CallBase *CI);
347 void wrapTestSetjmp(BasicBlock *BB, DebugLoc DL, Value *Threw,
348 Value *SetjmpTable, Value *SetjmpTableSize, Value *&Label,
349 Value *&LongjmpResult, BasicBlock *&CallEmLongjmpBB,
350 PHINode *&CallEmLongjmpBBThrewPHI,
351 PHINode *&CallEmLongjmpBBThrewValuePHI,
352 BasicBlock *&EndBB);
353 Function *getInvokeWrapper(CallBase *CI);
355 bool areAllExceptionsAllowed() const { return EHAllowlistSet.empty(); }
356 bool supportsException(const Function *F) const {
357 return EnableEmEH && (areAllExceptionsAllowed() ||
358 EHAllowlistSet.count(std::string(F->getName())));
360 void replaceLongjmpWith(Function *LongjmpF, Function *NewF);
362 void rebuildSSA(Function &F);
364 public:
365 static char ID;
367 WebAssemblyLowerEmscriptenEHSjLj()
368 : ModulePass(ID), EnableEmEH(WebAssembly::WasmEnableEmEH),
369 EnableEmSjLj(WebAssembly::WasmEnableEmSjLj),
370 EnableWasmSjLj(WebAssembly::WasmEnableSjLj) {
371 assert(!(EnableEmSjLj && EnableWasmSjLj) &&
372 "Two SjLj modes cannot be turned on at the same time");
373 assert(!(EnableEmEH && EnableWasmSjLj) &&
374 "Wasm SjLj should be only used with Wasm EH");
375 EHAllowlistSet.insert(EHAllowlist.begin(), EHAllowlist.end());
377 bool runOnModule(Module &M) override;
379 void getAnalysisUsage(AnalysisUsage &AU) const override {
380 AU.addRequired<DominatorTreeWrapperPass>();
383 } // End anonymous namespace
385 char WebAssemblyLowerEmscriptenEHSjLj::ID = 0;
386 INITIALIZE_PASS(WebAssemblyLowerEmscriptenEHSjLj, DEBUG_TYPE,
387 "WebAssembly Lower Emscripten Exceptions / Setjmp / Longjmp",
388 false, false)
390 ModulePass *llvm::createWebAssemblyLowerEmscriptenEHSjLj() {
391 return new WebAssemblyLowerEmscriptenEHSjLj();
394 static bool canThrow(const Value *V) {
395 if (const auto *F = dyn_cast<const Function>(V)) {
396 // Intrinsics cannot throw
397 if (F->isIntrinsic())
398 return false;
399 StringRef Name = F->getName();
400 // leave setjmp and longjmp (mostly) alone, we process them properly later
401 if (Name == "setjmp" || Name == "longjmp" || Name == "emscripten_longjmp")
402 return false;
403 return !F->doesNotThrow();
405 // not a function, so an indirect call - can throw, we can't tell
406 return true;
409 // Get a thread-local global variable with the given name. If it doesn't exist
410 // declare it, which will generate an import and assume that it will exist at
411 // link time.
412 static GlobalVariable *getGlobalVariable(Module &M, Type *Ty,
413 WebAssemblyTargetMachine &TM,
414 const char *Name) {
415 auto *GV = dyn_cast<GlobalVariable>(M.getOrInsertGlobal(Name, Ty));
416 if (!GV)
417 report_fatal_error(Twine("unable to create global: ") + Name);
419 // Variables created by this function are thread local. If the target does not
420 // support TLS, we depend on CoalesceFeaturesAndStripAtomics to downgrade it
421 // to non-thread-local ones, in which case we don't allow this object to be
422 // linked with other objects using shared memory.
423 GV->setThreadLocalMode(GlobalValue::GeneralDynamicTLSModel);
424 return GV;
427 // Simple function name mangler.
428 // This function simply takes LLVM's string representation of parameter types
429 // and concatenate them with '_'. There are non-alphanumeric characters but llc
430 // is ok with it, and we need to postprocess these names after the lowering
431 // phase anyway.
432 static std::string getSignature(FunctionType *FTy) {
433 std::string Sig;
434 raw_string_ostream OS(Sig);
435 OS << *FTy->getReturnType();
436 for (Type *ParamTy : FTy->params())
437 OS << "_" << *ParamTy;
438 if (FTy->isVarArg())
439 OS << "_...";
440 Sig = OS.str();
441 erase_if(Sig, isSpace);
442 // When s2wasm parses .s file, a comma means the end of an argument. So a
443 // mangled function name can contain any character but a comma.
444 std::replace(Sig.begin(), Sig.end(), ',', '.');
445 return Sig;
448 static Function *getEmscriptenFunction(FunctionType *Ty, const Twine &Name,
449 Module *M) {
450 Function* F = Function::Create(Ty, GlobalValue::ExternalLinkage, Name, M);
451 // Tell the linker that this function is expected to be imported from the
452 // 'env' module.
453 if (!F->hasFnAttribute("wasm-import-module")) {
454 llvm::AttrBuilder B(M->getContext());
455 B.addAttribute("wasm-import-module", "env");
456 F->addFnAttrs(B);
458 if (!F->hasFnAttribute("wasm-import-name")) {
459 llvm::AttrBuilder B(M->getContext());
460 B.addAttribute("wasm-import-name", F->getName());
461 F->addFnAttrs(B);
463 return F;
466 // Returns an integer type for the target architecture's address space.
467 // i32 for wasm32 and i64 for wasm64.
468 static Type *getAddrIntType(Module *M) {
469 IRBuilder<> IRB(M->getContext());
470 return IRB.getIntNTy(M->getDataLayout().getPointerSizeInBits());
473 // Returns an integer pointer type for the target architecture's address space.
474 // i32* for wasm32 and i64* for wasm64.
475 static Type *getAddrPtrType(Module *M) {
476 return Type::getIntNPtrTy(M->getContext(),
477 M->getDataLayout().getPointerSizeInBits());
480 // Returns an integer whose type is the integer type for the target's address
481 // space. Returns (i32 C) for wasm32 and (i64 C) for wasm64, when C is the
482 // integer.
483 static Value *getAddrSizeInt(Module *M, uint64_t C) {
484 IRBuilder<> IRB(M->getContext());
485 return IRB.getIntN(M->getDataLayout().getPointerSizeInBits(), C);
488 // Returns __cxa_find_matching_catch_N function, where N = NumClauses + 2.
489 // This is because a landingpad instruction contains two more arguments, a
490 // personality function and a cleanup bit, and __cxa_find_matching_catch_N
491 // functions are named after the number of arguments in the original landingpad
492 // instruction.
493 Function *
494 WebAssemblyLowerEmscriptenEHSjLj::getFindMatchingCatch(Module &M,
495 unsigned NumClauses) {
496 if (FindMatchingCatches.count(NumClauses))
497 return FindMatchingCatches[NumClauses];
498 PointerType *Int8PtrTy = Type::getInt8PtrTy(M.getContext());
499 SmallVector<Type *, 16> Args(NumClauses, Int8PtrTy);
500 FunctionType *FTy = FunctionType::get(Int8PtrTy, Args, false);
501 Function *F = getEmscriptenFunction(
502 FTy, "__cxa_find_matching_catch_" + Twine(NumClauses + 2), &M);
503 FindMatchingCatches[NumClauses] = F;
504 return F;
507 // Generate invoke wrapper seqence with preamble and postamble
508 // Preamble:
509 // __THREW__ = 0;
510 // Postamble:
511 // %__THREW__.val = __THREW__; __THREW__ = 0;
512 // Returns %__THREW__.val, which indicates whether an exception is thrown (or
513 // whether longjmp occurred), for future use.
514 Value *WebAssemblyLowerEmscriptenEHSjLj::wrapInvoke(CallBase *CI) {
515 Module *M = CI->getModule();
516 LLVMContext &C = M->getContext();
518 IRBuilder<> IRB(C);
519 IRB.SetInsertPoint(CI);
521 // Pre-invoke
522 // __THREW__ = 0;
523 IRB.CreateStore(getAddrSizeInt(M, 0), ThrewGV);
525 // Invoke function wrapper in JavaScript
526 SmallVector<Value *, 16> Args;
527 // Put the pointer to the callee as first argument, so it can be called
528 // within the invoke wrapper later
529 Args.push_back(CI->getCalledOperand());
530 Args.append(CI->arg_begin(), CI->arg_end());
531 CallInst *NewCall = IRB.CreateCall(getInvokeWrapper(CI), Args);
532 NewCall->takeName(CI);
533 NewCall->setCallingConv(CallingConv::WASM_EmscriptenInvoke);
534 NewCall->setDebugLoc(CI->getDebugLoc());
536 // Because we added the pointer to the callee as first argument, all
537 // argument attribute indices have to be incremented by one.
538 SmallVector<AttributeSet, 8> ArgAttributes;
539 const AttributeList &InvokeAL = CI->getAttributes();
541 // No attributes for the callee pointer.
542 ArgAttributes.push_back(AttributeSet());
543 // Copy the argument attributes from the original
544 for (unsigned I = 0, E = CI->arg_size(); I < E; ++I)
545 ArgAttributes.push_back(InvokeAL.getParamAttrs(I));
547 AttrBuilder FnAttrs(CI->getContext(), InvokeAL.getFnAttrs());
548 if (FnAttrs.contains(Attribute::AllocSize)) {
549 // The allocsize attribute (if any) referes to parameters by index and needs
550 // to be adjusted.
551 unsigned SizeArg;
552 Optional<unsigned> NEltArg;
553 std::tie(SizeArg, NEltArg) = FnAttrs.getAllocSizeArgs();
554 SizeArg += 1;
555 if (NEltArg.hasValue())
556 NEltArg = NEltArg.getValue() + 1;
557 FnAttrs.addAllocSizeAttr(SizeArg, NEltArg);
559 // In case the callee has 'noreturn' attribute, We need to remove it, because
560 // we expect invoke wrappers to return.
561 FnAttrs.removeAttribute(Attribute::NoReturn);
563 // Reconstruct the AttributesList based on the vector we constructed.
564 AttributeList NewCallAL = AttributeList::get(
565 C, AttributeSet::get(C, FnAttrs), InvokeAL.getRetAttrs(), ArgAttributes);
566 NewCall->setAttributes(NewCallAL);
568 CI->replaceAllUsesWith(NewCall);
570 // Post-invoke
571 // %__THREW__.val = __THREW__; __THREW__ = 0;
572 Value *Threw =
573 IRB.CreateLoad(getAddrIntType(M), ThrewGV, ThrewGV->getName() + ".val");
574 IRB.CreateStore(getAddrSizeInt(M, 0), ThrewGV);
575 return Threw;
578 // Get matching invoke wrapper based on callee signature
579 Function *WebAssemblyLowerEmscriptenEHSjLj::getInvokeWrapper(CallBase *CI) {
580 Module *M = CI->getModule();
581 SmallVector<Type *, 16> ArgTys;
582 FunctionType *CalleeFTy = CI->getFunctionType();
584 std::string Sig = getSignature(CalleeFTy);
585 if (InvokeWrappers.find(Sig) != InvokeWrappers.end())
586 return InvokeWrappers[Sig];
588 // Put the pointer to the callee as first argument
589 ArgTys.push_back(PointerType::getUnqual(CalleeFTy));
590 // Add argument types
591 ArgTys.append(CalleeFTy->param_begin(), CalleeFTy->param_end());
593 FunctionType *FTy = FunctionType::get(CalleeFTy->getReturnType(), ArgTys,
594 CalleeFTy->isVarArg());
595 Function *F = getEmscriptenFunction(FTy, "__invoke_" + Sig, M);
596 InvokeWrappers[Sig] = F;
597 return F;
600 static bool canLongjmp(const Value *Callee) {
601 if (auto *CalleeF = dyn_cast<Function>(Callee))
602 if (CalleeF->isIntrinsic())
603 return false;
605 // Attempting to transform inline assembly will result in something like:
606 // call void @__invoke_void(void ()* asm ...)
607 // which is invalid because inline assembly blocks do not have addresses
608 // and can't be passed by pointer. The result is a crash with illegal IR.
609 if (isa<InlineAsm>(Callee))
610 return false;
611 StringRef CalleeName = Callee->getName();
613 // TODO Include more functions or consider checking with mangled prefixes
615 // The reason we include malloc/free here is to exclude the malloc/free
616 // calls generated in setjmp prep / cleanup routines.
617 if (CalleeName == "setjmp" || CalleeName == "malloc" || CalleeName == "free")
618 return false;
620 // There are functions in Emscripten's JS glue code or compiler-rt
621 if (CalleeName == "__resumeException" || CalleeName == "llvm_eh_typeid_for" ||
622 CalleeName == "saveSetjmp" || CalleeName == "testSetjmp" ||
623 CalleeName == "getTempRet0" || CalleeName == "setTempRet0")
624 return false;
626 // __cxa_find_matching_catch_N functions cannot longjmp
627 if (Callee->getName().startswith("__cxa_find_matching_catch_"))
628 return false;
630 // Exception-catching related functions
632 // We intentionally treat __cxa_end_catch longjmpable in Wasm SjLj even though
633 // it surely cannot longjmp, in order to maintain the unwind relationship from
634 // all existing catchpads (and calls within them) to catch.dispatch.longjmp.
636 // In Wasm EH + Wasm SjLj, we
637 // 1. Make all catchswitch and cleanuppad that unwind to caller unwind to
638 // catch.dispatch.longjmp instead
639 // 2. Convert all longjmpable calls to invokes that unwind to
640 // catch.dispatch.longjmp
641 // But catchswitch BBs are removed in isel, so if an EH catchswitch (generated
642 // from an exception)'s catchpad does not contain any calls that are converted
643 // into invokes unwinding to catch.dispatch.longjmp, this unwind relationship
644 // (EH catchswitch BB -> catch.dispatch.longjmp BB) is lost and
645 // catch.dispatch.longjmp BB can be placed before the EH catchswitch BB in
646 // CFGSort.
647 // int ret = setjmp(buf);
648 // try {
649 // foo(); // longjmps
650 // } catch (...) {
651 // }
652 // Then in this code, if 'foo' longjmps, it first unwinds to 'catch (...)'
653 // catchswitch, and is not caught by that catchswitch because it is a longjmp,
654 // then it should next unwind to catch.dispatch.longjmp BB. But if this 'catch
655 // (...)' catchswitch -> catch.dispatch.longjmp unwind relationship is lost,
656 // it will not unwind to catch.dispatch.longjmp, producing an incorrect
657 // result.
659 // Every catchpad generated by Wasm C++ contains __cxa_end_catch, so we
660 // intentionally treat it as longjmpable to work around this problem. This is
661 // a hacky fix but an easy one.
663 // The comment block in findWasmUnwindDestinations() in
664 // SelectionDAGBuilder.cpp is addressing a similar problem.
665 if (CalleeName == "__cxa_end_catch")
666 return WebAssembly::WasmEnableSjLj;
667 if (CalleeName == "__cxa_begin_catch" ||
668 CalleeName == "__cxa_allocate_exception" || CalleeName == "__cxa_throw" ||
669 CalleeName == "__clang_call_terminate")
670 return false;
672 // std::terminate, which is generated when another exception occurs while
673 // handling an exception, cannot longjmp.
674 if (CalleeName == "_ZSt9terminatev")
675 return false;
677 // Otherwise we don't know
678 return true;
681 static bool isEmAsmCall(const Value *Callee) {
682 StringRef CalleeName = Callee->getName();
683 // This is an exhaustive list from Emscripten's <emscripten/em_asm.h>.
684 return CalleeName == "emscripten_asm_const_int" ||
685 CalleeName == "emscripten_asm_const_double" ||
686 CalleeName == "emscripten_asm_const_int_sync_on_main_thread" ||
687 CalleeName == "emscripten_asm_const_double_sync_on_main_thread" ||
688 CalleeName == "emscripten_asm_const_async_on_main_thread";
691 // Generate testSetjmp function call seqence with preamble and postamble.
692 // The code this generates is equivalent to the following JavaScript code:
693 // %__threwValue.val = __threwValue;
694 // if (%__THREW__.val != 0 & %__threwValue.val != 0) {
695 // %label = testSetjmp(mem[%__THREW__.val], setjmpTable, setjmpTableSize);
696 // if (%label == 0)
697 // emscripten_longjmp(%__THREW__.val, %__threwValue.val);
698 // setTempRet0(%__threwValue.val);
699 // } else {
700 // %label = -1;
701 // }
702 // %longjmp_result = getTempRet0();
704 // As output parameters. returns %label, %longjmp_result, and the BB the last
705 // instruction (%longjmp_result = ...) is in.
706 void WebAssemblyLowerEmscriptenEHSjLj::wrapTestSetjmp(
707 BasicBlock *BB, DebugLoc DL, Value *Threw, Value *SetjmpTable,
708 Value *SetjmpTableSize, Value *&Label, Value *&LongjmpResult,
709 BasicBlock *&CallEmLongjmpBB, PHINode *&CallEmLongjmpBBThrewPHI,
710 PHINode *&CallEmLongjmpBBThrewValuePHI, BasicBlock *&EndBB) {
711 Function *F = BB->getParent();
712 Module *M = F->getParent();
713 LLVMContext &C = M->getContext();
714 IRBuilder<> IRB(C);
715 IRB.SetCurrentDebugLocation(DL);
717 // if (%__THREW__.val != 0 & %__threwValue.val != 0)
718 IRB.SetInsertPoint(BB);
719 BasicBlock *ThenBB1 = BasicBlock::Create(C, "if.then1", F);
720 BasicBlock *ElseBB1 = BasicBlock::Create(C, "if.else1", F);
721 BasicBlock *EndBB1 = BasicBlock::Create(C, "if.end", F);
722 Value *ThrewCmp = IRB.CreateICmpNE(Threw, getAddrSizeInt(M, 0));
723 Value *ThrewValue = IRB.CreateLoad(IRB.getInt32Ty(), ThrewValueGV,
724 ThrewValueGV->getName() + ".val");
725 Value *ThrewValueCmp = IRB.CreateICmpNE(ThrewValue, IRB.getInt32(0));
726 Value *Cmp1 = IRB.CreateAnd(ThrewCmp, ThrewValueCmp, "cmp1");
727 IRB.CreateCondBr(Cmp1, ThenBB1, ElseBB1);
729 // Generate call.em.longjmp BB once and share it within the function
730 if (!CallEmLongjmpBB) {
731 // emscripten_longjmp(%__THREW__.val, %__threwValue.val);
732 CallEmLongjmpBB = BasicBlock::Create(C, "call.em.longjmp", F);
733 IRB.SetInsertPoint(CallEmLongjmpBB);
734 CallEmLongjmpBBThrewPHI = IRB.CreatePHI(getAddrIntType(M), 4, "threw.phi");
735 CallEmLongjmpBBThrewValuePHI =
736 IRB.CreatePHI(IRB.getInt32Ty(), 4, "threwvalue.phi");
737 CallEmLongjmpBBThrewPHI->addIncoming(Threw, ThenBB1);
738 CallEmLongjmpBBThrewValuePHI->addIncoming(ThrewValue, ThenBB1);
739 IRB.CreateCall(EmLongjmpF,
740 {CallEmLongjmpBBThrewPHI, CallEmLongjmpBBThrewValuePHI});
741 IRB.CreateUnreachable();
742 } else {
743 CallEmLongjmpBBThrewPHI->addIncoming(Threw, ThenBB1);
744 CallEmLongjmpBBThrewValuePHI->addIncoming(ThrewValue, ThenBB1);
747 // %label = testSetjmp(mem[%__THREW__.val], setjmpTable, setjmpTableSize);
748 // if (%label == 0)
749 IRB.SetInsertPoint(ThenBB1);
750 BasicBlock *EndBB2 = BasicBlock::Create(C, "if.end2", F);
751 Value *ThrewPtr =
752 IRB.CreateIntToPtr(Threw, getAddrPtrType(M), Threw->getName() + ".p");
753 Value *LoadedThrew = IRB.CreateLoad(getAddrIntType(M), ThrewPtr,
754 ThrewPtr->getName() + ".loaded");
755 Value *ThenLabel = IRB.CreateCall(
756 TestSetjmpF, {LoadedThrew, SetjmpTable, SetjmpTableSize}, "label");
757 Value *Cmp2 = IRB.CreateICmpEQ(ThenLabel, IRB.getInt32(0));
758 IRB.CreateCondBr(Cmp2, CallEmLongjmpBB, EndBB2);
760 // setTempRet0(%__threwValue.val);
761 IRB.SetInsertPoint(EndBB2);
762 IRB.CreateCall(SetTempRet0F, ThrewValue);
763 IRB.CreateBr(EndBB1);
765 IRB.SetInsertPoint(ElseBB1);
766 IRB.CreateBr(EndBB1);
768 // longjmp_result = getTempRet0();
769 IRB.SetInsertPoint(EndBB1);
770 PHINode *LabelPHI = IRB.CreatePHI(IRB.getInt32Ty(), 2, "label");
771 LabelPHI->addIncoming(ThenLabel, EndBB2);
773 LabelPHI->addIncoming(IRB.getInt32(-1), ElseBB1);
775 // Output parameter assignment
776 Label = LabelPHI;
777 EndBB = EndBB1;
778 LongjmpResult = IRB.CreateCall(GetTempRet0F, None, "longjmp_result");
781 void WebAssemblyLowerEmscriptenEHSjLj::rebuildSSA(Function &F) {
782 DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
783 DT.recalculate(F); // CFG has been changed
785 SSAUpdaterBulk SSA;
786 for (BasicBlock &BB : F) {
787 for (Instruction &I : BB) {
788 unsigned VarID = SSA.AddVariable(I.getName(), I.getType());
789 // If a value is defined by an invoke instruction, it is only available in
790 // its normal destination and not in its unwind destination.
791 if (auto *II = dyn_cast<InvokeInst>(&I))
792 SSA.AddAvailableValue(VarID, II->getNormalDest(), II);
793 else
794 SSA.AddAvailableValue(VarID, &BB, &I);
795 for (auto &U : I.uses()) {
796 auto *User = cast<Instruction>(U.getUser());
797 if (auto *UserPN = dyn_cast<PHINode>(User))
798 if (UserPN->getIncomingBlock(U) == &BB)
799 continue;
800 if (DT.dominates(&I, User))
801 continue;
802 SSA.AddUse(VarID, &U);
806 SSA.RewriteAllUses(&DT);
809 // Replace uses of longjmp with a new longjmp function in Emscripten library.
810 // In Emscripten SjLj, the new function is
811 // void emscripten_longjmp(uintptr_t, i32)
812 // In Wasm SjLj, the new function is
813 // void __wasm_longjmp(i8*, i32)
814 // Because the original libc longjmp function takes (jmp_buf*, i32), we need a
815 // ptrtoint/bitcast instruction here to make the type match. jmp_buf* will
816 // eventually be lowered to i32/i64 in the wasm backend.
817 void WebAssemblyLowerEmscriptenEHSjLj::replaceLongjmpWith(Function *LongjmpF,
818 Function *NewF) {
819 assert(NewF == EmLongjmpF || NewF == WasmLongjmpF);
820 Module *M = LongjmpF->getParent();
821 SmallVector<CallInst *, 8> ToErase;
822 LLVMContext &C = LongjmpF->getParent()->getContext();
823 IRBuilder<> IRB(C);
825 // For calls to longjmp, replace it with emscripten_longjmp/__wasm_longjmp and
826 // cast its first argument (jmp_buf*) appropriately
827 for (User *U : LongjmpF->users()) {
828 auto *CI = dyn_cast<CallInst>(U);
829 if (CI && CI->getCalledFunction() == LongjmpF) {
830 IRB.SetInsertPoint(CI);
831 Value *Env = nullptr;
832 if (NewF == EmLongjmpF)
833 Env =
834 IRB.CreatePtrToInt(CI->getArgOperand(0), getAddrIntType(M), "env");
835 else // WasmLongjmpF
836 Env =
837 IRB.CreateBitCast(CI->getArgOperand(0), IRB.getInt8PtrTy(), "env");
838 IRB.CreateCall(NewF, {Env, CI->getArgOperand(1)});
839 ToErase.push_back(CI);
842 for (auto *I : ToErase)
843 I->eraseFromParent();
845 // If we have any remaining uses of longjmp's function pointer, replace it
846 // with (void(*)(jmp_buf*, int))emscripten_longjmp / __wasm_longjmp.
847 if (!LongjmpF->uses().empty()) {
848 Value *NewLongjmp =
849 IRB.CreateBitCast(NewF, LongjmpF->getType(), "longjmp.cast");
850 LongjmpF->replaceAllUsesWith(NewLongjmp);
854 static bool containsLongjmpableCalls(const Function *F) {
855 for (const auto &BB : *F)
856 for (const auto &I : BB)
857 if (const auto *CB = dyn_cast<CallBase>(&I))
858 if (canLongjmp(CB->getCalledOperand()))
859 return true;
860 return false;
863 // When a function contains a setjmp call but not other calls that can longjmp,
864 // we don't do setjmp transformation for that setjmp. But we need to convert the
865 // setjmp calls into "i32 0" so they don't cause link time errors. setjmp always
866 // returns 0 when called directly.
867 static void nullifySetjmp(Function *F) {
868 Module &M = *F->getParent();
869 IRBuilder<> IRB(M.getContext());
870 Function *SetjmpF = M.getFunction("setjmp");
871 SmallVector<Instruction *, 1> ToErase;
873 for (User *U : make_early_inc_range(SetjmpF->users())) {
874 auto *CB = cast<CallBase>(U);
875 BasicBlock *BB = CB->getParent();
876 if (BB->getParent() != F) // in other function
877 continue;
878 CallInst *CI = nullptr;
879 // setjmp cannot throw. So if it is an invoke, lower it to a call
880 if (auto *II = dyn_cast<InvokeInst>(CB))
881 CI = llvm::changeToCall(II);
882 else
883 CI = cast<CallInst>(CB);
884 ToErase.push_back(CI);
885 CI->replaceAllUsesWith(IRB.getInt32(0));
887 for (auto *I : ToErase)
888 I->eraseFromParent();
891 bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
892 LLVM_DEBUG(dbgs() << "********** Lower Emscripten EH & SjLj **********\n");
894 LLVMContext &C = M.getContext();
895 IRBuilder<> IRB(C);
897 Function *SetjmpF = M.getFunction("setjmp");
898 Function *LongjmpF = M.getFunction("longjmp");
900 // In some platforms _setjmp and _longjmp are used instead. Change these to
901 // use setjmp/longjmp instead, because we later detect these functions by
902 // their names.
903 Function *SetjmpF2 = M.getFunction("_setjmp");
904 Function *LongjmpF2 = M.getFunction("_longjmp");
905 if (SetjmpF2) {
906 if (SetjmpF) {
907 if (SetjmpF->getFunctionType() != SetjmpF2->getFunctionType())
908 report_fatal_error("setjmp and _setjmp have different function types");
909 } else {
910 SetjmpF = Function::Create(SetjmpF2->getFunctionType(),
911 GlobalValue::ExternalLinkage, "setjmp", M);
913 SetjmpF2->replaceAllUsesWith(SetjmpF);
915 if (LongjmpF2) {
916 if (LongjmpF) {
917 if (LongjmpF->getFunctionType() != LongjmpF2->getFunctionType())
918 report_fatal_error(
919 "longjmp and _longjmp have different function types");
920 } else {
921 LongjmpF = Function::Create(LongjmpF2->getFunctionType(),
922 GlobalValue::ExternalLinkage, "setjmp", M);
924 LongjmpF2->replaceAllUsesWith(LongjmpF);
927 auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
928 assert(TPC && "Expected a TargetPassConfig");
929 auto &TM = TPC->getTM<WebAssemblyTargetMachine>();
931 // Declare (or get) global variables __THREW__, __threwValue, and
932 // getTempRet0/setTempRet0 function which are used in common for both
933 // exception handling and setjmp/longjmp handling
934 ThrewGV = getGlobalVariable(M, getAddrIntType(&M), TM, "__THREW__");
935 ThrewValueGV = getGlobalVariable(M, IRB.getInt32Ty(), TM, "__threwValue");
936 GetTempRet0F = getEmscriptenFunction(
937 FunctionType::get(IRB.getInt32Ty(), false), "getTempRet0", &M);
938 SetTempRet0F = getEmscriptenFunction(
939 FunctionType::get(IRB.getVoidTy(), IRB.getInt32Ty(), false),
940 "setTempRet0", &M);
941 GetTempRet0F->setDoesNotThrow();
942 SetTempRet0F->setDoesNotThrow();
944 bool Changed = false;
946 // Function registration for exception handling
947 if (EnableEmEH) {
948 // Register __resumeException function
949 FunctionType *ResumeFTy =
950 FunctionType::get(IRB.getVoidTy(), IRB.getInt8PtrTy(), false);
951 ResumeF = getEmscriptenFunction(ResumeFTy, "__resumeException", &M);
952 ResumeF->addFnAttr(Attribute::NoReturn);
954 // Register llvm_eh_typeid_for function
955 FunctionType *EHTypeIDTy =
956 FunctionType::get(IRB.getInt32Ty(), IRB.getInt8PtrTy(), false);
957 EHTypeIDF = getEmscriptenFunction(EHTypeIDTy, "llvm_eh_typeid_for", &M);
960 // Functions that contains calls to setjmp but don't have other longjmpable
961 // calls within them.
962 SmallPtrSet<Function *, 4> SetjmpUsersToNullify;
964 if ((EnableEmSjLj || EnableWasmSjLj) && SetjmpF) {
965 // Precompute setjmp users
966 for (User *U : SetjmpF->users()) {
967 if (auto *CB = dyn_cast<CallBase>(U)) {
968 auto *UserF = CB->getFunction();
969 // If a function that calls setjmp does not contain any other calls that
970 // can longjmp, we don't need to do any transformation on that function,
971 // so can ignore it
972 if (containsLongjmpableCalls(UserF))
973 SetjmpUsers.insert(UserF);
974 else
975 SetjmpUsersToNullify.insert(UserF);
976 } else {
977 std::string S;
978 raw_string_ostream SS(S);
979 SS << *U;
980 report_fatal_error(Twine("Indirect use of setjmp is not supported: ") +
981 SS.str());
986 bool SetjmpUsed = SetjmpF && !SetjmpUsers.empty();
987 bool LongjmpUsed = LongjmpF && !LongjmpF->use_empty();
988 DoSjLj = (EnableEmSjLj | EnableWasmSjLj) && (SetjmpUsed || LongjmpUsed);
990 // Function registration and data pre-gathering for setjmp/longjmp handling
991 if (DoSjLj) {
992 assert(EnableEmSjLj || EnableWasmSjLj);
993 if (EnableEmSjLj) {
994 // Register emscripten_longjmp function
995 FunctionType *FTy = FunctionType::get(
996 IRB.getVoidTy(), {getAddrIntType(&M), IRB.getInt32Ty()}, false);
997 EmLongjmpF = getEmscriptenFunction(FTy, "emscripten_longjmp", &M);
998 EmLongjmpF->addFnAttr(Attribute::NoReturn);
999 } else { // EnableWasmSjLj
1000 // Register __wasm_longjmp function, which calls __builtin_wasm_longjmp.
1001 FunctionType *FTy = FunctionType::get(
1002 IRB.getVoidTy(), {IRB.getInt8PtrTy(), IRB.getInt32Ty()}, false);
1003 WasmLongjmpF = getEmscriptenFunction(FTy, "__wasm_longjmp", &M);
1004 WasmLongjmpF->addFnAttr(Attribute::NoReturn);
1007 if (SetjmpF) {
1008 // Register saveSetjmp function
1009 FunctionType *SetjmpFTy = SetjmpF->getFunctionType();
1010 FunctionType *FTy =
1011 FunctionType::get(Type::getInt32PtrTy(C),
1012 {SetjmpFTy->getParamType(0), IRB.getInt32Ty(),
1013 Type::getInt32PtrTy(C), IRB.getInt32Ty()},
1014 false);
1015 SaveSetjmpF = getEmscriptenFunction(FTy, "saveSetjmp", &M);
1017 // Register testSetjmp function
1018 FTy = FunctionType::get(
1019 IRB.getInt32Ty(),
1020 {getAddrIntType(&M), Type::getInt32PtrTy(C), IRB.getInt32Ty()},
1021 false);
1022 TestSetjmpF = getEmscriptenFunction(FTy, "testSetjmp", &M);
1024 // wasm.catch() will be lowered down to wasm 'catch' instruction in
1025 // instruction selection.
1026 CatchF = Intrinsic::getDeclaration(&M, Intrinsic::wasm_catch);
1027 // Type for struct __WasmLongjmpArgs
1028 LongjmpArgsTy = StructType::get(IRB.getInt8PtrTy(), // env
1029 IRB.getInt32Ty() // val
1034 // Exception handling transformation
1035 if (EnableEmEH) {
1036 for (Function &F : M) {
1037 if (F.isDeclaration())
1038 continue;
1039 Changed |= runEHOnFunction(F);
1043 // Setjmp/longjmp handling transformation
1044 if (DoSjLj) {
1045 Changed = true; // We have setjmp or longjmp somewhere
1046 if (LongjmpF)
1047 replaceLongjmpWith(LongjmpF, EnableEmSjLj ? EmLongjmpF : WasmLongjmpF);
1048 // Only traverse functions that uses setjmp in order not to insert
1049 // unnecessary prep / cleanup code in every function
1050 if (SetjmpF)
1051 for (Function *F : SetjmpUsers)
1052 runSjLjOnFunction(*F);
1055 // Replace unnecessary setjmp calls with 0
1056 if ((EnableEmSjLj || EnableWasmSjLj) && !SetjmpUsersToNullify.empty()) {
1057 Changed = true;
1058 assert(SetjmpF);
1059 for (Function *F : SetjmpUsersToNullify)
1060 nullifySetjmp(F);
1063 // Delete unused global variables and functions
1064 for (auto *V : {ThrewGV, ThrewValueGV})
1065 if (V && V->use_empty())
1066 V->eraseFromParent();
1067 for (auto *V : {GetTempRet0F, SetTempRet0F, ResumeF, EHTypeIDF, EmLongjmpF,
1068 SaveSetjmpF, TestSetjmpF, WasmLongjmpF, CatchF})
1069 if (V && V->use_empty())
1070 V->eraseFromParent();
1072 return Changed;
1075 bool WebAssemblyLowerEmscriptenEHSjLj::runEHOnFunction(Function &F) {
1076 Module &M = *F.getParent();
1077 LLVMContext &C = F.getContext();
1078 IRBuilder<> IRB(C);
1079 bool Changed = false;
1080 SmallVector<Instruction *, 64> ToErase;
1081 SmallPtrSet<LandingPadInst *, 32> LandingPads;
1083 // rethrow.longjmp BB that will be shared within the function.
1084 BasicBlock *RethrowLongjmpBB = nullptr;
1085 // PHI node for the loaded value of __THREW__ global variable in
1086 // rethrow.longjmp BB
1087 PHINode *RethrowLongjmpBBThrewPHI = nullptr;
1089 for (BasicBlock &BB : F) {
1090 auto *II = dyn_cast<InvokeInst>(BB.getTerminator());
1091 if (!II)
1092 continue;
1093 Changed = true;
1094 LandingPads.insert(II->getLandingPadInst());
1095 IRB.SetInsertPoint(II);
1097 const Value *Callee = II->getCalledOperand();
1098 bool NeedInvoke = supportsException(&F) && canThrow(Callee);
1099 if (NeedInvoke) {
1100 // Wrap invoke with invoke wrapper and generate preamble/postamble
1101 Value *Threw = wrapInvoke(II);
1102 ToErase.push_back(II);
1104 // If setjmp/longjmp handling is enabled, the thrown value can be not an
1105 // exception but a longjmp. If the current function contains calls to
1106 // setjmp, it will be appropriately handled in runSjLjOnFunction. But even
1107 // if the function does not contain setjmp calls, we shouldn't silently
1108 // ignore longjmps; we should rethrow them so they can be correctly
1109 // handled in somewhere up the call chain where setjmp is. __THREW__'s
1110 // value is 0 when nothing happened, 1 when an exception is thrown, and
1111 // other values when longjmp is thrown.
1113 // if (%__THREW__.val == 0 || %__THREW__.val == 1)
1114 // goto %tail
1115 // else
1116 // goto %longjmp.rethrow
1118 // rethrow.longjmp: ;; This is longjmp. Rethrow it
1119 // %__threwValue.val = __threwValue
1120 // emscripten_longjmp(%__THREW__.val, %__threwValue.val);
1122 // tail: ;; Nothing happened or an exception is thrown
1123 // ... Continue exception handling ...
1124 if (DoSjLj && EnableEmSjLj && !SetjmpUsers.count(&F) &&
1125 canLongjmp(Callee)) {
1126 // Create longjmp.rethrow BB once and share it within the function
1127 if (!RethrowLongjmpBB) {
1128 RethrowLongjmpBB = BasicBlock::Create(C, "rethrow.longjmp", &F);
1129 IRB.SetInsertPoint(RethrowLongjmpBB);
1130 RethrowLongjmpBBThrewPHI =
1131 IRB.CreatePHI(getAddrIntType(&M), 4, "threw.phi");
1132 RethrowLongjmpBBThrewPHI->addIncoming(Threw, &BB);
1133 Value *ThrewValue = IRB.CreateLoad(IRB.getInt32Ty(), ThrewValueGV,
1134 ThrewValueGV->getName() + ".val");
1135 IRB.CreateCall(EmLongjmpF, {RethrowLongjmpBBThrewPHI, ThrewValue});
1136 IRB.CreateUnreachable();
1137 } else {
1138 RethrowLongjmpBBThrewPHI->addIncoming(Threw, &BB);
1141 IRB.SetInsertPoint(II); // Restore the insert point back
1142 BasicBlock *Tail = BasicBlock::Create(C, "tail", &F);
1143 Value *CmpEqOne =
1144 IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp.eq.one");
1145 Value *CmpEqZero =
1146 IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 0), "cmp.eq.zero");
1147 Value *Or = IRB.CreateOr(CmpEqZero, CmpEqOne, "or");
1148 IRB.CreateCondBr(Or, Tail, RethrowLongjmpBB);
1149 IRB.SetInsertPoint(Tail);
1150 BB.replaceSuccessorsPhiUsesWith(&BB, Tail);
1153 // Insert a branch based on __THREW__ variable
1154 Value *Cmp = IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp");
1155 IRB.CreateCondBr(Cmp, II->getUnwindDest(), II->getNormalDest());
1157 } else {
1158 // This can't throw, and we don't need this invoke, just replace it with a
1159 // call+branch
1160 changeToCall(II);
1164 // Process resume instructions
1165 for (BasicBlock &BB : F) {
1166 // Scan the body of the basic block for resumes
1167 for (Instruction &I : BB) {
1168 auto *RI = dyn_cast<ResumeInst>(&I);
1169 if (!RI)
1170 continue;
1171 Changed = true;
1173 // Split the input into legal values
1174 Value *Input = RI->getValue();
1175 IRB.SetInsertPoint(RI);
1176 Value *Low = IRB.CreateExtractValue(Input, 0, "low");
1177 // Create a call to __resumeException function
1178 IRB.CreateCall(ResumeF, {Low});
1179 // Add a terminator to the block
1180 IRB.CreateUnreachable();
1181 ToErase.push_back(RI);
1185 // Process llvm.eh.typeid.for intrinsics
1186 for (BasicBlock &BB : F) {
1187 for (Instruction &I : BB) {
1188 auto *CI = dyn_cast<CallInst>(&I);
1189 if (!CI)
1190 continue;
1191 const Function *Callee = CI->getCalledFunction();
1192 if (!Callee)
1193 continue;
1194 if (Callee->getIntrinsicID() != Intrinsic::eh_typeid_for)
1195 continue;
1196 Changed = true;
1198 IRB.SetInsertPoint(CI);
1199 CallInst *NewCI =
1200 IRB.CreateCall(EHTypeIDF, CI->getArgOperand(0), "typeid");
1201 CI->replaceAllUsesWith(NewCI);
1202 ToErase.push_back(CI);
1206 // Look for orphan landingpads, can occur in blocks with no predecessors
1207 for (BasicBlock &BB : F) {
1208 Instruction *I = BB.getFirstNonPHI();
1209 if (auto *LPI = dyn_cast<LandingPadInst>(I))
1210 LandingPads.insert(LPI);
1212 Changed |= !LandingPads.empty();
1214 // Handle all the landingpad for this function together, as multiple invokes
1215 // may share a single lp
1216 for (LandingPadInst *LPI : LandingPads) {
1217 IRB.SetInsertPoint(LPI);
1218 SmallVector<Value *, 16> FMCArgs;
1219 for (unsigned I = 0, E = LPI->getNumClauses(); I < E; ++I) {
1220 Constant *Clause = LPI->getClause(I);
1221 // TODO Handle filters (= exception specifications).
1222 // https://bugs.llvm.org/show_bug.cgi?id=50396
1223 if (LPI->isCatch(I))
1224 FMCArgs.push_back(Clause);
1227 // Create a call to __cxa_find_matching_catch_N function
1228 Function *FMCF = getFindMatchingCatch(M, FMCArgs.size());
1229 CallInst *FMCI = IRB.CreateCall(FMCF, FMCArgs, "fmc");
1230 Value *Undef = UndefValue::get(LPI->getType());
1231 Value *Pair0 = IRB.CreateInsertValue(Undef, FMCI, 0, "pair0");
1232 Value *TempRet0 = IRB.CreateCall(GetTempRet0F, None, "tempret0");
1233 Value *Pair1 = IRB.CreateInsertValue(Pair0, TempRet0, 1, "pair1");
1235 LPI->replaceAllUsesWith(Pair1);
1236 ToErase.push_back(LPI);
1239 // Erase everything we no longer need in this function
1240 for (Instruction *I : ToErase)
1241 I->eraseFromParent();
1243 return Changed;
1246 // This tries to get debug info from the instruction before which a new
1247 // instruction will be inserted, and if there's no debug info in that
1248 // instruction, tries to get the info instead from the previous instruction (if
1249 // any). If none of these has debug info and a DISubprogram is provided, it
1250 // creates a dummy debug info with the first line of the function, because IR
1251 // verifier requires all inlinable callsites should have debug info when both a
1252 // caller and callee have DISubprogram. If none of these conditions are met,
1253 // returns empty info.
1254 static DebugLoc getOrCreateDebugLoc(const Instruction *InsertBefore,
1255 DISubprogram *SP) {
1256 assert(InsertBefore);
1257 if (InsertBefore->getDebugLoc())
1258 return InsertBefore->getDebugLoc();
1259 const Instruction *Prev = InsertBefore->getPrevNode();
1260 if (Prev && Prev->getDebugLoc())
1261 return Prev->getDebugLoc();
1262 if (SP)
1263 return DILocation::get(SP->getContext(), SP->getLine(), 1, SP);
1264 return DebugLoc();
1267 bool WebAssemblyLowerEmscriptenEHSjLj::runSjLjOnFunction(Function &F) {
1268 assert(EnableEmSjLj || EnableWasmSjLj);
1269 Module &M = *F.getParent();
1270 LLVMContext &C = F.getContext();
1271 IRBuilder<> IRB(C);
1272 SmallVector<Instruction *, 64> ToErase;
1273 // Vector of %setjmpTable values
1274 SmallVector<Instruction *, 4> SetjmpTableInsts;
1275 // Vector of %setjmpTableSize values
1276 SmallVector<Instruction *, 4> SetjmpTableSizeInsts;
1278 // Setjmp preparation
1280 // This instruction effectively means %setjmpTableSize = 4.
1281 // We create this as an instruction intentionally, and we don't want to fold
1282 // this instruction to a constant 4, because this value will be used in
1283 // SSAUpdater.AddAvailableValue(...) later.
1284 BasicBlock *Entry = &F.getEntryBlock();
1285 DebugLoc FirstDL = getOrCreateDebugLoc(&*Entry->begin(), F.getSubprogram());
1286 SplitBlock(Entry, &*Entry->getFirstInsertionPt());
1288 BinaryOperator *SetjmpTableSize =
1289 BinaryOperator::Create(Instruction::Add, IRB.getInt32(4), IRB.getInt32(0),
1290 "setjmpTableSize", Entry->getTerminator());
1291 SetjmpTableSize->setDebugLoc(FirstDL);
1292 // setjmpTable = (int *) malloc(40);
1293 Instruction *SetjmpTable = CallInst::CreateMalloc(
1294 SetjmpTableSize, IRB.getInt32Ty(), IRB.getInt32Ty(), IRB.getInt32(40),
1295 nullptr, nullptr, "setjmpTable");
1296 SetjmpTable->setDebugLoc(FirstDL);
1297 // CallInst::CreateMalloc may return a bitcast instruction if the result types
1298 // mismatch. We need to set the debug loc for the original call too.
1299 auto *MallocCall = SetjmpTable->stripPointerCasts();
1300 if (auto *MallocCallI = dyn_cast<Instruction>(MallocCall)) {
1301 MallocCallI->setDebugLoc(FirstDL);
1303 // setjmpTable[0] = 0;
1304 IRB.SetInsertPoint(SetjmpTableSize);
1305 IRB.CreateStore(IRB.getInt32(0), SetjmpTable);
1306 SetjmpTableInsts.push_back(SetjmpTable);
1307 SetjmpTableSizeInsts.push_back(SetjmpTableSize);
1309 // Setjmp transformation
1310 SmallVector<PHINode *, 4> SetjmpRetPHIs;
1311 Function *SetjmpF = M.getFunction("setjmp");
1312 for (auto *U : make_early_inc_range(SetjmpF->users())) {
1313 auto *CB = cast<CallBase>(U);
1314 BasicBlock *BB = CB->getParent();
1315 if (BB->getParent() != &F) // in other function
1316 continue;
1317 if (CB->getOperandBundle(LLVMContext::OB_funclet))
1318 report_fatal_error(
1319 "setjmp within a catch clause is not supported in Wasm EH");
1321 CallInst *CI = nullptr;
1322 // setjmp cannot throw. So if it is an invoke, lower it to a call
1323 if (auto *II = dyn_cast<InvokeInst>(CB))
1324 CI = llvm::changeToCall(II);
1325 else
1326 CI = cast<CallInst>(CB);
1328 // The tail is everything right after the call, and will be reached once
1329 // when setjmp is called, and later when longjmp returns to the setjmp
1330 BasicBlock *Tail = SplitBlock(BB, CI->getNextNode());
1331 // Add a phi to the tail, which will be the output of setjmp, which
1332 // indicates if this is the first call or a longjmp back. The phi directly
1333 // uses the right value based on where we arrive from
1334 IRB.SetInsertPoint(Tail->getFirstNonPHI());
1335 PHINode *SetjmpRet = IRB.CreatePHI(IRB.getInt32Ty(), 2, "setjmp.ret");
1337 // setjmp initial call returns 0
1338 SetjmpRet->addIncoming(IRB.getInt32(0), BB);
1339 // The proper output is now this, not the setjmp call itself
1340 CI->replaceAllUsesWith(SetjmpRet);
1341 // longjmp returns to the setjmp will add themselves to this phi
1342 SetjmpRetPHIs.push_back(SetjmpRet);
1344 // Fix call target
1345 // Our index in the function is our place in the array + 1 to avoid index
1346 // 0, because index 0 means the longjmp is not ours to handle.
1347 IRB.SetInsertPoint(CI);
1348 Value *Args[] = {CI->getArgOperand(0), IRB.getInt32(SetjmpRetPHIs.size()),
1349 SetjmpTable, SetjmpTableSize};
1350 Instruction *NewSetjmpTable =
1351 IRB.CreateCall(SaveSetjmpF, Args, "setjmpTable");
1352 Instruction *NewSetjmpTableSize =
1353 IRB.CreateCall(GetTempRet0F, None, "setjmpTableSize");
1354 SetjmpTableInsts.push_back(NewSetjmpTable);
1355 SetjmpTableSizeInsts.push_back(NewSetjmpTableSize);
1356 ToErase.push_back(CI);
1359 // Handle longjmpable calls.
1360 if (EnableEmSjLj)
1361 handleLongjmpableCallsForEmscriptenSjLj(
1362 F, SetjmpTableInsts, SetjmpTableSizeInsts, SetjmpRetPHIs);
1363 else // EnableWasmSjLj
1364 handleLongjmpableCallsForWasmSjLj(F, SetjmpTableInsts, SetjmpTableSizeInsts,
1365 SetjmpRetPHIs);
1367 // Erase everything we no longer need in this function
1368 for (Instruction *I : ToErase)
1369 I->eraseFromParent();
1371 // Free setjmpTable buffer before each return instruction + function-exiting
1372 // call
1373 SmallVector<Instruction *, 16> ExitingInsts;
1374 for (BasicBlock &BB : F) {
1375 Instruction *TI = BB.getTerminator();
1376 if (isa<ReturnInst>(TI))
1377 ExitingInsts.push_back(TI);
1378 // Any 'call' instruction with 'noreturn' attribute exits the function at
1379 // this point. If this throws but unwinds to another EH pad within this
1380 // function instead of exiting, this would have been an 'invoke', which
1381 // happens if we use Wasm EH or Wasm SjLJ.
1382 for (auto &I : BB) {
1383 if (auto *CI = dyn_cast<CallInst>(&I)) {
1384 bool IsNoReturn = CI->hasFnAttr(Attribute::NoReturn);
1385 if (Function *CalleeF = CI->getCalledFunction())
1386 IsNoReturn |= CalleeF->hasFnAttribute(Attribute::NoReturn);
1387 if (IsNoReturn)
1388 ExitingInsts.push_back(&I);
1392 for (auto *I : ExitingInsts) {
1393 DebugLoc DL = getOrCreateDebugLoc(I, F.getSubprogram());
1394 // If this existing instruction is a call within a catchpad, we should add
1395 // it as "funclet" to the operand bundle of 'free' call
1396 SmallVector<OperandBundleDef, 1> Bundles;
1397 if (auto *CB = dyn_cast<CallBase>(I))
1398 if (auto Bundle = CB->getOperandBundle(LLVMContext::OB_funclet))
1399 Bundles.push_back(OperandBundleDef(*Bundle));
1400 auto *Free = CallInst::CreateFree(SetjmpTable, Bundles, I);
1401 Free->setDebugLoc(DL);
1402 // CallInst::CreateFree may create a bitcast instruction if its argument
1403 // types mismatch. We need to set the debug loc for the bitcast too.
1404 if (auto *FreeCallI = dyn_cast<CallInst>(Free)) {
1405 if (auto *BitCastI = dyn_cast<BitCastInst>(FreeCallI->getArgOperand(0)))
1406 BitCastI->setDebugLoc(DL);
1410 // Every call to saveSetjmp can change setjmpTable and setjmpTableSize
1411 // (when buffer reallocation occurs)
1412 // entry:
1413 // setjmpTableSize = 4;
1414 // setjmpTable = (int *) malloc(40);
1415 // setjmpTable[0] = 0;
1416 // ...
1417 // somebb:
1418 // setjmpTable = saveSetjmp(env, label, setjmpTable, setjmpTableSize);
1419 // setjmpTableSize = getTempRet0();
1420 // So we need to make sure the SSA for these variables is valid so that every
1421 // saveSetjmp and testSetjmp calls have the correct arguments.
1422 SSAUpdater SetjmpTableSSA;
1423 SSAUpdater SetjmpTableSizeSSA;
1424 SetjmpTableSSA.Initialize(Type::getInt32PtrTy(C), "setjmpTable");
1425 SetjmpTableSizeSSA.Initialize(Type::getInt32Ty(C), "setjmpTableSize");
1426 for (Instruction *I : SetjmpTableInsts)
1427 SetjmpTableSSA.AddAvailableValue(I->getParent(), I);
1428 for (Instruction *I : SetjmpTableSizeInsts)
1429 SetjmpTableSizeSSA.AddAvailableValue(I->getParent(), I);
1431 for (auto &U : make_early_inc_range(SetjmpTable->uses()))
1432 if (auto *I = dyn_cast<Instruction>(U.getUser()))
1433 if (I->getParent() != Entry)
1434 SetjmpTableSSA.RewriteUse(U);
1435 for (auto &U : make_early_inc_range(SetjmpTableSize->uses()))
1436 if (auto *I = dyn_cast<Instruction>(U.getUser()))
1437 if (I->getParent() != Entry)
1438 SetjmpTableSizeSSA.RewriteUse(U);
1440 // Finally, our modifications to the cfg can break dominance of SSA variables.
1441 // For example, in this code,
1442 // if (x()) { .. setjmp() .. }
1443 // if (y()) { .. longjmp() .. }
1444 // We must split the longjmp block, and it can jump into the block splitted
1445 // from setjmp one. But that means that when we split the setjmp block, it's
1446 // first part no longer dominates its second part - there is a theoretically
1447 // possible control flow path where x() is false, then y() is true and we
1448 // reach the second part of the setjmp block, without ever reaching the first
1449 // part. So, we rebuild SSA form here.
1450 rebuildSSA(F);
1451 return true;
1454 // Update each call that can longjmp so it can return to the corresponding
1455 // setjmp. Refer to 4) of "Emscripten setjmp/longjmp handling" section in the
1456 // comments at top of the file for details.
1457 void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForEmscriptenSjLj(
1458 Function &F, InstVector &SetjmpTableInsts, InstVector &SetjmpTableSizeInsts,
1459 SmallVectorImpl<PHINode *> &SetjmpRetPHIs) {
1460 Module &M = *F.getParent();
1461 LLVMContext &C = F.getContext();
1462 IRBuilder<> IRB(C);
1463 SmallVector<Instruction *, 64> ToErase;
1465 // We need to pass setjmpTable and setjmpTableSize to testSetjmp function.
1466 // These values are defined in the beginning of the function and also in each
1467 // setjmp callsite, but we don't know which values we should use at this
1468 // point. So here we arbitraily use the ones defined in the beginning of the
1469 // function, and SSAUpdater will later update them to the correct values.
1470 Instruction *SetjmpTable = *SetjmpTableInsts.begin();
1471 Instruction *SetjmpTableSize = *SetjmpTableSizeInsts.begin();
1473 // call.em.longjmp BB that will be shared within the function.
1474 BasicBlock *CallEmLongjmpBB = nullptr;
1475 // PHI node for the loaded value of __THREW__ global variable in
1476 // call.em.longjmp BB
1477 PHINode *CallEmLongjmpBBThrewPHI = nullptr;
1478 // PHI node for the loaded value of __threwValue global variable in
1479 // call.em.longjmp BB
1480 PHINode *CallEmLongjmpBBThrewValuePHI = nullptr;
1481 // rethrow.exn BB that will be shared within the function.
1482 BasicBlock *RethrowExnBB = nullptr;
1484 // Because we are creating new BBs while processing and don't want to make
1485 // all these newly created BBs candidates again for longjmp processing, we
1486 // first make the vector of candidate BBs.
1487 std::vector<BasicBlock *> BBs;
1488 for (BasicBlock &BB : F)
1489 BBs.push_back(&BB);
1491 // BBs.size() will change within the loop, so we query it every time
1492 for (unsigned I = 0; I < BBs.size(); I++) {
1493 BasicBlock *BB = BBs[I];
1494 for (Instruction &I : *BB) {
1495 if (isa<InvokeInst>(&I))
1496 report_fatal_error("When using Wasm EH with Emscripten SjLj, there is "
1497 "a restriction that `setjmp` function call and "
1498 "exception cannot be used within the same function");
1499 auto *CI = dyn_cast<CallInst>(&I);
1500 if (!CI)
1501 continue;
1503 const Value *Callee = CI->getCalledOperand();
1504 if (!canLongjmp(Callee))
1505 continue;
1506 if (isEmAsmCall(Callee))
1507 report_fatal_error("Cannot use EM_ASM* alongside setjmp/longjmp in " +
1508 F.getName() +
1509 ". Please consider using EM_JS, or move the "
1510 "EM_ASM into another function.",
1511 false);
1513 Value *Threw = nullptr;
1514 BasicBlock *Tail;
1515 if (Callee->getName().startswith("__invoke_")) {
1516 // If invoke wrapper has already been generated for this call in
1517 // previous EH phase, search for the load instruction
1518 // %__THREW__.val = __THREW__;
1519 // in postamble after the invoke wrapper call
1520 LoadInst *ThrewLI = nullptr;
1521 StoreInst *ThrewResetSI = nullptr;
1522 for (auto I = std::next(BasicBlock::iterator(CI)), IE = BB->end();
1523 I != IE; ++I) {
1524 if (auto *LI = dyn_cast<LoadInst>(I))
1525 if (auto *GV = dyn_cast<GlobalVariable>(LI->getPointerOperand()))
1526 if (GV == ThrewGV) {
1527 Threw = ThrewLI = LI;
1528 break;
1531 // Search for the store instruction after the load above
1532 // __THREW__ = 0;
1533 for (auto I = std::next(BasicBlock::iterator(ThrewLI)), IE = BB->end();
1534 I != IE; ++I) {
1535 if (auto *SI = dyn_cast<StoreInst>(I)) {
1536 if (auto *GV = dyn_cast<GlobalVariable>(SI->getPointerOperand())) {
1537 if (GV == ThrewGV &&
1538 SI->getValueOperand() == getAddrSizeInt(&M, 0)) {
1539 ThrewResetSI = SI;
1540 break;
1545 assert(Threw && ThrewLI && "Cannot find __THREW__ load after invoke");
1546 assert(ThrewResetSI && "Cannot find __THREW__ store after invoke");
1547 Tail = SplitBlock(BB, ThrewResetSI->getNextNode());
1549 } else {
1550 // Wrap call with invoke wrapper and generate preamble/postamble
1551 Threw = wrapInvoke(CI);
1552 ToErase.push_back(CI);
1553 Tail = SplitBlock(BB, CI->getNextNode());
1555 // If exception handling is enabled, the thrown value can be not a
1556 // longjmp but an exception, in which case we shouldn't silently ignore
1557 // exceptions; we should rethrow them.
1558 // __THREW__'s value is 0 when nothing happened, 1 when an exception is
1559 // thrown, other values when longjmp is thrown.
1561 // if (%__THREW__.val == 1)
1562 // goto %eh.rethrow
1563 // else
1564 // goto %normal
1566 // eh.rethrow: ;; Rethrow exception
1567 // %exn = call @__cxa_find_matching_catch_2() ;; Retrieve thrown ptr
1568 // __resumeException(%exn)
1570 // normal:
1571 // <-- Insertion point. Will insert sjlj handling code from here
1572 // goto %tail
1574 // tail:
1575 // ...
1576 if (supportsException(&F) && canThrow(Callee)) {
1577 // We will add a new conditional branch. So remove the branch created
1578 // when we split the BB
1579 ToErase.push_back(BB->getTerminator());
1581 // Generate rethrow.exn BB once and share it within the function
1582 if (!RethrowExnBB) {
1583 RethrowExnBB = BasicBlock::Create(C, "rethrow.exn", &F);
1584 IRB.SetInsertPoint(RethrowExnBB);
1585 CallInst *Exn =
1586 IRB.CreateCall(getFindMatchingCatch(M, 0), {}, "exn");
1587 IRB.CreateCall(ResumeF, {Exn});
1588 IRB.CreateUnreachable();
1591 IRB.SetInsertPoint(CI);
1592 BasicBlock *NormalBB = BasicBlock::Create(C, "normal", &F);
1593 Value *CmpEqOne =
1594 IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp.eq.one");
1595 IRB.CreateCondBr(CmpEqOne, RethrowExnBB, NormalBB);
1597 IRB.SetInsertPoint(NormalBB);
1598 IRB.CreateBr(Tail);
1599 BB = NormalBB; // New insertion point to insert testSetjmp()
1603 // We need to replace the terminator in Tail - SplitBlock makes BB go
1604 // straight to Tail, we need to check if a longjmp occurred, and go to the
1605 // right setjmp-tail if so
1606 ToErase.push_back(BB->getTerminator());
1608 // Generate a function call to testSetjmp function and preamble/postamble
1609 // code to figure out (1) whether longjmp occurred (2) if longjmp
1610 // occurred, which setjmp it corresponds to
1611 Value *Label = nullptr;
1612 Value *LongjmpResult = nullptr;
1613 BasicBlock *EndBB = nullptr;
1614 wrapTestSetjmp(BB, CI->getDebugLoc(), Threw, SetjmpTable, SetjmpTableSize,
1615 Label, LongjmpResult, CallEmLongjmpBB,
1616 CallEmLongjmpBBThrewPHI, CallEmLongjmpBBThrewValuePHI,
1617 EndBB);
1618 assert(Label && LongjmpResult && EndBB);
1620 // Create switch instruction
1621 IRB.SetInsertPoint(EndBB);
1622 IRB.SetCurrentDebugLocation(EndBB->getInstList().back().getDebugLoc());
1623 SwitchInst *SI = IRB.CreateSwitch(Label, Tail, SetjmpRetPHIs.size());
1624 // -1 means no longjmp happened, continue normally (will hit the default
1625 // switch case). 0 means a longjmp that is not ours to handle, needs a
1626 // rethrow. Otherwise the index is the same as the index in P+1 (to avoid
1627 // 0).
1628 for (unsigned I = 0; I < SetjmpRetPHIs.size(); I++) {
1629 SI->addCase(IRB.getInt32(I + 1), SetjmpRetPHIs[I]->getParent());
1630 SetjmpRetPHIs[I]->addIncoming(LongjmpResult, EndBB);
1633 // We are splitting the block here, and must continue to find other calls
1634 // in the block - which is now split. so continue to traverse in the Tail
1635 BBs.push_back(Tail);
1639 for (Instruction *I : ToErase)
1640 I->eraseFromParent();
1643 static BasicBlock *getCleanupRetUnwindDest(const CleanupPadInst *CPI) {
1644 for (const User *U : CPI->users())
1645 if (const auto *CRI = dyn_cast<CleanupReturnInst>(U))
1646 return CRI->getUnwindDest();
1647 return nullptr;
1650 // Create a catchpad in which we catch a longjmp's env and val arguments, test
1651 // if the longjmp corresponds to one of setjmps in the current function, and if
1652 // so, jump to the setjmp dispatch BB from which we go to one of post-setjmp
1653 // BBs. Refer to 4) of "Wasm setjmp/longjmp handling" section in the comments at
1654 // top of the file for details.
1655 void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForWasmSjLj(
1656 Function &F, InstVector &SetjmpTableInsts, InstVector &SetjmpTableSizeInsts,
1657 SmallVectorImpl<PHINode *> &SetjmpRetPHIs) {
1658 Module &M = *F.getParent();
1659 LLVMContext &C = F.getContext();
1660 IRBuilder<> IRB(C);
1662 // A function with catchswitch/catchpad instruction should have a personality
1663 // function attached to it. Search for the wasm personality function, and if
1664 // it exists, use it, and if it doesn't, create a dummy personality function.
1665 // (SjLj is not going to call it anyway.)
1666 if (!F.hasPersonalityFn()) {
1667 StringRef PersName = getEHPersonalityName(EHPersonality::Wasm_CXX);
1668 FunctionType *PersType =
1669 FunctionType::get(IRB.getInt32Ty(), /* isVarArg */ true);
1670 Value *PersF = M.getOrInsertFunction(PersName, PersType).getCallee();
1671 F.setPersonalityFn(
1672 cast<Constant>(IRB.CreateBitCast(PersF, IRB.getInt8PtrTy())));
1675 // Use the entry BB's debugloc as a fallback
1676 BasicBlock *Entry = &F.getEntryBlock();
1677 DebugLoc FirstDL = getOrCreateDebugLoc(&*Entry->begin(), F.getSubprogram());
1678 IRB.SetCurrentDebugLocation(FirstDL);
1680 // Arbitrarily use the ones defined in the beginning of the function.
1681 // SSAUpdater will later update them to the correct values.
1682 Instruction *SetjmpTable = *SetjmpTableInsts.begin();
1683 Instruction *SetjmpTableSize = *SetjmpTableSizeInsts.begin();
1685 // Add setjmp.dispatch BB right after the entry block. Because we have
1686 // initialized setjmpTable/setjmpTableSize in the entry block and split the
1687 // rest into another BB, here 'OrigEntry' is the function's original entry
1688 // block before the transformation.
1690 // entry:
1691 // setjmpTable / setjmpTableSize initialization
1692 // setjmp.dispatch:
1693 // switch will be inserted here later
1694 // entry.split: (OrigEntry)
1695 // the original function starts here
1696 BasicBlock *OrigEntry = Entry->getNextNode();
1697 BasicBlock *SetjmpDispatchBB =
1698 BasicBlock::Create(C, "setjmp.dispatch", &F, OrigEntry);
1699 cast<BranchInst>(Entry->getTerminator())->setSuccessor(0, SetjmpDispatchBB);
1701 // Create catch.dispatch.longjmp BB and a catchswitch instruction
1702 BasicBlock *CatchDispatchLongjmpBB =
1703 BasicBlock::Create(C, "catch.dispatch.longjmp", &F);
1704 IRB.SetInsertPoint(CatchDispatchLongjmpBB);
1705 CatchSwitchInst *CatchSwitchLongjmp =
1706 IRB.CreateCatchSwitch(ConstantTokenNone::get(C), nullptr, 1);
1708 // Create catch.longjmp BB and a catchpad instruction
1709 BasicBlock *CatchLongjmpBB = BasicBlock::Create(C, "catch.longjmp", &F);
1710 CatchSwitchLongjmp->addHandler(CatchLongjmpBB);
1711 IRB.SetInsertPoint(CatchLongjmpBB);
1712 CatchPadInst *CatchPad = IRB.CreateCatchPad(CatchSwitchLongjmp, {});
1714 // Wasm throw and catch instructions can throw and catch multiple values, but
1715 // that requires multivalue support in the toolchain, which is currently not
1716 // very reliable. We instead throw and catch a pointer to a struct value of
1717 // type 'struct __WasmLongjmpArgs', which is defined in Emscripten.
1718 Instruction *CatchCI =
1719 IRB.CreateCall(CatchF, {IRB.getInt32(WebAssembly::C_LONGJMP)}, "thrown");
1720 Value *LongjmpArgs =
1721 IRB.CreateBitCast(CatchCI, LongjmpArgsTy->getPointerTo(), "longjmp.args");
1722 Value *EnvField =
1723 IRB.CreateConstGEP2_32(LongjmpArgsTy, LongjmpArgs, 0, 0, "env_gep");
1724 Value *ValField =
1725 IRB.CreateConstGEP2_32(LongjmpArgsTy, LongjmpArgs, 0, 1, "val_gep");
1726 // void *env = __wasm_longjmp_args.env;
1727 Instruction *Env = IRB.CreateLoad(IRB.getInt8PtrTy(), EnvField, "env");
1728 // int val = __wasm_longjmp_args.val;
1729 Instruction *Val = IRB.CreateLoad(IRB.getInt32Ty(), ValField, "val");
1731 // %label = testSetjmp(mem[%env], setjmpTable, setjmpTableSize);
1732 // if (%label == 0)
1733 // __wasm_longjmp(%env, %val)
1734 // catchret to %setjmp.dispatch
1735 BasicBlock *ThenBB = BasicBlock::Create(C, "if.then", &F);
1736 BasicBlock *EndBB = BasicBlock::Create(C, "if.end", &F);
1737 Value *EnvP = IRB.CreateBitCast(Env, getAddrPtrType(&M), "env.p");
1738 Value *SetjmpID = IRB.CreateLoad(getAddrIntType(&M), EnvP, "setjmp.id");
1739 Value *Label =
1740 IRB.CreateCall(TestSetjmpF, {SetjmpID, SetjmpTable, SetjmpTableSize},
1741 OperandBundleDef("funclet", CatchPad), "label");
1742 Value *Cmp = IRB.CreateICmpEQ(Label, IRB.getInt32(0));
1743 IRB.CreateCondBr(Cmp, ThenBB, EndBB);
1745 IRB.SetInsertPoint(ThenBB);
1746 CallInst *WasmLongjmpCI = IRB.CreateCall(
1747 WasmLongjmpF, {Env, Val}, OperandBundleDef("funclet", CatchPad));
1748 IRB.CreateUnreachable();
1750 IRB.SetInsertPoint(EndBB);
1751 // Jump to setjmp.dispatch block
1752 IRB.CreateCatchRet(CatchPad, SetjmpDispatchBB);
1754 // Go back to setjmp.dispatch BB
1755 // setjmp.dispatch:
1756 // switch %label {
1757 // label 1: goto post-setjmp BB 1
1758 // label 2: goto post-setjmp BB 2
1759 // ...
1760 // default: goto splitted next BB
1761 // }
1762 IRB.SetInsertPoint(SetjmpDispatchBB);
1763 PHINode *LabelPHI = IRB.CreatePHI(IRB.getInt32Ty(), 2, "label.phi");
1764 LabelPHI->addIncoming(Label, EndBB);
1765 LabelPHI->addIncoming(IRB.getInt32(-1), Entry);
1766 SwitchInst *SI = IRB.CreateSwitch(LabelPHI, OrigEntry, SetjmpRetPHIs.size());
1767 // -1 means no longjmp happened, continue normally (will hit the default
1768 // switch case). 0 means a longjmp that is not ours to handle, needs a
1769 // rethrow. Otherwise the index is the same as the index in P+1 (to avoid
1770 // 0).
1771 for (unsigned I = 0; I < SetjmpRetPHIs.size(); I++) {
1772 SI->addCase(IRB.getInt32(I + 1), SetjmpRetPHIs[I]->getParent());
1773 SetjmpRetPHIs[I]->addIncoming(Val, SetjmpDispatchBB);
1776 // Convert all longjmpable call instructions to invokes that unwind to the
1777 // newly created catch.dispatch.longjmp BB.
1778 SmallVector<CallInst *, 64> LongjmpableCalls;
1779 for (auto *BB = &*F.begin(); BB; BB = BB->getNextNode()) {
1780 for (auto &I : *BB) {
1781 auto *CI = dyn_cast<CallInst>(&I);
1782 if (!CI)
1783 continue;
1784 const Value *Callee = CI->getCalledOperand();
1785 if (!canLongjmp(Callee))
1786 continue;
1787 if (isEmAsmCall(Callee))
1788 report_fatal_error("Cannot use EM_ASM* alongside setjmp/longjmp in " +
1789 F.getName() +
1790 ". Please consider using EM_JS, or move the "
1791 "EM_ASM into another function.",
1792 false);
1793 // This is __wasm_longjmp() call we inserted in this function, which
1794 // rethrows the longjmp when the longjmp does not correspond to one of
1795 // setjmps in this function. We should not convert this call to an invoke.
1796 if (CI == WasmLongjmpCI)
1797 continue;
1798 LongjmpableCalls.push_back(CI);
1802 for (auto *CI : LongjmpableCalls) {
1803 // Even if the callee function has attribute 'nounwind', which is true for
1804 // all C functions, it can longjmp, which means it can throw a Wasm
1805 // exception now.
1806 CI->removeFnAttr(Attribute::NoUnwind);
1807 if (Function *CalleeF = CI->getCalledFunction())
1808 CalleeF->removeFnAttr(Attribute::NoUnwind);
1810 // Change it to an invoke and make it unwind to the catch.dispatch.longjmp
1811 // BB. If the call is enclosed in another catchpad/cleanuppad scope, unwind
1812 // to its parent pad's unwind destination instead to preserve the scope
1813 // structure. It will eventually unwind to the catch.dispatch.longjmp.
1814 SmallVector<OperandBundleDef, 1> Bundles;
1815 BasicBlock *UnwindDest = nullptr;
1816 if (auto Bundle = CI->getOperandBundle(LLVMContext::OB_funclet)) {
1817 Instruction *FromPad = cast<Instruction>(Bundle->Inputs[0]);
1818 while (!UnwindDest) {
1819 if (auto *CPI = dyn_cast<CatchPadInst>(FromPad)) {
1820 UnwindDest = CPI->getCatchSwitch()->getUnwindDest();
1821 break;
1823 if (auto *CPI = dyn_cast<CleanupPadInst>(FromPad)) {
1824 // getCleanupRetUnwindDest() can return nullptr when
1825 // 1. This cleanuppad's matching cleanupret uwninds to caller
1826 // 2. There is no matching cleanupret because it ends with
1827 // unreachable.
1828 // In case of 2, we need to traverse the parent pad chain.
1829 UnwindDest = getCleanupRetUnwindDest(CPI);
1830 Value *ParentPad = CPI->getParentPad();
1831 if (isa<ConstantTokenNone>(ParentPad))
1832 break;
1833 FromPad = cast<Instruction>(ParentPad);
1837 if (!UnwindDest)
1838 UnwindDest = CatchDispatchLongjmpBB;
1839 changeToInvokeAndSplitBasicBlock(CI, UnwindDest);
1842 SmallVector<Instruction *, 16> ToErase;
1843 for (auto &BB : F) {
1844 if (auto *CSI = dyn_cast<CatchSwitchInst>(BB.getFirstNonPHI())) {
1845 if (CSI != CatchSwitchLongjmp && CSI->unwindsToCaller()) {
1846 IRB.SetInsertPoint(CSI);
1847 ToErase.push_back(CSI);
1848 auto *NewCSI = IRB.CreateCatchSwitch(CSI->getParentPad(),
1849 CatchDispatchLongjmpBB, 1);
1850 NewCSI->addHandler(*CSI->handler_begin());
1851 NewCSI->takeName(CSI);
1852 CSI->replaceAllUsesWith(NewCSI);
1856 if (auto *CRI = dyn_cast<CleanupReturnInst>(BB.getTerminator())) {
1857 if (CRI->unwindsToCaller()) {
1858 IRB.SetInsertPoint(CRI);
1859 ToErase.push_back(CRI);
1860 IRB.CreateCleanupRet(CRI->getCleanupPad(), CatchDispatchLongjmpBB);
1865 for (Instruction *I : ToErase)
1866 I->eraseFromParent();