4 LLDB has been structured from the beginning to be scriptable in two
5 ways -- a Unix Python session can initiate/run a debug session
6 non-interactively using LLDB; and within the LLDB debugger tool, Python
7 scripts can be used to help with many tasks, including inspecting
8 program data, iterating over containers and determining if a breakpoint
9 should stop execution or continue. This document will show how to do
10 some of these things by going through an example, explaining how to use
11 Python scripting to find a bug in a program that searches for text in a
14 The Test Program and Input
15 --------------------------
17 We have a simple C program (dictionary.c) that reads in a text file,
18 and stores all the words from the file in a Binary Search Tree, sorted
19 alphabetically. It then enters a loop prompting the user for a word,
20 searching for the word in the tree (using Binary Search), and reporting
21 to the user whether or not it found the word in the tree.
23 The input text file we are using to test our program contains the text
24 for William Shakespeare's famous tragedy "Romeo and Juliet".
29 When we try running our program, we find there is a problem. While it
30 successfully finds some of the words we would expect to find, such as
31 "love" or "sun", it fails to find the word "Romeo", which MUST be in
36 $ ./dictionary Romeo-and-Juliet.txt
38 Enter search word: love
40 Enter search word: sun
42 Enter search word: Romeo
47 Using Depth First Search
48 ------------------------
50 Our first job is to determine if the word "Romeo" actually got inserted
51 into the tree or not. Since "Romeo and Juliet" has thousands of words,
52 trying to examine our binary search tree by hand is completely
53 impractical. Therefore we will write a Python script to search the tree
54 for us. We will write a recursive Depth First Search function that
55 traverses the entire tree searching for a word, and maintaining
56 information about the path from the root of the tree to the current
57 node. If it finds the word in the tree, it returns the path from the
58 root to the node containing the word. This is what our DFS function in
59 Python would look like, with line numbers added for easy reference in
64 1: def DFS (root, word, cur_path):
65 2: root_word_ptr = root.GetChildMemberWithName ("word")
66 3: left_child_ptr = root.GetChildMemberWithName ("left")
67 4: right_child_ptr = root.GetChildMemberWithName ("right")
68 5: root_word = root_word_ptr.GetSummary()
69 6: end = len (root_word) - 1
70 7: if root_word[0] == '"' and root_word[end] == '"':
71 8: root_word = root_word[1:end]
72 9: end = len (root_word) - 1
73 10: if root_word[0] == '\'' and root_word[end] == '\'':
74 11: root_word = root_word[1:end]
75 12: if root_word == word:
77 14: elif word < root_word:
78 15: if left_child_ptr.GetValue() == None:
81 18: cur_path = cur_path + "L"
82 19: return DFS (left_child_ptr, word, cur_path)
84 21: if right_child_ptr.GetValue() == None:
87 24: cur_path = cur_path + "R"
88 25: return DFS (right_child_ptr, word, cur_path)
91 Accessing & Manipulating Program Variables
92 ------------------------------------------
94 Before we can call any Python function on any of our program's
95 variables, we need to get the variable into a form that Python can
96 access. To show you how to do this we will look at the parameters for
97 the DFS function. The first parameter is going to be a node in our
98 binary search tree, put into a Python variable. The second parameter is
99 the word we are searching for (a string), and the third parameter is a
100 string representing the path from the root of the tree to our current
103 The most interesting parameter is the first one, the Python variable
104 that needs to contain a node in our search tree. How can we take a
105 variable out of our program and put it into a Python variable? What
106 kind of Python variable will it be? The answers are to use the LLDB API
107 functions, provided as part of the LLDB Python module. Running Python
108 from inside LLDB, LLDB will automatically give us our current frame
109 object as a Python variable, "lldb.frame". This variable has the type
110 `SBFrame` (see the LLDB API for more information about `SBFrame`
111 objects). One of the things we can do with a frame object, is to ask it
112 to find and return its local variable. We will call the API function
113 `SBFrame.FindVariable` on the lldb.frame object to give us our dictionary
114 variable as a Python variable:
118 root = lldb.frame.FindVariable ("dictionary")
120 The line above, executed in the Python script interpreter in LLDB, asks the
121 current frame to find the variable named "dictionary" and return it. We then
122 store the returned value in the Python variable named "root". This answers the
123 question of HOW to get the variable, but it still doesn't explain WHAT actually
124 gets put into "root". If you examine the LLDB API, you will find that the
125 `SBFrame` method "FindVariable" returns an object of type `SBValue`. `SBValue`
126 objects are used, among other things, to wrap up program variables and values.
127 There are many useful methods defined in the `SBValue` class to allow you to get
128 information or children values out of SBValues. For complete information, see
129 the header file SBValue.h. The `SBValue` methods that we use in our DFS function
130 are ``GetChildMemberWithName()``, ``GetSummary()``, and ``GetValue()``.
133 Explaining DFS Script in Detail
134 -------------------------------
136 Before diving into the details of this code, it would be best to give a
137 high-level overview of what it does. The nodes in our binary search tree were
138 defined to have type ``tree_node *``, which is defined as:
142 typedef struct tree_node
145 struct tree_node *left;
146 struct tree_node *right;
149 Lines 2-11 of DFS are getting data out of the current tree node and getting
150 ready to do the actual search; lines 12-25 are the actual depth-first search.
151 Lines 2-4 of our DFS function get the word, left and right fields out of the
152 current node and store them in Python variables. Since root_word_ptr is a
153 pointer to our word, and we want the actual word, line 5 calls GetSummary() to
154 get a string containing the value out of the pointer. Since GetSummary() adds
155 quotes around its result, lines 6-11 strip surrounding quotes off the word.
157 Line 12 checks to see if the word in the current node is the one we are
158 searching for. If so, we are done, and line 13 returns the current path.
159 Otherwise, line 14 checks to see if we should go left (search word comes before
160 the current word). If we decide to go left, line 15 checks to see if the left
161 pointer child is NULL ("None" is the Python equivalent of NULL). If the left
162 pointer is NULL, then the word is not in this tree and we return an empty path
163 (line 16). Otherwise, we add an "L" to the end of our current path string, to
164 indicate we are going left (line 18), and then recurse on the left child (line
165 19). Lines 20-25 are the same as lines 14-19, except for going right rather
168 One other note: Typing something as long as our DFS function directly into the
169 interpreter can be difficult, as making a single typing mistake means having to
170 start all over. Therefore we recommend doing as we have done: Writing your
171 longer, more complicated script functions in a separate file (in this case
172 tree_utils.py) and then importing it into your LLDB Python interpreter.
175 The DFS Script in Action
176 ------------------------
178 At this point we are ready to use the DFS function to see if the word "Romeo"
179 is in our tree or not. To actually use it in LLDB on our dictionary program,
180 you would do something like this:
185 (lldb) process attach -n "dictionary"
186 Architecture set to: x86_64.
188 * thread #1: tid = 0x2c03, 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8, stop reason = signal SIGSTOP
189 frame #0: 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8
190 (lldb) breakpoint set -n find_word
191 Breakpoint created: 1: name = 'find_word', locations = 1, resolved = 1
195 * thread #1: tid = 0x2c03, 0x0000000100001830 dictionary`find_word + 16
196 at dictionary.c:105, stop reason = breakpoint 1.1
197 frame #0: 0x0000000100001830 dictionary`find_word + 16 at dictionary.c:105
199 103 find_word (tree_node *dictionary, char *word)
201 -> 105 if (!word || !dictionary)
204 108 int compare_value = strcmp (word, dictionary->word);
206 Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D.
207 >>> import tree_utils
208 >>> root = lldb.frame.FindVariable ("dictionary")
209 >>> current_path = ""
210 >>> path = tree_utils.DFS (root, "Romeo", current_path)
216 The first bit of code above shows starting lldb, attaching to the dictionary
217 program, and getting to the find_word function in LLDB. The interesting part
218 (as far as this example is concerned) begins when we enter the script command
219 and drop into the embedded interactive Python interpreter. We will go over this
220 Python code line by line. The first line
227 imports the file where we wrote our DFS function, tree_utils.py, into Python.
228 Notice that to import the file we leave off the ".py" extension. We can now
229 call any function in that file, giving it the prefix "tree_utils.", so that
230 Python knows where to look for the function. The line
234 root = lldb.frame.FindVariable ("dictionary")
237 gets our program variable "dictionary" (which contains the binary search tree)
238 and puts it into the Python variable "root". See Accessing & Manipulating
239 Program Variables in Python above for more details about how this works. The
246 This line initializes the current_path from the root of the tree to our current
247 node. Since we are starting at the root of the tree, our current path starts as
248 an empty string. As we go right and left through the tree, the DFS function
249 will append an 'R' or an 'L' to the current path, as appropriate. The line
253 path = tree_utils.DFS (root, "Romeo", current_path)
255 calls our DFS function (prefixing it with the module name so that Python can
256 find it). We pass in our binary tree stored in the variable root, the word we
257 are searching for, and our current path. We assign whatever path the DFS
258 function returns to the Python variable path.
260 Finally, we want to see if the word was found or not, and if so we want to see
261 the path through the tree to the word. So we do
267 From this we can see that the word "Romeo" was indeed found in the tree, and
268 the path from the root of the tree to the node containing "Romeo" is
269 left-left-right-right-left.
271 Using Breakpoint Command Scripts
272 --------------------------------
274 We are halfway to figuring out what the problem is. We know the word we are
275 looking for is in the binary tree, and we know exactly where it is in the
276 binary tree. Now we need to figure out why our binary search algorithm is not
277 finding the word. We will do this using breakpoint command scripts.
279 The idea is as follows. The binary search algorithm has two main decision
280 points: the decision to follow the right branch; and, the decision to follow
281 the left branch. We will set a breakpoint at each of these decision points, and
282 attach a Python breakpoint command script to each breakpoint. The breakpoint
283 commands will use the global path Python variable that we got from our DFS
284 function. Each time one of these decision breakpoints is hit, the script will
285 compare the actual decision with the decision the front of the path variable
286 says should be made (the first character of the path). If the actual decision
287 and the path agree, then the front character is stripped off the path, and
288 execution is resumed. In this case the user never even sees the breakpoint
289 being hit. But if the decision differs from what the path says it should be,
290 then the script prints out a message and does NOT resume execution, leaving the
291 user sitting at the first point where a wrong decision is being made.
293 Python Breakpoint Command Scripts Are Not What They Seem
294 --------------------------------------------------------
296 What do we mean by that? When you enter a Python breakpoint command in LLDB, it
297 appears that you are entering one or more plain lines of Python. BUT LLDB then
298 takes what you entered and wraps it into a Python FUNCTION (just like using the
299 "def" Python command). It automatically gives the function an obscure, unique,
300 hard-to-stumble-across function name, and gives it two parameters: frame and
301 bp_loc. When the breakpoint gets hit, LLDB wraps up the frame object where the
302 breakpoint was hit, and the breakpoint location object for the breakpoint that
303 was hit, and puts them into Python variables for you. It then calls the Python
304 function that was created for the breakpoint command, and passes in the frame
305 and breakpoint location objects.
307 So, being practical, what does this mean for you when you write your Python
308 breakpoint commands? It means that there are two things you need to keep in
309 mind: 1. If you want to access any Python variables created outside your
310 script, you must declare such variables to be global. If you do not declare
311 them as global, then the Python function will treat them as local variables,
312 and you will get unexpected behavior. 2. All Python breakpoint command scripts
313 automatically have a frame and a bp_loc variable. The variables are pre-loaded
314 by LLDB with the correct context for the breakpoint. You do not have to use
315 these variables, but they are there if you want them.
317 The Decision Point Breakpoint Commands
318 --------------------------------------
320 This is what the Python breakpoint command script would look like for the
321 decision to go right:
328 thread = frame.GetThread()
329 process = thread.GetProcess()
332 print "Here is the problem; going right, should go left!"
333 Just as a reminder, LLDB is going to take this script and wrap it up in a function, like this:
336 def some_unique_and_obscure_function_name (frame, bp_loc):
340 thread = frame.GetThread()
341 process = thread.GetProcess()
344 print "Here is the problem; going right, should go left!"
346 LLDB will call the function, passing in the correct frame and breakpoint
347 location whenever the breakpoint gets hit. There are several things to notice
348 about this function. The first one is that we are accessing and updating a
349 piece of state (the path variable), and actually conditioning our behavior
350 based upon this variable. Since the variable was defined outside of our script
351 (and therefore outside of the corresponding function) we need to tell Python
352 that we are accessing a global variable. That is what the first line of the
353 script does. Next we check where the path says we should go and compare it to
354 our decision (recall that we are at the breakpoint for the decision to go
355 right). If the path agrees with our decision, then we strip the first character
358 Since the decision matched the path, we want to resume execution. To do this we
359 make use of the frame parameter that LLDB guarantees will be there for us. We
360 use LLDB API functions to get the current thread from the current frame, and
361 then to get the process from the thread. Once we have the process, we tell it
362 to resume execution (using the Continue() API function).
364 If the decision to go right does not agree with the path, then we do not resume
365 execution. We allow the breakpoint to remain stopped (by doing nothing), and we
366 print an informational message telling the user we have found the problem, and
369 Actually Using The Breakpoint Commands
370 --------------------------------------
372 Now we will look at what happens when we actually use these breakpoint commands
373 on our program. Doing a source list -n find_word shows us the function
374 containing our two decision points. Looking at the code below, we see that we
375 want to set our breakpoints on lines 113 and 115:
379 (lldb) source list -n find_word
380 File: /Volumes/Data/HD2/carolinetice/Desktop/LLDB-Web-Examples/dictionary.c.
383 103 find_word (tree_node *dictionary, char *word)
385 105 if (!word || !dictionary)
388 108 int compare_value = strcmp (word, dictionary->word);
390 110 if (compare_value == 0)
392 112 else if (compare_value < 0)
393 113 return find_word (dictionary->left, word);
395 115 return find_word (dictionary->right, word);
400 So, we set our breakpoints, enter our breakpoint command scripts, and see what happens:
404 (lldb) breakpoint set -l 113
405 Breakpoint created: 2: file ='dictionary.c', line = 113, locations = 1, resolved = 1
406 (lldb) breakpoint set -l 115
407 Breakpoint created: 3: file ='dictionary.c', line = 115, locations = 1, resolved = 1
408 (lldb) breakpoint command add -s python 2
409 Enter your Python command(s). Type 'DONE' to end.
411 > if (path[0] == 'L'):
413 > thread = frame.GetThread()
414 > process = thread.GetProcess()
417 > print "Here is the problem. Going left, should go right!"
419 (lldb) breakpoint command add -s python 3
420 Enter your Python command(s). Type 'DONE' to end.
422 > if (path[0] == 'R'):
424 > thread = frame.GetThread()
425 > process = thread.GetProcess()
428 > print "Here is the problem. Going right, should go left!"
432 Here is the problem. Going right, should go left!
434 * thread #1: tid = 0x2d03, 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115, stop reason = breakpoint 3.1
435 frame #0: 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115
436 112 else if (compare_value < 0)
437 113 return find_word (dictionary->left, word);
439 -> 115 return find_word (dictionary->right, word);
446 After setting our breakpoints, adding our breakpoint commands and continuing,
447 we run for a little bit and then hit one of our breakpoints, printing out the
448 error message from the breakpoint command. Apparently at this point in the
449 tree, our search algorithm decided to go right, but our path says the node we
450 want is to the left. Examining the word at the node where we stopped, and our
455 (lldb) expr dictionary->word
456 (const char *) $1 = 0x0000000100100080 "dramatis"
458 (char *) $2 = 0x00007fff5fbff108 "romeo"
460 So the word at our current node is "dramatis", and the word we are searching
461 for is "romeo". "romeo" comes after "dramatis" alphabetically, so it seems like
462 going right would be the correct decision. Let's ask Python what it thinks the
463 path from the current node to our word is:
467 (lldb) script print path
470 According to Python we need to go left-left-right-right-left from our current
471 node to find the word we are looking for. Let's double check our tree, and see
472 what word it has at that node:
476 (lldb) expr dictionary->left->left->right->right->left->word
477 (const char *) $4 = 0x0000000100100880 "Romeo"
479 So the word we are searching for is "romeo" and the word at our DFS location is
480 "Romeo". Aha! One is uppercase and the other is lowercase: We seem to have a
481 case conversion problem somewhere in our program (we do).
483 This is the end of our example on how you might use Python scripting in LLDB to
484 help you find bugs in your program.
486 Source Files for The Example
487 ----------------------------
489 The complete code for the Dictionary program (with case-conversion bug), the
490 DFS function and other Python script examples (tree_utils.py) used for this
491 example are available below.
493 tree_utils.py - Example Python functions using LLDB's API, including DFS
498 # ===-- tree_utils.py ---------------------------------------*- Python -*-===//
500 # Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
501 # See https://llvm.org/LICENSE.txt for license information.
502 # SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
504 # ===----------------------------------------------------------------------===//
506 tree_utils.py - A set of functions for examining binary
507 search trees, based on the example search tree defined in
508 dictionary.c. These functions contain calls to LLDB API
509 functions, and assume that the LLDB Python module has been
512 For a thorough explanation of how the DFS function works, and
513 for more information about dictionary.c go to
514 http://lldb.llvm.org/scripting.html
518 def DFS(root, word, cur_path):
520 Recursively traverse a binary search tree containing
521 words sorted alphabetically, searching for a particular
522 word in the tree. Also maintains a string representing
523 the path from the root of the tree to the current node.
524 If the word is found in the tree, return the path string.
525 Otherwise return an empty string.
527 This function assumes the binary search tree is
528 the one defined in dictionary.c It uses LLDB API
529 functions to examine and traverse the tree nodes.
532 # Get pointer field values out of node 'root'
534 root_word_ptr = root.GetChildMemberWithName("word")
535 left_child_ptr = root.GetChildMemberWithName("left")
536 right_child_ptr = root.GetChildMemberWithName("right")
538 # Get the word out of the word pointer and strip off
539 # surrounding quotes (added by call to GetSummary).
541 root_word = root_word_ptr.GetSummary()
542 end = len(root_word) - 1
543 if root_word[0] == '"' and root_word[end] == '"':
544 root_word = root_word[1:end]
545 end = len(root_word) - 1
546 if root_word[0] == '\'' and root_word[end] == '\'':
547 root_word = root_word[1:end]
549 # Main depth first search
551 if root_word == word:
553 elif word < root_word:
555 # Check to see if left child is NULL
557 if left_child_ptr.GetValue() is None:
560 cur_path = cur_path + "L"
561 return DFS(left_child_ptr, word, cur_path)
564 # Check to see if right child is NULL
566 if right_child_ptr.GetValue() is None:
569 cur_path = cur_path + "R"
570 return DFS(right_child_ptr, word, cur_path)
575 Recursively traverse a binary search tree, counting
576 the nodes in the tree. Returns the final count.
578 This function assumes the binary search tree is
579 the one defined in dictionary.c It uses LLDB API
580 functions to examine and traverse the tree nodes.
582 if (root.GetValue is None):
585 if (int(root.GetValue(), 16) == 0):
588 left_size = tree_size(root.GetChildAtIndex(1))
589 right_size = tree_size(root.GetChildAtIndex(2))
591 total_size = left_size + right_size + 1
595 def print_tree(root):
597 Recursively traverse a binary search tree, printing out
598 the words at the nodes in alphabetical order (the
599 search order for the binary tree).
601 This function assumes the binary search tree is
602 the one defined in dictionary.c It uses LLDB API
603 functions to examine and traverse the tree nodes.
605 if (root.GetChildAtIndex(1).GetValue() is not None) and (
606 int(root.GetChildAtIndex(1).GetValue(), 16) != 0):
607 print_tree(root.GetChildAtIndex(1))
609 print root.GetChildAtIndex(0).GetSummary()
611 if (root.GetChildAtIndex(2).GetValue() is not None) and (
612 int(root.GetChildAtIndex(2).GetValue(), 16) != 0):
613 print_tree(root.GetChildAtIndex(2))
616 dictionary.c - Sample dictionary program, with bug
620 //===-- dictionary.c ---------------------------------------------*- C -*-===//
622 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
623 // See https://llvm.org/LICENSE.txt for license information.
624 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
626 //===----------------------------------------------------------------------===//
632 typedef struct tree_node {
634 struct tree_node *left;
635 struct tree_node *right;
638 /* Given a char*, returns a substring that starts at the first
639 alphabet character and ends at the last alphabet character, i.e. it
640 strips off beginning or ending quotes, punctuation, etc. */
642 char *strip(char **word) {
644 int len = strlen(start);
645 char *end = start + len - 1;
647 while ((start < end) && (!isalpha(start[0])))
650 while ((end > start) && (!isalpha(end[0])))
662 /* Given a binary search tree (sorted alphabetically by the word at
663 each node), and a new word, inserts the word at the appropriate
664 place in the tree. */
666 void insert(tree_node *root, char *word) {
670 int compare_value = strcmp(word, root->word);
672 if (compare_value == 0)
675 if (compare_value < 0) {
676 if (root->left != NULL)
677 insert(root->left, word);
679 tree_node *new_node = (tree_node *)malloc(sizeof(tree_node));
680 new_node->word = strdup(word);
681 new_node->left = NULL;
682 new_node->right = NULL;
683 root->left = new_node;
686 if (root->right != NULL)
687 insert(root->right, word);
689 tree_node *new_node = (tree_node *)malloc(sizeof(tree_node));
690 new_node->word = strdup(word);
691 new_node->left = NULL;
692 new_node->right = NULL;
693 root->right = new_node;
698 /* Read in a text file and storea all the words from the file in a
699 binary search tree. */
701 void populate_dictionary(tree_node **dictionary, char *filename) {
705 in_file = fopen(filename, "r");
707 while (fscanf(in_file, "%s", word) == 1) {
708 char *new_word = (strdup(word));
709 new_word = strip(&new_word);
710 if (*dictionary == NULL) {
711 tree_node *new_node = (tree_node *)malloc(sizeof(tree_node));
712 new_node->word = new_word;
713 new_node->left = NULL;
714 new_node->right = NULL;
715 *dictionary = new_node;
717 insert(*dictionary, new_word);
722 /* Given a binary search tree and a word, search for the word
723 in the binary search tree. */
725 int find_word(tree_node *dictionary, char *word) {
726 if (!word || !dictionary)
729 int compare_value = strcmp(word, dictionary->word);
731 if (compare_value == 0)
733 else if (compare_value < 0)
734 return find_word(dictionary->left, word);
736 return find_word(dictionary->right, word);
739 /* Print out the words in the binary search tree, in sorted order. */
741 void print_tree(tree_node *dictionary) {
745 if (dictionary->left)
746 print_tree(dictionary->left);
748 printf("%s\n", dictionary->word);
750 if (dictionary->right)
751 print_tree(dictionary->right);
754 int main(int argc, char **argv) {
755 tree_node *dictionary = NULL;
766 populate_dictionary(&dictionary, filename);
767 fprintf(stdout, "Dictionary loaded.\nEnter search word: ");
768 while (!done && fgets(buffer, sizeof(buffer), stdin)) {
770 int len = strlen(word);
773 for (i = 0; i < len; ++i)
774 word[i] = tolower(word[i]);
776 if ((len > 0) && (word[len - 1] == '\n')) {
777 word[len - 1] = '\0';
781 if (find_word(dictionary, word))
782 fprintf(stdout, "Yes!\n");
784 fprintf(stdout, "No!\n");
786 fprintf(stdout, "Enter search word: ");
789 fprintf(stdout, "\n");
794 The text for "Romeo and Juliet" can be obtained from the Gutenberg Project
795 (http://www.gutenberg.org).