TODO

   1 Smaller items
   2 =============
   3
   4 * Make the 'grep' proc support a -n option that is a synonym for
   5   'line'. The 'line' option must be retained for backward
   6   compatibility.
   7
   8 Bigger items
   9 ============
  10
  11 * Internationali[sz]ation.
  12 * Use a throw-away slave interpreter for running each test case.
  13 * Transfer timeouts should be dependent on file size and link speed.
  14 * Add more support for target boards and RTOSes.
  15 * Use the new expect terminal support for an "escape codes" API.
  16 * Use expectk and write a GUI testing API, complete with record/playback.
  17
  18
  19 \f
  20 Date: Thu, 29 Aug 2013 19:42:07 +0200
  21 From: Jan Kratochvil <jan.kratochvil@redhat.com>
  22 To: dejagnu@gnu.org
  23 Subject: dejagnu-2.0 feature wishlist (from Cauldron 2013)
  24
  25 Hi,
  26
  27 I haven't found any discussion here about the features in hypothetical
  28 dejagnu-2.0, as presented by Rob Savoye at Cauldron 2013.
  29
  30 I wrote some scripts on top of DejaGNU but I think at least some of the
  31 functionality could be integrated into DejaGNU itself.  It depends whether
  32 dejagnu-2.0 scope will remain the same or whether DejaGNU should be used
  33 together with tools like buildbot or whether dejagnu-2.0 will integrate some
  34 of the buildbot-like functionality (multi-note continuous runs).
  35
  36 Maybe there exists something similar already?  Originally I wrote it only for
  37 myself but I see nowadays such tool may be useful for more people.
  38
  39 Former announcement of my scripts:
  40         https://sourceware.org/ml/archer/2010-q3/msg00194.html
  41 URLs are no longer valid, the files can be found now at:
  42         git clone git://git.jankratochvil.net/nethome
  43         (that is my whole $HOME, not just the testsuite scripts)
  44 The primary script 'hammock' is at:
  45         http://git.jankratochvil.net/?p=nethome.git;a=blob;f=bin/hammock
  46
  47 Essential fixup of current DejaGNU:
  48 ------------------------------------------------------------------------------
  49 --orphanripper: It is used by default, normal DejaGNU scripts do not track
  50 their spawned children which share fds 0/1/2 (stdio).  This means some such
  51 children are due to *.exp code bugs occasionally leftover running forever.  As
  52 they have their fds still open the testsuite with output redirected somewhere
  53 will lock up at the end.  Some runaway processes also hog CPU for 100%.  The
  54 following utility identifies runaway processes by using custom pty for them
  55 and kills them at the end of testsuite run:
  56         http://pkgs.fedoraproject.org/cgit/gdb.git/plain/gdb-orphanripper.c
  57 It sure should be better integrated in DejaGNU somehow.
  58
  59 Features outside of the current scope of DejaGNU:
  60 ------------------------------------------------------------------------------
  61 --distro: Testing in various OSes.  My script implement it based on chroot
  62 (Fedora/RHEL has tool 'mock' for it), it has some performance/management
  63 advantages but it has to (1) run all OSes with the same kernel, (2) mock
  64 supports only Fedora/RHEL OSes, (3) it can run only x86_64/i386 arch this way.
  65 The real solution should be multi-node (so that it can also support non-x86*
  66 testing), for x86* it would be commonly using VMs.  But it still could support
  67 even mock/chroot as it runs without the hassle of disk images.
  68
  69 --component: Pre-set remote repositories for download of gdb/binutils/gcc etc.
  70 I want to run my patches on top of clean tree, not in some existing directory
  71 which may have leftover files forgotten to be checked into repository etc.
  72 Understandably it also supports local repository caches.
  73 --srcrpm is similar, it builds tree from a prepared archive - I should be able
  74 to provide also .src.rpm (or .tar.gz) to run the test for.
  75 --branch asks for example for branch "gdb_7_6-branch" from the repository.
  76
  77 --file: Provide custom patches for the newly built tree.
  78
  79 --target: Provide a list of custom configure --target options.  This could be
  80 more general such as to provide any custom configure options.
  81
  82 --parallel: Parallelization of multiple build+testsuite runs, not just
  83 parallelization of the testsuite run part.
  84 If I ask to build 40x binutils with 40 different targets I may want to do
  85 run it in parallel (like with make -j8).
  86
  87 Convenience:
  88 ------------------------------------------------------------------------------
  89 --gdbserver, --valgrind, --bfd32, --gdbindex, --dwz, --dwarf=X, --stabs:
  90 Various pre-set options.  One can configure it by hand but it is too difficult
  91 for daily use, for example for --dwz it means for GDB
  92         runtest CC_FOR_TARGET=/bin/sh\ $PWD/../contrib/cc-with-tweaks.sh\ -m\ gcc CXX_FOR_TARGET=/bin/sh\ $PWD/../contrib/cc-with-tweaks.sh\ -m\ g++ ...
  93         (plus also GNATMAKE_FOR_TARGET, GO_FOR_TARGET and GO_LD_FOR_TARGET)
  94 For --valgrind it means other cryptic options like:
  95         RUNTESTFLAGS=--target_board=valgrind
  96
  97 Incomplete racy reads ("read1"):
  98 ------------------------------------------------------------------------------
  99 https://sourceware.org/bugzilla/show_bug.cgi?id=12649
 100 GDB testsuite contains (yes, it still contains them) various racy cases:
 101   gdb_test_multiple "set dprintf-style agent" $msg {
 102       -re "warning: Target cannot run dprintf commands.*" {
 103 It commonly works as when expect does the read() syscall all the GDB output is
 104 ready.  But occasionally the next testcase FAILs.  This is because
 105 occasionally only part of the output gets read by the read() syscall, regex
 106 gets matched but the final $gdb_prompt is not discarded - and the leftover
 107 $gdb_prompt corrupts the next testcase below.  Sure the fix is:
 108   gdb_test_multiple "set dprintf-style agent" $msg {
 109     -re "warning: Target cannot run dprintf commands.*\r\n$gdb_prompt $" {
 110
 111 There is LD_PRELOAD *.so file in the Bug above to reproduce these cases
 112 reliably.  There is also a reproducer of different kind of bugs ("writew")
 113 although those do not happen so often AFAIK.  This functionality could be
 114 better integrated into DejaGNU.
 115
 116 (Sure the primary problem is that the testsuite should not use regex matching
 117 and it should use generic GDB MI output parser.  But that is a problem that
 118 only a few GDB features have implemented the GDB MI interface.)
 119
 120 Diffing of results:
 121 ------------------------------------------------------------------------------
 122 http://git.jankratochvil.net/?p=nethome.git;a=blob;f=bin/diffgdb
 123  * GDB has various known FAILs.  They should be but they are not KFAILed or
 124    XFAILed.  (On recent Fedora I see there are only 23 of them but on CentOS-5
 125    there is 1063 of them.)
 126  * One is only interested in introduced regressions so one needs to diff two
 127    *.sum files.  Looking again and again at the same known FAILing cases is
 128    not productive.
 129  * During diff one is not interested for example in newly PASSing testcases.
 130    One also is not interested in FAIL->PASS cases.  One is definitely
 131    interested in PASS->FAIL regressions.  New FAILing testcases are also
 132    interesting.
 133 Therefore the script above does a filtering of the diff results.  It parses
 134 DejaGNU *.sum output although DejaGNU did already knew them internally.
 135
 136 The script also filters out unstable/racy results.  This may be outside of the
 137 scope but in fact an unimplemented feature would be to provide statistics on
 138 unstable results (so one can fix those) if I run the same build+testsuite many
 139 times.
 140
 141 Not yet implemented: Finding a regression common reason:
 142 ------------------------------------------------------------------------------
 143 I run 73 testsuite runs daily - primarily GDB in different OSes, for each OS
 144 its x86_64 and i686 variant, for x86_64 OS also in -m32 mode.
 145 If there happens a general regression I get 73 times PASS->FAIL result.  That
 146 is not too convenient to filter out other changes out of the 73 regressions.
 147 Moreover sometimes the regression affects for example only 32-bit OSes
 148 - therefore there will be only about 24 PASS->FAILs and I have to figure out
 149 in which testsuite combinations they happen.
 150 In other cases the regression happens for example only on (older) RHELs and
 151 not on Fedoras but that again means about 6 PASS->FAIL cases.
 152 There would be nice some summary that this PASS->FAIL occured in testsuite
 153 runs in directories rhel* and it did not occur in directories fedora* etc.