sys/arch/hp300/DOC/TODO.hp300

   1 $NetBSD$
   2
   3 1. Create and use an interrupt stack.
   4    Well actually, use the master SP for kernel stacks instead of
   5    the interrupt SP.  Right now we use the interrupt stack for
   6    everything.  Allows for more accurate accounting of systime.
   7    In theory, could also allow for smaller kernel stacks but we
   8    only use one page anyway.
   9
  10 2. Copy/clear primitives could be tuned.
  11    What is best is highly CPU and cache dependent.  One thing to look
  12    at are the copyin/copyout primitives.  Rather than looping using
  13    MOVS instructions, you could map an entire page at a time and use
  14    bcopy, MOVE16, or whatever.  This would lose big on the VAC models
  15    however.
  16
  17 3. Sendsig/sigreturn are pretty bogus.
  18    Currently we can call a signal handler even if an excpetion
  19    occurs in the middle of an instruction.  This causes the handler
  20    to return right back to the middle of the offending instruction
  21    which will most likely lead to another exception/signal.
  22    Technically, I feel this is the correct behavior but it requires
  23    saving a lot of state on the user's stack, state that we don't
  24    really want the user messing with.  Other 68k implementations
  25    (e.g. Sun) will delay signals or abort execution of the current
  26    instruction to reduce saved state.  Even if we stick with the
  27    current philosophy, the code could be cleaned up.
  28
  29 4. Ditto for AST and software interrupt emulation.
  30    Both are possibly over-elaborate and inefficiently implemented.
  31    We could possibly handle them by using an appropriately planted
  32    PS trace bit.
  33
  34 5. Make use of transparent translation registers on 030/040 MMU.
  35    With a little rearranging of the KVA space we could use one to
  36    map the entire external IO space [ 600000 - 20000000 ).  Since
  37    the translation must be 1-1, this would limit the kernel to 6mb
  38    (some would say that is hardly a limit) or divide it into two
  39    pieces.  Another promising use would be to map physical memory
  40    within the kernel.  This allows a much simpler and more efficient
  41    implementation of /dev/mem, pmap_zero_page, pmap_copy_page and
  42    possible even kernel-user cross address space copies.  However,
  43    it does eat up a significant piece of kernel address space.
  44
  45 6. Create a 32-bit timer.
  46    Timers 2 and 3 on the MC6840 clock chip can be concatonated together to
  47    get a 32-bit countdown timer.  There are at least three uses for this:
  48    1. Monitoring the interval timer ("clock") to detect lost "ticks".
  49       (Idea from Scott Marovich)
  50    2. Implement the DELAY macro properly instead of approximating with
  51       the current "while (--count);" loop.  Because of caches, the current
  52       method is potentially way off.
  53    3. Export as a user-mappable timer for high-precision (4us) timing.
  54    Note that by doing this we can no longer use timer 3 as a separate
  55    statistics/profiling timer.  Should be able to compile-time (runtime?)
  56    select between the two.
  57
  58 7. Conditional MMU code sould be restructured.
  59    Right now it reflects the evolutionary path of the code: 320/350 MMU
  60    was supported and PMMU support was glued on.  The latter can be ifdef'ed
  61    out when not needed, but not all of the former (e.g. ``mmutype'' tests).
  62    Also, PMMU is made to look like the HP MMU somewhat ham-stringing it.
  63    Since HP MMU models are dead, the excess baggage should be there (though
  64    it could be argued that they benefit more from the minor performance
  65    impact).  MMU code should probably not be ifdef'ed on model type, but
  66    rather on more relevant tags (e.g. MMU_HP, MMU_MOTO).
  67
  68 8. Redo cache handling.
  69    There are way too many routines which are specific to particular
  70    cache types.  We should be able to come up with a more coherent
  71    scheme (though HP 68k boxes have just about every caching scheme
  72    imaginable: internal/external, physical/virtual, writeback/writethrough)
  73    See, for example, Wheeler and Bershad in ASPLOS 92.
  74
  75 9. Sort the free page list.
  76    The DMA hardware on the 300 cannot do scatter/gather IO.  For example,
  77    if an 8k system buffer consists of two non-contiguous physical pages
  78    it will require two DMA transfers (and hence two interrupts) to do the
  79    operation.  It would take only one transfer if they were physically
  80    contiguous.  By keeping the free list ordered we could potentially
  81    allocate contiguous pages and reduce the number of interrupts.  We can
  82    consider doing this since pages in the free list are not reclaimed and
  83    thus we don't have to worry about distorting any LRU behavior.
  84 ----
  85 Mike Hibler
  86 University of Utah CSS group
  87 mike@cs.utah.edu