i386/junos/MIT/course/lectures/all/6.828/2011/lec/l-processes.txt

   1 6.828 2011 Lecture 4: Process Creation
   2
   3 big picture
   4   getting xv6 as far as the first process
   5   load kernel
   6   temporary page table
   7   real page table
   8   create process
   9   switch to process
  10   exec /init
  11
  12 why do we care about VM?
  13   implements address spaces:
  14     force each process to only r/w its own memory (bugs, security)
  15     user code at predictable addresses
  16     big contiguous user address space
  17
  18 where we were on Wednesday
  19   setting up a page table for xv6 kernel
  20   diagram of virtual address space
  21      0x80000000
  22      0x00000000
  23   each process has its own page table
  24   plus one for when not running a process (e.g. early in boot)
  25
  26 quick review of x86 page directory / page table
  27   [diagram: cr3, 1024 PDEs, 1024 page table pages]
  28   see last week's handout
  29   PTE: 22 bits phys addr, 12 flag bits
  30   translation: 10, 10, 12
  31
  32 we were early in main(), in kvmalloc(), after setupkvm()
  33   sheet 17
  34
  35 let's look at the page table (kpgdir) that setupkvm produces
  36 (gdb) break kvmalloc
  37 (gdb) next
  38 (gdb) print/x kpgdir[0]
  39 why is is zero?
  40
  41 let's look up a virtual address
  42   how about the first instruction of kvmalloc
  43   (gdb) x/i kvmalloc
  44   0x80107990 <kvmalloc>:  push   %ebp
  45   how would we translate 0x80107990 to a physical address?
  46
  47 (gdb) print/x 0x80107990 >> 22
  48 $4 = 0x200
  49 (gdb) print/x kpgdir[0x200]
  50 $6 = 0x114007
  51   Q: what is this?
  52   Q: what is the PPN?
  53   Q: what does the 7 mean?
  54 (gdb) print/x (0x80107990 >> 12) & 0xfff
  55 $6 = 0x107
  56 (gdb) print/x ((int*)0x114000)[0x107]
  57 $12 = 0x107001
  58   Q: what is this?
  59   Q: why 1 in the low bits?
  60 (gdb) print/x 0x107000 + 0x990
  61 $13 = 0x107990
  62 (gdb) x/i 0x107990
  63
  64 wait!!! why did the physical address work in gdb?
  65
  66 back to kvmalloc
  67   it called setupkvm to create a page table
  68   now it calls switchkvm to start using it
  69   switchkvm loads kpgdir into %cr3
  70
  71 new page table
  72   much like the previous one
  73   maps more phys mem above the kernel
  74   does not have temporary mapping for low 4 MB
  75   and now 0x170990 won't work:
  76     (gdb) x/i 0x107990
  77     0x107990:       Cannot access memory at address 0x107990
  78
  79 next topic: physical memory allocation
  80   remember that setupkvm allocated phys mem for page directory
  81   how does memory allocation work?
  82   sheet 27
  83
  84 physical memory allocator interface
  85   allocates a page at a time
  86   va = kalloc()
  87   kfree(va)
  88   always allocates from phys mem above where kernel was loaded
  89   so pa is va - KERNBASE
  90
  91 what does kernel use phys mem allocator for?
  92   page table pages
  93   kernel data structures (pipe buffers, stacks, &c)
  94   user memory
  95
  96 how does the allocator work?
  97   data structure: a list of free physical pages
  98   allocation = remove first entry from list
  99   free = add page to head of list
 100   list's "next" pointers stored in first 4 bytes of each free page
 101     that memory is available since they are free
 102
 103 allocator depends on phys pages having virtual addresses!
 104   since it must write them to manipulate free list
 105   just like VM code needed to write page table pages
 106   thus xv6 maps all phys pages into kernel address space
 107     which burns up a lot of virtual address space, limits max user mem
 108   other arrangements are possible
 109
 110 where does allocator get initial pool of free physical memory?
 111
 112 kinit
 113   called by main
 114   newend is first address beyond the end of the kernel
 115     as a virtual address
 116     memory beyond that is unused
 117   PGROUNDUP since newend may not be page-aligned
 118   Q: why must allocated pages have page-aligned addresses?
 119   assume phys mem goes up to PHYSTOP (lame)
 120   kfree each page
 121     usually called on previously-allocated memory
 122     kinit is abusing kfree a little bit
 123
 124 kfree
 125   linked list
 126   note cast at 2767
 127   2768 is where we depend on phys mem being mapped at a virt addr
 128
 129 kalloc
 130   takes the first element of the free list
 131
 132 Q: how to allocate mem for a data structure (e.g. array) > 4096 bytes?
 133
 134 ****
 135
 136 now let's talk about creating first process and its address space
 137
 138 process execution states:
 139   diagram: user/kernel, process mem, kernel thread, kern stack, pagetable
 140   process might be executing in user space
 141     with cr3 pointing to its page table
 142     user mode, so can use PTE_U PTEs, < 0x80000000
 143   or might be in a system call in the kernel
 144     e.g. open() finding a file on disk
 145     process's "kernel thread"
 146     kernel mode, so can use non-PTE_U PTEs
 147     using kernel stack
 148   or not currently executing
 149
 150 xv6 has two kinds of transitions
 151   trap + return: user->kernel, kernel->user
 152     system calls, interrupts, divide-by-zero, &c
 153     hw+sw saves user registers on process's kernel stack
 154     save user process state ... run in kernel ... restore state
 155   process switch: between kernel threads
 156     one process is waiting for input, run another
 157       or time-slicing between compute-bound processes
 158     save p1's kernel-thread state ... restore p2's kernel-thread state
 159
 160 Q: why per-process kernel stack?
 161    what would go wrong if syscall used a single global stack?
 162
 163 how does xv6 store process state?
 164   struct proc sheet 20
 165   kernel proc[] table has an entry for each process
 166   each field ...
 167
 168 Q: you'd expect there to be an array or something of pointers
 169    to the process's user memory. where is it? how does xv6
 170    know what memory a process is using?
 171
 172 ordinarily p->pgdir and phys mem contents created by fork()
 173   we will fake them for first process
 174
 175 ordinarily p->tf and p->context created by syscall/switch
 176   we will fake them for first process
 177
 178 main calls userinit
 179
 180 userinit sheet 22
 181   only called for first process
 182     other processes created by fork
 183   mimics fork+exec
 184     create a normal-looking process
 185     ordinary scheduler will run it
 186   needs to fill in all struct proc entries
 187
 188 allocproc sheet 21
 189   used by both fork and userinit
 190   kernel stack setup:
 191     trapframe w/ "saved user registers"
 192       for us, initial user registers
 193       eax, eip, esp, &c
 194     trapret !!!!!!!!!!!!!!!
 195     context w/ "saved kernel thread registers"
 196       for us, initial kernel thread registers
 197       eip
 198   Q: where will new kernel thread start executing?
 199   doesn't set up trapframe b/c ordinarily copied
 200     from parent by fork, which calls allocproc
 201   but fork and userinit both always start thread in forkret
 202
 203 trapframe sheet 06
 204 context sheet 20
 205
 206 kernel stack diagram:
 207   top ->
 208                 esp, ss
 209                 eip, cs
 210                 ...
 211                 gs fs es ds
 212   p->tf ->      edi & 7 other registers
 213                 ---
 214                 trapret
 215                 ---
 216                 eip = forkret
 217                 ebp
 218                 ebx
 219                 esi
 220   p->context -> edi
 221                 ---
 222   p->kstack ->  ...
 223
 224 Q: any guesses why there are *two* saved EIPs?
 225
 226 back to userinit
 227   we know setupkvm -- only fills in kernel mappings
 228   this is a new page table for the new process
 229     not using it yet, will switch when new kernel thread starts
 230   call inituvm w/ ptr to new process's user instructions
 231
 232 initcode sheet 75
 233   user program
 234   exec("/init", args)
 235
 236 inituvm sheet 17
 237   we know kalloc and mappages(pgdir, va, sz, pa)
 238   initcode is tiny, fits in one page
 239   diagram: new mapping
 240
 241 Q: new page is mapped at va=0; could inituvm call memmove(0, init, sz)?
 242
 243 back to userinit sheet 22
 244   tf->esp -- user stack at top of page
 245   tf->esp=0 -- first instruction at bottom of page
 246
 247 main calls scheduler()
 248
 249 scheduler sheet 24
 250   no longer initialization: kernel now fully running
 251   whenever process gives up CPU -> scheduler
 252   so kernel runs scheduler a lot
 253   look for a process that wants to run, run it
 254   p->state: SLEEPING, RUNNABLE, RUNNING
 255   scheduler looks for RUNNABLE
 256
 257 switchuvm sheet 17
 258   tell h/w to use p->stack if re-enters kernel
 259     sys call or interrupt
 260   load %cr3
 261
 262 let's watch switch to new process's page table:
 263   (gdb) break switchuvm
 264   (gdb) x/5i 0
 265   0x0:    Cannot access memory at address 0x0
 266   next past load %cr3
 267   (gdb) x/5i 0
 268   same as initcode sheet 75
 269   but we are still in the kernel, in scheduler
 270
 271 back to scheduler sheet 24
 272   mark RUNNING so no other CPU runs it
 273   now switch to new process's kernel stack, registers, EIP
 274   swtch(place to save current ESP, previously saved ESP to switch to)
 275
 276 swtch sheet 26
 277   save current thread's registers and stack
 278   load new threads stack and registers
 279   saves current registers on current stack, struct context, sheet 20
 280   expects new thread's stack to have registers in that format
 281   stack diagram:
 282     eip *****
 283     ebp
 284     ebx
 285     esi
 286     edi
 287   Q: why these registers?
 288     callee saved, might have caller's live variables
 289   same format as struct context
 290
 291 let's watch:
 292   (gdb) break swtch
 293   si until esp switch...
 294   (gdb) x/6x $esp
 295   si past esp switch
 296   (gdb) x/6x $esp
 297   after: 4 regs, forkret, trapret
 298
 299 step into forkret sheet 24
 300   just returns
 301   allocproc set up stack to have it return to trapret
 302   watch out:
 303     release and initlog cause interrupts
 304     so hack source to set first=0 and pushcli
 305     si for iret &c -- si leaves interrupts off
 306   next into trapret
 307
 308 look at trapframe sheet 06
 309   (gdb) x/19x $esp
 310   0x0 0x23 are eip:cs
 311   0x1000 0x2b are esp:ss
 312
 313 trapret sheet 29
 314   pops trapret registers from stack, mostly zero
 315   popal pops 8 general-purpose registers
 316   iret pops ESP, EIP, clears supervisor flag
 317   x/5x $esp
 318   now we are executing at address 0x0 in initcode sheet 75
 319
 320 what does initcode do?
 321   traps back into the kernel to make exec() system call