xv6:lab3 Pgtbl

280 阅读6分钟

Lab3:Page tables

  • 这一章的主要内容是关于页表的
  • 具体要完成的任务就是为每个进程创建一个内核页表的副本,同时也将进程页表的内容添加到该页表中,以此来简化将数据从用户空间复制到内核空间的函数。

Print a page table(easy)

  • 这个没什么好说的,照着提示实现就可以了,使用递归访问的方式实现
  • pte & PTE_V代表该项有效,而叶子节点即三级页表项,PTE_R | PTE_W | PTE_X肯定不为0,若为0,则代表非三级页表项,将其print之后需要继续递归访问下一级页表
// vm.cvoid
print_pte(int index, pte_t pte, uint64 pa, int level)
{
  for (int i = 0; i < level; i++)
    printf(".. ");
  printf("..%d: pte %p pa %p\n", index, pte, pa);
}
​
void
print_pgtb(pagetable_t pagetable, int level)
{
  // there are 2^9 = 512 PTEs in a page table.
  for(int i = 0; i < 512; i++){
    pte_t pte = pagetable[i];
    if(pte & PTE_V){
      // this PTE points to a lower-level page table.
      uint64 child = PTE2PA(pte);
      print_pte(i, pte, child, level);
      if ((pte & (PTE_R|PTE_W|PTE_X)) == 0)
        print_pgtb((pagetable_t)child, level + 1);
    }
  }
}
​
void
vmprint(pagetable_t pagetable)
{
  printf("page table %p\n", pagetable);
  print_pgtb(pagetable, 0);
}
  • vmprint()添加到defs.h
// defs.h
void            vmprint(pagetable_t);
  • 最后在exec.creturn argc上面添加if (p->pid == 1) vmprint(p->pagetable);,即可大功告成
mit6s081@6336ffc64c30:~/xv6-labs-2020$ python grade-lab-pgtbl pte printout
make: 'kernel/kernel' is up to date.
== Test pte printout == pte printout: OK (1.4s)

A kernel page table per process (hard)

  • 该lab的任务就是修改内核来让每一个进程在内核中执行时使用它自己的内核页表的副本。修改struct proc来为每一个进程维护一个内核页表,修改调度程序使得切换进程时也切换内核页表。
  • proc.h:添加一个新表项来代表内核页表的副本
// proc.h// Per-process state
struct proc {
  struct spinlock lock;
  ...
​
  pagetable_t kpagetable;
  
  ...
};
  • 下一步是修改vm.c,实现一个修改版的kvminit(),来为进程的内核页表初始化
  • 同时lab中提示了进程页表的内核空间不得超过PLIC地址,所以小于PLIC地址的都可能会有映射,通过观察下面的内核地址空间图,可以看到CLINT地址在PLIC下面,所以,对于进程的内核页表我们不要映射该地址,否则可能会出现panic: remap

img

  • 实现了一个init_kpagetable函数,用于创建新的内核页表,另外对于全局的内核页表还是要映射CLINT地址
  • 另外,为了方便,把kvmmap函数添加一个参数pagetable
  • kvmpa也需要修改一下,将kernel_pagetable修改为myproc()->kpagetable
  • 同时别忘了将init_kpagetable()添加到defs.h以及修改其中的kvmmap
// vm.c#include "spinlock.h"
#include "proc.h"/*
 * create a direct-map page table for the kernel.
 */
void
kvminit()
{
  kernel_pagetable = init_kpagetable();
  // CLINT
  kvmmap(kernel_pagetable, CLINT, CLINT, 0x10000, PTE_R | PTE_W);
}
​
pagetable_t
init_kpagetable() {
  pagetable_t pagetable = (pagetable_t) kalloc();
  memset(pagetable, 0, PGSIZE);
​
  // uart registers
  kvmmap(pagetable, UART0, UART0, PGSIZE, PTE_R | PTE_W);
​
  // virtio mmio disk interface
  kvmmap(pagetable, VIRTIO0, VIRTIO0, PGSIZE, PTE_R | PTE_W);
​
  // PLIC
  kvmmap(pagetable, PLIC, PLIC, 0x400000, PTE_R | PTE_W);
​
  // map kernel text executable and read-only.
  kvmmap(pagetable, KERNBASE, KERNBASE, (uint64)etext-KERNBASE, PTE_R | PTE_X);
​
  // map kernel data and the physical RAM we'll make use of.
  kvmmap(pagetable, (uint64)etext, (uint64)etext, PHYSTOP-(uint64)etext, PTE_R | PTE_W);
​
  // map the trampoline for trap entry/exit to
  // the highest virtual address in the kernel.
  kvmmap(pagetable, TRAMPOLINE, (uint64)trampoline, PGSIZE, PTE_R | PTE_X);
  return pagetable;
}
​
void
kvmmap(pagetable_t pagetable, uint64 va, uint64 pa, uint64 sz, int perm)
{
  if(mappages(pagetable, va, sz, pa, perm) != 0)
    panic("kvmmap");
}
​
uint64
kvmpa(uint64 va)
{
  uint64 off = va % PGSIZE;
  pte_t *pte;
  uint64 pa;
  
  pte = walk(myproc()->kpagetable, va, 0);
  if(pte == 0)
    panic("kvmpa");
  if((*pte & PTE_V) == 0)
    panic("kvmpa");
  pa = PTE2PA(*pte);
  return pa+off;
}
​
// defs.h
void            kvmmap(pagetable_t, uint64, uint64, uint64, int);
pagetable_t     init_kpagetable();
  • proc.callocproc()函数中初始化kpagetable,同时将关于内核栈映射的内容从procinit移到allocproc中,并且映射到p->kpagetable
// proc.cvoid
procinit(void)
{
  struct proc *p;
  
  initlock(&pid_lock, "nextpid");
  for(p = proc; p < &proc[NPROC]; p++) {
      initlock(&p->lock, "proc");
  }
  kvminithart();
}
​
static struct proc*
allocproc(void)
{
  struct proc *p;
​
  ...
  // An empty user page table.
  p->pagetable = proc_pagetable(p);
  if(p->pagetable == 0){
    freeproc(p);
    release(&p->lock);
    return 0;
  }
  /* start */
  p->kpagetable = init_kpagetable();
​
  // Allocate a page for the process's kernel stack.
  // Map it high in memory, followed by an invalid
  // guard page.
  char *pa = kalloc();
  if(pa == 0)
    panic("kalloc");
  uint64 va = KSTACK((int) (p - proc));
  kvmmap(p->kpagetable, va, (uint64)pa, PGSIZE, PTE_R | PTE_W);
  p->kstack = va;
  /* end */
​
  // Set up new context to start executing at forkret,
  // which returns to user space.
  memset(&p->context, 0, sizeof(p->context));
  p->context.ra = (uint64)forkret;
  p->context.sp = p->kstack + PGSIZE;
​
  return p;
}
  • freeproc()中释放进程的内核页表,不必释放叶子物理内存,实现了一个free_pgtl,参考了freewalk,如果表项有效,则首先缓存该表项到pte中,然后清空表项,若非叶子节点,则继续向下递归释放下一级页表.....
  • 跟上面一样,也要将freeproc()添加到defs.h
  • 还要释放内核栈即kstack,,使用uvmunmap函数,要释放的npages为1,则第三个参数为1,这里内核栈要释放内存,所以do_free即第四个参数为1,另外内核栈一定要在内核页表释放之前释放,因为内核页表释放完之后就不会存在内核页表的映射了,这时再去释放内核栈会找不到该表项
// vm.cvoid
free_pgtl(pagetable_t pagetable)
{
  // there are 2^9 = 512 PTEs in a page table.
  for(int i = 0; i < 512; i++){
    pte_t pte = pagetable[i];
    if(pte & PTE_V){
      pagetable[i] = 0;
      if ((pte & (PTE_R|PTE_W|PTE_X)) == 0)
      {
        // this PTE points to a lower-level page table.
        uint64 child = PTE2PA(pte);
        free_pgtl((pagetable_t) child);
      }
      
    }
  }
  kfree((void*)pagetable);
}
​
// defs.h
void            free_pgtl(pagetable_t);
​
// proc.cstatic void
freeproc(struct proc *p)
{
  if(p->trapframe)
    kfree((void*)p->trapframe);
  p->trapframe = 0;
  if(p->pagetable)
    proc_freepagetable(p->pagetable, p->sz);
  p->pagetable = 0;
  // start
  uvmunmap(p->kpagetable, p->kstack, 1, 1);
  p->kstack = 0;
  if(p->kpagetable)
    free_pgtl(p->kpagetable);
  p->kpagetable = 0;
  // end
  p->sz = 0;
  p->pid = 0;
  p->parent = 0;
  p->name[0] = 0;
  p->chan = 0;
  p->killed = 0;
  p->xstate = 0;
  p->state = UNUSED;
}
  • 最后,需要在scheduler中加载进程的内核页表到核心的satp寄存器,在vm.c中实现了kpgtl_inithart函数,参考了kvminithart函数,最后在没有进程运行时切换为全局内核页表
// vm.cvoid
kpgtl_inithart(pagetable_t pagetable)
{
  w_satp(MAKE_SATP(pagetable));
  sfence_vma();
}
​
// defs.h
void            kpgtl_inithart(pagetable_t);
​
// proc.cvoid
scheduler(void)
{
  struct proc *p;
  struct cpu *c = mycpu();
  
  c->proc = 0;
  for(;;){
    // Avoid deadlock by ensuring that devices can interrupt.
    intr_on();
    
    int found = 0;
    for(p = proc; p < &proc[NPROC]; p++) {
      acquire(&p->lock);
      if(p->state == RUNNABLE) {
        // Switch to chosen process.  It is the process's job
        // to release its lock and then reacquire it
        // before jumping back to us.
        p->state = RUNNING;
        c->proc = p;
        kpgtl_inithart(p->kpagetable);              //TODO
        swtch(&c->context, &p->context);
​
        // Process is done running for now.
        // It should have changed its p->state before coming back.
        c->proc = 0;
        kvminithart();                              // TODO
        found = 1;
      }
      release(&p->lock);
    }
#if !defined (LAB_FS)
    if(found == 0) {
      intr_on();
      asm volatile("wfi");
    }
#else
    ;
#endif
  }
}
  • 最后,运行测试
mit6s081@6336ffc64c30:~/xv6-labs-2020$ python grade-lab-pgtbl usertests
make: 'kernel/kernel' is up to date.
== Test usertests == (132.7s) 
== Test   usertests: copyin == 
  usertests: copyin: OK 
== Test   usertests: copyinstr1 == 
  usertests: copyinstr1: OK 
== Test   usertests: copyinstr2 == 
  usertests: copyinstr2: OK 
== Test   usertests: copyinstr3 == 
  usertests: copyinstr3: OK 
== Test   usertests: sbrkmuch == 
  usertests: sbrkmuch: OK 
== Test   usertests: all tests == 
  usertests: all tests: OK 

Simplify copyin/copyinstr(hard)

  • 这部分的内容是要将将用户空间的映射添加到每个进程的内核页表(上个lab创建)

  • 首先需要将vmcopyin.c中的两个函数copyin_newcopyinstr_new添加到defs.h中,然后在copyin、copyinstr中替换为对新函数的调用

  • 之后在vm.c中实现一个copy函数,用于复制页表或者删除一些页表项,并将其添加到defs.h

  • 这里我把walk函数也添加到了defs.h中了,如果不添加的话,记得要将copy函数写到walk函数下面,否则会出错

  • pgbl_copy

    • PGROUNDUP(oldsz) <= PGROUNDUP(newsz):此时情况是要添加页表项到new页表中,首先通过walk找到页表项,并求出其物理地址以及权限,注意,进程内核页表的权限不能包含PTE_U,否则在内核状态下无法访问,并且这里的oldsz、newsz都是以字节为单位的,所以每次循环都要加PGSIZE
    • PGROUNDUP(oldsz) > PGROUNDUP(newsz):此时情况是要删除页表项,所以就不涉及old页表了。算一下npages,一定要除以PGSIZE,因为npages代表是要释放多少页,之后直接用uvmunmap函数即可,这里注意最后参数记得设为0,因为直接删除页表项对应的映射,并不需要释放物理内存
// vm.c

void
pgbl_copy(pagetable_t old, pagetable_t new, uint64 oldsz, uint64 newsz) {
  pte_t *pte;
  uint64 pa, i;
  uint flags;
  if (PGROUNDUP(oldsz) <= PGROUNDUP(newsz))
  {
    for(i = PGROUNDUP(oldsz); i < newsz; i += PGSIZE){
      if((pte = walk(old, i, 0)) == 0)
        panic("pgbl_copy: pte should exist");
      if((*pte & PTE_V) == 0)
        panic("pgbl_copy: page not present");
      pa = PTE2PA(*pte);
      flags = PTE_FLAGS(*pte) & (~PTE_U);
      if(mappages(new, i, PGSIZE, pa, flags) != 0){
        uvmunmap(new, PGROUNDUP(oldsz), (i - PGROUNDUP(oldsz)) / PGSIZE, 0);
      }
    }
  } else {
    uint64 npages = PGROUNDUP(oldsz) - PGROUNDUP(newsz) / PGSIZE;
    uvmunmap(new, PGROUNDUP(newsz), npages, 0);
  }
}

int
copyin(pagetable_t pagetable, char *dst, uint64 srcva, uint64 len)
{
  return copyin_new(pagetable, dst, srcva, len);
}

int
copyinstr(pagetable_t pagetable, char *dst, uint64 srcva, uint64 max)
{
  return copyinstr_new(pagetable, dst, srcva, max);
}

// defs.h
void            pgbl_copy(pagetable_t, pagetable_t, uint64, uint64);
pte_t*          walk(pagetable_t, uint64, int);
int             copyin_new(pagetable_t, char *, uint64, uint64);
int             copyinstr_new(pagetable_t, char *, uint64, uint64);
  • 之后需要修改三个地方fork(), exec(), 和sbrk()
  • fork()函数定义在proc.c中,一定是要这样去复制,将子线程的pagetable复制到kpagetable中,其它的复制方法会报错
// proc.c

int
fork(void)
{
  int i, pid;
  struct proc *np;
  struct proc *p = myproc();

  // Allocate process.
  if((np = allocproc()) == 0){
    return -1;
  }

  // Copy user memory from parent to child.
  if(uvmcopy(p->pagetable, np->pagetable, p->sz) < 0){
    freeproc(np);
    release(&np->lock);
    return -1;
  }
  np->sz = p->sz;
  pgbl_copy(np->pagetable, np->kpagetable, 0, np->sz);		//TODO

  np->parent = p;

  // copy saved user registers.
  *(np->trapframe) = *(p->trapframe);

  // Cause fork to return 0 in the child.
  np->trapframe->a0 = 0;

  // increment reference counts on open file descriptors.
  for(i = 0; i < NOFILE; i++)
    if(p->ofile[i])
      np->ofile[i] = filedup(p->ofile[i]);
  np->cwd = idup(p->cwd);

  safestrcpy(np->name, p->name, sizeof(p->name));

  pid = np->pid;

  np->state = RUNNABLE;

  release(&np->lock);

  return pid;
}
  • exec.c
  • 这里当加载程序的时候需要判断一下是否超过了PLIC地址,如果超过了就要goto bad
  • 之后,在exec函数完成对旧pagetable的操作之后,将之前旧的kpagetable删除所有进程页表映射,并将新pagetable中的所有内容添加到kpagtable中,这里如果不先unmap而是直接调用copy函数的话会报panic: remap,另外unmap的时候PGROUNDUP(oldsz) / PGSIZE为旧kpagetable的所包含的页面数
// exec.c

int
exec(char *path, char **argv)
{
  ...

  // Load program into memory.
  for(i=0, off=elf.phoff; i<elf.phnum; i++, off+=sizeof(ph)){
    if(readi(ip, 0, (uint64)&ph, off, sizeof(ph)) != sizeof(ph))
      goto bad;
    if(ph.type != ELF_PROG_LOAD)
      continue;
    if(ph.memsz < ph.filesz)
      goto bad;
    if(ph.vaddr + ph.memsz < ph.vaddr)
      goto bad;
    if((ph.vaddr + ph.memsz) >= PLIC)						// TODO
      goto bad;												// TODO
    uint64 sz1;
    if((sz1 = uvmalloc(pagetable, sz, ph.vaddr + ph.memsz)) == 0)
      goto bad;
    sz = sz1;
    if(ph.vaddr % PGSIZE != 0)
      goto bad;
    if(loadseg(pagetable, ph.vaddr, ip, ph.off, ph.filesz) < 0)
      goto bad;
  }
  iunlockput(ip);
  end_op();
  ip = 0;

  p = myproc();
  uint64 oldsz = p->sz;

  ...

  // Save program name for debugging.
  for(last=s=path; *s; s++)
    if(*s == '/')
      last = s+1;
  safestrcpy(p->name, last, sizeof(p->name));
  uvmunmap(p->kpagetable, 0, PGROUNDUP(oldsz) / PGSIZE, 0);			// TODO
  pgbl_copy(pagetable, p->kpagetable, 0, sz);						// TODO
  ...
}
  • sbrk是一个系统调用,通过查看syscall.c中的syscalls,可以找到sys_sbrk,点击sys_sbrk即可跳转到sysproc.csbrk的具体实现函数,即可看到内部是由growproc函数实现的,该函数定义在proc.c中,另外还需要判断一下是否增长到了PLIC
// syscall.c
static uint64 (*syscalls[])(void) = {
	...
[SYS_sbrk]    sys_sbrk,
    ...
};

// sysproc.c

uint64
sys_sbrk(void)
{
  int addr;
  int n;

  if(argint(0, &n) < 0)
    return -1;
  addr = myproc()->sz;
  if(growproc(n) < 0)
    return -1;
  return addr;
}

// proc.c

int
growproc(int n)
{
  uint sz;
  struct proc *p = myproc();

  sz = p->sz;
  if(n > 0){
    if((sz + n) > PLIC)										// TODO
      return -1;											// TODO
    if((sz = uvmalloc(p->pagetable, sz, sz + n)) == 0) {
      return -1;
    }
  } else if(n < 0){
    sz = uvmdealloc(p->pagetable, sz, sz + n);
  }
  pgbl_copy(p->pagetable, p->kpagetable, sz - n, sz);		// TODO
  p->sz = sz;
  return 0;
}
  • 这是你如果运行make qemu的话,会发现编译失败了,这是因为还有一个地方被忽略了,那就是proc.c中的userinit也更改了pagetable,所以也需要复制一下,到这里就算是彻底结束了
void
userinit(void)
{
  struct proc *p;

  p = allocproc();
  initproc = p;
  
  // allocate one user page and copy init's instructions
  // and data into it.
  uvminit(p->pagetable, initcode, sizeof(initcode));
  p->sz = PGSIZE;
  pgbl_copy(p->pagetable, p->kpagetable, 0, p->sz);				// TODO

  // prepare for the very first "return" from kernel to user.
  p->trapframe->epc = 0;      // user program counter
  p->trapframe->sp = PGSIZE;  // user stack pointer

  safestrcpy(p->name, "initcode", sizeof(p->name));
  p->cwd = namei("/");

  p->state = RUNNABLE;

  release(&p->lock);
}
  • 运行make grade
== Test pte printout == 
$ make qemu-gdb
pte printout: OK (2.1s) 
== Test answers-pgtbl.txt == answers-pgtbl.txt: OK 
== Test count copyin == 
$ make qemu-gdb
count copyin: OK (0.8s) 
== Test usertests == 
$ make qemu-gdb
(118.8s) 
== Test   usertests: copyin == 
  usertests: copyin: OK 
== Test   usertests: copyinstr1 == 
  usertests: copyinstr1: OK 
== Test   usertests: copyinstr2 == 
  usertests: copyinstr2: OK 
== Test   usertests: copyinstr3 == 
  usertests: copyinstr3: OK 
== Test   usertests: sbrkmuch == 
  usertests: sbrkmuch: OK 
== Test   usertests: all tests == 
  usertests: all tests: OK 
== Test time == 
time: OK 
Score: 66/66
  • 恭喜大家完成这个实验!!!确实很费脑筋,我在这个lab花了很长时间。。。