211 lines
8.1 KiB
HTML
211 lines
8.1 KiB
HTML
|
<title>Lecture 5/title>
|
||
|
<html>
|
||
|
<head>
|
||
|
</head>
|
||
|
<body>
|
||
|
|
||
|
<h2>Address translation and sharing using page tables</h2>
|
||
|
|
||
|
<p> Reading: <a href="../readings/i386/toc.htm">80386</a> chapters 5 and 6<br>
|
||
|
|
||
|
<p> Handout: <b> x86 address translation diagram</b> -
|
||
|
<a href="x86_translation.ps">PS</a> -
|
||
|
<a href="x86_translation.eps">EPS</a> -
|
||
|
<a href="x86_translation.fig">xfig</a>
|
||
|
<br>
|
||
|
|
||
|
<p>Why do we care about x86 address translation?
|
||
|
<ul>
|
||
|
<li>It can simplify s/w structure by placing data at fixed known addresses.
|
||
|
<li>It can implement tricks like demand paging and copy-on-write.
|
||
|
<li>It can isolate programs to contain bugs.
|
||
|
<li>It can isolate programs to increase security.
|
||
|
<li>JOS uses paging a lot, and segments more than you might think.
|
||
|
</ul>
|
||
|
|
||
|
<p>Why aren't protected-mode segments enough?
|
||
|
<ul>
|
||
|
<li>Why did the 386 add translation using page tables as well?
|
||
|
<li>Isn't it enough to give each process its own segments?
|
||
|
</ul>
|
||
|
|
||
|
<p>Translation using page tables on x86:
|
||
|
<ul>
|
||
|
<li>paging hardware maps linear address (la) to physical address (pa)
|
||
|
<li>(we will often interchange "linear" and "virtual")
|
||
|
<li>page size is 4096 bytes, so there are 1,048,576 pages in 2^32
|
||
|
<li>why not just have a big array with each page #'s translation?
|
||
|
<ul>
|
||
|
<li>table[20-bit linear page #] => 20-bit phys page #
|
||
|
</ul>
|
||
|
<li>386 uses 2-level mapping structure
|
||
|
<li>one page directory page, with 1024 page directory entries (PDEs)
|
||
|
<li>up to 1024 page table pages, each with 1024 page table entries (PTEs)
|
||
|
<li>so la has 10 bits of directory index, 10 bits table index, 12 bits offset
|
||
|
<li>What's in a PDE or PTE?
|
||
|
<ul>
|
||
|
<li>20-bit phys page number, present, read/write, user/supervisor
|
||
|
</ul>
|
||
|
<li>cr3 register holds physical address of current page directory
|
||
|
<li>puzzle: what do PDE read/write and user/supervisor flags mean?
|
||
|
<li>puzzle: can supervisor read/write user pages?
|
||
|
|
||
|
<li>Here's how the MMU translates an la to a pa:
|
||
|
|
||
|
<pre>
|
||
|
uint
|
||
|
translate (uint la, bool user, bool write)
|
||
|
{
|
||
|
uint pde;
|
||
|
pde = read_mem (%CR3 + 4*(la >> 22));
|
||
|
access (pde, user, read);
|
||
|
pte = read_mem ( (pde & 0xfffff000) + 4*((la >> 12) & 0x3ff));
|
||
|
access (pte, user, read);
|
||
|
return (pte & 0xfffff000) + (la & 0xfff);
|
||
|
}
|
||
|
|
||
|
// check protection. pxe is a pte or pde.
|
||
|
// user is true if CPL==3
|
||
|
void
|
||
|
access (uint pxe, bool user, bool write)
|
||
|
{
|
||
|
if (!(pxe & PG_P)
|
||
|
=> page fault -- page not present
|
||
|
if (!(pxe & PG_U) && user)
|
||
|
=> page fault -- not access for user
|
||
|
|
||
|
if (write && !(pxe & PG_W))
|
||
|
if (user)
|
||
|
=> page fault -- not writable
|
||
|
else if (!(pxe & PG_U))
|
||
|
=> page fault -- not writable
|
||
|
else if (%CR0 & CR0_WP)
|
||
|
=> page fault -- not writable
|
||
|
}
|
||
|
</pre>
|
||
|
|
||
|
<li>CPU's TLB caches vpn => ppn mappings
|
||
|
<li>if you change a PDE or PTE, you must flush the TLB!
|
||
|
<ul>
|
||
|
<li>by re-loading cr3
|
||
|
</ul>
|
||
|
<li>turn on paging by setting CR0_PE bit of %cr0
|
||
|
</ul>
|
||
|
|
||
|
Can we use paging to limit what memory an app can read/write?
|
||
|
<ul>
|
||
|
<li>user can't modify cr3 (requires privilege)
|
||
|
<li>is that enough?
|
||
|
<li>could user modify page tables? after all, they are in memory.
|
||
|
</ul>
|
||
|
|
||
|
<p>How we will use paging (and segments) in JOS:
|
||
|
<ul>
|
||
|
<li>use segments only to switch privilege level into/out of kernel
|
||
|
<li>use paging to structure process address space
|
||
|
<li>use paging to limit process memory access to its own address space
|
||
|
<li>below is the JOS virtual memory map
|
||
|
<li>why map both kernel and current process? why not 4GB for each?
|
||
|
<li>why is the kernel at the top?
|
||
|
<li>why map all of phys mem at the top? i.e. why multiple mappings?
|
||
|
<li>why map page table a second time at VPT?
|
||
|
<li>why map page table a third time at UVPT?
|
||
|
<li>how do we switch mappings for a different process?
|
||
|
</ul>
|
||
|
|
||
|
<pre>
|
||
|
4 Gig --------> +------------------------------+
|
||
|
| | RW/--
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
: . :
|
||
|
: . :
|
||
|
: . :
|
||
|
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| RW/--
|
||
|
| | RW/--
|
||
|
| Remapped Physical Memory | RW/--
|
||
|
| | RW/--
|
||
|
KERNBASE -----> +------------------------------+ 0xf0000000
|
||
|
| Cur. Page Table (Kern. RW) | RW/-- PTSIZE
|
||
|
VPT,KSTACKTOP--> +------------------------------+ 0xefc00000 --+
|
||
|
| Kernel Stack | RW/-- KSTKSIZE |
|
||
|
| - - - - - - - - - - - - - - -| PTSIZE
|
||
|
| Invalid Memory | --/-- |
|
||
|
ULIM ------> +------------------------------+ 0xef800000 --+
|
||
|
| Cur. Page Table (User R-) | R-/R- PTSIZE
|
||
|
UVPT ----> +------------------------------+ 0xef400000
|
||
|
| RO PAGES | R-/R- PTSIZE
|
||
|
UPAGES ----> +------------------------------+ 0xef000000
|
||
|
| RO ENVS | R-/R- PTSIZE
|
||
|
UTOP,UENVS ------> +------------------------------+ 0xeec00000
|
||
|
UXSTACKTOP -/ | User Exception Stack | RW/RW PGSIZE
|
||
|
+------------------------------+ 0xeebff000
|
||
|
| Empty Memory | --/-- PGSIZE
|
||
|
USTACKTOP ---> +------------------------------+ 0xeebfe000
|
||
|
| Normal User Stack | RW/RW PGSIZE
|
||
|
+------------------------------+ 0xeebfd000
|
||
|
| |
|
||
|
| |
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
. .
|
||
|
. .
|
||
|
. .
|
||
|
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
|
||
|
| Program Data & Heap |
|
||
|
UTEXT --------> +------------------------------+ 0x00800000
|
||
|
PFTEMP -------> | Empty Memory | PTSIZE
|
||
|
| |
|
||
|
UTEMP --------> +------------------------------+ 0x00400000
|
||
|
| Empty Memory | PTSIZE
|
||
|
0 ------------> +------------------------------+
|
||
|
</pre>
|
||
|
|
||
|
<h3>The VPT </h3>
|
||
|
|
||
|
<p>Remember how the X86 translates virtual addresses into physical ones:
|
||
|
|
||
|
<p><img src=pagetables.png>
|
||
|
|
||
|
<p>CR3 points at the page directory. The PDX part of the address
|
||
|
indexes into the page directory to give you a page table. The
|
||
|
PTX part indexes into the page table to give you a page, and then
|
||
|
you add the low bits in.
|
||
|
|
||
|
<p>But the processor has no concept of page directories, page tables,
|
||
|
and pages being anything other than plain memory. So there's nothing
|
||
|
that says a particular page in memory can't serve as two or three of
|
||
|
these at once. The processor just follows pointers:
|
||
|
|
||
|
pd = lcr3();
|
||
|
pt = *(pd+4*PDX);
|
||
|
page = *(pt+4*PTX);
|
||
|
|
||
|
<p>Diagramatically, it starts at CR3, follows three arrows, and then stops.
|
||
|
|
||
|
<p>If we put a pointer into the page directory that points back to itself at
|
||
|
index Z, as in
|
||
|
|
||
|
<p><img src=vpt.png>
|
||
|
|
||
|
<p>then when we try to translate a virtual address with PDX and PTX
|
||
|
equal to V, following three arrows leaves us at the page directory.
|
||
|
So that virtual page translates to the page holding the page directory.
|
||
|
In Jos, V is 0x3BD, so the virtual address of the VPD is
|
||
|
(0x3BD<<22)|(0x3BD<<12).
|
||
|
|
||
|
|
||
|
<p>Now, if we try to translate a virtual address with PDX = V but an
|
||
|
arbitrary PTX != V, then following three arrows from CR3 ends
|
||
|
one level up from usual (instead of two as in the last case),
|
||
|
which is to say in the page tables. So the set of virtual pages
|
||
|
with PDX=V form a 4MB region whose page contents, as far
|
||
|
as the processor is concerned, are the page tables themselves.
|
||
|
In Jos, V is 0x3BD so the virtual address of the VPT is (0x3BD<<22).
|
||
|
|
||
|
<p>So because of the "no-op" arrow we've cleverly inserted into
|
||
|
the page directory, we've mapped the pages being used as
|
||
|
the page directory and page table (which are normally virtually
|
||
|
invisible) into the virtual address space.
|
||
|
|
||
|
|
||
|
</body>
|