Linux Memory / Process Management on x86



Hello

I've posted this in comp.os.linux.development.system (with a different
subject line) but I didn't get an answer. I don't know if no one was able
to answer or I've I violated some rule of nettiquette or something (if I
did, could someone tell me? I've read the nettiquete, but this is
my 4th or so post to a newsgroup) or if my post is simply to long :)
Anyways, I know I should post a "follow-up", but I don't know how to do
this, so I post this instead, don't know it it helps:
pan.2006.03.17.16.17.41.160203@xxxxxxxxxxxxxxxxxxxxxxxxxxx in
comp.os.linux.development.system
If someone could help me with my problem(s) I'de be very gratefull! So
here comes my original post:


As you might have guessed from the topic "Linux Memory / Process
Management on x86Protect one single byte on Linux/x86 ..." I have some
problems fully understanding memory protection on a x86 machine running
linux. You also might have guessed that I'm not a native english speaker;
I do apologize for any grammar or spelling mistakes.

I did some research on the topic "memory protection" using google,
wikipedia and searching newsgroups and right now I'm a little bit at a
dead end because I don't know if the way I "understood" all I read is
really correct. You could do me a great favour if you could comment my
"thoughs" and/or correct them. I'd be really thankfull :) So here it comes
;)

A OS running on x86 architecture in protected mode has to use segmentation
of memory and can use paging.
Linux (2.6.XX) bypases segmentation by setting up 4 segments with the same
base and length register 0 - 4GB, so basically a logical address can be
translated to a linear address by "ignoring" everything above bit 31. The
Linux kernels "knows" for every "existing" process the value to put in the
CR3 register (Page Directory Base Register). The linear address is
translated to a physical address by the MMU by
1) looking for the correct page directory pointer (CR3 + Bit 31
and 30)
2) looking for the correct page directory entry (Pointer + Bit 29
to 21)
3) using the PDE to look up the correct page table entry (Bit
31 to 12 of PDE entry + Bit 20 to 12 of linear address)
4) using the PTE to generate physikal address (basically the same way as
in 3, Bit 11 to 0 of linear address)
PDE and PTE leave some space (the LSBs) for managment information, like
r/w, super user page, dirty bit and so on.
(Am I right so far? I hope so... ;)

So that's were, as we say in germany, "the dog is burried" (which might
not make any sense in english ;):
The Kernel/MMU can protect __pages__ by only allowing read access or super
user access or mark them as not available. The MMU generates a exception
and the Kernel can "react" to that (swap in page or send a SIGSEGF signal
to the process in case of "wrong write" and so on). Other processes are
protected because each process has a unique CR3 value which points to it's
page directory, thus it's impossible to access a physical address that
does not belong to the process (the values to calculate those are simply
not in the PDEs/PTEs). Sharing memory is easy, too: Put the correct values
in the PDEs/PTEs and different processes can access the same library at
the same location in physicall memory.


OK, so here are my conclusions, bases on what I have written above:

It is not possible to protect a _single_ byte in phys. memory by means of
the MMU, this can only be done by software (on x86!).
Linux - the kernel - doesn't protect a "single byte in memory" from being
written to if that byte is located in a page that is writeable.
If I malloc(), say, one byte, I get an address "inside" the heap.
Neither the MMU, nor the Linux kernel, nor "malloc()" can/do stop me from
writing to (address + X) unless, by doing that, I generate a linear
address which "leads" to a "different" page which is marked read only. The
Linux kenel and the C Library (C++?) do not provide that strong protection
because (I guess) it's too expensive (speed, space). On the other hand,
"managed code" (Java, C#) can provide such strong protection because the
code runs "inside" a VM which implements the possibility to protect a
single byte (by creating extra managment information for each process).

Sharing libraries: "call printf" is basically a "call <address>", only
that the compiler does not put "printf" somewhere in the executable and
"printf" is a symbolic name for a address inside the code segment of the
executable (unless the executable is linked statically). Instead the
"dynamic loader" "knows" where the code for printf is located, sets up a
PDE (in the Page Directory of the process) which points to a PTE (probably
not in the Page Table of the provess) which points to the location of the
printf code and replaces "printf" in each "call printf" with the correct
values.

Process memory layout: Is it (generally) correct to assume that the
address space for a process is set up like this, when you are looking at
the Page Director Table:

PDT: linear address:
--------------------------------- <- XX1...1X...X (Bit 29 to 21 set)
PDEs to stack pages, r/w
---------------------------------
PDEs marked as
"not valid", access
leads to exception
---------------------------------
PDEs fou heap, r/w
--------------------------------- <- XX0..01?1X..X
PDE for Code, Constants
--------------------------------- <- XX0..0X...X (Bit 29 to 21 not set)

Would it be possible, though, to change this? For example, put data and
stack on bottom and code in the middle, although this might be rather
stupid? I'd only need a compiler and libraries which would generate the
linear address according to the scheme I've mentiond, right? The
kernel/MMU doesn't care which pages I mark as, r/w, read only or
executable, right?
Or wrong, the kernel enforces (somehow) a consistent layout for all
processes?

And finally: when talking 'bout BSD on a Alpha, everything might just be
totally different... ?


OK, I realize this is a long post with probably to much detailed
questions. I do know one can write maybe hundes of pages treating memory
management in generall and on x86 machines. If you tell me: "you got it
all wrong, but it's impossible to explain it here; it takes to much time"
I'm happy with that, then at least I _know_ that I'm wrong. But if I'm
only a "little bit wrong" and someone can give me a short hint ... or
maybe even cut down to the answer to: "Yep, you got it, congrats!", I'd be
really happy! :)

Big thanks in advance for an answer!

Stephan Berger

.



Relevant Pages

  • Re: Protect one single byte on Linux/x86 / sharing libraries / process mem. layout
    ... I did some research on the topic "memory protection" using google, ... PDE and PTE leave some space for managment information, ... The MMU generates a exception ...
    (comp.os.linux.development.system)
  • Re: Cache question...?
    ... loaded segment descriptors that control the valid memory extents. ... they control the valid memory SPACE extents. ... That's a linear address. ... you get a general protection ...
    (comp.lang.asm.x86)
  • Protect one single byte on Linux/x86 / sharing libraries / process mem. layout
    ... I did some research on the topic "memory protection" using google, ... translated to a linear address by "ignoring" everything above bit 31. ... PDE and PTE leave some space for managment information, ...
    (comp.os.linux.development.system)
  • Re: [Lit.] Buffer overruns
    ... or are you talking about the pagein memory instead? ... but no additional memory protection ... Fiddling the storage keys for page protection could interfer ... since with virtual address space architecture fetch protection can be ...
    (sci.crypt)
  • Re: [Lit.] Buffer overruns
    ... > 360/67 had added virtual memory and features like segment sharing to ... > Fiddling the storage keys for page protection could interfer ...
    (sci.crypt)

Loading