/proc/$pid/pagemap troubles



Since the pagemap code has a little header on it to help describe the
format, I wrote a little c program to parse its output. I get some
strange results. If I do this:

fd = open("/proc/1/pagemap", O_RDONLY);
count = read(fd, &endianness, 1);

count will always be 4.

hexdump gets similar, but even worse results:

qemu:~# strace hexdump -C /proc/self/pagemap
...
read(0, "\1\f\4\4\377\377\377\377\377\377\377\377\377\377\377\377"..., 16) = 20
read(0, 0x804d39c, 4294967292) = -1 EFAULT (Bad address)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

Note that the kernel returns 20 to the read request of 16. I think the
kernel is actually copying over something important in hexdump's memory
which is adjacent to the buffer and causing it to segfault.

The code is basically organized not to output the right thing for any
unaligned access, and it apparently gets confused about exactly what
userspace has asked for. I think this is largely due to its overwriting
of "count" in pagemap_read().

So, a couple of questions. Don't we need to support non-sizeof(unsigned
long)-aligned reads?

Do we _really_ need that header in each and every file?

* first byte: 0 for big endian, 1 for little

Do we ever have cases where userspace and kernel differ in their
endianness? Or, are you hoping to dump these files raw on one
architecture and parse them on another? I would think it makes more
sense to get this from elsewhere. Or, should we just output in network
byte order and be done with it?

If anything, we could put it in /proc/$pid/status.

* second byte: page shift (eg 12 for 4096 byte pages)

This might actually (in theory) change on a per-process basis, so it
makes sense. But, it seems more global to the process that just pagemap
output. Would this always be the same as getpagesize()? Or, should it
always map 1:1 with the amount of memory mapped by a kernel pte_t. I
_think_ these can be slightly different because we have 64k PAGE_SIZE on
ppc64, but allow mappings to happen in 4k

* third byte: entry size in bytes (currently either 4 or 8)

This one really boils down to "what is the kernel's sizeof(unsigned
long)" because we'll always store pfns in those. It seems like we
should have a better way to go fetch that.

* fourth byte: header size

If we can get rid of the other three this, of course, goes away.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • RE: Linux GPL and binary module exception clause?
    ... the COPYING file for the kernel. ... the suggestion has been to have a separate copy of the header ... > with the Cygwin toolkit for Windows. ... > using the Cygwin libraries is a derived work and must be GPL. ...
    (Linux-Kernel)
  • Re: [RFC] Splitting kernel headers and deprecating __KERNEL__
    ... Andersen of the busybox project comes down on the other side of. ... between known kernel versions. ... linux/version.h isn't considered a kernel header? ... and is in generally much cleaner than deleting all that crap and just ...
    (Linux-Kernel)
  • Re: compiling kernel files with gcc
    ... abhijit kumar wrote: ... |header in userspace; ... |i.e. it is n't correctly compiled and n't able to include some header ... |"#warning "don't include kernel headers in userspace". ...
    (Fedora)
  • Re: [Fastboot] Re: Query: Kdump: Core Image ELF Format
    ... KVA for linearly mapped region is known in advance and can ... >> be filled at header creation time and gdb can directly operate upon this ... And before a kernel crashes looks like a reasonable ... > not the user who is capturing the system core dump. ...
    (Linux-Kernel)
  • Re: drop SECURITY_FILE_CAPABILITIES?
    ... the version in the header, why can't the kernel just say oh you want ... If user space is passing a NULL for the cap_user_data_t argument, ... if you supply incorrect magic... ...
    (Linux-Kernel)