Ralf Baechle
2014-10-22 08:34:37 UTC
This question comes up every once in a while and I've also been approac=
hed
during ELCE in D=FCsseldorf why there is no single MIPS kernel for all
platforms, so I thought I should post a writeup on the topic.
The primary reason is that MIPS kernels are using non-PIC kernels. Thi=
s
means code is linked to a particular absolute address. The link addres=
s
depends on the memory range available on a particular system's availabl=
e
memory range - there is no one size that fits all systems, not even a
large fraction of supported systems.
What does it take to make kernels relocatable? A current kernel is not
relocatable. One might do something along the lines of userland where
the dynamic linker is in a similar situation and needs to first relocat=
e
itself before it can perform its actual job.
Two approaches. First keeping the non-PIC code. That requires keeping
the entire relocation. A lasat_defconfig vmlinux is 5733098 bytes but
built with --emit-relocs to keep the reloc information in the final
binary the vmlinux file grows to 7217342 bytes! A quick look at the
reloc sections:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg =
Lk Inf Al
[ 2] .rel.text REL 00000000 461538 0eedf8 08 =
34 1 4
[ 4] .rel__ex_table REL 00000000 550330 0040e0 08 =
34 3 4
[ 8] .rel.rodata REL 00000000 554410 0310e0 08 =
34 7 4
[10] .rel.pci_fixup REL 00000000 5854f0 000998 08 =
34 9 4
[12] .rel__ksymtab REL 00000000 585e88 00b3b0 08 =
34 11 4
[14] .rel__ksymtab_gpl REL 00000000 591238 007180 08 =
34 13 4
[17] .rel__param REL 00000000 5983b8 000858 08 =
34 16 4
[19] .rel__modver REL 00000000 598c10 000038 08 =
34 18 4
[21] .rel.data REL 00000000 598c48 00a130 08 =
34 20 4
[23] .rel.init.text REL 00000000 5a2d78 00f008 08 =
34 22 4
[25] .rel.init.data REL 00000000 5b1d80 001d08 08 =
34 24 4
[27] .rel.exit.text REL 00000000 5b3a88 000b78 08 =
34 26 4
The approach could probably be optimized but as a first order approxima=
tion
this demonstrates there would be plenty of bloat to the binary. Positi=
ve
side of this approach: no runtime penalty.
Alternatively: make the kernel PIC code. Over the thumb that'd going t=
o
inflate the kernel by 10 or 15%. Less than above approach but there'd
also be significant runtime overhead. Probably nothing for a world whe=
re
benchmarks like network performance on 64 byte packets decide on the
fate of a product on the market.
Obviously there is the difference between 32 and 64 bit kernels. 64 bi=
t
uses additional instructions that are not available on 32 bit processor=
s
and using just 32 bit instructions won't fly on 64 bit kernels.
Hardware detection. That's all easy in a device tree world but in all
reality many of the existing systems don't support device tree yet so a
generic kernel would have to figure out what platform it's running on
which would end up in something like an ISA style device probe.
Ralf
hed
during ELCE in D=FCsseldorf why there is no single MIPS kernel for all
platforms, so I thought I should post a writeup on the topic.
The primary reason is that MIPS kernels are using non-PIC kernels. Thi=
s
means code is linked to a particular absolute address. The link addres=
s
depends on the memory range available on a particular system's availabl=
e
memory range - there is no one size that fits all systems, not even a
large fraction of supported systems.
What does it take to make kernels relocatable? A current kernel is not
relocatable. One might do something along the lines of userland where
the dynamic linker is in a similar situation and needs to first relocat=
e
itself before it can perform its actual job.
Two approaches. First keeping the non-PIC code. That requires keeping
the entire relocation. A lasat_defconfig vmlinux is 5733098 bytes but
built with --emit-relocs to keep the reloc information in the final
binary the vmlinux file grows to 7217342 bytes! A quick look at the
reloc sections:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg =
Lk Inf Al
[ 2] .rel.text REL 00000000 461538 0eedf8 08 =
34 1 4
[ 4] .rel__ex_table REL 00000000 550330 0040e0 08 =
34 3 4
[ 8] .rel.rodata REL 00000000 554410 0310e0 08 =
34 7 4
[10] .rel.pci_fixup REL 00000000 5854f0 000998 08 =
34 9 4
[12] .rel__ksymtab REL 00000000 585e88 00b3b0 08 =
34 11 4
[14] .rel__ksymtab_gpl REL 00000000 591238 007180 08 =
34 13 4
[17] .rel__param REL 00000000 5983b8 000858 08 =
34 16 4
[19] .rel__modver REL 00000000 598c10 000038 08 =
34 18 4
[21] .rel.data REL 00000000 598c48 00a130 08 =
34 20 4
[23] .rel.init.text REL 00000000 5a2d78 00f008 08 =
34 22 4
[25] .rel.init.data REL 00000000 5b1d80 001d08 08 =
34 24 4
[27] .rel.exit.text REL 00000000 5b3a88 000b78 08 =
34 26 4
The approach could probably be optimized but as a first order approxima=
tion
this demonstrates there would be plenty of bloat to the binary. Positi=
ve
side of this approach: no runtime penalty.
Alternatively: make the kernel PIC code. Over the thumb that'd going t=
o
inflate the kernel by 10 or 15%. Less than above approach but there'd
also be significant runtime overhead. Probably nothing for a world whe=
re
benchmarks like network performance on 64 byte packets decide on the
fate of a product on the market.
Obviously there is the difference between 32 and 64 bit kernels. 64 bi=
t
uses additional instructions that are not available on 32 bit processor=
s
and using just 32 bit instructions won't fly on 64 bit kernels.
Hardware detection. That's all easy in a device tree world but in all
reality many of the existing systems don't support device tree yet so a
generic kernel would have to figure out what platform it's running on
which would end up in something like an ISA style device probe.
Ralf