Star64_linux/Documentation
Hugh Dickins 1be7107fbe mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19 21:50:20 +08:00
..
ABI
accounting
acpi Merge branches 'acpi-button' and 'acpi-tools' 2017-05-22 20:29:06 +02:00
admin-guide mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
aoe
arm
arm64
auxdisplay
backlight
blackfin
block
blockdev
bus-devices
cdrom
cgroup-v1
cma
connector
console
core-api
cpu-freq cpufreq: intel_pstate: Document the current behavior and user interface 2017-05-14 02:06:03 +02:00
cpuidle
cris
crypto
dev-tools
device-mapper
devicetree USB fixes for 4.12-rc5 2017-06-11 11:23:10 -07:00
dmaengine
doc-guide
DocBook
driver-api
driver-model
early-userspace
EDID
extcon
fault-injection
fb
features
filesystems Tigran has moved 2017-05-12 15:57:15 -07:00
firmware_class
fmc
fpga
frv
gpio
gpu
hid
hwmon
i2c
ia64
ide
iio
infiniband
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2017-05-26 16:45:13 -07:00
ioctl
isdn
kbuild
kdump
laptops
leds
lightnvm
livepatch
locking
m68k
md
media
memory-devices
metag
mic
mips
misc-devices
mmc
mn10300
mtd
namespaces
netlabel
networking net: fix up hash documentation 2017-06-07 13:00:41 -04:00
nfc
nios2
nvdimm
nvmem
parisc
PCI
pcmcia
perf
phy
platform
power
powerpc
pps
prctl
process
pti
ptp
rapidio
RCU
s390
scheduler
scsi
security
serial
sh
sound ALSA: hda - Update the list of quirk models 2017-05-22 16:42:02 +02:00
sparc
sphinx
sphinx-static
spi
sysctl
target
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2017-05-12 11:58:45 -07:00
timers
trace
translations
usb doc-rst: fixed kernel-doc directives in usb/typec.rst 2017-05-17 11:52:44 +02:00
userspace-api
virtual
vm
w1
watchdog iTCO_wdt: all versions count down twice 2017-05-19 10:42:11 +02:00
wimax
x86
xtensa
.gitignore
00-INDEX
bcache.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
cgroup-v2.txt mm, docs: update memory.stat description with workingset* entries 2017-05-12 15:57:16 -07:00
Changes
circular-buffers.txt
clk.txt
CodingStyle
conf.py
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff
efi-stub.txt
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst
Intel-IOMMU.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-doc-nano-HOWTO.txt
kernel-per-CPU-kthreads.txt
kobject.txt
kprobes.txt
kref.txt
kselftest.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lzo.txt
mailbox.txt
Makefile
Makefile.sphinx
memory-barriers.txt Connect the newly RST-formatted documentation to the rest; this had to wait 2017-05-11 11:29:52 -07:00
memory-hotplug.txt
men-chameleon-bus.txt
nommu-mmap.txt
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt
pnp.txt
preempt-locking.txt
printk-formats.txt
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
static-keys.txt
SubmittingPatches
svga.txt
switchtec.txt
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt
vfio.txt
video-output.txt
xillybus.txt
xz.txt
zorro.txt