Graphics Pipelines for Next Generation Mixed Reality Systems

Lead Research Organisation: UNIVERSITY COLLEGE LONDON

Department Name: Computer Science

Abstract

In the past twenty years display technology has moved on considerably
with improvements such as higher resolution, faster frame rates and
high-dynamic range colour. In the same period graphics processing
units (GPUs or graphics cards) have become significantly faster with
broader functionality. However, we argue that current implementations
of the traditional graphics pipeline, which is based on the
rasterisation and z-buffering, are unsuited to emerging displays.

In particular, the types of near-eye displays used for augmented
reality and virtual reality provide new challenges. Since their
contexts of use are very different, it is not even clear what
properties are ideal. Should the displays be multi-focal or
vari-focal? How fast do display update rates need to be to support
registered augmented reality systems? How do we exploit the properties
of the human visual system to render more efficiently?

The traditional computer graphics pipeline puts emphasis on generating
full images at the display size and rate, where that display might be
1920x1080 at 60Hz or higher. For a near-eye display, it is clear that
such a pipeline is nowhere near suitable: already one consumer HMD is
demanding 2480x1080 at 90Hz, roughly double the bandwidth that a common
desktop display requires. At these rates, there will still be problems
with visual acuity and latency (e.g. there is an inherent 11ms display
lag). Future HMDs are touted with resolutions of 8K pixels across, but
this will require a very significant increase in graphics compute and
thus power consumption. Not only is this expensive, it could limit the
form factor of the device and thus user-acceptance of the technology.
There is very exciting work going on in display technologies at the
moment. For example, Varjo (a project partner) are building a HMD that
has a moving display insert that tracks the eye; Facebook have built a
demonstration display that uses a focal surface to generate multiple
focal depths in the same image.

In comparison, the graphics pipeline is taken mostly as a given.
Recently, the proposers, along with a small group of colleagues in the
field, have started to the challenge the status quo. We don't propose
to ditch the highly optimised compute units in graphics cards, but
rather to study frameworks within they can be exploited more readily.
We believe that by reformulating the graphics pipeline and paying
attention to the very specific needs of near-eye displays, we can
radically reduce the power required from GPUs, and thus make near-eye
display more usable.

We will focus on three connected challenges that we have labelled the
latency, redundancy and bandwidth challenges. First, we will target
extremely low latency displays. We will develop systems that achieve
>1000 fps visual output, with latency under 1ms and study how these
impact visual response. Second, we will explore stronger decoupling of
frame-based rendering from display. We note that in near-eye displays
most pixels are wasted, and thus we target novel spatial and temporal
algorithms that reduce redundancy. Third, to exploit redundancy more
generally, we need to use it to reduce bandwidth between graphics card
and display. Taking inspiration from the concept of surface light
fields, our concept of ambient fields will render to buffers that are
expected to be valid for re-rendering to the user for 10-100ms.

In summary, we believe that the current graphics pipeline and its
associated implementation in GPUs is unsuited to drive near-eye
displays. We want near-eye displays to be low-cost, power efficient
and highly acceptable to users. To achieve this, we propose new
algorithms that can use GPU capabilities more effectively. We target
reducing redundant compute to enable lower latency and lower bandwidth
requirements through the graphics pipeline.

Planned Impact

Near-eye displays pose a huge challenge to computer graphics hardware.
As noted in the summary, we cannot currently drive displays with the
resolutions and the frame rates that are projected to be needed for
truly comfortable and efficient near-eye displays. Thus we expect the
main impact of our project to a substantive piece of the technology
required to enable this class of display to flourish.

The vision of near-eye displays is compelling. Near-eye displays will
inherently be smaller and lighter than other displays of display, and
thus they have the potential to require less power to run and less
materials to manufacture. They can achieve this because the light
generated will not be "wasted" by scattering around the environment as
current desktop monitors and mobile displays do. The display will be
individual, and reacting to visual focus and vergence. One can then
just consider AR and VR to be applications that exploit these displays
by exploiting egocentric rendering that is grounded in 1st person
tracking. Furthermore, near-eye displays have the potential to be
custom designed to compensate for the changing visual capabilities of
individuals. Near-eye displays can correct for a variety of visual
abberations, such as near-sightedness, so they have the potential to
be more usable than alternative screen technologies and thus alleviate
some uses of traditional eye-wear.

As we have argued, the promise of near-eye displays cannot be realised
without a corresponding revolution in image generation. Near-eye
displays will need to be extremely fast, but not waste
bandwidth. Fortunately we will know where the eyes are looking and, as
we have argued in the main proposal, we can exploit new spatial and
temporal sampling algorithms. Thus one key observable metric and
impact will be that we can generated perceptually equivalent images,
at much lower power that traditional pipelines. Thus we expect to
enable near-eye displays by providing lower power consumption
rendering algorithms.

Further, we can generate images that are more perceptually correct.
The key observation here is that current image generation pipelines,
even those with post-render warping, cannot compensate for display
latency (not render latency) effects. Modern displays have
persistence, and while we could compensate for this by having a very
fast refresh rate, this requires more frames and more power. If,
instead, we focus on perceptual metrics, we can exploit redundancy and
tune algorithms to fit perceptual requirements. Thus a second key part
that we enable is more perceptually correct image generation for
near-eye displays.

We will demonstrate this impact with practical implementations in the
laboratory. We will work with our project partner Varjo to enable some
of our algorithms into their development kits so that we can
demonstrate very practically the reduction in time and improvement in
quality of images. This type of practical demonstration is very
important to our group (e.g. see [7]).

To secure commercial impact, our pathways to impact statement includes
an intellectual property strategy, fleshes out our plans for practical
demonstration, discusses media and outreach, and plans to engage more
broadly with industry.

Overall, the impact of the project will be to improve the
acceptability of near-eye displays by enabling more efficient and
lower latency rendering. This will then impact society in a very broad
way as displays are now a key component of our everyday experience.

Funded Value:

£835,710

Funded Period:

Apr 20 - Aug 24

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/T01346X/1

Principal Investigator:

Anthony Steed

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Computer Graphics & Visual. (90%)

Networks & Distributed Systems (10%)

Organisations

People	ORCID iD
Anthony Steed (Principal Investigator)
Tobias Ritschel (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Aksit K (2022) Perceptually guided computer-generated holography

Dos Anjos R (2023) Metameric Inpainting for Image Warping in IEEE Transactions on Visualization and Computer Graphics

Friston S (2023) Exploring Server-Centric Scalability for Social VR

Friston S (2023) Extending the Open Source Social Virtual Reality Ecosystem to the Browser in Ubiq

Kohli P (2022) Beyond Flicker, Beyond Blur: View-coherent Metameric Light Fields for Foveated Display

Steed A (2022) Ubiq-exp: A toolkit to build and run remote and distributed mixed reality experiments in Frontiers in Virtual Reality

Walton D (2022) Metameric Varifocal Holograms

Walton D (2021) Beyond blur real-time ventral metamers for foveated rendering in ACM Transactions on Graphics

Key Findings
Impact Summary
Intellectual Property
Engagement Activities


Description	We established a highly novel algorithm for creating metamerised images in the periphery. These are the common type of colour metamer, but rely on neural processing of patches in the proposal. We then applied this to lightfields and holograms. We have built a prototype display that generate metamers directly.
Exploitation Route	We are looking to patent the work, and have discussed starting a company to seek funding to build hardware to embody the work.
Sectors	Creative Economy Digital/Communication/Information Technologies (including Software) Electronics
URL	https://vr-unity-viewer.cs.ucl.ac.uk/


Description	We have been investigating submitting a patent on the newer materials. There has been initial interest in the IP, but we need to develop a potential customer before we will submit a formal application, due to funding limits for UCLBusiness. We have used the material in funding applications to industrial funders, and these relationships are developing.
First Year Of Impact	2022
Sector	Creative Economy,Electronics
Impact Types	Economic


Title	Metamerisation of Images
Description	A method for creating a metamer for an image for a display, the method comprising receiving a first input image, dividing the input image into a plurality of regions comprising a foveal region and at least one peripheral region, wherein each region of the plurality of regions comprises a plurality of pixels, determining, for each of the at least one peripheral region, the distribution of statistics associated with each of the at least one peripheral region, for each of the at least one peripheral region, identifying a metamer for the peripheral region wherein the metamer has similar and/or identical distribution of statistics to the associated peripheral region, and creating an output image based on the foveal region and the metamer for each of the at least one peripheral region such that the peripheral region of the output image is perceived to be the same as the peripheral region of the input image when perceived by a viewer of the image.
IP Reference	US2024355016
Protection	Patent / Patent application
Year Protection Granted	2024
Licensed	No


Description	VR Club
Form Of Engagement Activity	Participation in an open day or visit at my research institution
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Public/other audiences
Results and Impact	We run a montly open afternoon in our laboratories. Approximately 20-40 people attend each month, with attendees ranging from students at UCL, staff at UCL, through to general public, including company representatives, who had requested demonstrations. This furthered several small collaborations with industry and artists.
Year(s) Of Engagement Activity	2019,2020,2021,2022,2023,2024

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications