William Wang

I'm a researcher at Arm Research in Cambridge, where I lead the systems research on non-volatile memories. My current research focus is on addressing persistent memory programming challenges with the right architectural and microarchitectural support.

If you're interested in finding out more about my research area, an overview can be found in my recent keynote [slides|video] at PIRL'20 in October 2020, as well as my more recent invited talk [slides] at VCEW'21 in June 2021 and talk [slides] at Dagstuhl Seminar 21462 in November 2021, as well as my invited talk [slides] at NANDA Workshop in September 2022. You can also find a few blogs I wrote to introduce research publications in microarchitectural support, architectural support, and language support of persistent programming.

Selected Publications

HPCA'21, March 2021

The gem5 Simulator: Version 20.0+

Research Infrastructure

arXiv, September 2020

Relaxed Persist Ordering Using Strand Persistency

Addressing NVM Programming Challenges

ISCA'20, June 2020

Hardware Implementation of Strand Persistency

Addressing NVM Programming Challenges

NVMW'20, March 2020

Language support for memory persistency

Addressing NVM Programming Challenges

IEEE Micro Top Picks, May-June 2019

Strand Persistency

Addressing NVM Programming Challenges

NVMW'19, March 2019

Software Wear Management for Persistent Memories

Addressing NVM Endurance Challenges

FAST'19, February 2019

Quantifying the performance overheads of PMDK

Addressing NVM Programming Challenges

MEMSYS'18, October 2018

Persistency for synchronization-free regions

Addressing NVM Programming Challenges

PLDI'18, June 2018
MEMSYS'17, October 2017
MEMSYS'15, October 2015
IISWC'13, September 2013

Selected Patents


Processor Architecture,UK App No. 2218617.5

Architectural support for synchronization of asynchronous instructions in a dataflow triggered instruction set architecture.

December 2022


Processor Architecture - Hybrid Dataflow/vonNeumann Architecture, US17/942,554

An apparatus, method and computer program, the apparatus comprising processing circuitry to execute instructions, issue circuitry to issue the instructions for execution by the processing circuitry, and candidate instruction storage circuitry to store a plurality of condition- dependent instructions, each specifying at least one condition. The issue circuitry is configured to issue a given condition-dependent instruction in response to a determination or a prediction of the at least one condition specified by the given condition-dependent instruction being met, and when the given condition-dependent instruction is a sequence-start instruction, the issue circuitry is responsive to the determination or prediction to issue a sequence of instructions comprising the sequence-start instruction and at least one subsequent instruction.

September 2022


Processor Architecture,UK Patent No.: GB2605796

April 2021

Systems and Methods for Defining and Enforcing Ordered Constraints

Processor Architecture,US Patent No.:11,740,908

In a particular implementation, a method includes: receiving, at a computing device, first and second instructions of a plurality of instructions obtained from a memory, where the first instruction corresponds to a preceding instruction of a second instruction, and where the second instruction corresponds to a succeeding instruction of the first instruction; determining a dependency of the first and second instructions; sending the first and second instructions to an issue queue of the computing device; executing, at the computing device, the first and second instructions; and completing, at the computing device, the first and second instructions.

December 2020

Draining Operation to Cause Store Data to be Written to Persistent Memory

Processor Microarchitecture,US Patent No.: 11,513,962

October 2020

Draining Operation for Draining Dirty Cache Lines to Persistent Memory

Processor Microarchitecture,UK Patent No.: GB2598784

September 2020

Hardware Bitvector Techniques

Processor Microarchitecture,US Patent No.:11,409,617

May 2020

Task-Aware Checkpointing for Intermittent Compute

Processor Microarchitecture, UK Patent No.: GB2591813

February 2020

Instruction ordering

Processor Architecture, US Patent No.: 10,956,166

March 2019

Method and apparatus for implementing lock-free data structures

Processor Architecture, US Patent No.: 10,866,890

November 2018

Robust Transactional Memory

Processor Microarchitecture, US Patent No.: 10,445,238

April 2018

Energy conservation for memory applications

Memory System, US Patent No. 11,182,106

March 2018

Multi-tier cache placement mechanism

Processor Microarchitecture, US Patent No.: 10,831,678

November 2017

Initialisation of a storage device

Memory System, US Patent No.: 11,137,919

October 2017

Apparatus and method of handling caching of persistent data

Memory System, US Patent No.: 10,642,743

June 2017

Initialisation of a storage device

Memory System, UK Patent No.: 2561011, US Patent No.: 10,964,386

March 2017

Control of refresh operation for memory regions

Memory System, UK Patent No.: 2560968, US Patent No.: 10,847,204

March 2017

Non-volatile buffer for memory operations

Processor Microarchitecture, Memory System, US Patent No.: 10,719,236

November 2015

Research Interests

In addition to architectural and microarchitectural support for persistent memory, I've looked into programming lanaguages and operating systems support for persistent memory, and thought about tackling translation, protection and persistence holistically. Apart from systems research related to non-volatile memory, I've also looked into novel use cases of emerging non-volatile memories, ranging from off-chip to on-chip memory use cases, and from uses as memory to uses as analog computing device for machine learning and bioinformatics.

Outside of memory, I'm interested in exploring pushing up the boundary between hardware and software, e.g., from RISC to EDGE, just as non-volatile memory shifts up the boundary between volatility and non-volatility from memory/storage to caches/memory. You can find some of my thoughts in this direction in the short write-up (中文), and my patents related to distributed memory spatial architectures such as the hybrid dataflow architecture with triggered instruction set architecture. In addition to looking up and looking forward in computer architecture, in my spare time, I also enjoy looking back (and looking down the stacks) of past computer architectures, here's an example collection of computer architectures that I curated and researched.

In the past, I've also led the project on accelerating genomics on Arm, you can find the invited talk that I gave at Cambridge University Computer Laboratory on Genomics at Arm, in which the bit shuffle ISA extensions were accepted to SVE2, also a short abstract on Hardware Accelerators for Genomic Data Processing. In addition, I've spent a couple of years working on the gem5 simulator around 2010, refactoring the memory subsystems and verifying the AArch64 ISA support (including getting Linux to boot on gem5 AArch64), alongside many system and CPU studies with the simulator.

Please get in touch if you are interested in exploring collaboration opportunities. At present, I'm very fortunate to work with talented PhD students Sivert Sliper and Dmitrii Ustiugov as their industrial supervisor, along with their academic supervisors.