ICCAD 2016 TOC

Full Citation in the ACM Digital Library

Scope – quality retaining display rendering workload scaling based on user-smartphone
distance

Nixon
Kent W.

Modern smartphone display system come equipped with powerful GPU’s capable of rendering
advanced 2D and 3D graphics. These GPU’s make up a significant portion of the system
power profile due to the high resolution and framerate of smartphone display. …

NVSim-CAM: a circuit-level simulator for emerging nonvolatile memory based content-addressable
memory

Li
Shuangchen

Ternary Content-Addressable Memory (TCAM) is widely used in networking routers, fully
associative caches, search engines, etc. While the conventional SRAM-based TCAM suffers
from the poor scalability, the emerging nonvolatile memories (NVM, i.e., MRAM, …

Design technology for fault-free and maximally-parallel wavelength-routed optical
networks-on-chip

Peano
Andrea

The recent interest in emerging interconnect technologies is bringing the issue of
a proper EDA support for them to the forefront, so to tackle the design complexity.
A relevant case study is provided by wavelength-routed optical NoCs (WRONoCs), which
…

Fast generation of lexicographic satisfiable assignments: Enabling canonicity in SAT-based applications

Petkovska
Ana

Lexicographic Boolean satisfiability (LEXSAT) is a variation of the Boolean satisfiability
problem (SAT). Given a variable order, LEXSAT finds a satisfying assignment whose
integer value under the given variable order is minimum (maximum) among all …

Analytic approaches to the collapse operation and equivalence verification of threshold
logic circuits

Lee
Nian-Ze

Threshold logic circuits gain increasing attention due to their feasible realization
with emerging technologies and strong bind to neural network applications. In this
paper, for logic synthesis we formulate the fundamental operation of collapsing …

A flash-based digital circuit design flow

Abusultan
Monther

Traditionally, floating gate (flash) transistors have been used exclusively to implement
non-volatile memory in its various forms. Recently, we showed that flash transistors
can be used to implement digital circuits as well. In this paper, we present …

MrDP: multiple-row detailed placement of heterogeneous-sized
cells for advanced nodes

Lin
Yibo

As VLSI technology shrinks to fewer tracks per standard cell, e.g., from 10-track
to 7.5-track libraries (and lesser for 7nm), there has been a rapid increase in the
usage of multiple-row cells like two- and three-row flip-flops, buffers, etc., for
…

OWARU: free space-aware timing-driven incremental placement

Jung
Jinwook

This paper proposes a powerful new technique called “OWARU”¹ that re-places and re-sizes multiple gates simultaneously to improve the most critical
paths of a design. In essence, it is an incremental timing-driven placement technique
integrated with …

Detailed placement for modern FPGAs using 2D dynamic programming

Dhar
Shounak

In this paper, we propose a 2-dimensional dynamic programming (DP) based detailed
placement algorithm for modern FPGAs for wirelength and timing optimization. By tuning
a control parameter, our algorithm can perform fast heuristic or exact optimization.
…

Security and privacy threats to on-chip non-volatile memories and countermeasures

Ghosh
Swaroop

Non-volatile memories (NVMs) such as Spin-Transfer Torque RAM (STTRAM) have drawn
significant attention due to complete elimination of bitcell leakage. In addition
to the plethora of benefits such as density, non-volatility, low-power and high speed,
…

Security engineering of nanostructures and nanomaterials

Shahrjerdi
D.

Proliferation of electronics and their increasing connectivity pose formidable challenges
for information security. At the most fundamental level, nanostructures and nanomaterials
offer an unprecedented opportunity to introduce new approaches to …

Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks

Zhang
Chen

With the recent advancement of multilayer convolutional neural networks (CNN), deep
learning has achieved amazing success in many areas, especially in visual content
understanding and classification. To improve the performance and energy-efficiency
of …

Re-architecting the on-chip memory sub-system of machine-learning accelerator for
embedded devices

Wang
Ying

The rapid development of deep learning are enabling a plenty of novel applications
such as image and speech recognition for embedded systems, robotics or smart wearable
devices. However, typical deep learning models like deep convolutional neural …

A data locality-aware design framework for reconfigurable sparse matrix-vector multiplication
kernel

Li
Sicheng

Sparse matrix-vector multiplication (SpMV) is an important computational kernel in
many applications. For performance improvement, software libraries designated for
SpMV computation have been introduced, e.g., MKL library for CPUs and cuSPARSE library …

Compact oscillation neuron exploiting metal-insulator-transition for neuromorphic
computing

Chen
Pai-Yu

The phenomenon of metal-insulator-transition (MIT) in strongly correlated oxides,
such as NbO₂, have shown the oscillation behavior in recent experiments. In this work, the MIT
based two-terminal device is proposed as a compact oscillation neuron for …

A new tightly-coupled transient electro-thermal simulation method for power electronics

Chen
Quan

This paper presents a new transient electro-thermal (ET) simulation method for fast
3D chip-level analysis of power electronics with field solver accuracy. The metallization
stacks are meshed and solved with 3D field solver using nonlinear temperature-…

A tensor-based volterra series black-box nonlinear system identification and simulation
framework

Batselier
Kim

Tensors are a multi-linear generalization of matrices to their d-way counterparts, and are receiving intense interest recently due to their natural
representation of high-dimensional data and the availability of fast tensor decomposition
algorithms. …

Efficient statistical analysis for correlated rare failure events via asymptotic probability
approximation

Yu
Handi

In this paper, a novel Asymptotic Probability Approximation (APA) method is proposed
to estimate the overall rare probability of correlated failure events for complex
circuits containing a large number of replicated cells (e.g., SRAM bit-cells). The
key …

Duplex: simultaneous parameter-performance exploration for optimizing analog circuits

Ahmadyan
Seyed Nematollah

We present Duplex random tree search, an algorithm to optimize performance metrics
of analog and mixed signal circuits. Duplex determines the optimal design, the Pareto
set and the sensitivity of circuit’s performance metrics to its parameters. We …

Improved flop tray-based design implementation for power reduction

Kahng
Andrew B.

Clock network power reduction is critical in modern SoC designs. Application of flop trays (i.e., multi-bit flip-flops) can significantly reduce the number of sinks in a clock
network, and thus reduce the number of clock buffers, clock wirelength, and …

RC-aware global routing

Scheifele
Rudolf

We address the problem of incorporating RC delay constraints into global routing.
In contrast to the usual global routing approach that focuses on minimizing net length
while obeying constraints given by other tools such as layer assignments, our method
…

Scalable, high-quality, SAT-based multi-layer escape routing

Bayless
Sam

Escape routing for Printed Circuit Boards (PCBs) is an important problem arising from
modern packaging with large numbers of densely spaced pins, such as BGAs. Single-layer
escape routing has been well-studied, but large, dense BGAs often require …

Redistribution layer routing for integrated fan-out wafer-level chip-scale packages

Lin
Bo-Qiao

The integrated fan-out (InFO) wafer-level chip-scale package (WLCSP) s an emerging
packaging technology, which typically consists of multiple redistribution layers (RDLs)
for signal redistributions among multiple chips. There is still no published work
…

The architecture value engine: measuring and delivering sustainable SoC improvement

Carballo
Juan-Antonio

The value of semiconductor-based systems continues to increase rapidly especially
when considering the cost associated with building it. As such, Moore’s Law has become
a law associated broadly with value growth instead of pure performance growth. While
…

Circuit valorization in the IC design ecosystem

de Gyvez
José Pineda

Staying at the forefront of research, or in the top tier product market requires circuit
innovation as a key differentiation. We are entering an era where more than Moore
is becoming increasingly evident, not only because of the physical limitations of
…

Interconnect-aware device targeting from PPA perspective

Badaroglu
Mustafa

CMOS scaling so far enabled simultaneous system throughput scaling by concurrent improvements
in delay, power, and area with thanks to Moore’s law. CMOS scaling becomes more difficult
with the limits of interconnect and increasing wafer cost. It is …

Measuring progress and value of IC implementation technology

Kahng
Andrew B.

Over the past decade, “Moore’s Law” has become increasingly well-understood as being
a law of “value scaling”: success of new electronics- and semiconductor-based products
depends on improved cost-efficiency, utility, and value. Design Automation (DA) …

Provably secure camouflaging strategy for IC protection

Li
Meng

The advancing of reverse engineering techniques has complicated the efforts in intellectual
property protection. Proactive methods have been developed recently, among which layout-level
IC camouflaging is the leading example. However, existing …

CamoPerturb: secure IC camouflaging for minterm protection

Yasin
Muhammad

Integrated circuit (IC) camouflaging is a layout-level technique that thwarts reverse
engineering attacks on ICs by introducing camouflaged cells that look alike, but can
implement one of many possible Boolean functions. Existing camouflaging techniques
…

Chip editor: leveraging circuit edit for logic obfuscation and trusted fabrication

Shakya
Bicky

The globalization of the semiconductor foundry business poses grave risks in terms
of intellectual property (IP) protection, especially for critical applications. Over
the past few years, several techniques have been proposed that allow manufacturing
of …

Arbitrary streaming permutations with minimum memory and latency

Koehn
Thaddeus

Streaming architectures are a popular choice for data intensive application due to
their high throughput requirements. When assembling components for a streaming application,
it is often necessary to build translation blocks between them to match the …

Multibank memory optimization for parallel data access in multiple data arrays

Yin
Shouyi

To realize high throughput out of a relatively low bandwidth, memory partitioning
algorithms have been proposed to separate data arrays into multiple memory banks,
from which multiple data can be accessed in parallel. However, previous partitioning
…

Allocation of multi-bit flip-flops in logic synthesis for power optimization

Yi
Dongyoun

In this paper, a new approach to the problem of allocating multi-bit flip-flops for
data storage is presented. Previous approaches divide the allocation problem into
two separate steps: (i) placing single-bit flip-flops under circuit timing constraints
…

Model-based design of resource-efficient automotive control software

Chang
Wanli

Automotive platforms today run hundreds of millions of lines of software code implementing
a large number of different control applications spanning across safety-critical functionality
to driver assistance and comfort-related functions. While such …

Testing automotive embedded systems under X-in-the-loop setups

Tibba
Ghizlane

The development of automotive electronics and software systems is often associated
with high costs due to their multi-domain nature (including control engineering, electronics,
hydraulics, mechanics, etc). The involvement of these different disciplines …

Efficient statistical validation of machine learning systems for autonomous driving

Shi
Weijing

Today’s automotive industry is making a bold move to equip vehicles with intelligent
driver assistance features. A modern automobile is now equipped with a powerful computing
platform to run multiple machine learning algorithms for environment …

CONVINCE: a cross-layer modeling, exploration and validation framework for next-generation connected
vehicles

Zheng
Bowen

Next-generation autonomous and semi-autonomous vehicles will not only precept the
environment with their own sensors, but also communicate with other vehicles and surrounding
infrastructures for vehicle safety and transportation efficiency. The design, …

Overview of the 2016 CAD contest at ICCAD

Huang
Shih-Hsu

The CAD Contest at ICCAD is a challenging, multi-month competition, focusing on advanced,
real-world problems in the field of Electronic Design Automation (EDA). In its fifth
year, the 2016 CAD Contest at ICCAD attracted 135 teams from 11 regions/…

ICCAD-2016 CAD contest in large-scale identical fault search

Wei
Tangent

Injecting faults into designs is a way to qualify a verification environment. To improve
the performance of a qualifying process, we need to remove identical faults. The problem
will provide some faulty design cases; the contestants must identify all …

ICCAD-2016 CAD contest in non-exact projective NPNP boolean matching and benchmark
suite

Wu
Chi-An (Rocky)

Boolean Matching is significant to industry applications, such as library binding,
synthesis, engineer change order, and hardware Trojan detection. Instead of basic
Boolean matching, Non-exact Projective NPNP Boolean Matching allows to match two designs
…

ICCAD-2016 CAD contest in pattern classification for integrated circuit design space
analysis and benchmark suite

Topaloglu
Rasit O.

Layout pattern classification has been utilized in recent years in integrated circuit
design towards various goals such as design space analysis, design rule generation,
and systematic yield optimization. There is a need for open source or academic …

OpenDesign flow database: the infrastructure for VLSI design and design automation
research

Jung
Jinwook

Recently, there have been a slew of design automation contests and released benchmarks.
ISPD place & route contests, DAC placement contests, timing analysis contests at TAU
and CAD contests at ICCAD are good examples in the past and more of new contests …

Malicious LUT: a stealthy FPGA trojan injected and triggered by the design flow

Krieg
Christian

We present a novel type of Trojan trigger targeted at the field-programmable gate
array (FPGA) design flow. Traditional triggers base on rare events, such as rare values
or sequences. While in most cases these trigger circuits are able to hide a Trojan
…

On detecting delay anomalies introduced by hardware trojans

Ismari
D.

A hardware Trojan (HT) detection method is presented that is based on measuring and
detecting small systematic changes in path delays introduced by capacitive loading
effects or series inserted gates of HTs. The path delays are measured using a high
…

An optimization-theoretic approach for attacking physical unclonable functions

Liu
Yuntao

Physical unclonable functions (PUFs) utilize manufacturing ariations of circuit elements
to produce unpredictable response to any challenge vector. The attack on PUF aims
to predict the PUF response to all challenge vectors while only a small number of
…

LRR-DPUF: learning resilient and reliable digital physical unclonable function

Miao
Jin

Conventional silicon physical unclonable function (PUF) extracts fingerprints from
transistor’s analog attributes, which are vulnerable to environmental and operational
variations. Recently, digitalized PUF prototypes have emerged to overcome the …

Enabling online learning in lithography hotspot detection with information-theoretic
feature optimization

Zhang
Hang

With the continuous shrinking of technology nodes, lithography hotspot detection and
elimination in the physical verification phase is of great value. Recently machine
learning and pattern matching based methods have been extensively studied to overcome
…

Incorporating cut redistribution with mask assignment to enable 1D gridded design

Kuang
Jian

1D gridded design is one of the most promising solutions that can enable the scaling
to 10nm technology node and beyond. Line-end cuts are needed to fabricate 1D layouts, where
two techniques are available to resolve the conflicts between cuts: cut …

VCR: simultaneous via-template and cut-template-aware routing for directed self-assembly
technology

Su
Yu-Hsuan

The directed self-assembly (DSA) technology for next-generation lithography has been
shown its great potential for fabricating highly dense via patterns and cut masks
in the sub-5 nm technology node and beyond. However, DSA via and cut optimizations
…

DSA-compliant routing for two-dimensional patterns using block copolymer lithography

Su
Yu-Hsuan

Two-dimensional (2D) directed self-assembly (DSA) is an emerging lithography for the
5 nm process node and beyond that can substantially increase design flexibility in
critical routing layers and reduce the number of cuts for better yield. The state-of-…

The art of semi-formal bug hunting

Nalla
Pradeep Kumar

Verification is a critical task in the development of correct computing systems. Simulation
remains the predominantly used technique to identify design flaws, due to its scalability.
However, simulation intrinsically suffers from low functional coverage,…

Compiled symbolic simulation for systemC

Herdt
Vladimir

Ensuring the correctness of SystemC virtual prototypes is indispensable. For such
models, existing symbolic simulation approaches are based on interpreting their behavior.
In this paper we propose a major enhancement called Compiled Symbolic Simulation (…

Exact diagnosis using boolean satisfiability

Riener
Heinz

We propose an exact algorithm to model-free diagnosis with an application to fault
localization in digital circuits. We assume that a faulty circuit and a correctness
specification, e.g., in terms of an un-optimized reference circuit, are available.
Our …

Efficient and accurate analysis of single event transients propagation using SMT-based
techniques

Hamad
Ghaith Bany

This paper presents a hierarchical framework to model, analyze, and estimate digital
design vulnerability to soft errors due to Single Event Transients (SETs). A new SET
propagation model is proposed. This model simultaneously includes the impact of …

Power delivery in 3D packages: current crowding effects, dynamic IR drop and compensation network using sensors (invited
paper)

Kannan
Sukeshwar

In 3D packages top-die power delivery is a not only limited by back-end of line (technology
scaling), but also by the TSV integration scheme, the stacking method and the microbump
current-carrying capability. The microbump structure and its …

Cost analysis and cost-driven IP reuse methodology for SoC design based on 2.5D/3D
integration

Stow
Dylan

Due to the increasing fabrication and design complexity with new process nodes, the
cost per transistor trend originally identified in Moore’s Law is slowing when using
traditional integration methods. However, emerging die-level integration …

Energy-efficient and reliable 3D network-on-chip (NoC): architectures and optimization
algorithms

Das
Sourav

The Network-on-Chip (NoC) paradigm has emerged as an enabler for integrating a large
number of embedded cores in a single die. Three-dimensional (3D) integration, a breakthrough
technology to achieve “More Moore and More Than Moore,” provides numerous …

The hype, myths, and realities of testing 3D integrated circuits

Wang
Ran

Three-dimensional (3D) integration using through-silicon vias (TSVs) promises higher
integration levels in a single package, keeping pace with Moore’s law. Despite the
promise and benefits offered by 3D integration, testing remains a major obstacle that
…

TASA: toolchain-agnostic static software randomisation for critical real-time systems

Kosmidis
Leonidas

Measurement-Based Probabilistic Timing Analysis (MBPTA) derives WCET estimates for
tasks running on processors comprising high-performance features such as caches. MBPTA’s
correct application requires the system to exhibit certain timing properties, …

Splitting functions in code management on scratchpad memories

Kim
Youngbin

As the number of cores increases, cache-based memory hierarchy is becoming a major
problem in terms of the scalability and energy consumption. Software-managed scratchpad
memories (SPM) is a scalable alternative to caches, but the benefit comes at the …

Adaptive performance prediction for integrated GPUs

Gupta
Ujjwal

Integrated GPUs have become an indispensable component of mobile processors due to
the increasing popularity of graphics applications. The GPU frequency is a key factor
both in application throughput and mobile processor power consumption under graphics
…

Energy-efficient fault tolerance approach for internet of things applications

Xu
Teng

Fault tolerance (FT) is essential in many Internet of Things (IoT) applications, in
particular in the domains such as medical devices and automotive systems where a single
fault in the system can lead to serious consequences. Non-volatile memory (NVM), …

Critical path isolation for time-to-failure extension and lower voltage operation

Masuda
Yutaka

Device miniaturization due to technology scaling has made manufacturing variability
and aging more significant, and lower supply voltage makes circuits sensitive to dynamic
environmental fluctuation. These may shorten the time to failure (TTF) of …

Control synthesis and delay sensor deployment for efficient ASV designs

Li
Chaofan

Adaptive Supply Voltage (ASV) is a power-efficient approach to achieving resilience
against process variation and circuit aging. Fine-grained ASV offers further power-efficiency
gains, but entails relatively complex control circuit, which has not been …

Performance driven routing for modern FPGAs

Kannan
Parivallal

FPGA routing is a well studied problem. Basic point-to-point routing of nets on FPGA
fabrics can be done optimally using well known shortest path algorithms like Dijkstra’s
and A-star. Practical rip-up and reroute algorithms like PathFinder have been …

UTPlaceF: a routability-driven FPGA placer with physical and congestion aware packing

Li
Wuxi

FPGA packing and placement without routability consideration could lead to unroutable
results for high-utilization designs. Conventional FPGA packing and placement approaches
are shown to have severe difficulties to yield good routability. In this paper,…

RippleFPGA: a routability-driven placement for large-scale heterogeneous FPGAs

Pui
Chak-Wa

As the complexity and scale of FPGA circuits grows, resolving routing congestion becomes
more important in FPGA placement. In this paper, we propose a routability-driven placement
algorithm for large-scale heterogeneous FPGAs. Our proposed algorithm …

GPlace: a congestion-aware placement tool for ultrascale FPGAs

Pattison
Ryan

Traditional FPGA flows that wait until the routing stage to tackle congestion are
quickly becoming less effective. This is due to the increasing size and complexity
of FPGA architectures and the designs targeted for them. In this paper, we present
two …

Resiliency in dynamically power managed designs

Lai
Liangzhen

Dynamic power management has become essential for low power designs and systems. Whether
intentionally or unintentionally, these power reduction techniques and corresponding
management schemes can impact the hardware reliability and system resiliency in …

Dynamic reliability management for near-threshold dark silicon processors

Kim
Taeyoung

In this article, we propose a new dynamic reliability management (DRM) techniques
at the system level for emerging low power dark silicon manycore microprocessors operating
in near-threshold region. We mainly consider the electromigration (EM) failures. …

A cross-layer approach for resiliency and energy efficiency in near threshold computing

Golanbari
M. S.

Energy constrained systems become the cornerstone of emerging energy harvested or
battery-limited applications in Internet of Thing (IoT) platforms. A promising approach
is to operate at near threshold voltage ranges, which can significantly reduce …

Design space exploration of drone infrastructure for large-scale delivery services

Park
Sangyoung

Drones, also referred to as unmanned aerial vehicles (UAVs), are recently expanding
their field of usage beyond military surveillance and tactical applications. Commercial
drone delivery service is one of the promising applications in the near future, …

Multi-objective design optimization for flexible hybrid electronics

Bhat
Ganapati

Flexible systems that can conform to any shape are desirable for wearable applications.
Over the past decade, there have been tremendous advances in the domain of flexible
electronics which enabled printing of devices, such as sensors on a flexible …

KCAD: kinetic cyber-attack detection method for cyber-physical additive manufacturing systems

Chhetri
Sujit Rokka

Additive Manufacturing (AM) uses Cyber-Physical Systems (CPS) (e.g., 3D Printers)
that are vulnerable to kinetic cyber-attacks. Kinetic cyber-attacks cause physical
damage to the system from the cyber domain. In AM, kinetic cyber-attacks are realized
by …

Autonomous sensor-context learning in dynamic human-centered internet-of-things environments

Rokni
Seyed Ali

Human-centered Internet-of-Things (IoT) applications utilize computational algorithms
such as machine learning and signal processing techniques to infer knowledge about
important events such as physical activities and medical complications. The …

Formulating customized specifications for resource allocation problem of distributed
embedded systems

Zhang
Xinhai

There are plentiful attempts for increasing the efficiency, generality and optimality
of the Design Space Exploration (DSE) algorithms for resource allocation problems
of distributed embedded systems. Most contemporary approaches formulate DSE as an
…

A polyhedral model-based framework for dataflow implementation on FPGA devices of
iterative stencil loops

Natale
Giuseppe

Iterative Stencil Loops (ISLs) are a specific class of algorithms of great importance
for their substantial presence in a lot of industrial and scientific computing applications,
such as in numerical methods for solving partial differential equation — …

Efficient memory compression in deep neural networks using coarse-grain sparsification
for speech applications

Kadetotad
Deepak

Recent breakthroughs in deep neural networks have led to the proliferation of its
use in image and speech applications. Conventional deep neural networks (DNNs) are
fully-connected multi-layer networks with hundreds or thousands of neurons in each
…

Parallel code-specific CPU simulation with dynamic phase convergence modeling for
HW/SW co-design

Kemmerer
Warren

While SystemC models provide a promising solution to the complex problem of HW/SW
co-design within the system-on-chip paradigm, such requires a detailed annotation
of transaction level energy and performance data within the model. While this data
can be …

Architectural-space exploration of approximate multipliers

Rehman
Semeen

This paper presents an architectural-space exploration methodology for designing approximate
multipliers. Unlike state-of-the-art, our methodology generates various design points
by adapting three key parameters: (1) different types of elementary …

Design of power-efficient approximate multipliers for approximate artificial neural
networks

Mrazek
Vojtech

Artificial neural networks (NN) have shown a significant promise in difficult tasks
like image classification or speech recognition. Even well-optimized hardware implementations
of digital NNs show significant power consumption. It is mainly due to non-…

Automated error prediction for approximate sequential circuits

Kapare
Amrut

Synthesis tools for approximate sequential circuits require the ability to quickly,
efficiently, and automatically characterize and bound the errors produced by the circuits.
Previous approaches to characterize errors in approximate sequential circuits …

Approximation-aware rewriting of AIGs for error tolerant applications

Chandrasekharan
Arun

Approximation circuits offer superior performance (speed and area) compared to traditional
circuits at the cost of computational accuracy. The accuracy of the results in approximation
circuits is evaluated based on several error metrics such as worst-…

Properties first? a new design methodology for hardware, and its perspectives in safety
analysis

Urdahl
Joakim

This paper discusses the possible role of formal verification techniques in system-level
design flows. It is argued that the role of formal verification techniques should
not be limited to “bug hunting” alone. Instead, formal technology should be …

Where formal verification can help in functional safety analysis

Bernardini
Alessandro

Formal techniques seem to be a way to cope with the exploding complexity of functional
safety analysis. Here, the overall fault propagation probability to a certain safety-point
in the design must be analyzed. As a consequence, the careful verification …

Formal approaches to design of active cell balancing architectures in battery management
systems

Steinhorst
Sebastian

Large battery packs composed of Lithium-Ion cells are continuously gaining in importance
due to their applications in Electric Vehicles (EVs) and smart energy grids. To ensure
maximum lifetime, safety and performance of the battery pack, complex …

How much cost reduction justifies the adoption of monolithic 3D ICs at 7nm node?

Ku
Bon Woong

In this paper we study power, performance, and cost (PPC) tradeoffs for 2-tier, gate-level,
full-chip GDS monolithic 3D ICs (M3D) built using a foundry-grade 7nm bulk FinFET
technology. We first develop highly-accurate wafer and die cost models for 2D …

A novel unified dummy fill insertion framework with SQP-based optimization method

Tao
Yudong

Dummy fill insertion is widely applied to significantly improve the planarity of topographic
patterns for chemical mechanical polishing process in VLSI manufacture. However, these
dummies will lead to additional parasitic capacitance and deteriorate the …

Efficient yield estimation through generalized importance sampling with application
to NBL-assisted SRAM bitcells

Ciampolini
Lorenzo

We consider the general problem of the efficient and accurate determination of the
yield of an integrated circuit, through electrical circuit level simulation, under
variability constraints due to the manufacturing process. We demonstrate the …

Are proximity attacks a threat to the security of split manufacturing of integrated
circuits?

Magaña
Jonathon

Split manufacturing is a technique that allows manufacturing the transistor-level
and lower metal layers of an IC at a high-end, untrusted foundry, while manufacturing
only the higher metal layers at a smaller, trusted foundry. Using split manufacturing
…

Making split-fabrication more secure

Yang
Ping-Lin

Today many design houses must outsource their design fabrication to a third party
which is often an overseas foundry. Split-fabrication is proposed for combining the
FEOL capabilities of an advanced but untrusted foundry with the BEOL capabilities
of a …

A machine learning approach to fab-of-origin attestation

Ahmadi
Ali

We introduce a machine learning approach for distinguishing between integrated circuits
fabricated in a ratified facility and circuits originating from an unknown or undesired
source based on parametric measurements. Unlike earlier approaches, which …

OpenRAM: an open-source memory compiler

Guthaus
Matthew R.

Computer systems research is often inhibited by the availability of memory designs.
Existing Process Design Kits (PDKs) frequently lack memory compilers, while expensive
commercial solutions only provide memory models with immutable cells, limited …

A hardware-based technique for efficient implicit information flow tracking

Shin
Jangseop

To access sensitive information, some recent advanced attacks have been successful
in exploiting implicit flows in a program in which sensitive data affects the control
path and in turn affects other data. To track the sensitive data through implicit
…

Imprecise security: quality and complexity tradeoffs for hardware information flow tracking

Hu
Wei

Secure hardware design is a challenging task that goes far beyond ensuring functional
correctness. Important design properties such as non-interference cannot be verified
on functional circuit models due to the lack of essential information (e.g., …

Encasing block ciphers to foil key recovery attempts via side channel

Agosta
Giovanni

Providing efficient protection against energy consumption based side channel attacks
(SCAs) for block ciphers is a relevant topic for the research community, as current
overheads are in the 100x range. Unprofiled SCAs exploit information leakage from
…

Security of neuromorphic computing: thwarting learning attacks using memristor’s obsolescence effect

Yang
Chaofei

Neuromorphic architectures are widely used in many applications for advanced data
processing, and often implements proprietary algorithms. In this work, we prevent
an attacker with physical access from learning the proprietary algorithm implemented
by …

Generation and use of statistical timing macro-models considering slew and load variability

Sinha
Debjit

Timing macro-modeling captures the timing characteristics of a circuit in a compact
form for use in a hierarchical timing environment. At the same time, statistical timing
provides coverage of the impact from variability sources with the goal of …

TinySPICE plus: scaling up statistical SPICE simulations on GPU leveraging shared-memory based sparse
matrix solution techniques

Han
Lengfei

TinySPICE was a SPICE simulator on GPU developed to achieve dramatic speedups in statistical
simulations of small nonlinear circuits, such as standard cell designs and SRAMs.
While TinySPICE can perform circuit simulations much faster than traditional …

PieceTimer: a holistic timing analysis framework considering setup/hold time interdependency using
a piecewise model

Zhang
Grace Li

In static timing analysis, clock-to-q delays of flip-flops are considered as constants.
Setup times and hold times are characterized separately and also used as constants.
The characterized delays, setup times and hold times, are applied in timing …

A fast layer elimination approach for power grid reduction

Yassine
Abdul-Amir

Simulation and verification of the on-die power delivery network (PDN) is one of the
key steps in the design of integrated circuits (ICs). With the very large sizes of
modern grids, verification of PDNs has become very expensive and a host of techniques
…

A deterministic approach to stochastic computation

Jenson
Devon

Stochastic logic performs computation on data represented by random bit streams. The
representation allows complex arithmetic to be performed with very simple logic, but
it suffers from high latency and poor precision. Furthermore, the results are …

Control-fluidic CoDesign for paper-based digital microfluidic biochips

Wang
Qin

Paper-based digital microfluidic biochips (P-DMFBs) have recently emerged as a promising
low-cost and fast-responsive platform for biochemical assays. In P-DMFBs, electrodes
and control lines are printed on a piece of photo paper using inkjet printer …

Neural networks designing neural networks: multi-objective hyper-parameter optimization

Smithson
Sean C.

Artificial neural networks have gone through a recent rise in popularity, achieving
state-of-the-art results in various fields, including image classification, speech
recognition, and automated control. Both the performance and computational complexity
…

Error recovery in a micro-electrode-dot-array digital microfluidic biochip?

Li
Zipeng

A digital microfluidic biochip (DMFB) is an attractive technology platform for automating
laboratory procedures in biochemistry. However, today’s DMFBs suffer from several
limitations: (i) constraints on droplet size and the inability to vary droplet …

Privacy protection via appliance scheduling in smart homes

Wu
Jie

Smart grid, managed by intelligent devices, have demonstrated great potentials to
help residential customers to optimally schedule and manage the appliances’ energy
consumption. Due to the fine-grained power consumption information collected by smart
…

Framework designs to enhance reliable and timely services of disaster management systems

Shih
Chi-Sheng

How to tolerate fault is a fundamental requirement to the designs of many cyber-physical
systems. Devices or sensors might have different requirements on their levels of reliability
and/or timely services in the composition of a cyber-physical system. …

Analysis of production data manipulation attacks in petroleum cyber-physical systems

Chen
Xiaodao

Petroleum Cyber-Physical System (CPS) marks the beginning of a new chapter of the
oil and gas industry. Combining vast computational power with intelligent Computer
Aided Design (CAD) algorithms, petroleum CPS is capable of precisely modeling the
flow …

Security challenges in smart surveillance systems and the solutions based on emerging
nano-devices

Yang
Chaofei

Modern smart surveillance systems can not only record the monitored environment but
also identify the targeted objects and detect anomaly activities. These advanced functions
are often facilitated by deep neural networks, achieving very high accuracy …

Fast physics-based electromigration checking for on-die power grids

Chatterjee
Sandeep

Due to technology scaling, electromigration (EM) signoff has become increasingly difficult,
mainly due to the use of inaccurate methods for EM assessment, such as the empirical
Black’s model. In this paper, we present a novel approach for EM checking …

Exploring aging deceleration in FinFET-based multi-core systems

Cai
Ermao

Power and thermal issues are the main constraints for highperformance multi-core systems.
As the current technology of choice, FinFET is observed to have lower delay under
higher temperature in super-threshold voltage region, an effect called …

An efficient and accurate algorithm for computing RC current response with applications
to EM reliability evaluation

Guan
Zhong

In this paper, we propose a current waveform estimation algorithm for signal lines
without the necessity of SPICE simulation. Unlike previous methods, we do not use
function fitting or compute the effective capacitance. Instead, the proposed algorithm
…

Voltage-based electromigration immortality check for general multi-branch interconnects

Sun
Zeyu

As VLSI technology features are pushed to the limit with every generation and with
the introduction of new materials and increased current densities to satisfy the performance
demands, Electromigration (EM) is projected to be a key reliability issue for …

Exploiting randomness in sketching for efficient hardware implementation of machine
learning applications

Wang
Ye

Energy-efficient processing of large matrices for big-data applications using hardware
acceleration is an intense area of research. Sketching of large matrices into their
lower-dimensional representations is an effective strategy. For the first time, …

Making neural encoding robust and energy efficient: an advanced analog temporal encoder for brain-inspired computing systems

Zhao
Chenyuan

Neural encoder is one of the key components in neuromorphic computing systems, whereby
sensory information is transformed into spike coded trains. The design of temporal
encoder has attracted a widespread attention in the field of neuromorphic computing
…

Statistical methodology to identify optimal placement of on-chip process monitors
for predicting fmax

Mu
Szu-Pang

In previous literatures, many approaches use ring oscillators or other process monitors
to correlate the chip’s maximum operating frequency (F_max). But none of them focus on the placement of these on-chip process monitors (OPMs)
on a chip. The placement …

BugMD: automatic mismatch diagnosis for bug triaging

Mammo
Biruk

System-level validation is the most challenging phase of design verification. A common
methodology in this context entails simulating the design under validation in lockstep
with a high-level golden model, while comparing the architectural state of the …

ODESY: a novel 3T-3MTJ cell design with optimized area DEnsity, scalability and latencY

Xue
Linuo

The STT-RAM (Spin-Transfer Torque Magnetic RAM) technology is a promising candidate
for cache memory because of its high density, low standy-power, and non-volatility.
As technology scales, especially under 40nm technology node, the read disturbance
…

Delay-optimal technology mapping for in-memory computing using ReRAM devices

Bhattacharjee
Debjyoti

Recent propositions of diverse In-Memory Computing platforms have shown a promising
alternative to classical Von Neumann computing models. Significant benefits, in terms
of energy-efficiency and performance, are reported for in-memory arithmetic …

Reconfigurable in-memory computing with resistive memory crossbar

Zha
Yue

Driven by recent advances in resistive random-access memory (RRAM), there have been
growing interests in exploring alternative computing concept, i.e., in-memory processing,
to address the classical von Neumann bottlenecks. Despite of their great …

Exploiting ferroelectric FETs for low-power non-volatile logic-in-memory circuits

Yin
Xunzhao

Numerous research efforts are targeting new devices that could continue performance
scaling trends associated with Moore’s Law and/or accomplish computational tasks with
less energy. One such device is the ferroelectric FET (FeFET), which offers the …

Approximation knob: power capping meets energy efficiency

Kanduri
Anil

Power Capping techniques are used to restrict power consumption of computer systems
to a thermally safe limit. Current many-core systems employ dynamic voltage and frequency
scaling (DVFS), power gating (PG) and scheduling methods as actuators for power …

IC thermal analyzer for versatile 3-D structures using multigrid preconditioned krylov
methods

Ladenheim
Scott

Thermal analysis is crucial for determining the propagation of heat and tracking the
formation of hot spots in advanced integrated circuit technologies. At the core of
the thermal analysis for integrated circuits is the numerical solution of the heat
…

BoostNoC: power efficient network-on-chip architecture for near threshold computing

Rajamanikkam
Chidhambaranathan

While near threshold design space provides a promising approach towards energy-efficient
computing, it is plagued by sub-optimal performance. Application characteristics and
hardware non-idealities of conventional architectures (optimized for the …

QScale: thermally-efficient QoS management on heterogeneous mobile platforms

Sahin
Onur

Single-ISA heterogeneous mobile processors integrate low-power and power-hungry CPU
cores together to combine energy efficiency with high performance. While running computationally
demanding applications, current power management and scheduling …

Synthesis of statically analyzable accelerator networks from sequential programs

Cheng
Shaoyi

This paper describes a general framework for transforming a sequential program into
a network of processes, which are then converted to hardware accelerators through
high level synthesis. Also proposed is a complementing technique for performing static
…

Joint loop mapping and data placement for coarse-grained reconfigurable architecture
with multi-bank memory

Yin
Shouyi

Coarse-Grained Reconfigurable Architecture (CGRA) is a promising architecture with
high performance, high power-efficiency and attraction of flexibility. The compute-intensive
parts of an application (e.g. loops) are often mapped onto CGRA for …

Efficient synthesis of graph methods: a dynamically scheduled architecture

Minutoli
Marco

RDF databases naturally map to a graph representation and employ languages, such as
SPARQL, that implements queries as graph pattern matching routines. Graph methods
exhibit an irregular behavior: they present unpredictable, fine-grained data accesses,
…

Tier partitioning strategy to mitigate BEOL degradation and cost issues in monolithic
3D ICs

Samal
Sandeep Kumar

In this paper, we develop tier partitioning strategy to mitigate back-end-of-line
(BEOL) interconnect delay degradation and cost issues in monolithic 3D ICs (M3D).
First, we study the routing overhead and delay degradation caused by tungsten BEOL
…

Cascade2D: A design-aware partitioning approach to monolithic 3D IC with 2D commercial tools

Chang
Kyungwook

Monolithic 3D IC (M3D) can continue to improve power, performance, area and cost beyond
traditional Moore’s law scaling limitations by leveraging the third-dimension and
fine-grained monolithic inter-tier vias (MIVs). Several recent studies present …

SAINT: handling module folding and alignment in fixed-outline floorplans for 3D ICs

Lin
Jai-Ming

Three-dimensional integrated circuits (3D ICs) offer significant improvements over
two-dimensional circuits in several aspects. Classic 3D floorplanning algorithm places
each module at one single die. However, power consumption and wirelength of a 3D IC
…

From biochips to quantum circuits: computer-aided design for emerging technologies

Wille
Robert

While previous decades have witnessed impressive accomplishments in the design and
realization of conventional computing devices, physical boundaries and cost restrictions
led to an increasing interest in alternative technologies (often referred to as …

Multilevel design understanding: from specification to logic invited paper

Ray
Sandip

We present an outline of the field of Multilevel Design Understanding by first defining
and motivating the related problems, and then describing the key issues which must
be addressed in future research.