How sensitive is processor customization to the workload’s input datasets?

Maximilien Breughe; Zheng Li; Yang Chen; Stijn Eyerman; Olivier Temam; Chengyong Wu; Lieven Eeckhout

How sensitive is processor customization to the workload’s input datasets?

Maximilien Breughe ,
Zheng Li ,
Yang Chen ,
Stijn Eyerman ,
Olivier Temam ,
Chengyong Wu ,
Lieven Eeckhout

2011 Symposium on Application Specific Processors | June 2011

Published by IEEE

Download BibTex

Hardware customization is an effective approach for meeting application performance requirements while achieving high levels of energy efficiency. Application-specific processors achieve high performance at low energy by tailoring their designs towards a specific workload, i.e., an application or application domain of interest. A fundamental question that has remained unanswered so far though is to what extent processor customization is sensitive to the training workload’s input datasets. Current practice is to consider a single or only a few input datasets per workload during the processor design cycle — the reason being that simulation is prohibitively time-consuming which excludes considering a large number of datasets. This paper addresses this fundamental question, for the first time. In order to perform the large number of runs required to address this question in a reasonable amount of time, we first propose a mechanistic analytical model, built from first principles, that is accurate within 3.6% on average across a broad design space. The analytical model is at least 4 orders of magnitude faster than detailed cycle-accurate simulation for design space exploration. Using the model, we are able to study the sensitivity of a workload’s input dataset on the optimum customized processor architecture. Considering MiBench benchmarks and 1000 datasets per benchmark, we conclude that processor customization is largely dataset-insensitive. This has an important implication in practice: a single or only a few datasets are sufficient for determining the optimum processor architecture when designing application-specific processors.