|
Selected Publications
(see
my CV for a full list of publications)
|
|
Channel-wise autoregressive entropy models for learned image compression
David Minnen and Saurabh Singh
Int. Conf. on Image Processing (ICIP) 2020
Better runtime and
rate-distortion performance for learned image
compression. We improve the entropy model with
latent residual prediction and channel-wise
conditioning instead of spatial context.
|
|
Nonlinear Transform Coding
Johannes Ballé, Philip A. Chou, David Minnen, Saurabh Singh, Nick Johnston, Eirikur Agustsson, Sung Jin Hwang, and George Toderici
IEEE Journal of Selected Topics in Signal Processing
(STSP 2020) (under review)
A review of learned image
compression framed as nonlinear transform coding
(NTC). This paper analyzes rate-distortion
performance using simple sources and natural
images, introduces a novel variant of
entropy-constrained vector quantization and a
method for learning multi-rate models, and
analyzes different forms of stochastic
optimization techniques for compression
models.
|
|
Scale-space flow for end-to-end optimized video compression
Eirikur Agustsson, David Minnen, Nick Johnston, Johannes Ballé, Sung Jin Hwang, and George Toderici
Computer Vision and Pattern Recognition (CVPR) 2020
Learns "compressible flow"
within an end-to-end optimized model for video
compression. Optical flow is typically a 2D vector
field representing motion. We generalize this to a
3D representation that holds spatial offsets plus
a scale-space parameter. Larger scale values lead
to more blurring before warping. The model learns
to predict a small scall coupled with accurate
flow and a large scale when accurate flow is not
possible (or is too expensive to code relative to
the target bit rate).
|
|
Integer networks for data compression with latent-variable models
Johannes Ballé, Nick Johnston, and David Minnen
Int. Conf. on Learning Representations (ICLR) 2019
Avoids floating point
non-determinism for entropy model parameters
predicted by deep networks. Non-determinism
typically doesn't matter for deep networks, but it
is catastrophic for entry coding.
|
|
Joint autoregressive and hierarchical priors for learned image compression
David Minnen, Johannes Ballé, and George Toderici
Advances in Neural Information Processing Systems (NeurIPS) 2018
Combines a hyperprior with spatial context to
improve entropy modeling for learned image
compression. By mixing forward and
backward-adaptation, we achieve a new
state-of-the-art for rate-distortion performance
with neural compression models.
|
|
Image-dependent local entropy models for image compression with deep networks
David Minnen, George Toderici, Saurabh Singh, Sung Jin Hwang, and Michele Covell
Int. Conf. on Image Processing (ICIP) 2018
We learn a dictionary of
entropy models and allow the encoder to select the
best distribution for each channel and each
spatial tile. If none of the distributions match
the local data, the encoder transmits a custom
histogram. This spatially local and
image-dependent modeling improves rate-distortin
peroformance over earlier models and avoids
floating point non-determinism that can break
entropy models predicted on the fly.
|
|
Variational image compression with a scale hyperprior
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston
Int. Conf. on Learning Representations (ICLR) 2018
This model is the first to
introduce a hyperprior for end-to-end
optimized image compression with deep
networks. The model learns a non-linear transform
from pixels to a quantized latent space, which is
jointly optimized with a hyperprior that predicts
the parameters of the entropy model used to code
the latents.
|
|
Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks
Nick Johnston, Damien
Vincent, David Minnen, Michele
Covell, Saurabh Singh, Troy Chinen, Sung Jing
Hwang, Joel Shor, and George Toderici
Computer Vision and Pattern Recognition (CVPR) 2018
Our team's best compression
network based on recurrent neural network. I
developed the spatially adaptive bit rate (SABR)
component that allowed the encoder to adapt the
local bit rate to the image content, which
improves overall rate-distortion performance.
|
|
Spatially adaptive image compression using a tiled deep network
David Minnen, George Toderici,
Michele Covell, Troy Chinen, Nick Johnston, Joel
Shor, Sung Jin Hwang, Damien Vincent, and Saurabh
Singh
Int. Conference on Image Processing (ICIP) 2017
Deep neural networks are
used for image intra-prediction. Each tile is
predicted from neighboring tiles in the causal
context, and then the residual is coded
separately. By using a progressive model based on
recurrent networks, the encoder can spatially
adapt the bit rate to improve the overall
rate-distortion performance.
|
|
|