Table of Contents

- Next meeting:
- Paper stack:
- Past meetings:
- Spring 2021
- Fall 2020
- Summer 2020
- Spring 2020
- Fall 2019
- Summer 2019
- Spring 2019
- Fall 2018
- Summer 2018
- Spring 2018
- Fall 2017
- Summer 2017
- Spring 2017
- IAP 2017
- Fall 2016
- Summer 2016
- Spring 2016
- IAP 2016
- Fall 2015
- Summer 2015
- Spring 2015
- Fall 2014
- Summer 2014
- Spring 2014
- Fall 2013
- Spring 2013
- Fall 2012
- Summer 2012
- Spring 2012
- Fall 2011
- Summer 2011
- Spring 2011
- Fall 2010
- Summer 2010
- Spring 2010
- Fall 2009
- Summer 2009
- Spring 2009
- Fall 2008
- Summer 2008
- Spring 2008
- Fall 2007
- Jan 22 (Tuesday)
- Jan 15 (Tuesday)
- Dec 10 (Mon, 9-10am)
- Dec 5 (Wed, 2:30-3:30 since Polina is away at NIPS on Monday)
- Nov 26
- Nov 19
- Nov 14 (Wednesday, since Monday is Veterans Day)
- Oct 17 (Wednesday, to work around MMBIA)
- Oct 10 (Wednesday since Monday, Oct 8 is Columbus Day)
- Oct 1
- Sep 24
- Sep 17
- Sep 12

- Summer 2007
- Spring 2007
- Fall 2006
- Spring 2006
- Fall 2005

We meet on Mondays from 3:45PM - 5:00PM ET ONLINE at https://mit.zoom.us/j/911296050.

Feel free to add papers to the paper stack.

To join the reading group, feel free to:

- subscribe to v-golland email list at csail.
- To get access to the wiki, please contact Clinton Wang at clintonw at csail.mit.edu

ACORN: Adaptive Coordinate Networks for Neural Scene Representation

Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, Gordon Wetzstein

https://arxiv.org/abs/2105.02788

Deep Parametric Continuous Convolutional Neural Networks

Shenlong Wang,Simon Suo,Wei-Chiu Ma, Andrei Pokrovsky, and Raquel Urtasun

https://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Deep_Parametric_Continuous_CVPR_2018_paper.pdf

Meta-Learning with Latent Embedding Optimization

Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell

https://arxiv.org/abs/1807.05960

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains

Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, Ren Ng

https://arxiv.org/abs/2006.10739

Learning Interpretable Features via Adversarially Robust Optimization; Khakzar et al.

Graph Neural Networks for Interpreting Task-fMRI Biomarkers; Li et al.

A Surface-theoretic Approach for Statistical Shape Modeling; Ambellan et al.

Geometric deep learning: going beyond Euclidean data; Bronstein et al. (may also be an associated review article?) http://geometricdeeplearning.com/

Construction of a Spatiotemporal Statistical Shape Model of Pediatric Liver from Cross-Sectional Data; Atsushi Saito et al.

Fast CapsNet for Lung Cancer Screening; Aryan Mobiny, Hien Van Nguyen

CompNet: Complementary Segmentation Network for Brain MRI Extraction, Raunak Dey, Yi Hong

Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation; Tanya Nair et al.

Generative discriminative models for multivariate inference and statistical mapping in medical imaging; Erdem Varol et al.

Roto-translation covariant convolutional networks for medical image analysis; Erik Bekkers et al.

Uncertainty in multitask learning: Joint representations for probabilistic MR-only radiotherapy planning; Felix Bragman et al.

Factorized spatial representation learning: Application in semi-supervised myocardial segmentation; Agisilaos Chartsias et al.

Hierarchical Spherical Deformation for Shape Correspondence; Lyu et al.

Using the Anisotropic Laplace Equation to Compute Cortical Thickness; Joshi et al.

3D Segmentation with Exponential Logarithmic Loss for Highly Unbalanced Object Sizes; Wong et al.

Deep Multi-Structural Shape Analysis: Application to Neuroanatomy; Gutierrez-Becker, B., and Wachinger, C.

rfDemons: Resting fMRI-Based Cortical Surface Registration Using the BrainSync Transform; Joshi et al.

Deformable Convolution Networks; Dai et al.

Unsupervised domain adaptation in brain lesion segmentation with adversarial networks; Kamnitsas et al.

Spectral kernels for probabilistic analysis and clustering of shapes; Folgoc et al.

Intraoperative Organ Motion Models with an Ensemble of conditional Generative Adversarial Networks; Hu et al.

A multi-armed bandit to smartly select a training set from big medical data; Becker et al.

Skin Disease Recognition Using Deep saliency features and multimodal learning of Dermoscopy and clinical images; Ge et al.

X-Ray in-depth decomposition: revealing the latent structures; Albarqouni et al.

Deep adversarial networks for biomedical image segmentation utilizing unannotated networks; Zhang et al.

Semi-supervised Deep Learning for Fully Convolutional Networks; Baur et al.

TandemNet: Distilling Knowledge from Medical Images Using Diagnostic Reports as Optional Semantic References; Zhang et al.

Towards automatic semantic segmentation in volumetric ultrasound; Yang et al.

The active atlas: combining 3D anatomical models with texture detectors; Chen et al.

Nonrigid image registration using Multi-scale 3D Convolutional Neural Networks; Sokooti et al.

Robust nonrigid registration through agent-based action learning; Krebs et al.

Online statistical inference for Large-Scale Binary Images; Chung et al.

Efficient deformable motion correction for 3-D abdominal MRI using manifold regression; Chen et al.

Learning and incorporating shape models for semantic segmentation; Ravishankar et al.

End to end unsupervised deformable image registration with a convolutional neural network; de Vos et al.

Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations; Sudre et al.

Adversarial training and dilated convolutions for Brain MRI segmentation; Moeskops et al.

Sparse Kernel Machines for Discontinuous Registration and Nonstationary Regularization, Christoph Jud (University of Basel) Nadia MÃri, Philippe C. Cattin

fast implementation of registration: Fast Deformable Image Registration with Non-Smooth Dual Optimization, Martin Rajchl (Imperial College London, Robarts, Ontario); John S.H Baxter, Wu Qiu, Ali R. Khan, Aaron Fenster, Terry M. Peters, Daniel Rueckert, Jing Yuan

Image Registration for Placenta Reconstruction, Floris Gaisser, Toshio Chiba, Pieter Jonker

Tissue-Volume Preserving Deformable Image Registration for 4DCT Pulmonary Images, Bowen Zhao, Joohyun Song, Geoffrey Hugo, Yue Pan, Sarah Gerard, Kaifang Du, Taylor Patton, Joseph Reinhardt, John Bayouth, Gary Christensen

A Simple Framework for Contrastive Learning of Visual Representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton

https://arxiv.org/abs/2002.05709

AutoInt: Automatic Integration for Fast Neural Volume Rendering

David B. Lindell, Julien N. P. Martel, Gordon Wetzstein

https://arxiv.org/abs/2012.01714

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng

https://arxiv.org/abs/2003.08934

Image Registration via Stochastic Gradient Markov Chain Monte Carlo (https://doi.org/10.1007/978-3-030-60365-6_1)

Evidential Deep Learning to Quantify Classification Uncertainty (https://arxiv.org/pdf/1806.01768.pdf)

Spherical Deformable U-Net: Application to Cortical Surface Parcellation and Development Prediction

Zhao et al.spherical_unet.pdf

RARE: Image Reconstruction using Deep Priors Learned without Ground Truth

Jiaming Liu, Yu Sun, Cihat Eldeniz, Weijie Gan, Hongyu An, Ulugbek S. Kamilov

https://arxiv.org/abs/1912.05854

The little engine that could: Regularization by denoising (RED)

Yaniv Romano, Michael Elad, and Peyman Milanfar

https://epubs.siam.org/doi/10.1137/16M1102884 red.pdf

B-spline Parameterized Joint Optimization of Reconstruction and K-space Trajectories (BJORK) for Accelerated 2D MRI

Guanhua Wang, Tianrui Luo, Jon-Fredrik Nielsen, Douglas C. Noll, Jeffrey A. Fessler

https://arxiv.org/abs/2101.11369

CBAM: Convolutional Block Attention Module

Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon

ECCV 2018

https://openaccess.thecvf.com/content_ECCV_2018/html/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.html

Attention Augmented Convolutional Networks

Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens, Quoc V. Le

ICCV 2019

https://openaccess.thecvf.com/content_ICCV_2019/html/Bello_Attention_Augmented_Convolutional_Networks_ICCV_2019_paper.html

Linear Predictability in Magnetic Resonance Imaging Reconstruction: Leveraging Shift-Invariant Fourier Structure for Faster and Better Imaging

Justin P. Haldar, Kawin Setsompop https://ieeexplore.ieee.org/abstract/document/8962389

- Attention is all you need: https://arxiv.org/pdf/1706.03762.pdf (introduces attention)
- On the Relationship between Self-Attention and Convolutional Layers: https://arxiv.org/pdf/1911.03584.pdf (shows attention learns to perform a convolution)

Information-Theoretic Segmentation by Inpainting Error Maximization https://arxiv.org/abs/2012.07287

Dubois, Yann, Douwe Kiela, David J. Schwab, and Ramakrishna Vedantam. “Learning Optimal Representations with the Decodable Information Bottleneck.” Advances in Neural Information Processing Systems 33 (2020).

Geometric deep learning: going beyond Euclidean data

Dimensionality Reduction by Learning an Invariant Mapping

http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf

Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization

https://arxiv.org/abs/2009.12829

Invariant Risk Minimization

https://arxiv.org/abs/1907.02893

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

https://arxiv.org/abs/1806.07572

Semi-Supervised Learning with Ladder Networks

https://arxiv.org/abs/1507.02672

Structural Autoencoders Improve Representations for Generation and Transfer

https://arxiv.org/abs/2006.07796

Training Generative Adversarial Networks with Limited Data, Karras et al.

https://arxiv.org/abs/2006.06676

A Fourier Perspective on Model Robustness in Computer Vision

Yin et al, 2020

[pdf] https://arxiv.org/abs/1906.08988

Elements of Causal Inference Peters, Janzing, and Scholkopf [pdf] https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf

Chapters 6.1, 6.2, 6.3, 6.4, 6.5 of Elements of Causal Inference [pdf] https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf

Chapter 4.2 and 5 of Elements of Causal Inference [pdf] https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf

Chapter 4.1 of Elements of Causal Inference [pdf] https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf

Chapter 3 of Elements of Causal Inference [pdf] https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf

Chapters 1 and 2 of Elements of Causal Inference [pdf] https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf

Tutorial on Variational Autoencoders, Carl Doesch, 2016: https://arxiv.org/pdf/1606.05908.pdf

Auxiliary material, Kingma and Welling tutorial on VAEs, 2019: https://arxiv.org/pdf/1906.02691.pdf

Yen-Chun Chen and Linjie Li and Licheng Yu and Ahmed El Kholy and Faisal Ahmed and Zhe Gan and Yu Cheng and Jingjing Liu. “UNITER: UNiversal Image-TExt Representation Learning.” arXiv:1909.11740 (2019). https://arxiv.org/pdf/1909.11740.pdf

Oord, Aaron van den, Yazhe Li, and Oriol Vinyals. “Representation learning with contrastive predictive coding.” arXiv preprint arXiv:1807.03748 (2018). https://arxiv.org/abs/1807.03748

HOLIDAY - NO MEETING

Noise2Noise: Learning Image Restoration without Clean Data https://arxiv.org/abs/1803.04189

Extending Stein’s unbiased risk estimator to train deep denoisers with correlated pairs of noisy images https://arxiv.org/abs/1902.02452

k-Space Deep Learning for Accelerated MRI. Yoseob Han, Leonard Sunwoo, Jong Chul Ye, IEEE TMI. https://arxiv.org/abs/1805.03779

DeepSphere: A graph-based spherical CNN Michael Defferrard, Martino Milani, and Frederick Gussett. ICLR 2020 deepsphere_cnn.pdf

Louizos, Christos, Uri Shalit, Joris M. Mooij, David Sontag, Richard Zemel, and Max Welling. “Causal effect inference with deep latent-variable models.” In Advances in Neural Information Processing Systems, pp. 6446-6456. 2017. https://arxiv.org/pdf/1705.08821.pdf

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction Leland McInnes, John Healy, James Melville https://arxiv.org/abs/1802.03426

Neural Discrete Representation Learning Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu https://arxiv.org/abs/1711.00937

Unsupervised Learning with Stein’s Unbiased Risk Estimator

How Good is the Bayes Posterior in Deep Neural Networks Really?

Invert to Learn to Invert https://papers.nips.cc/paper/8336-invert-to-learn-to-invert.pdf

Deep Complex Networks https://arxiv.org/abs/1705.09792

Learning to Explain: An Information-Theoretic Perspective on Model Interpretation https://arxiv.org/abs/1802.07814

Towards Automatic Concept-based Explanations https://arxiv.org/abs/1902.03129

Explanation by Progressive Exaggeration Sumedha Singla, Brian Pollack, Junxiang Chen, Kayhan Batmanghelich https://arxiv.org/abs/1911.00483.pdf Updated version: https://openreview.net/forum?id=H1xFWgrFPS

Noise-contrastive estimation: A new estimation principle for unnormalized statistical models

Putting An End to End-to-End:Gradient-Isolated Learning of Representations

Burda et al, Importance weighted autoencoders (ICLR 2016):

The Thermodynamic Variational Objective (NeurIPS 2019):

Domain-Adversarial Training of Neural Networks, JMLR 2016:

Zhou et al, Prior-aware Neural Network for Partially-Supervised Multi-Organ Segmentation, ICCV 2019 https://arxiv.org/abs/1904.06346

Show, attend and tell: Neural image caption generation with visual attention

Compositional Attention Networks for Machine Reasoning: https://arxiv.org/abs/1803.03067

Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis https://arxiv.org/abs/1908.06912

Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology https://arxiv.org/abs/1812.09764

A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology https://arxiv.org/abs/1910.01877

A Topology Layer for Machine Learning https://arxiv.org/abs/1905.12200

Predicting Slice-to-Volume Transformation in Presence of Arbitrary Subject Motion, Hou et al., MICCAI 2018 https://arxiv.org/abs/1702.08891

Fast Volume Reconstruction From Motion Corrupted Stacks of 2D Slices https://ieeexplore.ieee.org/abstract/document/7064742

High Accuracy Optical Flow Estimation Based on a Theory for Warping https://lmb.informatik.uni-freiburg.de/people/brox/pub/brox_eccv04_of.pdf

Determining Optical Flow http://image.diku.dk/imagecanon/material/HornSchunckOptical_Flow.pdf

Group Equivariant Convolutional Networks, Taco S. Cohen, Max Welling https://arxiv.org/abs/1602.07576#

Andrearczyk et al, Exploring local rotation invariance in 3D CNNs with steerable filters, MIDL 2019. http://proceedings.mlr.press/v102/andrearczyk19a/andrearczyk19a.pdf

Freeman and Adelson, The design and use of steerable filters, IEEE PAMI 1991 design_and_use_of_steerable_filters.pdf

Weiler et al, 3D Steerable CNNs: Learning RotationallyEquivariant Features in Volumetric Data, NeurIPS 2018. http://papers.nips.cc/paper/8239-3d-steerable-cnns-learning-rotationally-equivariant-features-in-volumetric-data.pdf

Dihn et al, Density estimation using Real NVP, ICLR 2017 https://arxiv.org/abs/1605.08803

Gomez et al, The reversible residual network: Backpropagation without storing activations, NeurIPS 2017 https://papers.nips.cc/paper/6816-the-reversible-residual-network-backpropagation-without-storing-activations

Diederik P. Kingma, Prafulla Dhariwal: Glow: Generative Flow with Invertible 1×1 Convolutions, NeurIPS 2018

https://papers.nips.cc/paper/8224-glow-generative-flow-with-invertible-1x1-convolutions.pdf

Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem; Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations; Proceedings of the 36th International Conference on Machine Learning, PMLR 97:4114-4124, 2019.

Computational Optimal Transport by Gabriel Peyre and Marco Cuturi: Finish 2.4-2.5; 3.1-3.4 https://arxiv.org/pdf/1803.00567.pdf

Computational Optimal Transport by Gabriel Peyre and Marco Cuturi: Read up to the end of 2.4 https://arxiv.org/pdf/1803.00567.pdf

Justin Solomon’s slides from his tutorial talk on optimal transport http://people.csail.mit.edu/polina/papers/ot_tutorial.pptx

Automated Treatment Planning in Radiation Therapy using Generative Adversarial Networks; Mahmood et al. mahmood18a.pdf

The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision; Jiayuan Mao et al. https://openreview.net/forum?id=rJgMlhRctm

Fully Automatic 3D Reconstruction of the Placenta and its Peripheral Vasculature in Intrauterine Fetal MRI; Torrents-Barrena et al. automatic_placenta_seg.pdf

Semi-supervised learning for segmentation under semantic constraint; Pierre-Antoine Ganaye et al. https://link.springer.com/content/pdf/10.1007%2F978-3-030-00931-1_68.pdf

Improved Training of Wasserstein GANs https://arxiv.org/pdf/1704.00028.pdf

Wasserstein GAN https://arxiv.org/pdf/1701.07875.pdf

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer https://openreview.net/pdf?id=S1fQSiCcYm https://openreview.net/forum?id=S1fQSiCcYm

Spectral Representations for Convolutional Neural Networks: https://arxiv.org/pdf/1506.03767.pdf

Greff et al., “Highway and Residual Networks learn Unrolled Iterative Estimation”. https://arxiv.org/abs/1612.07771.

Note: we’re meeting on Tue, Jan 22, 1pm for a one-time off-schedule meeting (because Mon Jan 21 is a holiday), focusing on Sec 4 and 5 of:

Neural Ordinary Differential Equations

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud

NeurIPS 2018. pdf

Related papers for further reading:

- Rousseau et al, Residual Networks as Geodesic Flows of Diffeomorphisms, ArXiv 2018. pdf

Seems very related concurrent development by different authors.

- Grathwohl et al, FFJORD: Free-form continuous dynamics for scalable reversible generative models, ICLR 2019. pdf

Continuation of the continuous flows from the Neural-ODEs

A Deep Cascade of Convolutional Neural Networks for Dynamic MR Image Reconstruction. Jo Schlemper, Jose Caballero, Joseph V. Hajnal, Anthony Price and Daniel Rueckert. IEEE TMI 2017. //arxiv.org/pdf/1704.02422.pdf

K. Hammernik, T. Klatzer, E. Kobler, M.P. Recht, D.K. Sodickson, T. Pock, F. Knoll

arning a Variational Network for Reconstructionof Accelerated MRI Data.

Magnetic Resonance in Medicine, 2018

hammernik_et_al-2018-magnetic_resonance_in_medicine.pdf

Bo Zhu, Jeremiah Z. Liu, Stephen F. Cauley, Bruce R. Rosen, Matthew S. Rosen

Image reconstruction by domain-transform manifold learning.

Nature Letters

zhu_nature25988.pdf

Simon A. A. Kohl, Bernardino Romera-Paredes, Clemens Meyer, Jeffrey De Fauw, Joseph R. Ledsam, Klaus H. Maier-Hein, S. M. Ali Eslami, Danilo Jimenez Rezende, Olaf Ronneberger

A Probabilistic U-Net for Segmentation of Ambiguous Images.

NIPS 2018

https://arxiv.org/abs/1806.05034

Finn, C., Abbeel, P., Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML. (2017)

Gabriel Maicas, Andrew P. Bradley, Jacinto C. Nascimento, Ian Reid, Gustavo Carneiro

“Training Medical Image Analysis Systems like Radiologists” MICCAI 2018

Martin Szummer and Tommi Jaakkola. “Information Regularization with Partially Labeled Data.” Advances in neural information processing systems. 2003. https://people.csail.mit.edu/tommi/papers/SzuJaa-nips02.pdf

Follow up paper: https://arxiv.org/abs/1212.2466

Grandvalet, Yves, and Yoshua Bengio. “Semi-supervised learning by entropy minimization.” Advances in neural information processing systems. 2005. http://papers.nips.cc/paper/2740-semi-supervised-learning-by-entropy-minimization.pdf

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra, “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization”, ICCV 2017, https://arxiv.org/pdf/1610.02391

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba, “Learning Deep Features for Discriminative Localization”, CVPR 2016, https://arxiv.org/pdf/1512.04150

David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, Antonio Torralba. Network dissection: Quantifying interpretability of deep visual representations. CVPR 2017. http://netdissect.csail.mit.edu/final-network-dissection.pdf

Other papers of interest from Bolei Zhou’s webpage:

- Bolei Zhou*, David Bau*, Aude Oliva, Antonio Torralba. Interpreting deep visual representations via network dissection. PAMI 2018. https://arxiv.org/pdf/1711.05611
- Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba. Learning deep features for discriminative localization. CVPR 2016. http://cnnlocalization.csail.mit.edu/Zhou_Learning_Deep_Features_CVPR_2016_paper.pdf
- Bolei Zhou*, Yiyou Sun*, David Bau*, Antonio Torralba. Interpretable basis decomposition for visual explanation. ECCV 2018. http://people.csail.mit.edu/bzhou/publication/eccv18-IBD

S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek: On Pixel-wise Explanations for Non-Linear Classifier Decisions by Layer-wise Relevance Propagation. PLOS ONE, 10(7): e0130140 (2015) http://dx.doi.org/10.1371/journal.pone.0130140

Bob D. de Vos, Floris F. Berendsen, Max A. Viergever, Marius Staring, and Ivana Isgum, “End-to-End Unsupervised Deformable Image Registration with a Convolutional Neural Network .” DLMIA-MICCAI, 2017. https://arxiv.org/pdf/1704.06065.pdf

Geoffrey E Hinton, Sara Sabour, Nicholas Frosst. Matrix capsules with EM routing. https://openreview.net/pdf?id=HJWLfGWRb

Sabour, Sara, Nicholas Frosst, and Geoffrey E. Hinton. “Dynamic routing between capsules.” Advances in Neural Information Processing Systems. 2017. https://arxiv.org/pdf/1710.09829.pdf

Additional papers for more reading:

braintumortypeclassificationviacapsulenetworks.pdf

Kingma, Diederik P., et al. “Semi-supervised learning with deep generative models.” Advances in Neural Information Processing Systems. 2014.

http://papers.nips.cc/paper/5352-semi-supervised-learning-with-deep-generative-models.pdf

Kainz et al.: “Fast Volume Reconstruction From Motion Corrupted Stacks of 2D Slices”

Mengye Ren and Richard S. Zemel. “End-to-End Instance Segmentation with Recurrent Attention.” arXiv preprint arXiv:1605.09410 (2015).

Lipton, Zachary C., et al. “Learning to diagnose with LSTM recurrent neural networks.” arXiv preprint arXiv:1511.03677 (2015).

Evangelos Kalogerakis, Siddhartha Chaudhuri, Daphne Koller, and Vladlen Koltun “A Probabilistic Model for Component-Based Shape Synthesis” SIGGRAPH / ACM Transactions on Graphics 31(4), 2012

Conjugate gradient algorithm and gradient preconditioning.

Chapters 11.2 and 11.3 in Numerical Algorithms by Justin Solomon

We will briefly discuss conjugate gradients then focus on preconditioning.

Conjugate gradient algorithm.

A (shorter) section from the numerical recipes book: c10-6.pdf

More background: Jonathan Richard Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. This is a nice (although long) tutorial paper: painless-conjugate-gradient.pdf

From Label Maps to Generative Shape Models: A Variational Bayesian Learning Approach; IPMI 2017 elhabian2017.pdf

Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery; IPMI 2017 schlegl2017.pdf

Error correction boosting for learning fully Convolutional networks with limited data; MICCAI 2017, Roy et al. roy.pdf

Christoph Baur, Shadi Albarqouni, Nassir Navab. Semi-supervised Deep Learning for Fully Convolutional Networks. baur.pdf

Zhang Y., Yang L., Chen J., Fredericksen M., Hughes D.P., Chen D.Z. (2017) Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images.

Kingma, Diederik P., and Max Welling. “Auto-encoding variational bayes.” arXiv preprint arXiv:1312.6114 (2013). vae_2013.pdf

Useful lecture notes for reviewing variational approximation: 437_approximations.pdf

Juan Eugenio Iglesias - Globally optimal coupled surfaces for semi-automatic segmentation of medical images iglesias2017.pdf

Topology-controlled Reconstruction of Multi-labelled Domains from Cross-sections; Ed Chien (Justin’s postdoc) will present. multitopo_tog.pdf

We will focus on:

- Ravishankar et al - Learning and Incorporating Shape Models for Semantic Segmentation ravishankar2017.pdf
- Milletari et al - Integrating Statistical Prior Knowledge into Convolutional Neural Networks milletari2017.pdf

This reading will be related to that of Oct 24th, when we discussed:

- Oktay et al - Anatomically Constrained Neural Networks (ACNN): Application to Cardiac Image Enhancement and Segmentation oktay2017a.pdf

Phillips Tech Talk: Computational Neurology and Computational Pathology. The talk is at 4PM in 32-G882. The link to register is here.

- Oktay et al - Anatomically Constrained Neural Networks (ACNN): Application to Cardiac Image Enhancement and Segmentation oktay2017a.pdf
- Milletari et al - Integrating Statistical Prior Knowledge into Convolutional Neural Networks milletari2017.pdf
- Ravishankar et al - Learning and Incorporating Shape Models for Semantic Segmentation ravishankar2017.pdf

Vertex Clustering model for disease progression: Application to Cortical Thickness Images; Marinescu et al. marinescu2017.pdf

We will read Sections 5.3 to 5.4 of bv_cvxbook.pdf.

We will read Sections 5.1 to 5.2.

We will read from section 4.2.3 onwards.

We reviewed the first third of Ch. 4 (read through section 4.2.3)

We will be reading Convex Optimization by Stephen Boyd this summer. bv_cvxbook.pdf

Read section 2.6 on Dual Cones

We will continue reading Ch. 2 (read through section 2.5). See also a helpful set of slides for Chapter 2.

We will start Ch. 2 on Convex Sets (read through section 2.3)

We will go back to the paper we started with:

Chen et al., Sparse Projections of Medical Images onto Manifolds [http://people.csail.mit.edu/polina/papers/Chen-IPMI-2013.pdf]

Nonlinear Component Analysis as a Kernel Eigenvalue Problem Bernhard Scholkopf, Alexander Smola, Klaus-Robert Muller. AKA Kernel PCA. kpca.pdf

For those who want to take a look at a more formal treatment of the Sobolev kernel and such, here is another tutorial: AN INTRODUCTION TO THE THEORY OF REPRODUCING KERNEL HILBERT SPACES VERN I. PAULSENrkhs.pdf

We will go over the second part of the notes.

RKHS Material: Text notes: http://www.mit.edu/~9.520/scribe-notes/class03_gdurett.pdf Slides: http://www.mit.edu/~9.520/fall14/index.html Class 4 and class 5.

Fowlkes, Charless, Serge Belongie, Fan Chung, and Jitendra Malik. “Spectral grouping using the Nystrom method.” IEEE transactions on pattern analysis and machine intelligence 26, no. 2 (2004): 214-225. fowlkes_spectralgrouping_nystrom.pdf

Chen et al., Sparse Projections of Medical Images onto Manifolds [http://people.csail.mit.edu/polina/papers/Chen-IPMI-2013.pdf]

Daniel Moyer, Boris A. Gutman, Joshua Faskowitz, Neda Jahanshad, and Paul M. Thompson. A Continuous Model of Cortical Connectivity. MICCAI 2016 [http://www-scf.usc.edu/~moyerd/pubs/continuous-connectivity.pdf]

Yang Xiao, Roland Kwitt, and Marc Niethammer. “Fast Predictive Image Registration.” International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis. Springer International Publishing, 2016. [https://arxiv.org/abs/1607.02504]

Isola et al., Image-to-Image Translation with Conditional Adversarial Networks, [https://arxiv.org/pdf/1611.07004.pdf]

Ronneberger et al. U-Net: Convolutional Networks for Biomedical Image Segmentation. MICCAI 2015. unet.pdf

Oktay et al. Multi-Input Cardiac Image Super-Resolution using Convolutional Neural Networks. MICCAI 2016. [https://www.doc.ic.ac.uk/~oo2113/publications/MICCAI2016_camera_ready.pdf]

- HeMIS: Hetero-Modal Image Segmentation [https://arxiv.org/pdf/1607.05194v1.pdf]

Barycentric Subspace Analysis: a new Symmetric Group-wise Paradigm for Cardiac Motion Tracking. [https://hal.inria.fr/hal-01373706/document]

Unsupervised Freeview Groupwise Cardiac Segmentation Using Synchronized Spectral Network unsupervised.pdf

Bilateral Weighted Adaptive Local Similarity Measure for Registration in Neurosurgery kochan16.pdf

Fast Fully Automatic Segmentation of the Human Placenta from Motion Corrupted MRI. kainz_miccai106a.pdf

SpineNet: Automatically Pinpointing Classification Evidence in Spinal MRIs. jamaludin16.pdf

MICCAI 2016 debrief.

Please come with 1-3 papers that you liked at MICCAI 2016 and want to discuss about!

Aditya Khosla will talk about his work on CNN for Medical Tasks.

Potentially relevant paper: http://people.csail.mit.edu/khosla/papers/arxiv2016_Wang.pdf

Antonio Torralba will discuss his experience with NNs.

We’ll continue with Goodfellow et al’s book. http://www.deeplearningbook.org/, chapter 9 (CNNs)

We’ll continue with Goodfellow et al’s book. http://www.deeplearningbook.org/, up to and including 6.5 (Backprop)

We’ll continue with Goodfellow et al’s book. http://www.deeplearningbook.org/, up to and including 6.4

We’re starting a series of discussions on Deep learning.

We’ll start with Goodfellow et al’s book. http://www.deeplearningbook.org/ (Please scan through Part I and review whatever is necessary)

**On Aug 11th, we’ll read Part II, Chapter 6.**

Adrien Depeursinge will tell us about his work on texture classification.

Vercauteren et al, Diffeomorphic Demons: Efficient Non-parametric Image Registration: diffeodemons-neuroimage08-vercauteren.pdf

For those interested in further development: symlogdemons-miccai08-vercauteren.pdf

For those need basic background of demons algorithm: thirion98.pdf

Avants, Brian B., Charles L. Epstein, Murray Grossman, and James C. Gee. “Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain.” Medical image analysis 12, no. 1 (2008): 26-41. MedIA version: avants-media.pdf symmetric_diffeomorphic_image_registration.pdf

C. Studholme, D.L.G. Hill, D.J. Hawkes, “An overlap invariant entropy measure of 3D medical image alignment”, Pattern Recognition 32 (1999) 71—86 normalizedmi.pdf (Discussion leader: Danielle)

Horn et al, Determining Optical Flow: opticalflow.pdf

Nonrigid Registration Using Free-Form Deformations: Application to Breast MR Images. D. Rueckert, L. I. Sonoda, C. Hayes, D. L. G. Hill, M. O. Leach, and D. J. Hawkes. IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 18, NO. 8, AUGUST 1999. rueckert-ffd.pdf

Floris F. Berendsen, Uulke A. van der Heide, Thomas R. Langerak, Alexis N.T.J. Kotte, Josien P.W. Pluim. Free-form image registration regularized by a statistical shape model: application to organ segmentation in cervical MR. CVIU 2013. plum-2013.pdf

Papież, B. W., Heinrich, M. P., Fehrenbach, J., Risser, L., & Schnabel, J. A. (2014). An implicit sliding-motion preserving regularisation via bilateral filtering for deformable image registration. Medical image analysis, 18(8), 1299-1311. 1-s2.0-s1361841514000784-main.pdf

Tanya Schmah, Laurent Risser, and Franncois-Xavier Vialard, 2013 - Left-Invariant Metrics for Diffeomorphic Image Registration with Spatially-Varying Regularisation. leftinvariantmetrics.pdf diffeomorphic_image_matching_with_left-invariant_metrics.pdf(Discussion Leader - Miaomiao)

Tong et al, 2013 - Segmentation of MR images via discriminative dictionary learning and sparse coding: Application to hippocampus labeling. dictlearning.pdf (Discussion Leader - Danielle)

Bhatia et al, 2014 - Hierarchical Manifold Learning for Regional Image Analysis. bhatia2014.pdf - Discussion Leader: Adrian

Kim, Jaechul, Ce Liu, Fei Sha, and Kristen Grauman. “Deformable spatial pyramid matching for fast dense correspondences.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013. 2013cvpr_dsp.pdf - Discussion Leader: Ray

One of its applications in image-guided radiotherapy:

Mazur, Thomas R., et al. “SIFT-based dense pixel tracking on 0.35 T cine-MR images acquired during image-guided radiation therapy with application to gating optimization.” Medical physics 43.1 (2016): 279-293. sift-based_dense_pixel_tracking_on_0.35_t_cine-mr_images_acquired_during_imageguided_radiation_therapy_with_application_to_gating_optimization.pdf

Matthieu Lê, Jan Unkelbach, Nicholas Ayache, Hervé Delingette. GPSSI: Gaussian Process for Sampling Segmentations of Images. MICCAI 2015. le-miccai-2015.pdf

Mapping Stacked Decision Forests to Deep and Sparse Convolutional Neural Networks for Semantic Segmentation http://arxiv.org/pdf/1507.07583v2.pdf Disc. leader: Greg, 20160209_ciccarelli_rfdnn.pdf

Conditional Regression Forests for Human Pose Estimation, Sun et al, CVPR 2012 skt_cvpr2012.pdf Discussion Leader: Danielle

If you’d like some background on Regression Forests, you can look at these chapters from “Decision Forests for Computer Vision and Medical Image Analysis”, Criminisi and Shotton, eds.: decisionforestschap3.pdf, decisionforestschap5.pdf

We will have Mert Sabuncu present his work on longitudinal analysis:

{https://calendar.csail.mit.edu/events/161223}

related on his MICCAI 2015 paper sabuncu-miccai-2015.pdf.

We will march off to Sarang Joshi’s tutorial on diffeomorphic registration. Here’s some holiday reading: François-Xavier Vialard , Laurent Risser, Daniel Rueckert, Colin J. Cotter. Diffeomorphic 3D Image Registration via Geodesic Shooting Using an Efficient Adjoint Calculation. IJCV 2011. vialard-ijcv.pdf

Note from the tutorial: liegroups.pdf

Matlab code used in the tutorial: gaussiansplines.txt and flow.txt

Simultaneous Longitudinal Registration with Group-Wise Similarity Prior. Greg M. Fleishman , Boris A. Gutman, P. Thomas Fletcher, Paul M. Thompson. IPMI 2015 fleishman-ipmi-2015.pdf

[MICCAI2015] Uncertainty-driven Forest Predictors for Vertebra Localization and Segmentation. David Richmond, Dagmar Kainmueller, Ben Glocker, Carsten Rother, Gene Myers.

forest.pdf

q-Space Deep Learning for Twelve-Fold Shorter and Model-Free Diffusion MRI Scans golkov15.pdf.

Neher et. al. A Machine Learning Based Approach to Fiber Tractography Using Classifier Voting. MICCAI 2015. neher15.pdf

Soheil Hor, Mehdi Moradi. Scandent Tree: A Random Forest Learning Method for Incomplete Multimodal Datasets. MICCAI 2015. scandent-trees-miccai-2015.pdf

We sorted out the list to read.

We will be reading Miaomiao Zhang’s latest paper: zhang_ipmi2015.pdf

For more context we have, Ashburner and Friston’s paper: ashburner_ni11.pdf

And also the paper we read last week!

We will read Miller et al’s paper on the application of geodesic shooting in diffeomorphic image registration: miller_jmiv06.pdf

A more recent paper by Ashburner and Friston is also available for further reading: ashburner_ni11.pdf

We will read Vercauteren et al, Diffeomorphic Demons: Efficient Non-parametric Image Registration: diffeodemons-neuroimage08-vercauteren.pdf

For those interested in further development of this line of research: symlogdemons-miccai08-vercauteren.pdf

No meeting

We will continue reading Beg et al.

Additional resources:

Review of calculus of variations: calculusofvariations.pdf

Danielle’s notes on Beg et al: dfpace_beg_etal.pdf

We will read Beg et al, Computing large deformation metric mappings via geodesic flows of diffeomorphisms, IJCV 2005. beg_lddmm.pdf

We will continue going over Danial Lashkari’s notes.

We will continue going over Danial Lashkari’s notes. Additionally, this is another resource: diffgeo.pdf

We will go over Danial Lashkari’s notes from the last time we all learned about lie algebra and manifolds.

http://groups.csail.mit.edu/vision/golland/group_meeting/doku.php?id=discussion:oct_1_2007

Durrleman, Stanley, et al. “Toward a comprehensive framework for the spatiotemporal statistical analysis of longitudinal shape data.” International journal of computer vision 103.1 (2013): 22-59. durrleman_et_al_2013.pdf

Reshef DN, Reshef YA, Finucane HK, et al. Detecting Novel Associations in Large Datasets. Science (New York, N.y). 2011;334(6062):1518-1524. doi:10.1126/science.1205438. mic.pdf

The supplementary material that we will discuss: som.pdf Section 3 discussed the approximation algorithm, though sections 1 and 2 are likely to also be useful for the discussion!

MICCAI: Zhang, Miaomiao, and P. Thomas Fletcher. “Bayesian Principal Geodesic Analysis in Diffeomorphic Image Registration.” Medical Image Computing and Computer-Assisted Intervention–MICCAI 2014. Springer International Publishing, 2014. 121-128. bayesian-principal-geodesic-analysis-in-diffeomorphic-image-registration.pdf

NIPS: Zhang, Miaomiao, and P. Thomas Fletcher. “Probabilistic principal geodesic analysis.” Advances in Neural Information Processing Systems. 2013. 5133-probabilistic-principal-geodesic-analysis.pdf

Sabuncu, Mert R. “A Universal and Efficient Method to Compute Maps from Image-Based Prediction Models.” Medical Image Computing and Computer-Assisted Intervention–MICCAI 2014. Springer International Publishing, 2014. 353-360.

mappingimagebasedpredictionmodels_sabuncu_miccai14_final.pdf

Maxime Taquet, Benoıt Scherrer, Jurriaan M. Peters, Sanjay P. Prabhu, and Simon K. Warfield. A Fully Bayesian Inference Framework for Population Studies of the Brain Microstructure. MICCAI 2014. taquet-miccai2014.pdf

Ramesh, Adrian and Danielle will lead:

ICML tutorial on Submodularity in Machine learning - Part I http://www.cs.berkeley.edu/~stefje/submodularity_icml.html

Danielle will lead:

Zhu, Zhang, Liu and Metaxas, Scalable histopathological image analysis via active learning. zhu_miccai2014.pdf

George will lead:

Herve Lombaert, Darko Zikic, Antonio Criminisi, and Nicholas Ayache. Laplacian Forests: Semantic Image Segmentation by Guided Bagging. miccai-laplacian-forest.pdf

**(Update 10/24/2014)** Notes are now available. reading-group-random-forests-laplacian-forests-2014-10-21-draft1.pdf

We will meet in *32-D451*.

Daniel C. Alexander, Darko Zikic, Jiaying Zhang, Hui Zhang, Antonio Criminisi. Image Quality Transfer via Random Forest Regression: Applications in Diffusion MRI. alexander_miccai2014.pdf

We will meet in *32-G431*.

Xiaoxiao Liu, Marc Niethammer, Roland Kwitt, Matthew McCormick, and Stephen Aylward. Low-Rank to the Rescue – Atlas-based Analyses in the Presence of Pathologies. liu2014_miccai_low_rank_to_the_rescue.pdf

Rohlfing T1, Sullivan EV, Pfefferbaum A. Regression models of atlas appearance. ipmi2009_rohlfing.pdf

O. Veksler, Star shape prior for graph-cut image segmentation. ECCV 2008, Lecture Notes in Computer Science Volume 5304, 2008, pp 454-467. starshapeprior.pdf

Since we didn’t have dessert, this is my suggestion for next week. I hope you enjoy it. SUNGKYU JUNG, IAN L. DRYDEN and J. S. MARRON. “Analysis of principal nested spheres (PNS)”, Biometrika (2012), http://www.stat.pitt.edu/sungkyu/papers/Biometrika-2012-Jung-551-68.pdf

The reading is split into an amuse-bouche and an entree. I intend on only discussing the entree at length.

Amuse-bouche: To whet your appetite, skim chapter 8 of Boyd’s ADMM monograph for a sampler of applications where ADMM is used (among these are a few I sketched last time: lasso, group lasso, SVM’s): http://www.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf

Entree: Concentrate your fire power on reading the following paper by Yedidia and friends who show how ADMM can be implemented as a message-passing algorithm and how to modify the algorithm to tackle some large nonconvex problems: http://arxiv.org/abs/1305.1961

**Update 4/30/2014 2:28am**: Notes by me (George) for last reading group are now up! reading-group-admm-intro-2014-04-15-notes.pdf (lasted updated 4/30 to fix some typos caught by Danielle)

We begin our series on distributed optimization, with a preliminary focus on a method called ADMM. Please read chapters 1-3 of the following: http://www.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf

A resource webpage that might be helpful: http://www.stanford.edu/~boyd/papers/admm_distr_stats.html

Ben Glocker, A Sotiras, N Komodakis, Nikos Paragios. “Deformable Medical Image Registration: Setting the State of the Art with Discrete Methods”, Annual Review Biomedical Engineering, 2011, 13: 219-244. glocker11.pdf

Wang et al, “Markov Random Field modeling, inference & learning in computer vision & image understanding: A survey”, CVIU, 2013. Technical report (42pages) http://hal.inria.fr/docs/00/73/49/83/PDF/GraphicalModelSurvey.pdf, or CVIU paper (18 pages) wang13.pdf

Yang et al, Neighbor-Constrained Segmentation With Level Set Based 3-D Deformable Models, IEEE TMI 2004 yang_neighborconstrainedsegmentation.pdf

Firdaus Janoos, Shantanu Singh, Raghu Machiraju, William M. Wells III, István Ákos Mórocz. State-Space Models of Mental Processes from fMRI. IPMI 2011. janoos-ipmi2011.pdf

And here is the journal paper: janoos_2013.pdf

Harini Eavani, Theodore Satterthwaite, Raquel Gur, Ruben Gur, Christos Davatzikos; Unsupervised Learning of Functional Network Dynamics in Resting State fMRI http://www.rad.upenn.edu/sbia/Harini.Eavani/ipmi13/IPMI_2013_Harini_camera_ready_ver2.pdf

Ioannidis, J.P.A. Why most published research findings are false why_most_published_research_findings.pdf (see also this rebuttal paper and Andrew Gelman's counter-rebuttal).

Probabilistic inference of regularisation in non-rigid registration. Ivor J.A. Simpson, Julia A. Schnabel, Adrian R. Groves, Jesper L.R. Andersson, Mark W. Woolrich. simpson-neuroimage-2012.pdf

Simpson et al, *A Bayesian Approach for Spatially Adaptive Regularisation in Non-rigid Registration*bayes_reg.pdf (MICCAI)

We continue our patch series by moving from super-resolution to segmentation.

Wang et al, 2013, *Patch-Based Segmentation without Registration: Application to Knee MRI* wang_seg_2013.pdf (MLMI-MICCAI)

We will start a series of patch-based methods. For Nov 4th, we’ll look at:

Shi et al, 2013. *Cardiac Image Super-Resolution with Global Correspondence Using Multi-Atlas PatchMatch* shi_et_al_2013.pdf (MICCAI)

As an **optional** read, many algorithms use or refer to patchMatch - Barnes, 2009.

We will discuss interesting papers from IPMI and MICCAI. Everyone should come with a couple of papers they are interested in exploring.

We will read Tree-space statistics and approximations for large-scale analysis of anatomical trees, by A. Feragen, M. Owen, J. Petersen, M.M.W. Wille, L.H. Thomsen, A. Dirksen and M. de Bruijne http://image.diku.dk/aasa/papers/tree_stats_ipmi2012_cameraready.pdf

Matched Signal Detection on Graphs: Theory and Application to Brain Network Classification, by C. Hu, L. Cheng, J. Sepulcre, G. El Fakhri, Y. M. Lu, and Q. Li.

Chapter 5 of this book: “Handbook of Markov Chain Monte Carlo”, which can be found here

Chapter 27 of this Book: “Bayesian Reasoning and Machine Learning”

Globerson and Roweis, Nightmare at Test Time: Robust Learning by Feature Deletion, ICML, 2006 [globerson_and_roweis_2006.pdf]

Kayhan and I (George) will talk about a new approach to nonnegative matrix factorization (NMF) that is NOT based on iterative hill-climbing type algorithms that may only reach a local optima; instead, under a separability assumption which is empirically observed in real topic modeling data, NMF for learning topic models can be solved in polynomial time. The paper that we’ll look at is on arXiv, and we’ll focus on everything up to and including Section 3.1 (mid way through page 8–basically we’ll focus on the case where there are true anchor words):

"Learning Topic Models -- Going Beyond SVD"

Sanjeev Arora, Rong Ge, Ankur Moitra

Foundations of Computer Science 2012

**Update (3/11/2013 5:10pm).** I just put together some preliminary notes going over some key high-level ideas; hopefully this is helpful: reading-group-learning-topic-models-2013-03-12-draft1.pdf

We met to plan what we’ll read for the spring. Here are the results of the voting:

- Non-negative matrix factorization (6)
- Learning with missing features (5)
- Tutorial on sampling methods (MCMC, HMC, etc) (5)
- Type I vs Type II sparsity (4): David Wipf and Yi Wu. “Dual-Space Analysis of the Sparse Linear Model” NIPS 2012 http://nips.cc/Conferences/2012/Program/event.php?ID=3436
- Video Magnification (4) Initial paper
- Tutorial on deep learning (4)
- Compressed sensing (3) Paper?
- Learning distributions from composition of marginals (3): F. Sanchez-Vega, J. Eisner, L. Younes, and D. Geman. “Learning Multivariate Distributions by Competitive Assembly of Marginals” IEEE PAMI, 2012.
- Network/time clustering in fmri (3): Identification of Recurrent Patterns in the Activation of Brain Networks
- ADMM for distribution optimization (3)
- sym diffeomorphism long. (3)
- significant for lasso (2)
- Tutorial on Indian Buffet processes (2)
- Non-quadratic priors (0)
- Multidimensional Spectral Hashing (0)
- Ordered based search (0)

We will read (Mert will lead): A. Criminisi, J. Shotton, and E. Konukoglu, “Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning” The file is big so here’s a link: http://research.microsoft.com/apps/pubs/default.aspx?id=158806

Depending on interest, this reading might span several weeks. In our first session, let’s plan to read up to the end of chapter 4, namely the one on regression. On an unrelated note, Ender, one of the co-authors of this paper, will be present.

We also read the following MICCAI paper: E. Konokoglu, B. Glocker, D. Zikic, A. Criminisi, “Neighbourhood Approximation Forests”, MICCAI 2012: neighbourhood_approximation_forests.pdf.

We will continue with MICCAI 2012 papers and read: “Hierarchical Manifold Learning” by Kanwal K. Bhatia, Anil Rao, Anthony N. Price, Robin Wolz, Joseph V. Hajnal, Daniel Rueckert: hierarchicalmanifoldlearning.pdf

We will read this paper: “SVM based significance maps” by Bilwaj Gaonkar, Christos Davatzikos bilwaj2012.pdf

We will read our first paper from the MICCAI 2012 series: “Evaluating segmentation error without ground truth” by Timo Kohlberger, Vivek Singh, Chris Alvino, Claus Bahlmann, Leo Grady: kohlberger2012_gt.pdf

MICCAI Discussion

Decided for next week: “Evaluating segmentation error without ground truth”

To invite for talk — Neighborhood approximation forests, 10

Later reading series - Reading up on random forests (Antonio’s book), 10

Other readings:

Evaluating segmentation error without ground truth, 9

SVM based significance maps, 8

Geodesic information flow, 5

Hierarchical manifold learning, 5

We will read Chapter 3 of Tom Minka’s thesis on Expectation Propagation minka-thesis.pdf. There is a fair amount of derivation, so hopefully, we can get a clear picture of the algorithm.

Additional references on EP and other approximate inference algorithms can be found at: http://research.microsoft.com/en-us/um/people/minka/papers/ep/roadmap.html

We’ll expand on the spatially dependent Pitman-Yor processes we covered a couple of weeks ago, and read:

Soumya Ghosh and Erik B. Sudderth, Nonparametric Learning for Layered Segmentation of Natural Images, CVPR 2012. ghoshsudderth12cvpr.pdf

The supplementary materials cover some detail that we probably won’t have time to cover, but I’ve uploaded them for completeness. ghoshsudderth12cvpr-supplement.pdf

Also relevant is a video lecture on the topic by Erik Sudderth from a NIPS 2011 workshop: http://videolectures.net/nipsworkshops2011_sudderth_segmentation

We will read the original Hierarchical Dirichlet Process paper: Teh, Jordan, Beal, Blei: Hierarchical Dirichlet Processes hierarchical_nonparametric.pdf

Here are some helpful notes written up by Danial: dp_brief_notes.pdf.

We’ll look at a paper that uses layers approach to image segmentation:

Sudderth and Jordan, Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes, NIPS 2008. sudderth-jordan-nips08.pdf

We’ll look at two papers that use/extend SWA, and cover a little more detail on supernode/seed selection:

Alpert, S., Galun, M., Basri, R. and Brandt, A., Image segmentation by probabilistic bottom-up aggregation and cue integration, PAMI 2012. alpert_aggregating_pami_2012.pdf

Goldschmidt, Y., Galun, M., Sharon, E., Basri, R. and Brandt, A. Fast multiscale clustering by integrating collective features, NIPS 2007 goldschmidt2007nips.pdf

Note that the first paper has the same title as their 2007 CVPR submission, so be careful if searching.

We will look at a paper on tumor segmentation via superpixel hierarchy: corso_et_al.pdf

J. J. Corso, E. Sharon, S. Dube, S. El-Saden, U. Sinha, and A. Yuille. Efficient Multilevel Brain Tumor Segmentation with Integrated Bayesian Model Classification. IEEE Transactions on Medical Imaging, 27(5):629-640, 2008

If you want to read into the SWA algorithm: Hierarchy and adaptivity in segmenting visual scenes, nature04977.pdf

We will discuss S. C. Zhu’s paper on image segmentation: zhu_pami96.pdf A brief review of the snakes is provided in Sec. 2.1 and in the Appendix of the paper, and active contours are explained in more details in kass_ijcv88.pdf and caselles_ijcv97.pdf.

We will finish chapter 5 and go over chapter 6.

This week, we will go over the 5’th chapter (Filtering on Graphs) of the “Discrete Calculus” by Leo Grady.

We’ll start with Sebastian’s nature paper: http://www.nature.com/nature/journal/v401/n6755/pdf/401788a0.pdf

The paper gives intuition for non-negative matrix factorization and compares it to several other approaches.

For reference, the detailed algorithms paper is here: http://hebb.mit.edu/people/seung/papers/nmfconverge.pdf

To understand the algorithms: A Tutorial on MM Algorithms

On the Equivalence of NMF and Spectral Clustering

Understanding the similarities and the differences between probabilistic topic models and NMF will be useful. A potential paper is: Probabilistic Latent Variable Models as Nonnegative Factorizations

We’ll read: 4D registration of serial brain’s MR images: a robust measure of changes applied to Alzheimer’s disease. Marco Lorenzi, Nicholas Ayache, Giovanni Frisoni, and Xavier Pennec. MICCAI STIA Workshop, 2010

We’ll be reading on determinantal point processes. More precisely, we will read chapter 2 of Alex’ thesis:

A. Kulesza, Learning with Determinantal Point Processes, thesis draft. alex_thesis_draft.pdf

Additional material. Conference papers:

A. Kulesza, and B. Taskar, Structured Determinantal Point Processes, Neural Information Processing Systems Conference (NIPS), Vancouver, BC, December 2010. sdpp_nips10-1.pdf

k-DPPs: Fixed-Size Determinantal Point Processes, A. Kulesza, and B. Taskar. International Conference on Machine Learning (ICML), Bellevue, WA, June 2011. kdpps_icml11-1.pdf

Learning Determinantal Point Processes, A. Kulesza, and B. Taskar. Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, July 2011. ldpps_uai11.pdf

Math DPP surveys:

Ben Hough, Manjunath Krishnapur, Yuval Peres, Bálint Virág, Determinantal Processes and Independence, Probability Surveys, 2006. determinantal_processes_and_independence_-_hough.pdf

Alexei Borodin, Determinantal point processes, 2009. determinantal_point_processes_-_borodin.pdf

Switching gears, we’ll read - Vounou M, Nichols TE, Montana G; Alzheimer’s Disease Neuroimaging Initiative. Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach. Neuroimage, 53(3):1147-59, 2010 vounou_et_al_2010_-_neuroimage.pdf.

We will read Dan Feldman’s NIPS paper. Here’s an **updated** PDF that includes supplemental material: nips11coresets-supplemental.pdf

Here’s the NIPS talk: http://videolectures.net/nips2011_faulkner_coresets/

**Update (3/19/2012, 10:17pm)** Preliminary notes that I (George) typed up for last week’s discussion are here: reading_group_2012_03_13_notes.pdf Unfortunately, from what I can tell, some constants from the lecture notes don’t quite match constants in literature cited at times. I tried to reconcile these discrepancies. The main message is *not* any different though; basically the k-center coreset construction algorithm has a horrendous running time in terms of the number of clusters k.

We will finish reading the survey paper from last week, emphasizing clustering (section 6) and the high-dimensional setting (section 7).

**Update (3/13/2012, 2:34am)** Preliminary notes that I (George) typed up for last week’s discussion are here: reading_group_2012_03_06_notes.pdf Unfortunately, these may not be terribly helpful for this week’s reading as the clustering material is quite different...

In preparation for Dan Feldman’s coresets talk in April, we’ll be reading a (dated) survey on the basics of coresets from computational geometry. Basically the idea is that we want to approximate some interesting metric of a bunch of data points by using a hopefully substantially smaller subsample (called a coreset) of the data points. The reading:

P.K. Agarwal, S. Har-Peled, and K.R. Varadarajan. “Geometric Approximation via Coresets”. (2005) coresets_survey_2005.pdf

As a preview for what’s to come, here’s Dan’s paper at NIPS 2011 on how to build a coreset for a (huge) training dataset of a mixture model while retaining an $\epsilon$-approximation to the data likelihood; the result is that you can run, say, the EM algorithm for Gaussian mixture models on the coreset instead (**this is NOT part of the reading for Mar 6** although I’d recommend looking over the abstract to see what coresets can achieve now):

D. Feldman, M. Faulkner, and A. Krause. “Scalabe Training of Mixture Models via Coresets”. (2011) coresets_nips11.pdf

We’ll be reading a paper on a model and algorithm for Gaussian process classification:

Urtasun, R. and Darrell, T.: Discriminative Gaussian Process Latent Variable Model for Classification. ICML ‘07

We will read Mackay’s tutorial on Gaussian processes: mackay-gaussian-processes.pdf

Thanks to Iman for finding an updated version: mackay-gaussian-processes-frombook.pdf

We’ll focus on the first 6 sections.

Another useful reference is Williams’ tutorial: williams-gaussian-process.pdf (only Sections 3 and 4 are relevant to this week’s discussion).

We will read the first 4 sections of “A Riemannian Framework for Tensor Computing” riemannian_framework_for_tensor_computing_2005.pdf.

We will be reading the paper “Smooth relevance vector machine: a smoothness prior extension of the RVM” by Alexander Schmolck and Richard Everson (Mach Learn (2007) 68: 107–135) smoothrvm.pdf

We’ll discuss and compare it to papers we covered earlier, Mert’s MICCAI sabuncumiccai2011.pdf and journal paper (coming soon) as well as Tipping’s original work on RVM tippingjmlr2001_sparsebayesianlearningandtherelevancevectormachine.pdf.

Spatially regularized SVM for the detection of brain areas associated with stroke outcome. Remi Cuingnet, Charlotte Rosso, Stephane Lehericy, Didier Dormont, Habib Benali, Yves Samson1, and Olivier Colliot. MICCAI 2010. miccai2010_cuingnet.pdf

UPDATE: Annotations added. I recommend reading most of the introduction for context and background, so I’ve left it without annotations. In the Methods section, I’ve highlighted titles of subsections that are pertinent to the results I intend to cover. Similarly, in the Results section, I’ve highlighted subsections I intend to cover and struck through everything we won’t discuss. Despite this, there is still a fairly large amount of text to read, some of which is somewhat technical (in a neuroscientific sense). Therefore, I recommend initially focusing on the results figures, which are highly illustrative and well described in their respective captions, then referring back to the text.

Annotated PDF: j_neurophys-2011-yeo-annot-print.pdf

We plan to read parts of Thomas’ (and others’) paper on a large scale study of resting state functional connectivity in healthy subjects. The discussion should mostly be centred around particularly interesting results that could serve as motivation for future work in our group. Once I’ve read all of the paper in detail, I’ll try to narrow down the results sections so that the amount of material is reasonable.

B. T. Thomas Yeo, Fenna M. Krienen, Jorge Sepulcre, Mert R. Sabuncu, Danial Lashkari, Marisa Hollinshead, Joshua L. Roffman, Jordan W. Smoller, Lilla Zöllei, Jonathan R. Polimeni, Bruce Fischl, Hesheng Liu and Randy L. Buckner; The organization of the human cerebral cortex estimated by intrinsic functional connectivity j_neurophys-2011-yeo.pdf

met-mlbayes.pdf We’re going back in time and covering a tutorial in Bayesian inference in Machine Learning by Tipping. It offers a nice build-up from least squares regression to the RVM. The Models in Section 4 are very related to the RVoxM readings from the previous 2 weeks.

rvmpath.pdf I put together a quick list of papers and references for learning the RVM and RVoxM. If you are interested in this topic, take a look at the file for more information/paths to take.

For the curios, heretipping.pdf‘s some results from a quick (potentially buggy) implementation. I tried to keep the colors as those Tipping’s Figure 5, although I had to guess some of the parameters so the curves are slightly different. The new blue line shows the (-log) joint that we were wondering about, and the red is the (-log) marginal likelihood from the paper (these later two curves were rescaled). Although they differ slightly, all three estimations of lambda (validation, joint, marginal) seem ~equally decent.

** Note special time this week: Thursday, Dec 1, 4pm **

We’ll go into the hyperparameter estimation derivations from the Relevance Voxel Machine. I’ve emailed a draft of the journal RVoxM paper to V-Golland. We will concentrate on Sections II, III and VI (Theory and Appendix).

For the interested, more reading into RVM:

Michel Tipping maintains a website on “Sparse Bayesian Models (& the RVM)”: http://www.miketipping.com/index.php?page=rvm

One of the papers, “Sparse Bayesian learning and the relevance vector machine” tipping01a.pdf has relevant derivations in sec 1, 2, 5. (see http://www.miketipping.com/index.php?page=corrections for corrections)

Fletcher fletcher_-_rvm_explained.pdf also dives right into the theory (much less discussion on motivation/application/etc)

See 2004-wipf-ieeesigproc.pdf for another perspective on sparsity

** Note special time this week: Tuesday, Nov 22, 4pm **

We’ll be reading Mert’s MICCAI paper - Sabuncu and Leemput, The Relevance Voxel Machine (RVoxM): A Bayesian Method for Image-based Prediction, MICCAI, 2011 - sabuncumiccai2011.pdf

We will read “Connectivity-Informed fMRI Activation Detection” by Bernard Ng, Rafeef Abugharbieh, Gael Varoquaux, Jean Baptiste Poline and Bertrand Thirion. ng_connectivityinformedfmri_miccai11.pdf

We’ll read up on RKHSs. We’ll start with this tutorial: daume04rkhs.pdf Sections 1-5 are straightforward and should be mostly review, so we’ll just be discussing Section 6. There are a few typos:

At the end of 6.2, the first sentence of the last paragraph is more clear as “The property of reproducing kernels that we need is <f,K(x,\cdot)> = f(\cdot).”

The right-hand side of the equation in 6.4.1 should read $\lambda\phi(x)$ (instead of x^\prime).

The notes that George posted are also relevant; here are the notes and the blurb: On an unrelated note, here are some notes I hacked up about basics of Reproducing Kernel Hilbert Spaces and Kernel Ridge Regression: reading_group_prep_rkhs_2011_10_14.pdf This is NOT related to the 10/19 reading; it’s related to the 10/12 reading. This could be useful for when we return to talking about kernel methods. **Update 10/13**: A few minor typos have been corrected. **Update 10/14**: The proof for the closed-form solution for kernel ridge regression has been extended to allow for possibly non-invertible kernel matrices.

We will go through the details of INLA and discuss applications. Please read Section 4 in addition to the first part of the paper.

Here’s a write-up on the Laplace’s approximation from 6.437: laplace.pdf

We will read about integrated nested Laplace approximations (it’s a 74-page beast, but references begin on page 32 followed by discussions that happened after publication): Håvard Rue, Sara Martino, and Nicolas Chopin. “Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations.” JRSS 2009. inla.pdf

**Update 10/18**: Just read through the end of section 3; this amounts to ~13.3 pages of reading.**Update 10/19 1:38am**: Notes I (George) will use today at reading group. It’s high-level and almost surely has bugs: reading_group_inla_notes_2011_10_19_0137.pdf

Nicolas Duchateau, Mathieu De Craene, Gemma Piella, and Alejandro F. Frangi. Characterizing Pathological Deviations from Normality using Constrained Manifold-Learning. miccai 2011. duchateau-miccai2011.pdf

Rómer Rosales and Brendan Frey. Learning Generative Models of Affinity Matrices: UAI 2003 rosales_frey_similarity.pdf

Firdaus Janoos, Shantanu Singh, Raghu Machiraju, William M. Wells III, István Ákos Mórocz. State-Space Models of Mental Processes from fMRI. IPMI 2011. janoos-ipmi2011.pdf

Fani Deligianni, Gael Varoquaux, Bertrand Thirion, Emma Robinson, David J Sharp, A David Edwards, and Daniel Rueckert. A Probabilistic Framework to Infer Brain Functional Connectivity from Anatomical Connections. IPMI 2011. deligianni2011.pdf

Gaël Varoquaux, Alexandre Gramfort, Fabian Pedregosa, Vincent Michel, Bertrand Thirion. “Multi-subject Dictionary Learning to Segment an Atlas of Brain Spontaneous Activity.” IPMI 2011. ipmi2011-multi-subject-dict-learning-fmri.pdf

We’ll cover Stanley Durrleman, Marcel Prastawa, Guido Gerig, Sarang C. Joshi. “Optimal Data-Driven Sparse Parameterization of Diffeomorphisms for Population Analysis.” IPMI 2011. ipmi2011-sparse-diffeomorphisms.pdf

We will continue covering Wainwright’s notes on sparse linear models (i.e. same reading as July 12 and 19). In particular, we will blow up the plot on Page 92, work through the proof for Theorem 5.3 and Example 5.7, and finish discussion about pairwise incoherence and the restricted isometry property (RIP). As an illustrative example, I (George) will also discuss Exercise 5.2 at the end of the chapter (showing that pairwise incoherence implies the restricted nullspace property, without using RIP results). Non-trivial probabilistic results that are used in Example 5.7 and Exercise 5.2c will be briefly discussed.

Definition 5.3. and Theorem 5.3 are in hard-to-read shaded boxes, which are reproduced here: reading_group_july_26.pdf

Page 92 plot, blown up (1.7MB): http://people.csail.mit.edu/georgehc/pg92_plot_blown_up.jpg

**Update (July 24, 1:48pm)**: Notes from the previous reading group are now available: reading_group_2011_07_19_notes.pdf

**Update (July 25, 11:43am)**: Preliminary supplemental notes for this week’s reading group are now available: reading_group_2011_07_26_notes_draft2.pdf

We will read the notes on sparse linear models, focusing on the conditions, their relationships, and the proofs. Try to come up with or keep in mind a roadmap of where to go from here as you read.

We will cover Martin Wainwright’s notes on sparse linear models up to and including section 5.3.2.

wainwright-sparse-linear-models.pdf

The gray boxes are kind of hard to read, so they’re reproduced in this supplement:

We will continue the discussion on CCA. In addition to the June 14th reading, see

I. Rustandi Thesis, Predictive fMRI Analysis for Multiple Subjects and Multiple Studies - Section 3.3. cmu-cca-ch3.pdf (see section 3.3)

D. R. Hardoon , S. Szedmak and J. Shawe-Taylor, Canonical correlation analysis; An overview with application to learning methods, T.R. 2003hardoon-03.pdf

Further reading

F.R. Bach and M.I. Jordan, A Probabilistic Interpretation of Canonical Correlation Analysis, T.R. 2005 bachjordan05.pdf

H. Hotelling, Relations Between Two Sets of Variates, Biometrika, 1936hotelling36.pdf

Original Infomax paper. Bell and Sejnowski, Neural Computation, 1995. infomax.pdf

Cardoso ‘97 paper that connects Infomax to Max. Likelihood: cardoso1997_infomax_ml.pdf

We’ll talk about ICA. The paper will be:

ICA Tutorial (with sections devoted to Infomax ICA):

- Independent Component Analysis: Algorithms and Applications, Aapo Hyvärinen and Erkki Oja, 2000. icatutorial.pdf

We’ll talk about an ICA method for evaluating fMRI and genetic data. The paper will be:

- J.Liu et al, Combining fMRI and SNP data to investigate connections between brain function and genetics using parallel ICA, HBM 2009. 2009_liu-hbm.pdf

A relevant review paper (optional):

- Calhoun et al. A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data, NeuroImage, 2009 calhoun2009.pdf

We will be discussing “Joint Bayesian Cortical Sulci Recognition and Spatial Normalization” by Perrot et al. from IPMI 2009. perrot_ipmi_09.pdf

Gael Varoquaux, Alexandre Gramfort, Jean-Baptiste Poline, Bertrand Thirion; Brain covariance selection: better individual functional connectivity models using population prior. NIPS 2011 varoquax-fmri-nips10.pdf

We’ll look at “Fitting a graph to vector data” fitting_a_graph_to_vector_data.pdf, which gives an alternate approach to K-nearest neighbors and epsilon radius for creating the initial graph.

We will look more closely at the derivation that leads to the eigenvector problem in one of the references in the paper we read last time. Specifically, we will discuss theorems 14.2.1 and 14.4.2 in Multivariate Analysis by Mardia, Kent and Bibby. multivariate_analysis.pdf

Proof that the SVD is the best low rank approximation and the exercise in Mardia svdproof.pdf

**Remark**: Given the discussion during reading group, it seems that a Cholesky decomposition would work. In particular, for real, positive semi-definite B, using Cholesky decomposition would produce lower-triangular, real-valued L such that:

B = L * L^T

As an example, if B is full-rank, then the columns of L would all be independent forming a basis (which may not be orthonormal; although we could say that the original data points also need not be orthonormal).

More information: http://en.wikipedia.org/wiki/Cholesky_decomposition#Statement

Actually, we will discuss Tenebaum, J.B., de Silva, V., and Langford, J.C., A global geometric framework for nonlinear dimensionality reduction, Science, 290, p 2319-2323, 2000. isomap.pdf

Contrary to the first notice, we will NOT discuss: R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. PNAS, 2005. geometric-diffusion1.pdf

This paper demonstrates gradient-based implementation of MDS (known as Sammon mapping):

JOHN W. SAMMON, A Nonlinear Mapping for Data Structure Analysis. IEEE TRANSACTIONS ON COMPUTERS, VOL. C-18, NO. 5, MAY 1969 mds-numerical.pdf

We will discuss a view of spectral embedding via diffusion processes. Please read carefully the first paper and go over the second paper. Both present similar approaches to the problem.

A wandom walk view of spectral segmentation. Meila and Shi, AIstats 2001. meila-aistats-2001.pdf.

R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. PNAS, 2005. geometric-diffusion1.pdf

U. Luxburg. A tutorial on spectral clustering. Statistics and Computing 2007. luxburg_spectralclustering_tutorial.pdf.

J. Shi, J. Malik: Normalized Cuts and Image Segmentation shimalik-normalizedcuts.pdf

We’ll mainly be focusing on section 2 - i.e. why it is that minimizing the normalized cut turns out to be a generalized eigenvalue problem.

We might also want to discuss how we can interpret this by forming a Markov chain on the graph, so I’ve added a relevant paper meila-aistats-2001.pdf.

We’ll be continuing Sharpnack’s paper on graph labeling, focusing on Theorem 1 and its proof:

J. Sharpnack, A. Singh; Identifying graph-structured activation patterns in networks sharpnack-graph-nips10.pdf

A video of the conference talk is here: http://videolectures.net/nips2010_sharpnack_igs/ (the talk doesn’t go into detail on the proofs, but does give a little bit of intuition for the terms in the theorem).

We’ll also look briefly at these papers on other spectral methods: shimalik-normalizedcuts.pdf ng-spectralclustering.pdf belkin-laplacianeigenmaps.pdf

We’ll be reading Sharpnack’s paper on graph labeling:

J. Sharpnack, A. Singh; Identifying graph-structured activation patterns in networks sharpnack-graph-nips10.pdf

A video of the conference talk is here: http://videolectures.net/nips2010_sharpnack_igs/

We will take another look at the paper by M. Nielsen, L. Florack and R. Deriche nielsen_regscalespace_jmathimage1997.pdf through Section 5.

We will clean up our discussion of the demons algorithms. The leftover items is the second-order optimization and the relationship between regularization and smoothing.

1) For the ESM method, we will look at section 3 of this paper:

Insight Into Efficient Image Registration Techniques and the Demons Algorithm. Tom Vercauteren, Xavier Pennec, Ezio Malis, Aymeric Perchant, and Nicholas Ayache. IPMI 2007. insighteffregdemons-ipmi07-vercauteren.pdf

2) For the regularization, we will look at these papers:

- Iconic Feature Based Non-Rigid Registration: The PASHA Algorithm. P. Cachier et. al. (Appendix A only) cachier_pasha_cviu2003.pdf
- Regularization, Scale-Space and Edge Detection Filters. M. Nielsen, L. Florack, R. Deriche nielsen_regscalespace_jmathimage1997.pdf

Diffeomorphic demons: Efficient non-parametric image registration. Tom Vercauteren, Xavier Pennec, Aymeric Perchant, Nicholas Ayache. NeuroImage, 2009. vercauteren-2009.pdf

The original demons paper: J.-P. Thirion, Image matching as a diffusion process: an analogy with Maxwell’s demons. Medical Image Analysis (1998), volume 2, number 3. thirion98.pdf

From MICCAI, 2010: L. Risser, F.-X. Vialard, R. Wolz, D. Holm, D. Rueckert; Simultaneous Fine and Coarse Diffeomorphic Registration: Application to Atrophy Measurement in Alzheimer’s Disease; MICCAI 2010 miccai2010_risser.pdf

From MICCAI, 2010: P. Risholm, S. Pieper, E. Samset and W. M. Wells; Summarizing and visualizing uncertainty in non-rigid registration miccai2010_risholm.pdf

Background paper: P. Risholm et al. Bayesian Estimation of Deformation and Elastic Parameters in Non-rigid Registration wibr2010_risholm.pdf

From MICCAI, 2010: G. Varoquaux, F. Baronnet, A. Kleinschmidt, P. Fillard, and B. Thirion; Detection of Brain Functional-Connectivity Difference in Post-stroke Patients Using Group-Level Covariance Modeling varoquaux_grouplevelcovariance_miccai10.pdf

We will read Chapter 22 “Analysis of variance” (ANOVA) of Andrew Gelman’s “Data Analysis Using Regression and Multilevel/Hierarchical Models” book gelman.pdf. In particular, we will just be looking at pages 487-501 in the book (PDF pages 513-527). **Update 10/15/10 9:04PM**: The high quality PDF of just Chapter 22 is now available gelman-ch22.pdf.

Notes by George: reading-group-anova-2010-10-20-notes.pdf

We will read Bernardo and Smith’s coverage of reference priors. The relevant section from the book is here: bernardo_smith_referencepriors.pdf.

- Pages 298-316 (PDF pages 1-19) cover the basics of 1-dimensional reference priors,
- pages 316-320 (PDF pages 19-23) cover restricted reference priors (from a restricted family of distributions),
- pages 320-333 (PDF pages 23-36) cover reference priors with nuisance parameters,
- and pages 333-339 (PDF pages 36-42) cover the multidimensional case.

The next three meetings will provide an overview of Multilevel Regression and ANOVA using the book, “Data analysis using regression and multilevel/hierarchical models” by Andrew Gelman. The chapters posted below come from the publisher’s website and have a few formatting issues. A short summary of these issues has been made format_errors.pdf

The material for the first 2 weeks comes from Part 2A gelman_pt2a.pdf

Aug 17th: Please read chapter 12, pgs.251-276, as an introduction to multilevel linear models. Also, please read pgs.244-247 from chapter 11, which attempts to explain the difference between “random” and “fixed” effects.

Aug 24th and 31st (pick one to attend): Please read chapter 13, pgs.279-297, which covers some more complex multilevel linear models with varying slopes and “non-nested” data.

Sep 7th: Multilevel regression applied to fMRI. Please read woolrich_behrens_multilevel.pdf

Gerber et al. Manifold modeling for brain population analysis. Medical Image Analysis 14 (2010) 643–653. gerber-media-2010.pdf

Jia et al. ABSORB: Atlas building by self-organized registration and bundling. NeuroImage 51 (2010) 1057–1070. jia_neuroimage_2010.pdf

We will start a 3-4 week overview of Neural Networks and Deep Belief Networks. The rough schedule is as follows:

July 13th: We will introduce the standard Neural Network setup, as presented in Bishop’s “Pattern Recognition and Machine Learning” bishop_patternrecognitionandmachinelearning_ch5.pdf. Please read Sections 5.1 and 5.2. In addition, please skim 5.3.1 and 5.3.2, as we may get started on Back-Propagation.

July 20th: We will talk about back propagation. Please read Section 5.3 of Bishop and the original Back-Propogation paper by Rumelhart rumelhart_backpropagatingerrors_nature86.pdf). If time permits, we may go over computing the Hessian (Section 5.4 of Bishop).

July 27th: We will discuss Hinton’s learning algorithm for Deep Belief networks hinton_fastlearningdeepbelief_neuralcomp06.pdf.

We will finish the LP paper from Mar 29.

We will discuss the meanfield solvers for MRFs.

Background from Wanmei’s thesis: meanfield.pdf

Exerts from course notes: variational.pdf

__Graph cuts for energy minimization; Min Cut Max Flow:__

Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast Approximate Energy Minimization via Graph Cuts. *IEEE PAMI* 23(11), 2001. boykov2001.pdf

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein. Min Cut Max Flow and the Ford-Fulkerson method. *Introduction to Algorithms, Second Edition* minmax.pdf

We will discuss primal-dual algorithms for solving linear programs. The first chapter of “Primal-Dual Interior-Point Methods” by Stephen Wright provides a good overview of the subject: wright_primaldual_1997.pdf

The tutorial on convex optimization by Haitham Hindi is a more complete overview of duality and optimality conditions: hindi_tutorialconvexopt.pdf

We will discuss primal-dual formulation in more detail. See pages 215-227 and 232-234 of Convex Optimization by Boyd and Vandenberghe: boyd_duality.pdf

A PDF of the full book is available at http://www.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf. Book pages 215-227,232-234 correspond to PDF pages 229-241,246-248.

We will also try to finish the labeling from last time.

Nikos Komodakis and Georgios Tziritas: Approximate Labeling via Graph Cuts Based on Linear Programming. in PAMI 2007 komodakis07linprog.pdf

we will stick to the PAMI paper, but a concise listing of the most important ideas is in: Nikos Komodakis et al. ICCV 2005 komodakis_iccv2005.pdf External Link

Olivier Commowick, Simon K. Warfield, and Gregoire Malandain. Using Frankenstein’s Creature Paradigm to Build a Patient Specific Atlas; MICCAI 2009 miccai2009_commowick.pdf

We continued the discussion of the paper from the previous week.

Nematollah Batmanghelich, Ben Taskar, and Christos Davatzikos. A general and unifying framework for feature construction in image-based patter classification; IPMI 2009 ipmi2009_batmanghelich.pdf

Davis, B.C., Fletcher, P.T., Bullitt, E., Joshi, S. Population Shape Regression From Random Design Data. ICCV 2007. davis-iccv2007.pdf

T. Rohlfing et al. Regression models for atlas appearance. IPMI 2009. ipmi2009_rohlfing.pdf

Ou; DRAMMS: Deformable registration via Attribute Matching and mutual-saliency weighting; IPMI 2009 ipmi2009_ou.pdf

Criminisi; Decision Forests with Long-Range Spatial Context for Organ Localization in CT Volumes, MICCAI/PMMIA 2009 criminisipmmia2009_decisionforests.pdf

Additional reading:

Breimann 01 Random Forests: breiman01randomforests.pdf

Murgasova, A spatio-temporaral atlas of the growing brain for fMRI studies, MICCAI/IADB 2009 murgasova09development.pdf

D. R. Hardoon , S. Szedmak and J. Shawe-Taylor, Canonical correlation analysis; An overview with application to learning methods, T.R. 2003hardoon-03.pdf

Further reading

H. Hotelling, Relations Between Two Sets of Variates, Biometrika, 1936hotelling36.pdf

F.R. Bach and M.I. Jordan, A Probabilistic Interpretation of Canonical Correlation Analysis, T.R. 2005 bachjordan05.pdf

We will continue discussing the Elastic Net paper by Zou and Hastie: elastic_net_zou_hastie_2005.pdf

We will discuss the Elastic Net paper by Zou and Hastie: elastic_net_zou_hastie_2005.pdf

Will finish the sparse PCA paper.

~~We will discuss the Sparse PCA paper by Zou, Hastie, and Tibshirani: sparsepca.pdf~~

**The published version of the paper (which we’ll discuss) is here: spca_jcgs.pdf**

We continued discussing the IBP papers.

Indian Buffet Process: we will discuss

Griffith and Ghahramani, Infinite latent feature models and the Indian buffet process, NIPS 2006. ibp-1.pdf

There are two other important papers:

IBP and the hierarchical Beta processes. hbp-2.pdf

Stick-breaking Construction for the IBP. stickbreakingibp.pdf

We will discuss

Discriminative Shape Alignment. Marco Loog and Marleen de Bruijne. IPMI 2009. loog-ipmi2009.pdf

Thomas will lead the discussion.

Proof of complex representations: procrusts-complex-representation.pdf

Two tutorials found by Tammy: procrustes-ch-04.pdf procrustes_tutorial.pdf

We will be discussing Stochastic Tractography Friman, Bayesian Stochastic Tractography. This paper references the work by Behrens et. al. Behrens, Propagation of Uncertainty. Since both papers are fairly straightforward, if time permits, we will talk about both models.

We discussed the PAMI paper during our Reading Group; the MICCAI paper was listed for further reading.

Finsler Active Contours, J. Melonakos, E. Pichon, S. Angenent, A. Tannenbaum, IEEE PAMI 2008.

Finsler Tractography for White Matter Connectivity Analysis of the Cingulum Bundle J. Melonakos, V. Mohan, M. Niethammer, K. Smith, M. Kubicki, A. Tannenbaum, MICCAI 2007.

We used the MICCAI paper as the basis of the discussion; the thesis has much more detail.

MCMC Curve Sampling for Image Segmentation, Fan, A., J. Fisher, W. Wells, J. Levitt, A. Willsky, MICCAI 2007.

Curve Sampling and Geometric Conditional Simulation, Fan, A., MIT PhD Thesis, Nov. 2007.

Dirichlet Processes (contd.) We plan to read and understand the variational approximations from Kurihara et al.

Kurihara, Welling, Teh: Collapsed Variational Dirichlet Process Mixture Models cvdp.pdf.

Dirichlet Processes (contd.) We are continuing with the hierarchical Dirichlet processes from Teh et al.

Dirichlet Processes. We plan to read and understand Teh et al. But for the first session, we will spend time going through the basics of the nonparametric methods from the tutorial dirichlet_processes.pdf.

Here is some brief notes on DPs: dp_brief_notes.pdf.

Teh, Jordan, Beal, Blei: Hierarchical Dirichlet Processes hierarchical_nonparametric.pdf

We will finish the shapes paper from last time.

E. Klassen, A. Srivastava, W. Mio, and S. Joshi, Analysis of Planar Shapes Using Geodesic Paths in Shape Spaces, PAMI 2004 klassensrivastavamiojoshipami04.pdf

We will start by finishing the paper from last time. The homework is to derive the coefficients from the next level of splines from the coefficients of the previous level. We will then discuss the thin-plate splines.

Principal Warps: Thin-Plate Splines and the Decomposition of Deformations. FRED L. BOOKSTEIN. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL.II. NO. 6. JUNE 1989. bookstein-89.pdf

Wanmei and Mert will lead the discussion.

We will finish the registration paper and discuss multi-level splines in a bit more detail. Michal will lead the discussion on the registration paper.

Scattered Data Interpolation with Multilevel B-Splines. Seungyong Lee, George Wolberg, and Sung Yong Shin. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 3, NO. 3, JULY–SEPTEMBER 1997. bsplines-2.pdf

Archana will lead the discussion on splines.

Nonrigid Registration Using Free-Form Deformations: Application to Breast MR Images. D. Rueckert, L. I. Sonoda, C. Hayes, D. L. G. Hill, M. O. Leach, and D. J. Hawkes. IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 18, NO. 8, AUGUST 1999. rueckert-ffd.pdf

These are several background papers on splines:

Surface Fitting with Hierarchical Splines DAVID R. FORSEY and RICHARD H. BARTELS. ACM Transactions on Graphics, Vol. 14, No. 2, April 1995, Pages 134-161. bsplines-1.pdf

Scattered Data Interpolation with Multilevel B-Splines. Seungyong Lee, George Wolberg, and Sung Yong Shin. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 3, NO. 3, JULY–SEPTEMBER 1997. bsplines-2.pdf

Principal Warps: Thin-Plate Splines and the Decomposition of Deformations. FRED L. BOOKSTEIN. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL.II. NO. 6. JUNE 1989. bookstein-89.pdf

K. K. Bhatia, P. Aljabar, J. P. Boardman, L. Srinivasan, M. Murgasova, S. J. Counsell, M. A. Rutherford, J. Hajnal, A. D. Edwards, and D. Rueckert, Groupwise Combined Segmentation and Registration for Atlas Construction, MICCAI 2007 bahtiamiccai07.pdf

We will go over section V of the same paper.

We will work on details of geometry of the cost function and Lemma 2 in the sparse bayesian learning paper.

David P. Wipf and Bhaskar D. Rao. Sparse Bayesian Learning for Basis Selection. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004. 2004-wipf-ieeesigproc.pdf has some theoretical explanation about why the sparse Bayesian learning is indeed sparse.

Section II shows the formulation, which is the same as Tipping’s ARD paper, and an EM algorithm to optimize the objective function.

Section III and IV are the theoretical analysis of why the model favors sparse solution. In particular, IV A proposes a very interesting proof.

We will continue with the currents.

S. Durrleman, X. Pennec, A. Trouve, P. Thompson, N. Ayache. Inferring brain variability from diffeomorphic deformations of currents: An integrative approach durrleman_currents.pdf

M. Vaillant and J. Glaunes” Surface Matching via Currents vaillant_currents.pdf

S. Durrleman, X. Pennec, A. Trouve, P. Thompson, N. Ayache. Inferring brain variability from diffeomorphic deformations of currents: An integrative approach durrleman_currents.pdf

We are going to read Tipping’s paper on Sparse Bayesian Learning

tipping2003_fastmarginallikelihoodmaximisationforsparcemachinemodels.pdf. It is closely related to some other learning schemes.

2001-tipping-jmlr.pdf lays out the fundamental concepts for sparse Bayesian learning, and it may assist you to understand the above paper.

2004-wipf-ieeesigproc.pdf has some theoretical explanation about why the sparse Bayesian learning is indeed sparse.

Using the logarithm of odds to define a vector space on probabilistic atlases. Kilian M. Pohl , John Fisher , Sylvain Bouix , Martha Shenton , Robert W. McCarley , W. Eric L. Grimson , Ron Kikinis , William M. Wells Medical Image Analysis 2007. pohl-media-2007.pdf

Label Space: A Multi-object Shape Representation. James Malcolm, Yogesh Rathi, and Allen Tannenbaum. MICCAI 2008. malcolm-miccai08.pdf

Salima Makni, Philippe Ciuciu, Jérôme Idier, and Jean-Baptiste Poline. Joint Detection-Estimation of Brain Activity in Functional MRI: A Multichannel Deconvolution Solution. IEEE Trans. Sig. Proc., 53(9):3488-502, Sept. 2005. makni-tsp.pdf

Yonggang Shi, Zhuowen Tu, Allan L. Reiss, Rebecca A. Dutton, Agatha D. Lee, Albert M. Galaburda, Ivo D. Dinov, Paul M. Thompson, Arthur W. Toga: Joint Sulci Detection Using Graphical Models and Boosted Priors. IPMI 2007. shi-ipmi07.pdf

We will finish the GLM paper this time.

Friston, L., et al.: Assessing the significance of focal activations using their spatial extent. Human Brain Mapping, 1994. friston_focalactivations.pdf

Friston, K., et al.: Statistical parametric maps in functional imaging: A general linear approach. Human Brain Mapping, 1995.friston_glm.pdf

Worsley, KJ, Friston, KJ.: Analysis of fMRI Time-Series Revisited—Again. Neuroimage, 1995. worsley_fmriagain.pdf

Myers, RH, Montgomery, DC. A tutorial on Generalized Linear Models. Journal of Quality Technology, 1997. myers_tutorialonglm.pdf

We will finish the paper from last week.

Joshua E. Cates, P. Thomas Fletcher, Martin Andreas Styner, Martha Elizabeth Shenton, Ross T. Whitaker: Shape Modeling and Analysis with Entropy-Based Particle Systems. IPMI 2007. cates-ipmi07.pdf

We will continue to discuss the LDDMM paper. The goal is to find a geometric interpretation of the first term in the sum in eq. 4, finish the theorem and talk about the metric (Sec 6).

We will discuss the computation behind LDDMM beg-lddmm.pdf. It will be an easier start than understanding the more theoretical underpinnings of the other LDDMM papers.

Note that the first lemma is the hardest part of the paper, but things get a lot easier after that.

We will discuss the Nystrom Method, which can be used to approximate the eigendecomposition of large matrices. An application of this technique to Spectral Clustering is presented in Fowlkes et al. fowlkes_spectralgrouping_nystrom.pdf

We will discuss Rao’s paper on the uniqueness of the decomposition into a Gaussian and a non-Gausian part. New paper 1966-rao-sankhyasera.pdf.

We will discuss Rao’s paper on the uniqueness of the decomposition into a Gaussian and a non-Gausian part 1969-rao-annmathstatist.pdf.

We will continue with the ICA paper.

We will discuss Beckmann et al.’s Probabilistic ICA for fMRI paper beckmann2003.pdf

We will continue with the LDA discussion.

We will discuss Blei et al.’s Latent Dirichlet Allocation paper bleingjordan2003.pdf

We will also discuss this.

We will discuss Lilla Zollei’s paper. A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration: 2007-zollei-ipmi.pdf

We will continue on the Wisharts paper; the homework is to get more comfortable with that particular distribution.

Bing Jian, Baba C. Vemuri: Multi-fiber Reconstruction from Diffusion MRI Using Mixture of Wisharts and Sparse Deconvolution. IPMI 2007. jian-vemuri-ipmi07.pdf

Anastasia will lead the discussion.

Schapire. Boosting overview schapire_msri.pdf

Additional papers:

Jerome Friedman, Trevor Hastie, Robert Tibshirani. Additive Logistic Regression: a Statistical View of Boosting (1998). friedman98additive.pdf

Torralba, Murphy and Freeman. Sharing visual features for multiclass and multiview object detection. sharing.pdf

Boosting Image Retrieval (Tieu, Viola, IJCV 2004) tieu_boostingimageretrieval.pdf

We will continue the discussion about the paper from last meeting.

Chris McIntosh and Ghassan Hamarneh. Is a Single Energy Functional Sufficient? Adaptive Energy Functionals and Automatic Initialization. MICCAI 2007. mcintosh-miccai2007.pdf

Shaohua Kevin Zhou and Dorin Comaniciu. Shape Regression Machine. IPMI 2007. zhou-ipmi2007.pdf

We will discuss the two simple examples defined in last meeting: a closed curve and an open curve. Please work through the examples and come with beatiful matlab figures of the embedding.

We will also talk about wavelets. The first paper is what we already looked at; the other two are longer, more detailed versions.

R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Multiscale methods. PNAS, 2005. geometric_diffusion2.pdf

Additional papers on the topic: Diffusion wavelets and their use in spectral clustering: nadler06.pdf coifmanmaggoni06.pdf

Laplacian-Eigenmaps by Belkin and Niyogilaplacianeigenmaps.pdf

Further papers on discrete Laplace-Beltrami Operators (overview): xu-04-discretelaplace.pdf wardetzky-07-discretelbo.pdf

We will finish the diffusion map discussion and will talk about the second paper. The week after that, we will come back to the operators in the first paper.

We will start by discussing the first paper and go on to the second one. You only need to read the first paper for this meeting, but we will end up reading both by the end of this series.

R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. PNAS, 2005. geometric-diffusion1.pdf

R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Multiscale methods. PNAS, 2005. geometric_diffusion2.pdf

Additional papers on the topic: Diffusion wavelets and their use in spectral clustering: nadler06.pdf coifmanmaggoni06.pdf

We will finish the fast diff. paper. Mert and Thomas will lead.

Fast diffeomorphic registration. Thomas will lead the discussion.

We will finish Mahony’s paper.

Lie groups tutorial.

Continuing on the basic differential geometric notions required for understanding Mahony’s paper, we will review 5th chapter of:

Wolfgang Kuhnel and Bruce Hunt, *Differential Geometry: Curves - Surfaces - Manifolds*, AMS, Second Edition, 2005.

Please check the discussion for mote details.

Danial will present: Mahony, R. Manton, J. H., ``The Geometry of the Newton Method on Non-Compact Lie Groups,’’ JOURNAL OF GLOBAL OPTIMIZATION, 2002, VOL 23; NUMBER 3, pages 309–327. mahony.pdf

J.-P. Thirion, Image matching as a diffusion process: an analogy with Maxwell’s demons. Medical Image Analysis (1998), volume 2, number 3. Biz will present. thirion98.pdf

Tom Vercauteren, Xavier Pennec, Aymeric Perchant, Nicholas Ayache. Non-parametric Diffeomorphic Image Registration with the Demons Algorithm. MICCAI 2007.

Min Cut Max Flow and the Ford-Fulkerson method.

Introduction to Algorithms, Second Edition Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein.

Graph cuts for energy minimization. Koen will lead the discussion.

Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast Approximate Energy Minimization via Graph Cuts. IEEE PAMI 23(11), 2001.

Finish the BP paper.

Belief propagation

Our main reference is: Constructing free-energy approximations and generalized belief propagation algorithms: J. Yedidia, et al. yedidiafreemanweiss2005.pdf

The basic mathematical motivation of belief propagation: A. M. Aji and J. McEliece gendislaw.pdf

BP is used for learning problems on different types of graphs: Bayesian Networks, Markov Random Fields, Junction graphs, and Factor graphs. To see examples: Y. Weiss and W. T. Freeman max-product_optimality.pdf and F. R. Kschischang, et al. sum-product.pdf

And a couple of other popular review papers by the same authors as of our main paper: genbp.pdf and understandingbp.pdf.

William T. Freeman, Thouis R. Jones, and Egon C. Pasztor, Example-based super-resolution, IEEE Computer Graphics and Applications, March/April, 2002. cgasres.pdf

W. T. Freeman, E. C. Pasztor, O. T. Carmichael Learning Low-Level Vision International Journal of Computer Vision, 40(1), pp. 25-47, 2000. MERL-TR2000-05. tr2000-05.pdf

We will continue our discussion on projection pursuit and its connection to Infomax ICA.

Original Infomax paper:

InfoMax ICA algorithm and its connection to projection pursuit. An information-maximization approach to blind separation and blind deconvolution. Bell and Sejnowski, Neural Computation, 1995. infomax.pdf

ICA Tutorial (with sections devoted to Infomax ICA):

Independent Component Analysis: Algorithms and Applications, Aapo Hyvärinen and Erkki Oja, 2000. icatutorial.pdf

Projection Pursuit:

Please read the first 17 pages of the following paper.

M. C. Jones; Robin Sibson, *What is Projection Pursuit?,* Journal of the Royal Statistical Society. Series A (General), Vol. 150, No. 1. (1987), pp. 1-37. jones87.pdf

A much longer, more detailed review paper is the below.

Peter Huber,*Projection Pursuit,*The annals of Statistics, Vol 13, No2 (June 1985), pp. 435-475. huber-1.pdf

MM algorithms: Hunter and Lange, *A Tutorial on MM Algorithms,* Am. Stat. 58(1):30-7, Feb. 2004. Clear Version mm_tutorial.pdf

Hunter and Lange, *A Tutorial on MM Algorithms,* Am. Stat. 58(1):30-7, Feb. 2004. lange_04_amstat.pdf

Jacobson and Fessler, *An Expanded Theoretical Treatment of Iteration-Dependent
Majorize-Minimize Algorithms,* preprint. jacobson,tip.pdf

“Information Theoretic Coclustering,” by Inderjit S. Dhillon, Subramanyam Mallela, and Dharmendra S. Modha p89-dhillon.pdf

“A Log-Euclidean Polyaffine Framework for Locally Rigid or Affine Registration” by Vincent Arsigny, Olivier Commowick, Xavier Pennec, and Nicholas Ayache logeuclidean_wbir.pdf

Finish the CG.

Conjugate gradient algorithm.

Jonathan Richard Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. This is a nice (although long) tutorial paper: painless-conjugate-gradient.pdf

A (shorter) section from the numerical recipes book: c10-6.pdf

We will finish the paper.

We will continue with the same paper. Please read Section 4 and the appendices. It’s heavy reading; you might want to start early.

Danial: The following paper makes a nice connection between exponential-family-mixture-model and distance-measure-based clustering methods.

A. Banerjee et al. Clustering with Bregman Divergence. J Mach Learn Res 6 (2005).

Please read the first three sections for the first meeting. We will go through the key concepts: Bregman divergence, and information in more detail and try to understand the Bregman hard clustering algorithm.

Mert will lead the discussion on registration of fMRI.

B. Thirion et al. Improving Sensitivity and Reliability of fMRI Group Studies through High Level Combination of Individual Subjects Results. MMBIA 2006.

Serdar will lead the discussion on the multi-modal (not in a classical sense) nature of atlases:

Daniel J. Blezek and James V. Miller. Atlas Stratification. MICCAI 2006.

More Sobolev Active Contours this week...

Sobolev Active Contours: sobolevactivecontours.pdf

–Thomas

We will continue our asymptotic quest. Let’s hope this process converges to the true value of the parameters!

Here is my last version of summary: Reading: summarydoob.pdf

We will continue the discussion of Wilks’ paper on the asymptotics of LR. In order to make sense of the main assumption of equation (3), we have to go through the reference to Doob’s paper which is a seminal derivation of many basic theorems.

Reading: doob1948.pdf

Danial

Reading: leemput_miccai2006.pdf

–Thomas

Papoulis (pp. 275–278) is a good introductory note on asymptotics of hypothesis testing (photocopies outside Polina’s office). See also his statistics chapter (pp. 241–282) for hypothesis testing in general. The original papers for the asymptotics of the likelihood ratio are:

We will continue the paper from last week.

We also made a note that we need to look into asympotic statistics results (mentioned in the paper) in the future. Kinh might take a lead on that.

Here’s the original DCM paper, for people who want more detail:

Note the special time: 10:00am

Penny W.D. et al. (2004). Modelling functional integration: a comparison of structural equation and dynamic causal models.

We will focus on the structural equation model this week.

Note the special time: 4pm

We will continue our discussion on effective connectivity in neuroimaging. We will focus on the second half of the first paper from last time (Friston 1994). If time is allowed, we will discuss the third paper from last time as well (Friston 1997).

We will discuss functional and effective connectivity in neuroimaging. The first paper is the most general paper. The second paper is a book chapter version of the first paper on functional connectivity and goes a bit deeper. The third paper deals with more exotic topic in effective connectivity. We will mainly discuss the first paper, so if you have limited time, the first one is the paper to read.

Friston, K.J. (1994). Functional and effective connectivity in neuroimaging: A synthesis. Human Brain Mapping, 2, 56-78.

Friston, K.J, Büchel, C (1993). Functional Connectivity: Eigenimages and multivariate analyses.

Friston, K.J., Buchel, C., Fink, G.R., Morris, J., Rolls, E., and Dolan, R. (1997). Psychophysiological and modulatory interactions in Neuroimaging. NeuroImage, 6, 218-229.

We will start the meetings with two papers that Wanmei claims use the same model in two somewhat unrelated applications. Both papers consider the fundamental problem of evidence integration from independent sources.

We will discuss the details of the models and the relationship between them. If you have time to read just one paper, the first one is lighter.

Genovese, C. R., Noll, D. C. and Eddy, W. F. (1997). Estimating Test-Retest Reliability in fMRI I: Statistical Methodology, Magnetic Resonance in Medicine, 38, 497–507.

Simultaneous Truth and Performance Level Estimation (STAPLE): An Algorithm for the Validation of Image Segmentation. Warfield S, Zou KH, Wells WM. IEEE Trans Med Imag 2004; 23:903-921.

We will discuss ICA for fMRI time series analysis. The longer paper discusses the method in details, and will be the basis for our discussion. The shorter one is a nice overview.

We will discuss the Dirichlet process mixture model, as presented in Teh 04, , which develops a variant of the DP mixture for grouped data. Other papers (all cited by Teh) are provided here for those interested in a deeper theoretical background: Ferguson 73 and Antoniak 74 are the seminal papers (with fairly technical measure-theoretic proofs), while Sethuraman 94, followed by Ishwaran and James 01 and Ishwaran and Zarepour 02, are more constructive. Of these, I’d recommend Sethuraman for clarity and brevity. – John

Yee Whye Teh *et al.*

Hierarchical Dirichlet Processes

teh04hdp.pdf

Thomas Ferguson

A Bayesian Analysis of Some Nonparametric Problems

ferguson.ps.gz

Charles Antoniak

Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems

antoniak74mixtures.ps

Jayaram Sethuraman

A Constructive Definition of Dirichlet Priors

sethuraman94constructive.pdf

Hemant Ishwaran and Lancelot James

Gibbs Sampling Methods for Stick-Breaking Priors

ishwaran01gibbs.pdf

Hemant Ishwaran and Mahmoud Zarepour

Exact and approximate sum-representations for the Dirichlet process

ishwaran02exact.pdf

We will continue the discussion from the last time. I posted some questions in the Discussion section. Feel free to add comments and more questions. – Polina

We will discuss the Information Bottleneck Method and its uses in determining the number of clusters in fMRI data. The second paper describes the information bottleneck approach and is a background reading for the first paper.

Bertrand Thirion, Olivier Faugeras. Feature Detection in fMRI Data: The Information Bottleneck Approach. thirionmedia.pdf

Naftali Tishby, Fernando C. Pereira, William Bialek. The Information Bottleneck Method

tishby99information.pdf

Uncut Version:

allertoninfobottleneck.pdf

I (Thomas) am posting the writeups for Apr 12. Gheorghe has also kindly provided his short writeup on EM. Happy Reading!!

I (Thomas) will be presenting the papers for this week.

Andrew Ng Michael Jordan and Yair Weiss

On Spectral Clustering: An analysis and an algorithm

ng-spectralclustering.pdf

Optional Papers: I will talk about mixture fitting, but I will not be completely following Michael Collins’ derivations. However, the tutorial is still nice if you have no experience with mixture fitting or EM. We will NOT go over the theorems about convergence, i.e. we will not go beyond section 3.3. The idea is to give everyone an intuitive feel about the concept of mixture fitting, not limiting oneself to using only gaussians or the EM technique.

Michael Collins

michaelcollinstutorialonem.ps

More Variants of Spectral Clustering:

Jianbo Shi and Jitendra Malik

shimalik-normalizedcuts.pdf

Mikhail Belkin and Partha Niyogi

belkin-laplacianeigenmaps.pdf

Marina Meila and Jianbo Shi

randomwalkspectralclustering.ps

We will take a two week break and resume our discussion of FDR at the next meeting.

We’ll start with Efron, B. and Tibshirani, R.

Empirical Bayes Methods and False Discovery Rates for Microarrays

Genetic Epidemiology 23:70-86, 2002.

efrontibshirani_fdr.pdf

And then move onto the Genovese et al. paper on thresholding maps in neuroimaging using FDR nicholsfdr.pdf (listed below).

A comparison of several similar methods (FDR, pFDR, PER, PFP) can be found in

K. Manly, D. Nettleton, and J.T.G. Hwang

Genomics, Prior Probability, and Statistical Tests of Multiple Hypotheses

Genome Research 14: 997-1001, 2004 manlyetal.pdf

I might touch on some of these different options during the discussion.

The PFP paper was

Controlling the Proportion of False Positives in Multiple Dependent Tests

R. L. Fernando, D. Nettletonb, B. R. Southey, J. C. M. Dekkers, M. F. Rothschild, and M. Soller

Genetics, Vol. 166, 611-619, January 2004

Benjamini, Y. and Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289-300, 1995. benjaminifdr.pdf

Christopher R. Genovese, Nicole A. Lazar, Thomas Nichols. Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate. NeuroImage 15:870-878, 2002. nicholsfdr.pdf

Yoav Benjamini and Daniel Yekutieli. THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY. The Annals of Statistics 2001, Vol. 29, No. 4, 1165-1188.

I think it might be worth switching to one of the pFDR papers for the extensions of the FDR. I’m reading through some of them, and will post suggestions if one of them strikes me. In particular, I think they have a nicer treatment of independence, and it will be nice to see some other approaches to FDR. -Ray

We will continue with the permutation testing.

We also finshed the proof that the sample mean and the sample variance are independent for Gaussian iid case.

See Discussion for additional notes.

Thomas E. Nichols and Andrew P. Holmes.

Nonparametric Permutation Tests For Functional Neuroimaging: A Primer with Examples.

Human Brain Mapping 15:1-25(2001).

nichols-perm.pdf

Wanmei Ou will define the general setup of the fMRI detection problem, preparing us for the series of papers in fMRI analysis.

Notes for the group meeting.

Michael Siracusa will lead a discussion on a classical hypothesis testing, in preparation for some papers in fMRI and DTI statistics.

Notes for the group meeting. hyptest-faq.pdf

Duygu Tosun and Jerry L. Prince.

Cortical Surface Alignment Using Geometry Driven Multispectral Optical Flow,

Information Processing in Medical Imaging (IPMI), Colorado, USA, July 11-15, 2005.

tosun2005.pdf.gz

**Additional papers:** Duygu Tosun, Maryam E. Rettmann, Jerry L. Prince.

Mapping techniques for aligning sulci across multiple brains.

Medical Image Analysis 8 (2004) 295-309.

tosun2004a.pdf.gz

Xiao Han, Dzung L. Pham, Duygu Tosun, Maryam E. Rettmann, Chenyang Xu, and Jerry L. Prince.

CRUISE: Cortical reconstruction using implicit surface evolution.

NeuroImage 23 (2004) 997-1012.

han-2004.pdf.gz

Xiao Han, Chenyang Xu, and Jerry L. Prince.

A Topology Preserving Level Set Method for Geometric Deformable Models.

PAMI, VOL. 25, NO. 6, JUNE 2003.

han-2003.pdf.gz

**Also on the list for future reading:** Thompson, P. Toga, A.W.

A surface-based technique for warping three-dimensional images of the brain.

IEEE Transactions on Medical Imaging, Volume 15, Issue 4, 402 - 417, 1996.

thompson96.pdf.gz

**Additional papers:**

Xianfeng Gu; Yalin Wang; Chan, T.F.; Thompson, P.M.; Shing-Tung Yau;

Genus zero surface conformal mapping and its application to brain surface mapping.

IEEE Transactions on Medical Imaging, Volume 23, Issue 8, 949 - 958, 2004.

thompson2004.pdf.gz

Bruce Fischl, Martin I. Sereno and Anders M. Dale.

Cortical Surface-Based Analysis: II: Inflation, Flattening, and a Surface-Based Coordinate System.

Neuroimage. 9(2):195-207. 1999.

fischl99.pdf.gz

Bruce Fischl, Martin I. Sereno, Roger B.H. Tootell, Anders M. Dale.

High-resolution inter-subject averaging and a coordinate system for the cortical surface.

Human Brain Mapping, Volume 8, Issue 4, Pages 272 - 284, 1999. fischl99a.pdf.gz

**Additional papers:**

Anders M. Dale, Bruce Fischl and Martin I. Sereno.

Cortical Surface-Based Analysis: I. Segmentation and Surface Reconstruction.

Neuroimage, 9(2):179-194, 1999. dale99.pdf.gz

We will continue the discussion on the permutation tests. The future meetings will be shifted by one week.

Timothy B. Terriberry, Sarang C. Joshi, and Guido Gerig.

Hypothesis Testing with Nonlinear Shape Models.

IPMI 2005, LNCS 3565, pp. 15-26, 2005.

multperm2005.pdf.gz

**Additional papers:**

Blair, R.C., Higgins, J.J., Karniski, W., Kromrey, J.D.

A study of multivariate permutation tests which may replace Hotelling T2 test in prescribed circumstances.

Multivariate Behavioral Research 29 (1994) 141-164.

multperm94.pdf.gz

Book:

Pesarin, Fortunato

Multivariate permutation tests : with applications in biostatistics.

I have the book.

GE Christensen, RD Rabbit, MI Miller.

Deformable Templates Using Large Deformation Kinematics.

IEEE Transactions on Image Processing, 5(10), 1996, pp. 1435-1447.

deformabletemplatesusinglargedeformationkinematics.pdf.gz

**Additional papers:**

Ain A. Sonin.

Fundamental Laws of Motion for Particles, Material Volumes, and Control Volumes, 2001.

fundamental_laws.pdf.gz

On Choosing and Using Control Volumes: Six Ways of Applying the Integral Mass Conservation Theorem to a Simple Problem.

choosingcontrolvolumes.pdf.gz

We will also finish the discussion on using prior examples to bias registration.

D. M. Blei, A. Y. Ng, and M. I. Jordan.

Latent Dirichlet allocation.

Journal of Machine Learning Research, 3, 993-1022, 2003.

blei03a.pdf.gz

**Additional papers:** Thomas Hofmann.

Probabilistic Latent Semantic Analysis.

UAI 1999.

http://www.cs.brown.edu/people/th/papers/Hofmann-UAI99.pdf.gz

Josef Sivic, Bryan C. Russell, Alexei A. Efros, Andrew Zisserman, William T. Freeman.

Discovering objects and their location in images.

ICCV 2005.

http://people.csail.mit.edu/brussell/research/SREZF05.pdf.gz

Brian Russell’s RQE paper. http://www.csail.mit.edu/~brussell/research/rqe.pdf.gz

Mert R. Sabuncu and Peter J. Ramadge.

Gradient based optimization of an EMST registration function.

IEEE Conference on Acoustics, Speech and Signal Processing, Philadelphia, March 2005.

sabuncu_icassp05-1.pdf.gz

Mert R. Sabuncu and Peter J. Ramadge.

Graph theoretic image registration using prior examples.

European Signal Processing Conference 2005, Antalya, Turkey, September 2005.

eusipco05_submit-1.pdf.gz

**Additional papers:**

B. Ma, A.O. Hero, J.D. Gorman and O. Michel.

Image Registration with Minimum Spanning Tree Algorithm.

IEEE International Conf. on Image Processing, vol.1, pp.481-484, Vancouver, BC, Canada, Sept. 2000.

ma_icip00.pdf.gz

A. O. Hero, B. Ma, O. Michel and J. Gorman.

Applications of entropic spanning graphs.

IEEE Signal Proc. Magazine (Special Issue on Mathematics in Imaging), Vol 19, No. 5, pp 85-95, Sept. 2002.

spmag_ma_01-1.pdf.gz

**Additional papers:**

Beirlant, J., Dudewicz, E. J., Gyorfi, L., and van der Meulen, E. C.

Nonparametric entropy estimation: An overview.

International Journal of the Mathematical Statistics Sciences, 6, 17-39, 2001.

beirlant.pdf.gz

No meeting, many of us are at ICCV.

P.D. Grünwald.

A Tutorial Introduction to the Minimum Description Principle.

grunwald-mdlintro.pdf.gz

Also use material from Septermber 27 meeting.

A minimum description length approach to statistical shape modeling.

Davies, R.H.; Twining, C.J.; Cootes, T.F.; Waterton, J.C.; Taylor, C.J.

IEEE Transactions on Medical Imaging, 21(5):525 - 537, 2002.

shape-mdl.pdf.gz

Erik Learned-Miller.

Data Driven Image Models through Continuous Joint Alignment.

to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005.

pami_congeal.pdf.gz

**Additional papers:**

Lilla Zollei, Erik Learned-Miller, Eric Grimson and William Wells.

Efficient population registration of 3D data.

Workshop on Computer Vision for Biomedical Image Applications: Current Techniques and Future Trends, at the International Confernece of Computer Vision (ICCV), 2005.

congeal_3d.pdf.gz

Connection between the code length and entropy - Section 5.4 (and around) in Cover and Tomas.

J. Rissanen.

An Introduction to the MDL Principle.

rissanen-intro.pdf.gz

**Additional papers on MDL:**

Rissanen, J.

Stochastic Complexity and Modeling.

Annals of Statistics, Vol 14, 1080-1100, 1986.

rissanen86.pdf.gz

P.D. Grünwald.

A Tutorial Introduction to the Minimum Description Principle.

grunwald-mdlintro.pdf.gz

**Additional papers on BIC and AIC:**

Schwarz, G. (1978).

Estimating the dimension of a model. Annals of Statistics, 6, 461-464.

scwarzbic.pdf.gz

Akaike, H. (1974).

A new look at the statistical model identification.

IEEE Transactions on Automatic Control, AC-19, 716-723. akaike74.pdf.gz

A Unified Information-Theoretic Approach to Groupwise Non-rigid Registration and Model Building.

Carole J. Twining, Tim Cootes, Stephen Marsland, Vladimir Petrovic, Roy Schestowitz and Chris J. Taylor.

Information Processing in Medical Imaging: 19th International Conference, IPMI 2005, Glenwood Springs, CO, USA, July 10-15, 2005.

twining-ipmi-2005.pdf.gz

**Additional papers:**

Carole J. Twining, Stephen Marsland, and Chris Taylor.

Groupwise Non-Rigid Registration: The Minimum Description Length Approach.

BMVC 2004.

twining-bmvc-2004.pdf.gz

Carole Twining and Stephen Marsland.

A Unified Information-Theoretic Approach to the Correspondence Problem in Image Registration.

International Conference on Pattern Recognition, Cambridge, U.K. 2004.

twining-icpr-2004.pdf.gz

First meeting, general intros.Internal Link