計算機視覺

出版時間:2012-5  出版社:電子工業(yè)出版社  作者:福賽斯(David A. Forsyth),泊斯(Jean Ponce)  頁數(shù):761  字數(shù):1268000  譯者:David A.Forsyth  
Tag標簽:無  

內(nèi)容概要

  計算機視覺是研究如何使人工系統(tǒng)從圖像或多維數(shù)據(jù)中“感知”的科學。本書是計算機視覺領域的經(jīng)典教材,內(nèi)容涉及幾何攝像模型、光照和著色、色彩、線性濾波、局部圖像特征、紋理、立體相對、運動結構、聚類分割、組合與模型擬合、追蹤、配準、平滑表面與骨架、距離數(shù)據(jù)、圖像分類、對象檢測與識別、基于圖像的建模與渲染、人形研究、圖像搜索與檢索、優(yōu)化技術等內(nèi)容。與前一版相比,本書簡化了部分主題,增加了應用示例,重寫了關于現(xiàn)代特性的內(nèi)容,詳述了現(xiàn)代圖像編輯技術與對象識別技術。

作者簡介

作者:(美國)福賽斯(David A. Forsyth) (美國)泊斯(Jean Ponce)

書籍目錄

I IMAGE FORMATION
1 Geometric Camera Models
 1.1 Image Formation
  1.1.1 Pinhole Perspective
  1.1.2 Weak Perspective
  1.1.3 Cameras with Lenses
  1.1.4 The Human Eye
 1.2 Intrinsic and Extrinsic Parameters
  1.2.1 Rigid Transformations and Homogeneous Coordinates
  1.2.2 Intrinsic Parameters
  1.2.3 Extrinsic Parameters
  1.2.4 Perspective Projection Matrices
  1.2.5 Weak-Perspective Projection Matrices
 1.3 Geometric Camera Calibration
  1.3.1 ALinear Approach to Camera Calibration
  1.3.2 ANonlinear Approach to Camera Calibration
 1.4 Notes
2 Light and Shading
 2.1 Modelling Pixel Brightness
  2.1.1 Reflection at Surfaces
  2.1.2 Sources and Their Effects
  2.1.3 The Lambertian+Specular Model
  2.1.4 Area Sources
 2.2 Inference from Shading
  2.2.1 Radiometric Calibration and High Dynamic Range Images
  2.2.2 The Shape of Specularities
  2.2.3 Inferring Lightness and Illumination
  2.2.4 Photometric Stereo: Shape from Multiple Shaded Images
 2.3 Modelling Interreflection
  2.3.1 The Illumination at a Patch Due to an Area Source
  2.3.2 Radiosity and Exitance
  2.3.3 An Interreflection Model
  2.3.4 Qualitative Properties of Interreflections
 2.4 Shape from One Shaded Image
 2.5 Notes
3 Color
 3.1 Human Color Perception
  3.1.1 Color Matching
  3.1.2 Color Receptors
 3.2 The Physics of Color
  3.2.1 The Color of Light Sources
  3.2.2 The Color of Surfaces
 3.3 Representing Color
  3.3.1 Linear Color Spaces
  3.3.2 Non-linear Color Spaces
 3.4 AModel of Image Color
  3.4.1 The Diffuse Term
  3.4.2 The Specular Term
 3.5 Inference from Color
  3.5.1 Finding Specularities Using Color
  3.5.2 Shadow Removal Using Color
  3.5.3 Color Constancy: Surface Color from Image Color
 3.6 Notes
II EARLY VISION: JUST ONE IMAGE
4 Linear Filters
 4.1 Linear Filters and Convolution
  4.1.1 Convolution
 4.2 Shift Invariant Linear Systems
  4.2.1 Discrete Convolution
  4.2.2 Continuous Convolution
  4.2.3 Edge Effects in Discrete Convolutions
 4.3 Spatial Frequency and Fourier Transforms
  4.3.1 Fourier Transforms
 4.4 Sampling and Aliasing
  4.4.1 Sampling
  4.4.2 Aliasing
  4.4.3 Smoothing and Resampling
 4.5 Filters as Templates
  4.5.1 Convolution as a Dot Product
  4.5.2 Changing Basis
 4.6 Technique: Normalized Correlation and Finding Patterns
  4.6.1 Controlling the Television by Finding Hands by
Normalized
  Correlation
 4.7 Technique: Scale and Image Pyramids
  4.7.1 The Gaussian Pyramid
  4.7.2 Applications of Scaled Representations
 4.8 Notes
5 Local Image Features
 5.1 Computing the Image Gradient
  5.1.1 Derivative of Gaussian Filters
 5.2 Representing the Image Gradient
  5.2.1 Gradient-Based Edge Detectors
  5.2.2 Orientations
 5.3 Finding Corners and Building Neighborhoods
  5.3.1 Finding Corners
  5.3.2 Using Scale and Orientation to Build a Neighborhood
 5.4 Describing Neighborhoods with SIFT and HOG Features
  5.4.1 SIFT Features
  5.4.2 HOG Features
 5.5 Computing Local Features in Practice
 5.6 Notes
6 Texture
 6.1 Local Texture Representations Using Filters
  6.1.1 Spots and Bars
  6.1.2 From Filter Outputs to Texture Representation
  6.1.3 Local Texture Representations in Practice
 6.2 Pooled Texture Representations by Discovering Textons
  6.2.1 Vector Quantization and Textons
  6.2.2 K-means Clustering for Vector Quantization
 6.3 Synthesizing Textures and Filling Holes in Images
  6.3.1 Synthesis by Sampling Local Models
  6.3.2 Filling in Holes in Images
 6.4 Image Denoising
  6.4.1 Non-local Means
  6.4.2 Block Matching 3D (BM3D)
  6.4.3 Learned Sparse Coding
  6.4.4 Results
 6.5 Shape from Texture
  6.5.1 Shape from Texture for Planes
  6.5.2 Shape from Texture for Curved Surfaces
 6.6 Notes
III EARLY VISION: MULTIPLE IMAGES
7 Stereopsis
 7.1 Binocular Camera Geometry and the Epipolar Constraint
  7.1.1 Epipolar Geometry
  7.1.2 The Essential Matrix
  7.1.3 The Fundamental Matrix
 7.2 Binocular Reconstruction
  7.2.1 Image Rectification
 7.3 Human Stereopsis
 7.4 Local Methods for Binocular Fusion
  7.4.1 Correlation
  7.4.2 Multi-Scale Edge Matching
 7.5 Global Methods for Binocular Fusion
  7.5.1 Ordering Constraints and Dynamic Programming
  7.5.2 Smoothness and Graphs
 7.6 Using More Cameras
  7.7 Application: Robot Navigation
 7.8 Notes
8 Structure from Motion
 8.1 Internally Calibrated Perspective Cameras
  8.1.1 Natural Ambiguity of the Problem
  8.1.2 Euclidean Structure and Motion from Two Images
  8.1.3 Euclidean Structure and Motion from Multiple Images
 8.2 Uncalibrated Weak-Perspective Cameras
  8.2.1 Natural Ambiguity of the Problem
  8.2.2 Affine Structure and Motion from Two Images
  8.2.3 Affine Structure and Motion from Multiple Images
  8.2.4 From Affine to Euclidean Shape
 8.3 Uncalibrated Perspective Cameras
  8.3.1 Natural Ambiguity of the Problem
  8.3.2 Projective Structure and Motion from Two Images
  8.3.3 Projective Structure and Motion from Multiple Images
  8.3.4 From Projective to Euclidean Shape
 8.4 Notes
IV MID-LEVEL VISION
9 Segmentation by Clustering
 9.1 Human Vision: Grouping and Gestalt
 9.2 Important Applications
  9.2.1 Background Subtraction
  9.2.2 Shot Boundary Detection
  9.2.3 Interactive Segmentation
  9.2.4 Forming Image Regions
 9.3 Image Segmentation by Clustering Pixels
  9.3.1 Basic Clustering Methods
  9.3.2 The Watershed Algorithm
  9.3.3 Segmentation Using K-means
  9.3.4 Mean Shift: Finding Local Modes in Data
  9.3.5 Clustering and Segmentation with Mean Shift
 9.4 Segmentation, Clustering, and Graphs
  9.4.1 Terminology and Facts for Graphs
  9.4.2 Agglomerative Clustering with a Graph
  9.4.3 Divisive Clustering with a Graph
  9.4.4 Normalized Cuts
 9.5 Image Segmentation in Practice
  9.5.1 Evaluating Segmenters
 9.6 Notes
10 Grouping and Model Fitting
 10.1 The Hough Transform
  10.1.1 Fitting Lines with the Hough Transform
  10.1.2 Using the Hough Transform
 10.2 Fitting Lines and Planes
  10.2.1 Fitting a Single Line
  10.2.2 Fitting Planes
  10.2.3 Fitting Multiple Lines
 10.3 Fitting Curved Structures
 10.4 Robustness
  10.4.1 M-Estimators
  10.4.2 RANSAC: Searching for Good Points
  10.5 Fitting Using Probabilistic Models
  10.5.1 Missing Data Problems
  10.5.2 Mixture Models and Hidden Variables
  10.5.3 The EM Algorithm for Mixture Models
  10.5.4 Difficulties with the EM Algorithm
 10.6 Motion Segmentation by Parameter Estimation
  10.6.1 Optical Flow and Motion
  10.6.2 Flow Models
  10.6.3 Motion Segmentation with Layers
 10.7 Model Selection: Which Model Is the Best Fit?
  10.7.1 Model Selection Using Cross-Validation
 10.8 Notes
11 Tracking
 11.1 Simple Tracking Strategies
  11.1.1 Tracking by Detection
  11.1.2 Tracking Translations by Matching
  11.1.3 Using Affine Transformations to Confirm a Match
 11.2 Tracking Using Matching
  11.2.1 Matching Summary Representations
  11.2.2 Tracking Using Flow
 11.3 Tracking Linear Dynamical Models with Kalman Filters
  11.3.1 Linear Measurements and Linear Dynamics
  11.3.2 The Kalman Filter
  11.3.3 Forward-backward Smoothing
 11.4 Data Association
  11.4.1 Linking Kalman Filters with Detection Methods
  11.4.2 Key Methods of Data Association
 11.5 Particle Filtering
  11.5.1 Sampled Representations of Probability Distributions
  11.5.2 The Simplest Particle Filter
  11.5.3 The Tracking Algorithm
  11.5.4 A Workable Particle Filter
  11.5.5 Practical Issues in Particle Filters
 11.6 Notes
V HIGH-LEVEL VISION
12 Registration
 12.1 Registering Rigid Objects
  12.1.1 Iterated Closest Points
  12.1.2 Searching for Transformations via Correspondences
  12.1.3 Application: Building Image Mosaics
 12.2 Model-based Vision: Registering Rigid Objects with
Projection
  12.2.1 Verification: Comparing Transformed and Rendered
Source
  to Target
 12.3 Registering Deformable Objects
  12.3.1 Deforming Texture with Active Appearance Models
  12.3.2 Active Appearance Models in Practice
  12.3.3 Application: Registration in Medical Imaging Systems
 12.4 Notes
13 Smooth Surfaces and Their Outlines
 13.1 Elements of Differential Geometry
  13.1.1 Curves
  13.1.2 Surfaces
 13.2 Contour Geometry
  13.2.1 The Occluding Contour and the Image Contour
  13.2.2 The Cusps and Inflections of the Image Contour
  13.2.3 Koenderink’s Theorem
 13.3 Visual Events: More Differential Geometry
  13.3.1 The Geometry of the Gauss Map
  13.3.2 Asymptotic Curves
  13.3.3 The Asymptotic Spherical Map
  13.3.4 Local Visual Events
  13.3.5 The Bitangent Ray Manifold
  13.3.6 Multilocal Visual Events
  13.3.7 The Aspect Graph
 13.4 Notes
14 Range Data
 14.1 Active Range Sensors
 14.2 Range Data Segmentation
  14.2.1 Elements of Analytical Differential Geometry
  14.2.2 Finding Step and Roof Edges in Range Images
  14.2.3 Segmenting Range Images into Planar Regions
 14.3 Range Image Registration and Model Acquisition
  14.3.1 Quaternions
  14.3.2 Registering Range Images
  14.3.3 Fusing Multiple Range Images
 14.4 Object Recognition
  14.4.1 Matching Using Interpretation Trees
  14.4.2 Matching Free-Form Surfaces Using Spin Images
 14.5 Kinect
  14.5.1 Features
  14.5.2 Technique: Decision Trees and Random Forests
  14.5.3 Labeling Pixels
  14.5.4 Computing Joint Positions
 14.6 Notes
15 Learning to Classify
 15.1 Classification, Error, and Loss
  15.1.1 Using Loss to Determine Decisions
  15.1.2 Training Error, Test Error, and Overfitting
  15.1.3 Regularization
  15.1.4 Error Rate and Cross-Validation
  15.1.5 Receiver Operating Curves
 15.2 Major Classification Strategies
  15.2.1 Example: Mahalanobis Distance
  15.2.2 Example: Class-Conditional Histograms and Naive
Bayes
  15.2.3 Example: Classification Using Nearest Neighbors
  15.2.4 Example: The Linear Support Vector Machine
  15.2.5 Example: Kernel Machines
  15.2.6 Example: Boosting and Adaboost
 15.3 Practical Methods for Building Classifiers
  15.3.1 Manipulating Training Data to Improve Performance
  15.3.2 Building Multi-Class Classifiers Out of Binary
Classifiers
  15.3.3 Solving for SVMS and Kernel Machines
 15.4 Notes
16 Classifying Images
 16.1 Building Good Image Features
  16.1.1 Example Applications
  16.1.2 Encoding Layout with GIST Features
  16.1.3 Summarizing Images with Visual Words
  16.1.4 The Spatial Pyramid Kernel
  16.1.5 Dimension Reduction with Principal Components
  16.1.6 Dimension Reduction with Canonical Variates
  16.1.7 Example Application: Identifying Explicit Images
  16.1.8 Example Application: Classifying Materials
  16.1.9 Example Application: Classifying Scenes
 16.2 Classifying Images of Single Objects
  16.2.1 Image Classification Strategies
  16.2.2 Evaluating Image Classification Systems
  16.2.3 Fixed Sets of Classes
  16.2.4 Large Numbers of Classes
  16.2.5 Flowers, Leaves, and Birds: Some Specialized
Problems
 16.3 Image Classification in Practice
  16.3.1 Codes for Image Features
  16.3.2 Image Classification Datasets
  16.3.3 Dataset Bias
  16.3.4 Crowdsourcing Dataset Collection
 16.4 Notes
17 Detecting Objects in Images
 17.1 The Sliding Window Method
  17.1.1 Face Detection
  17.1.2 Detecting Humans
  17.1.3 Detecting Boundaries
 17.2 Detecting Deformable Objects
 17.3 The State of the Art of Object Detection
  17.3.1 Datasets and Resources
 17.4 Notes
18 Topics in Object Recognition
 18.1 What Should Object Recognition Do?
  18.1.1 What Should an Object Recognition System Do?
  18.1.2 Current Strategies for Object Recognition
  18.1.3 What Is Categorization?
  18.1.4 Selection: What Should Be Described?
 18.2 Feature Questions
  18.2.1 Improving Current Image Features
  18.2.2 Other Kinds of Image Feature
 18.3 Geometric Questions
 18.4 Semantic Questions
  18.4.1 Attributes and the Unfamiliar
  18.4.2 Parts, Poselets and Consistency
  18.4.3 Chunks of Meaning
VI APPLICATIONS AND TOPICS
19 Image-Based Modeling and Rendering
 19.1 Visual Hulls
  19.1.1 Main Elements of the Visual Hull Model
  19.1.2 Tracing Intersection Curves
  19.1.3 Clipping Intersection Curves
  19.1.4 Triangulating Cone Strips
  19.1.5 Results
  19.1.6 Going Further: Carved Visual Hulls
 19.2 Patch-Based Multi-View Stereopsis
  19.2.1 Main Elements of the PMVS Model
  19.2.2 Initial Feature Matching
  19.2.3 Expansion
  19.2.4 Filtering
  19.2.5 Results
 19.3 The Light Field
 19.4 Notes
20 Looking at People
 20.1 HMM’s, Dynamic Programming, and Tree-Structured Models
  20.1.1 Hidden Markov Models
  20.1.2 Inference for an HMM
  20.1.3 Fitting an HMM with EM
  20.1.4 Tree-Structured Energy Models
 20.2 Parsing People in Images
  20.2.1 Parsing with Pictorial Structure Models
  20.2.2 Estimating the Appearance of Clothing
 20.3 Tracking People
  20.3.1 Why Human Tracking Is Hard
  20.3.2 Kinematic Tracking by Appearance
  20.3.3 Kinematic Human Tracking Using Templates
 20.4 3D from 2D: Lifting
  20.4.1 Reconstruction in an Orthographic View
  20.4.2 Exploiting Appearance for Unambiguous
Reconstructions
  20.4.3 Exploiting Motion for Unambiguous Reconstructions
 20.5 Activity Recognition
  20.5.1 Background: Human Motion Data
  20.5.2 Body Configuration and Activity Recognition
  20.5.3 Recognizing Human Activities with Appearance
Features
  20.5.4 Recognizing Human Activities with Compositional
Models
 20.6 Resources
 20.7 Notes
21 Image Search and Retrieval
 21.1 The Application Context
  21.1.1 Applications
  21.1.2 User Needs
  21.1.3 Types of Image Query
  21.1.4 What Users Do with Image Collections
 21.2 Basic Technologies from Information Retrieval
  21.2.1 Word Counts
  21.2.2 Smoothing Word Counts
  21.2.3 Approximate Nearest Neighbors and Hashing
  21.2.4 Ranking Documents
 21.3 Images as Documents
  21.3.1 Matching Without Quantization
  21.3.2 Ranking Image Search Results
  21.3.3 Browsing and Layout
  21.3.4 Laying Out Images for Browsing
 21.4 Predicting Annotations for Pictures
  21.4.1 Annotations from Nearby Words
  21.4.2 Annotations from the Whole Image
  21.4.3 Predicting Correlated Words with Classifiers
  21.4.4 Names and Faces
  21.4.5 Generating Tags with Segments
 21.5 The State of the Art of Word Prediction
  21.5.1 Resources
  21.5.2 Comparing Methods
  21.5.3 Open Problems
 21.6 Notes
VII BACKGROUND MATERIAL
22 Optimization Techniques
 22.1 Linear Least-Squares Methods
  22.1.1 Normal Equations and the Pseudoinverse
  22.1.2 Homogeneous Systems and Eigenvalue Problems
  22.1.3 Generalized Eigenvalues Problems
  22.1.4 An Example: Fitting a Line to Points in a Plane
  22.1.5 Singular Value Decomposition
 22.2 Nonlinear Least-Squares Methods
  22.2.1 Newton’s Method: Square Systems of Nonlinear
Equations.
  22.2.2 Newton’s Method for Overconstrained Systems
  22.2.3 The Gauss—Newton and Levenberg—Marquardt Algorithms
 22.3 Sparse Coding and Dictionary Learning
  22.3.1 Sparse Coding
  22.3.2 Dictionary Learning
  22.3.3 Supervised Dictionary Learning
 22.4 Min-Cut/Max-Flow Problems and Combinatorial
Optimization
  22.4.1 Min-Cut Problems
  22.4.2 Quadratic Pseudo-Boolean Functions
  22.4.3 Generalization to Integer Variables
 22.5 Notes
  Bibliography
  Index
  List of Algorithms
  Courses
  Computer Vision (Computer Science)
  Previous Edition(s)
  Net price is Pearson's wholesale price to college bookstores and
other resellers.
  Table of Contents
I IMAGE FORMATION
1 Geometric Camera Models
  1.1 Image Formation
  1.1.1 Pinhole Perspective
  1.1.2 Weak Perspective
  1.1.3 Cameras with Lenses
  1.1.4 The Human Eye
 1.2 Intrinsic and Extrinsic Parameters
  1.2.1 Rigid Transformations and Homogeneous Coordinates
  1.2.2 Intrinsic Parameters
  1.2.3 Extrinsic Parameters
  1.2.4 Perspective Projection Matrices
  1.2.5 Weak-Perspective Projection Matrices
 1.3 Geometric Camera Calibration
  1.3.1 ALinear Approach to Camera Calibration
  1.3.2 ANonlinear Approach to Camera Calibration
 1.4 Notes
2 Light and Shading
 2.1 Modelling Pixel Brightness
  2.1.1 Reflection at Surfaces
  2.1.2 Sources and Their Effects
  2.1.3 The Lambertian+Specular Model
  2.1.4 Area Sources
 2.2 Inference from Shading
  2.2.1 Radiometric Calibration and High Dynamic Range Images
  2.2.2 The Shape of Specularities
  2.2.3 Inferring Lightness and Illumination
  2.2.4 Photometric Stereo: Shape from Multiple Shaded Images
 2.3 Modelling Interreflection
  2.3.1 The Illumination at a Patch Due to an Area Source
  2.3.2 Radiosity and Exitance
  2.3.3 An Interreflection Model
  2.3.4 Qualitative Properties of Interreflections
 2.4 Shape from One Shaded Image
 2.5 Notes
3 Color
 3.1 Human Color Perception
  3.1.1 Color Matching
  3.1.2 Color Receptors
 3.2 The Physics of Color
  3.2.1 The Color of Light Sources
  3.2.2 The Color of Surfaces
 3.3 Representing Color
  3.3.1 Linear Color Spaces
  3.3.2 Non-linear Color Spaces
 3.4 AModel of Image Color
  3.4.1 The Diffuse Term
  3.4.2 The Specular Term
 3.5 Inference from Color
  3.5.1 Finding Specularities Using Color
  3.5.2 Shadow Removal Using Color
  3.5.3 Color Constancy: Surface Color from Image Color
 3.6 Notes
II EARLY VISION: JUST ONE IMAGE
4 Linear Filters
 4.1 Linear Filters and Convolution
  4.1.1 Convolution
 4.2 Shift Invariant Linear Systems
  4.2.1 Discrete Convolution
  4.2.2 Continuous Convolution
  4.2.3 Edge Effects in Discrete Convolutions
 4.3 Spatial Frequency and Fourier Transforms
  4.3.1 Fourier Transforms
 4.4 Sampling and Aliasing
  4.4.1 Sampling
  4.4.2 Aliasing
  4.4.3 Smoothing and Resampling
 4.5 Filters as Templates
  4.5.1 Convolution as a Dot Product
  4.5.2 Changing Basis
 4.6 Technique: Normalized Correlation and Finding Patterns
  4.6.1 Controlling the Television by Finding Hands by
Normalized
  Correlation
 4.7 Technique: Scale and Image Pyramids
  4.7.1 The Gaussian Pyramid
  4.7.2 Applications of Scaled Representations
 4.8 Notes
5 Local Image Features
 5.1 Computing the Image Gradient
 5.1.1 Derivative of Gaussian Filters
 5.2 Representing the Image Gradient
  5.2.1 Gradient-Based Edge Detectors
  5.2.2 Orientations
 5.3 Finding Corners and Building Neighborhoods
  5.3.1 Finding Corners
  5.3.2 Using Scale and Orientation to Build a Neighborhood
 5.4 Describing Neighborhoods with SIFT and HOG Features
  5.4.1 SIFT Features
  5.4.2 HOG Features
 5.5 Computing Local Features in Practice
 5.6 Notes
6 Texture
 6.1 Local Texture Representations Using Filters
  6.1.1 Spots and Bars
  6.1.2 From Filter Outputs to Texture Representation
  6.1.3 Local Texture Representations in Practice
 6.2 Pooled Texture Representations by Discovering Textons
  6.2.1 Vector Quantization and Textons
  6.2.2 K-means Clustering for Vector Quantization
 6.3 Synthesizing Textures and Filling Holes in Images
  6.3.1 Synthesis by Sampling Local Models
  6.3.2 Filling in Holes in Images
 6.4 Image Denoising
  6.4.1 Non-local Means
  6.4.2 Block Matching 3D (BM3D)
  6.4.3 Learned Sparse Coding
  6.4.4 Results
 6.5 Shape from Texture
  6.5.1 Shape from Texture for Planes
  6.5.2 Shape from Texture for Curved Surfaces
 6.6 Notes
III EARLY VISION: MULTIPLE IMAGES
7 Stereopsis
 7.1 Binocular Camera Geometry and the Epipolar Constraint
  7.1.1 Epipolar Geometry
  7.1.2 The Essential Matrix
  7.1.3 The Fundamental Matrix
 7.2 Binocular Reconstruction
  7.2.1 Image Rectification
 7.3 Human Stereopsis
 7.4 Local Methods for Binocular Fusion
  7.4.1 Correlation
  7.4.2 Multi-Scale Edge Matching
 7.5 Global Methods for Binocular Fusion
  7.5.1 Ordering Constraints and Dynamic Programming
  7.5.2 Smoothness and Graphs
 7.6 Using More Cameras
 7.7 Application: Robot Navigation
 7.8 Notes
8 Structure from Motion
 8.1 Internally Calibrated Perspective Cameras
  8.1.1 Natural Ambiguity of the Problem
  8.1.2 Euclidean Structure and Motion from Two Images
  8.1.3 Euclidean Structure and Motion from Multiple Images
 8.2 Uncalibrated Weak-Perspective Cameras
  8.2.1 Natural Ambiguity of the Problem
  8.2.2 Affine Structure and Motion from Two Images
  8.2.3 Affine Structure and Motion from Multiple Images
  8.2.4 From Affine to Euclidean Shape
 8.3 Uncalibrated Perspective Cameras
  8.3.1 Natural Ambiguity of the Problem
  8.3.2 Projective Structure and Motion from Two Images
  8.3.3 Projective Structure and Motion from Multiple Images
  8.3.4 From Projective to Euclidean Shape
 8.4 Notes
IV MID-LEVEL VISION
9 Segmentation by Clustering
 9.1 Human Vision: Grouping and Gestalt
 9.2 Important Applications
  9.2.1 Background Subtraction
  9.2.2 Shot Boundary Detection
  9.2.3 Interactive Segmentation
  9.2.4 Forming Image Regions
 9.3 Image Segmentation by Clustering Pixels
  9.3.1 Basic Clustering Methods
  9.3.2 The Watershed Algorithm
  9.3.3 Segmentation Using K-means
  9.3.4 Mean Shift: Finding Local Modes in Data
  9.3.5 Clustering and Segmentation with Mean Shift
 9.4 Segmentation, Clustering, and Graphs
  9.4.1 Terminology and Facts for Graphs
  9.4.2 Agglomerative Clustering with a Graph
  9.4.3 Divisive Clustering with a Graph
  9.4.4 Normalized Cuts
 9.5 Image Segmentation in Practice
  9.5.1 Evaluating Segmenters
 9.6 Notes
10 Grouping and Model Fitting
 10.1 The Hough Transform
  10.1.1 Fitting Lines with the Hough Transform
  10.1.2 Using the Hough Transform
 10.2 Fitting Lines and Planes
  10.2.1 Fitting a Single Line
  10.2.2 Fitting Planes
  10.2.3 Fitting Multiple Lines
 10.3 Fitting Curved Structures
 10.4 Robustness
  10.4.1 M-Estimators
  10.4.2 RANSAC: Searching for Good Points
 10.5 Fitting Using Probabilistic Models
  10.5.1 Missing Data Problems
  10.5.2 Mixture Models and Hidden Variables
  10.5.3 The EM Algorithm for Mixture Models
  10.5.4 Difficulties with the EM Algorithm
 10.6 Motion Segmentation by Parameter Estimation
  10.6.1 Optical Flow and Motion
  10.6.2 Flow Models
  10.6.3 Motion Segmentation with Layers
 10.7 Model Selection: Which Model Is the Best Fit?
  10.7.1 Model Selection Using Cross-Validation
 10.8 Notes
11 Tracking
 11.1 Simple Tracking Strategies
  11.1.1 Tracking by Detection
  11.1.2 Tracking Translations by Matching
  11.1.3 Using Affine Transformations to Confirm a Match
 11.2 Tracking Using Matching
  11.2.1 Matching Summary Representations
  11.2.2 Tracking Using Flow
 11.3 Tracking Linear Dynamical Models with Kalman Filters
  11.3.1 Linear Measurements and Linear Dynamics
  11.3.2 The Kalman Filter
  11.3.3 Forward-backward Smoothing
 11.4 Data Association
  11.4.1 Linking Kalman Filters with Detection Methods
  11.4.2 Key Methods of Data Association
 11.5 Particle Filtering
  11.5.1 Sampled Representations of Probability Distributions
  11.5.2 The Simplest Particle Filter
  11.5.3 The Tracking Algorithm
  11.5.4 A Workable Particle Filter
  11.5.5 Practical Issues in Particle Filters
 11.6 Notes
V HIGH-LEVEL VISION
12 Registration
 12.1 Registering Rigid Objects
  12.1.1 Iterated Closest Points
  12.1.2 Searching for Transformations via Correspondences
  12.1.3 Application: Building Image Mosaics
 12.2 Model-based Vision: Registering Rigid Objects with
Projection
  12.2.1 Verification: Comparing Transformed and Rendered
Source
  to Target
 12.3 Registering Deformable Objects
  12.3.1 Deforming Texture with Active Appearance Models
  12.3.2 Active Appearance Models in Practice
  12.3.3 Application: Registration in Medical Imaging Systems
 12.4 Notes
13 Smooth Surfaces and Their Outlines
 13.1 Elements of Differential Geometry
  13.1.1 Curves
  13.1.2 Surfaces
 13.2 Contour Geometry
  13.2.1 The Occluding Contour and the Image Contour
  13.2.2 The Cusps and Inflections of the Image Contour
  13.2.3 Koenderink’s Theorem
 13.3 Visual Events: More Differential Geometry
  13.3.1 The Geometry of the Gauss Map
  13.3.2 Asymptotic Curves
  13.3.3 The Asymptotic Spherical Map
  13.3.4 Local Visual Events
  13.3.5 The Bitangent Ray Manifold
  13.3.6 Multilocal Visual Events
  13.3.7 The Aspect Graph
 13.4 Notes
14 Range Data
 14.1 Active Range Sensors
 14.2 Range Data Segmentation
  14.2.1 Elements of Analytical Differential Geometry
  14.2.2 Finding Step and Roof Edges in Range Images
  14.2.3 Segmenting Range Images into Planar Regions
 14.3 Range Image Registration and Model Acquisition
  14.3.1 Quaternions
  14.3.2 Registering Range Images
  14.3.3 Fusing Multiple Range Images
 14.4 Object Recognition
  14.4.1 Matching Using Interpretation Trees
  14.4.2 Matching Free-Form Surfaces Using Spin Images
 14.5 Kinect
  14.5.1 Features
  14.5.2 Technique: Decision Trees and Random Forests
  14.5.3 Labeling Pixels
  14.5.4 Computing Joint Positions
 14.6 Notes
15 Learning to Classify
 15.1 Classification, Error, and Loss
  15.1.1 Using Loss to Determine Decisions
  15.1.2 Training Error, Test Error, and Overfitting
  15.1.3 Regularization
  15.1.4 Error Rate and Cross-Validation
  15.1.5 Receiver Operating Curves
 15.2 Major Classification Strategies
  15.2.1 Example: Mahalanobis Distance
  15.2.2 Example: Class-Conditional Histograms and Naive
Bayes
  15.2.3 Example: Classification Using Nearest Neighbors
  15.2.4 Example: The Linear Support Vector Machine
  15.2.5 Example: Kernel Machines
  15.2.6 Example: Boosting and Adaboost
 15.3 Practical Methods for Building Classifiers
  15.3.1 Manipulating Training Data to Improve Performance
  15.3.2 Building Multi-Class Classifiers Out of Binary
Classifiers
  15.3.3 Solving for SVMS and Kernel Machines
 15.4 Notes
16 Classifying Images
 16.1 Building Good Image Features
  16.1.1 Example Applications
  16.1.2 Encoding Layout with GIST Features
  16.1.3 Summarizing Images with Visual Words
  16.1.4 The Spatial Pyramid Kernel
  16.1.5 Dimension Reduction with Principal Components
  16.1.6 Dimension Reduction with Canonical Variates
  16.1.7 Example Application: Identifying Explicit Images
  16.1.8 Example Application: Classifying Materials
  16.1.9 Example Application: Classifying Scenes
 16.2 Classifying Images of Single Objects
  16.2.1 Image Classification Strategies
  16.2.2 Evaluating Image Classification Systems
  16.2.3 Fixed Sets of Classes
  16.2.4 Large Numbers of Classes
  16.2.5 Flowers, Leaves, and Birds: Some Specialized
Problems
 16.3 Image Classification in Practice
  16.3.1 Codes for Image Features
  16.3.2 Image Classification Datasets
  16.3.3 Dataset Bias
  16.3.4 Crowdsourcing Dataset Collection
 16.4 Notes
17 Detecting Objects in Images
 17.1 The Sliding Window Method
  17.1.1 Face Detection
  17.1.2 Detecting Humans
  17.1.3 Detecting Boundaries
 17.2 Detecting Deformable Objects
 17.3 The State of the Art of Object Detection
  17.3.1 Datasets and Resources
 17.4 Notes
18 Topics in Object Recognition
 18.1 What Should Object Recognition Do?
  18.1.1 What Should an Object Recognition System Do?
  18.1.2 Current Strategies for Object Recognition
  18.1.3 What Is Categorization?
  18.1.4 Selection: What Should Be Described?
 18.2 Feature Questions
  18.2.1 Improving Current Image Features
  18.2.2 Other Kinds of Image Feature
 18.3 Geometric Questions
 18.4 Semantic Questions
  18.4.1 Attributes and the Unfamiliar
  18.4.2 Parts, Poselets and Consistency
  18.4.3 Chunks of Meaning
VI APPLICATIONS AND TOPICS
19 Image-Based Modeling and Rendering
 19.1 Visual Hulls
  19.1.1 Main Elements of the Visual Hull Model
  19.1.2 Tracing Intersection Curves
  19.1.3 Clipping Intersection Curves
  19.1.4 Triangulating Cone Strips
  19.1.5 Results
  19.1.6 Going Further: Carved Visual Hulls
 19.2 Patch-Based Multi-View Stereopsis
  19.2.1 Main Elements of the PMVS Model
  19.2.2 Initial Feature Matching
  19.2.3 Expansion
  19.2.4 Filtering
  19.2.5 Results
 19.3 The Light Field
 19.4 Notes
20 Looking at People
 20.1 HMM’s, Dynamic Programming, and Tree-Structured Models
  20.1.1 Hidden Markov Models
  20.1.2 Inference for an HMM
  20.1.3 Fitting an HMM with EM
  20.1.4 Tree-Structured Energy Models
 20.2 Parsing People in Images
  20.2.1 Parsing with Pictorial Structure Models
  20.2.2 Estimating the Appearance of Clothing
 20.3 Tracking People
  20.3.1 Why Human Tracking Is Hard
  20.3.2 Kinematic Tracking by Appearance
  20.3.3 Kinematic Human Tracking Using Templates
 20.4 3D from 2D: Lifting
  20.4.1 Reconstruction in an Orthographic View
  20.4.2 Exploiting Appearance for Unambiguous
Reconstructions
  20.4.3 Exploiting Motion for Unambiguous Reconstructions
 20.5 Activity Recognition
  20.5.1 Background: Human Motion Data
  20.5.2 Body Configuration and Activity Recognition
  20.5.3 Recognizing Human Activities with Appearance
Features
  20.5.4 Recognizing Human Activities with Compositional
Models
 20.6 Resources
 20.7 Notes
21 Image Search and Retrieval
 21.1 The Application Context
  21.1.1 Applications
  21.1.2 User Needs
  21.1.3 Types of Image Query
  21.1.4 What Users Do with Image Collections
 21.2 Basic Technologies from Information Retrieval
  21.2.1 Word Counts
  21.2.2 Smoothing Word Counts
  21.2.3 Approximate Nearest Neighbors and Hashing
  21.2.4 Ranking Documents
 21.3 Images as Documents
  21.3.1 Matching Without Quantization
  21.3.2 Ranking Image Search Results
  21.3.3 Browsing and Layout
  21.3.4 Laying Out Images for Browsing
 21.4 Predicting Annotations for Pictures
  21.4.1 Annotations from Nearby Words
  21.4.2 Annotations from the Whole Image
  21.4.3 Predicting Correlated Words with Classifiers
  21.4.4 Names and Faces
  21.4.5 Generating Tags with Segments
 21.5 The State of the Art of Word Prediction
  21.5.1 Resources
  21.5.2 Comparing Methods
  21.5.3 Open Problems
  21.6 Notes
VII BACKGROUND MATERIAL
22 Optimization Techniques
 22.1 Linear Least-Squares Methods
  22.1.1 Normal Equations and the Pseudoinverse
  22.1.2 Homogeneous Systems and Eigenvalue Problems
  22.1.3 Generalized Eigenvalues Problems
  22.1.4 An Example: Fitting a Line to Points in a Plane
  22.1.5 Singular Value Decomposition
 22.2 Nonlinear Least-Squares Methods
  22.2.1 Newton’s Method: Square Systems of Nonlinear
Equations.
  22.2.2 Newton’s Method for Overconstrained Systems
  22.2.3 The Gauss—Newton and Levenberg—Marquardt Algorithms
 22.3 Sparse Coding and Dictionary Learning
  22.3.1 Sparse Coding
  22.3.2 Dictionary Learning
  22.3.3 Supervised Dictionary Learning
 22.4 Min-Cut/Max-Flow Problems and Combinatorial
Optimization
  22.4.1 Min-Cut Problems
  22.4.2 Quadratic Pseudo-Boolean Functions
  22.4.3 Generalization to Integer Variables
 22.5 Notes
  Bibliography
Index
List of Algorithms

章節(jié)摘錄

版權頁:   插圖:    Inference from Shading Registered images are not essential for radiometric calibration. For example, it is sufficient to have two images where we believe the histogram of Eij values is the same (Grossberg and Nayar 2002). This occurs, for example, when the images are of the same scene, but are not precisely registered. Patterns of intensity around edges also can reveal calibration (Lin et al. 2004). There has not been much recent study of lightness constancy algorithms. The basic idea is due to Land and McCann (1971).Their work was formalized for the computer vision community by Horn (1974). A variation on Horn's algorithm was constructed by Blake (1985). This is the lightness algorithm we describe. It appeared originally in a slightly different form, where it was called the Retinex algorithm (Land and McCann 1971). Retinex was originally intended as a color constancy algorithm. It is surprisingly difficult to analyze (Brainard and Wandell 1986). Retinex estimates the log-illumination term by subtracting the log-albedo from the log-intensity. This has the disadvantage that we do not impose any struc- tural eonstraints on illumination. This point has largely been ignored, beeause the main focus has been on albedo estimates. However, albedo estimates are likely to be improved by balancing violations of albedo eonstraints with those of illumination constraints. Lightness techniques are not as widely used as they should be, particularly given that there is some evidence that they produce useful information on real images (Brelstaff and Blake 1987). Classifying illumination versus albedo simply by looking at the magnitude of the gradient is crude, and ignores important cues. Sharp shading changes occur at shadow boundaries or normal discontinuities, but using chromaticity (Funt et al. 1992) or multiple images under different lighting conditions (Weiss 20011 yields improved estimates. One can learn to distinguish illumination from albedo (Freeman et al. 2000). Discriminative methods to classify edges into albedo or shading help (Tappen et al. 2006b) and chromaticity cues can contribute (Farenzena and Fusiello 2007).

編輯推薦

《計算機視覺:一種現(xiàn)代方法(第2版)(英文版)》可作為高等院校計算幾何、計算機圖形學、圖像處理、機器人學等專業(yè)學生的教材,也可供相關的專業(yè)人士閱讀。

圖書封面

圖書標簽Tags

評論、評分、閱讀與下載


    計算機視覺 PDF格式下載


用戶評論 (總計21條)

 
 

  •   計算機視覺方面的經(jīng)典著作,第二版比第一版有較多改進,且反映了近年的新進展,是每個研究圖像處理,分析,識別等技術的必備書籍,最新文獻到2011年的計算機視覺3大會議。
  •   計算機視覺方面的參考書
  •   很不錯的一本書,值得細細學習研究,
  •   大致翻看了一下,經(jīng)典啊
  •   書很詳細,物流很好
  •   我的第一本歷史探險漫畫書尋寶記全套1-20
  •   正在看,寫的挺好的!?。。?!11
  •   有網(wǎng)絡資料支持,非常不錯。大師級的視野,思想的盛宴。
  •   MIT的經(jīng)典教材,內(nèi)容豐富,和《圖像處理、分析與機器視覺》配合著看,收獲良多
  •   計算機視覺的經(jīng)典著作,內(nèi)容全面。
  •   很經(jīng)典的一本計算機視覺教材
  •   雖然有電子版的了,但還是想要一本紙質的,感謝電子工業(yè)出版社第一時間影印了這本書。書的內(nèi)容沒說的,值得購買。只是感覺紙張和油墨都不夠好,談不上完美。另外書的定價也偏貴一點,折后能在60左右是比較合適的價格。不過也算瑕不掩瑜了。
  •   非常好的一本書,很適合研究使用。
  •   內(nèi)容老了些,學學基礎倒也可以
  •   很好的圖書,好好學習;
  •   本書數(shù)學描述方式不通用,看著怪怪的,譬如:卷積表達不直接用連續(xù)或者離散求和的方式,而是寫成shift(x)這種方式;另外,一些地方cover的range夠大,但是只是帶一下,書不同于論文,有些地方還是要說清楚,因為不僅僅是買一本論文的參考索引
  •   第一版買不到了,買了這版,感覺質量很一般很一般,紙很薄,不像正品。。
  •   紙質不錯。不多說了,學習的話,沒什么問題。換句話,再好的書,不看結果就無需在乎書本身了。對了,價格當時買很便宜。英文版啊!
  •   感覺這本書不是很實用,還可以吧
  •   書中詳細介紹了各種圖像處理的基礎知識
  •   很大的一本,挺厚的,還挺重,質量不錯
 

250萬本中文圖書簡介、評論、評分,PDF格式免費下載。 第一圖書網(wǎng) 手機版

京ICP備13047387號-7