images - eigenvector€¦ · multivariate images spatial information between pixels spectral...
TRANSCRIPT
1
Gallagher, APACT May 2-4, 2007
Multivariate Image AnalysisPast, Present and Future–
A Biased View
Neal B. Gallagher
Eigenvector Research, Inc.
Images
Number of Spectral Channels1 10 100 1,000 10,000
grey- scale
color
multi-band
spectral
multivariate image
hyperspectral
multispectral
superspectral
omnispectral
megaspectral
gigaspectral
2
Multivariate Images
Spatial Information
between pixels
Spectral Information
between channels
(chemical information)
Spatial distribution of
chemical analytes, physical
features, and other
properties
Multispectral Imaging
Early applications in astronomy and remote sensing
telescopes, satellites looking down
typically had low spectral resolution
much to learn (physics, applications, implementation) and algorithms to ‘borrow’ from this community
• vast experience combining first principles and statistics
• maybe we can lend some chemistry
3
Grey-Scale
Spatial information
Limited (no) chemical information
Algorithmsedge detection, size distribution estimation
crystallization systems, powders and pellets
Image of Hrad Vallis, Mars: VIS instrument. Latitude 34.1N, Longitude
141.5E. 19 meter/pixel resolution.
MIA: Past
spectral information• spectral resolution tended to be low - data
tended to be full rank (or nearly)
• some chemical or temperature
information available
• little use of the true spectral nature of the
information
spatial information• select a layer for a grey-scale image
• 2-3 layers for color images
• density slicing (mean of several layers
assigned to a color)
spatial resolution tended to be low to high
4
Select Channels for RGB
Example of slicing a multivariate image for RGB visualization
color enhances interpretability
Choose 3 of 7 channels → false color
Landsat
100 200 300 400 500
50
100
150
200
250
300
350
400
450
500
Paris (NIR/blue/SWIR-1)*
*contrast enhanced
Slicing MIs to RGB
Color enhances interpretability and pattern recognition
Can lend chemical and physical information
E.g. temperature contrast in the Spitzer image
Does not utilize all available information in the spectra
Improvements of the sensing system tended to focus on improving spatial resolution (Hubble image inset)
Spitzer Space Telescope (2003) of the Eagle nebula, 7,000 light-yrs away.
Spitzer's infrared and multiband imaging photometer. Blue = 4.5 µm;
Green = 8 µm; Red = 24 µm.
5
Hyperspectral Imaging
Chemistry and Chemical Process of Chemometrics interest
“remote” imagining on short distance scales 10-9 to 100 m compared to 101 to 1025 m
imaging cameras and microscopes
• infrared, x-ray, uv, vis, mass spec, raman, oes ...
• often use active vs passive sensing
• tend to have higher spectral resolution
• clutter and atmospheric interferences can be less of an issue
pharmaceuticals, fine chemicals, powders, crystallization, pellets, foods and beverages, films, pulp and paper, medical imaging, neuroscience, forensics, archeology and anthropology examining paintings, petroglyphs, books, pots, bones, environmental, precision agriculture, …
MIA: Present
spectral dimension• multivariate approaches applied to the spectral mode
• PCA scores images
• MCR contribution images (chemical images)
• visualization improvements
• higher spectral resolution, more chemical selectivity
MVA
spatial information• uses results from Multivariate
Analysis results
• spatial resolution high
P. Geladi, H. Grahn, “Multivariate Image Analysis,” Wiley, 1996
MxxMyxN
MxMyxN
6
Multivariate Analysis for Images
Factor-based multivariate analysis are bilinear models that explicitly use correlation in the spectra
Principal Components Analysis
Multivariate Curve Resolution (chemical images)
Don’t typically utilize the spatial information
However, ...Measurement artifacts adversely affect bilinear structure
• analysis can’t take advantage of the data structure
• one underlying factor must be described by several factor
e.g. camera moves before next spectral band is measured
50 100 150 200 250
50
100
150
200
250
0 20 40 60 80 1000
0.5
1
0 100 200 300 400 500 6000
0.5
1
0 20 40 60 80 1000
0.5
1
Concentration Images from MCR
Gallagher, N.B., Shaver, J.M., Martin, E.B., Morris, J., Wise, B.M. and
Windig, W., “Curve resolution for images with applications to TOF-
SIMS and Raman”, Chemometr. Intell. Lab., 73(1), 105–117 (2003).
23: Red
366: Green
29: Blue
59
sodium
active drug
coating
mass channel
Prednisolone drug bead TOF-SIMS image
7
Alignment necessary due toCamera motion
Sample motion
Sensor Positions
Same sample, different date
tracking, pitch, roll, yaw
stretch, parallelogram, trapezoid, shifts, radial aberration, distortions ...
Aligned Image
Distorted Image 6.8 6.8 7.9 8.6 9.4 10.2 11.0 11.8 12.6
Image Alignment
2001 MARS ODYSSEY, THEMIS, I00816001EDR
1 2 3 4 5 6 7 8 910
-1
100
101
102
103
104
Number of PCs
Eig
env
alu
e o
f C
ov
aria
nce
Mat
rix
Multivariate Image Alignment
10 20 30 40 50 60
5
10
15
20
25
30
35
40
45
50
55
10 20 30 40 50 60
5
10
15
20
25
30
35
40
45
50
55
Layer 1 (6.78 µm)
Layer 9 (12.57 µm)
standard imageImage-to-image at different channels
• results in increase of rank of the
multivariate image.
• Alignment makes the data more
“directional”.
aligned image
unaligned image
image section after
alignment
8
Missing Data
100 200 300 400 500
50
100
150
200
250
300
10 20 30 40 50 60
10
20
30
40
50
60
64x64x366, Mid-IR 895-3705.5 cm-1, Corn Kernel
B.O. Budevska, S.T. Sum, T.J. Jones, “Fourier Transform Infrared
Spectral Imaging and Microscopy. Application of Multivariate Curve
Resolution,” Appl. Spectrosc., 57, 124-131 (2003).
MARS ODYSSEY, 6.78 µm layer
Bad line (optics?)
Bad pixels
Line drop-out
0.5 0 0.5
1 0 1
0.5 0 0.5
1
3
0.25 0.25 0.25
0.5 1 0.5
0.25 0.25 0.25
1
3.5
Interp Missing Data Miss Data+Intrp0
2
4
6
8
10
12
14
16
18
20
22
1 2 3 4 5 6 1 2 3 4 5 6
RM
SE
Replacement Algorithm
Num. PCs
Cross-Validation
Error
9
150 200 250
50
100
150
200
250
300
Line Replaced
100 200 300 400 500
50
100
150
200
250
300
MARS ODYSSEY, 6.78 µm layer
100 200 300 400 500
50
100
150
200
250
300
MCR of a Feed Pellet
Previous MCR example did not utilize
spatial information
500 micron feed pellet
where is the sugar and protein
in a feed pellet?
embed a pellet in epoxy,
section, and polish
scratches are evident and can
make analysis difficult
FTIR reflection image ~400
microns square
Thanks to Sean Smith and Janiece Hope of
Cargill, Inc., Global Food Research, Scientific
Resources for the image data.
10
Spectra for Potential Analytes
regions used with 2nd derivative spectra to estimate
spatial contributions of scratch features
10001500200025003000 0
0.1
0.2
0.3
0.4
0.5
Wavenumber (cm-1
)
Resin
Lysine amino acid
Glucose
Bacteria
10001500200025003000
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Wavenumber (cm-1
)
Resin
Lysine
Glucose
Bacteria
10001500200025003000-0.2
0
0.2
Wavenumber (cm-1
)
PC 1
PC 2
Scratch Features
2nd derivative spectra
10 20 30 40 50 60
10
20
30
40
50
60
-0.01
-0.005
0
0.005
0.01
10 20 30 40 50 60
10
20
30
40
50
60
-0.01
-0.005
0
0.005
0.01
PC 1
PC 2
11
MCR Set Up
MCR Initialization and Results
Perform EMSC with magnitude and slope correction• reference is an estimate of the resin spectrum with robust fitting
• allow glucose, lysine, CaSO4 spectra to pass the filter
• Gallagher, Blake, Gassman, J. Chemometr., 19(5-7), 271-281 (2005).
Account for scratches using spatial constraints:
• Soft Equality Constraints on C: components 4 to 11
• Scores from a PCA of region 2778 to 1790 cm-1 w/ 2nd derivative preprocessing capture variability due to scratch features
Soft Equality Constraints on S: components 1 to 3• Factor 1: resin
• Factor 2: lysine (w/~ CaSO4)
• Factor 3: glucose
MCR Factor 1: Resin
100015002000250030000
0.05
0.1
0.15
0.2
0.25
Wavenumber (cm-1
)
MCR Factor 1
Resin
10 20 30 40 50 60
10
20
30
40
50
60 0.7
0.8
0.9
1
1.1
1.2
1.3
12
10 20 30 40 50 60
10
20
30
40
50
60-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
10 20 30 40 50 60
10
20
30
40
50
60
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
10001500200025003000
0
0.05
0.1
0.15
0.2
0.25
Wavenumber (cm-1
)
MCR Factor 2
Lysine
CaSO2
10001500200025003000-0.05
0
0.05
0.1
0.15
0.2
Wavenumber (cm-1
)
MCR Factor 3
Sucrose
Glucose
10 20 30 40 50 60
10
20
30
40
50
60
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Lysine Sulfate
Sugar
10 20 30 40 50 60
10
20
30
40
50
60-1
-0.5
0
0.5
1
1.5
2
4000 5000 6000 7000 8000 9000 10000
-1
-0.5
0
0.5
1
Wavelength (nm)
3200 3400 3600 3800 4000 4200 4400 4600 4800 5000-1
-0.5
0
0.5
1
1.5
Wavelength (nm)
MCR Factor 4
0.16 µm 0.23 µm
1020
3040
5060
1020
30
4050
60
-0.01
-0.005
0
0.005
0.01
Contributions on
Factor 4
Scratch Feature
13
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
R = lysine, G = resin, B = sucrose
C for Factors [2 1 3] = RGB
Contributions → RGB
C for Factors 1:3: 1-Norm Preprocessing
Sample Correlation Map (5 clusters)
10 20 30 40 50 60
10
20
30
40
50
60
1
1.5
2
2.5
3
3.5
4
4.5
5
KNN Cluster Analysis of the MCR Contributions
Pei, L. Guilin, J., Davis, R.C., Shaver, J.M., Smentkowski, V.S., Asplund, M.C.,
Linford, M.R., Applied Surface Science, submitted (2007).
MIA: Future
MVA
MxxMyxN
MxMyxN
spatial and spectral information extraction integratedMultivariate analysis will account for spatial correlation
spatial resolution high
spectral resolution high
14
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
Maximum Autocorrelation Factors
MAF example using scores from first 3 factorsC for Factors [2 1 3]:
R = lysine, G = resin, B = sucrose
Switzer, P., in Comp. Science and Statistics, L. Billard, Ed. 1985, 13-16, Elsevier
Image Analysis Biases
Gray-scale Bias:“It is often wise to remove redundant bands before classification.”
+
Multivariate Bias:Utilize redundant bands to enhance signal-to-noise.
Multivariate Image Analysis Bias:Utilize redundant bands and spatial information to maximize signal-to-noise.