f - uzhsam/smartimgsys/download/smart.pdfhalbleiterherstellung und ist in zahlreic hen eingespielten...
Post on 06-Mar-2021
1 Views
Preview:
TRANSCRIPT
School of Engineering and Architecture
Department for Electrical Engineering
Quellgasse 21, CH-2501 Biel
www.hta-bi.bfh.ch
Bachelor's thesis and Preliminary studies
Smart Image System
Bettler Matthias & Zahnd Samuel
Supervisors: Dr. J. Goette & Dr. M. Jacomet
23rd January 2001
Author: Bettler Matthias & Zahnd Samuel
Supervisors: Dr. J. Goette & Dr. M. Jacomet
Web publication: www.microlab.ch/academics/r and d/diplom
Word processor: LATEX
F�ur Alle, die zum Gelingen dieser Arbeit beigetragen haben.
F�ur meine Eltern und ihre Unterst�utzung w�ahrend meines Studiums.
Matthias
F�ur die Endlichkeit
- f�ur die Ewigkeit: Entscheidet euch heute wem ihr dienen wollt.
Ich aber und mein Haus wollen dem Herrn dienen. Jos 24,15.
Samuel
Abstract
Since the �rst realisation of an integrated circuit by Jack S. Kilby in 1958 in the Texas
Instrument laboratories, the integrated circuit technology has been improved in an amazing
manner. Today, the Cmos technology is the state-of-the-art in the �eld of semiconductor
manufacturing and various well proven processes are cheaply available. Nevertheless, the
Cmos technology has been used for a few years only for the manufacturing of image-sensors
until today. The market of image-sensors has been dominated and is still dominated by
sensors that are known as charge coupled devices (CCD). CCD sensors use special manufac-
turing processes that are not compatible with the standard Cmos technology.
The main advantages of Cmos based image-sensors are the ability to place additional
electric circuits on the same chip, e.g. gain circuits on pixel level, circuits for signal processing
etc. Another advantage is the development of the Cmos technology by numerous companies.
In the �rst part of our work we developed a Cmos image-sensor with a high dynamic range
over many orders of magnitude of illumination. Due to the use of a logarithmic pixel circuit
we could extend the dynamic range up to 6 decades. The circuit consists of a photosensitive
area and a gain circuit. The most demanding part of our image-sensor design was the
derivation of the accurate dimensions of the transistors used for the pixel gain circuit. By
calculating and simulating, we could �nally �nd the optimal sizes of the transistors and
could draw the pixel layout. The chip has been manufactured by the Alcatel Mietec 0:5�m
process.
While the chip has been manufactured we dealt with image processing algorithms. We
concentrated on a completely new approach, the cellular neural networks CNNs. CNNs,
implemented on a chip, provide an ideal structure for fast analog image processing. Image
datas can be processed in parallel and outperform any digital processors in the quest of
speed. Leon O. Chua developed the architecture of this networks in 1988. Since then, many
papers were published. First we dealt with the functionality of the CNNs, then we tried
to con�gure the networks, by using genetic algorithms. We then decided to buy a so-called
CNN Universal Machine that provide us to verify our template designs and to process our
data from the image sensor. The CNN Universal Machine consists basically on an integrated
circuit with numerous con�gurable CNNs implemented on the chip.
Another important part of our work was the measurement on our image-sensor. First of
all, we focused on the measurement of our light intensity range and we could successfully
verify our simulated datas. Another challenge was the elimination of the so-called �xed
pattern noise. This noise is well known in connection with logarithmic sensors but can
be eliminated with suitable methods. Our image-sensor was developed as a prototype and
optimized on a large dynamic light intensity range. In the context of this constraints our
senor ful�lls the demanded requirements.
Thesis Report { Smart Image System i
Inhalt
Seit der ersten Realisierung von integrierten Schaltungen im Jahre 1958 durch Jack S. Kilby
bei Texas Instruments, wurde die Technologie der integrierten Schaltungen mit rasender
Geschwindigkeit weiterentwickelt und verbessert. DieCmos Technolgie ist heute die treibende
Kraft im Gebiet der Halbleiterherstellung und ist in zahlreichen eingespielten Prozessen
billig verf�ugbar. Trotzdem wurde diese Technik erst seit ein paar Jahren f�ur die Herstel-
lung von Bildsensoren genutzt und kommerzielle Produkte sind erst seit kurzem auf dem
Markt. Die Bildsensorik wurde bisher von Sensoren dominiert, welche nach dem Ladungs-
Kopplungs-Prinzip (CCD) funktionieren. Solche Bildsensoren ben�otigen zur Herstellung
speziell angepasste Fertigungsprozesse.
Auf Cmos basierende Bildsensoren bieten haupts�achlich den Vorteil, zus�atzliche Schal-
tungselemente wie zum Beispiel Verst�arkerschaltungen auf Pixelebene oder Bildverarbeitungsal-
gorithmen direkt auf ein und demselben Chip zu integrieren. Ein weiterer Vorteil ist auch
die st�andige Weiterentwicklung der Cmos Technologie durch die Industrie. Wir haben in
einem ersten Schritt unserer Arbeit einen Cmos Bildsensor entwickelt, der eine sehr hohe
Lichtintensit�atsdynmaik von ca. 6 Dekaden aufweist. Dies haben wir mit einem sogenan-
nten logarithmischen Pixel erreicht, das neben dem �ublichen photosensitiven Element noch
zus�atzliche Schaltungselemente aufweist. Die pr�azise Dimensionierung der f�ur die Schaltung
n�otigen Transistoren stellte dabei die gr�osste Herausforderung dar. Durch Berechnungen und
genaue Simulationen konnten schliesslich die optimalen Transistorgr�ossen erruiert und das
Layout des Pixels erstellt werden. Der Chip wurde mit dem Alcatel Mietec 0:5�m Prozess
realisiert.
W�ahrend der Produktion unseres Sensors befassten wir uns mit Bildverarbeitungsalgo-
rithmen und stiessen dabei auf das v�ollig neuartige Gebiet der zellul�aren neuronalen Netze
(CNNs). CNNs, behandelt auf der Ebene der Schaltungstechnik, sind analoge Schaltungen,
die untereinander vernetzt sind und deshalb von der Struktur her geradezu pr�adestiniert
sind f�ur die Bildverarbeitung. Bilddaten werden parallel und analog verarbeitet und lassen
dabei digitale Prozessoren in Sachen Geschwindigkeit weit hinter sich. Die Architektur dieser
Netzwerke wurde von Leon O. Chua im Jahr 1988 entwickelt und seither sind zahlreiche Ar-
beiten zu diesem Thema publiziert worden. Wir haben uns zuerst mit der Funktion dieser
Netze auseinandergesetzt und in einem zweiten Schritt versucht, die CNNs zu kon�gurieren.
F�ur die Berechnung der Kon�gurationsdaten, den sogenannten Templates, haben wir einen
genetischen Algorithmus verwendet. Um die CNNs auch in Hardware auszutesten, �el die
Entscheidung f�ur den Kauf einer sogenannten CNN 'Universal Machine'. Die CNN 'Uni-
versal Machine' besteht im Wesentlichen aus einem Chip, auf dem programmierbare CNNs
implementiert sind und der vom Computer aus kon�guriert werden kann. Mit Hilfe dieses
Systems waren wir in der Lage, unseren Template-Design zu veri�zieren und Bilder von
unserem Chip zu verarbeiten.
Ein weiterer bedeutender Teil unserer Arbeit bestand aus dem Ausmessen unseres Bild-
Thesis Report { Smart Image System iii
sensors. Dabei war vor allem die Messung der Lichtintensit�at von Interesse; wir konnten
unsere simulierten Daten erfolgreich veri�zieren. Eine weitere Herausforderung war die Be-
seitigung des sogenannten Fixed Pattern Noise. Dieses Rauschen ist im Zusammenhang mit
logarithmischen Sensoren bekannt, kann aber mit geeigneten Methoden eliminiert werden.
Unser Bildsensor wurde als Prototyp auf einen maximalen Lichtintensit�atsbereich optimiert
und erf�ullte die gestellten Anforderungen.
iv M. Bettler & S. Zahnd
Foreword
The main goal of this work was to combine an image sensing system with an image processing
system in a smart image system. Figure 0.2 shows the time schedule of our work. We
completed and updated the documentation of all the parts in this paper during the bachelor's
thesis.
Figure 0.1: Smart Image System.
The project will be divided into three main parts as follows.
Figure 0.2: Project time schedule.
Image Sensing (Preliminary 1) In this part we have accomplished our �rst implemen-
tation of a Cmos image sensor. By this way we will be able to capture our �rst pictures,
verify our simulations and calculations and �nally get unknown physical parameters like the
real dependence of light on electric signals.
Image Processing (Preliminary 2) In this part we studied the properties of cellular
neural networks. We presented a complete method to design CNN templates in the frequency
domain by using genetic algorithms as an optimization algorithm. Finally we introduced the
concept of the CNN Universal Machine that we will use in the bachelor's thesis for image
processing.
Smart Image System (Bachelor's Thesis) In the bachelor's thesis we measured many
of the important properties of a Cmos image-sensor. To perform this measurements we built
speci�c hardware and software. To complete our work we combined the two systems.
Thesis Report { Smart Image System v
Additional Information
for Experts
Figure 0.3 presents the job scheduling of our bachelor's thesis. It covers the eight weeks of
the �nal project. We have started the �rst project at the end of October and terminated it
in the middle of December in the y2k.
Each of the preliminary studies covers a period of half a year, and was carried out in
winter 1999 and summer 2000.
Figure 0.3: Job scheduling.
Thesis Report { Smart Image System vii
Contents
I Image
Sensing 1
1 Basics of a CMOS image sensor 5
1.1 The perfect model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 The photodiode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Logarithmic APS chip 9
2.1 The sensor core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Simulation of the sensor core . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.2 Implementation of the sensor core . . . . . . . . . . . . . . . . . . . . 9
2.2 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Input (digital part) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Input (analog part) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Output (analog part) . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The complete chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Chip development at MicroLab 15
3.1 Process Fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Design- ow for the digital part . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Design- ow for the analog part . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Assembling (analog & digital) . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
II Image
Processing 17
1 Introduction to Cellular Neural Networks 21
1.1 Cellular Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.1.1 Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2 Mathematical Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.2.1 Cell Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.2.2 Spatial Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Electrical Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.1 Cell Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.2 Network operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4 Types of Processing Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Thesis Report { Smart Image System ix
Contents
1.5 Design of CNN Templates and their Robustness . . . . . . . . . . . . . . . . . 25
2 CNN Template Design for Image Processing 27
2.1 Convolution Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Spatial Frequency Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Template Design Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.1 Low-Pass Filter Design . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 CNN Optimization Techniques 35
3.1 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1 How Genetic Algorithms are Di�erent from Traditional Methods . . . 35
3.1.2 Genetic Search Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.3 Some Mathematical Foundations . . . . . . . . . . . . . . . . . . . . . 37
3.1.4 Design of the Fitness Function . . . . . . . . . . . . . . . . . . . . . . 39
3.1.5 GA Based Template Learning . . . . . . . . . . . . . . . . . . . . . . . 39
4 Hardware Implementation 41
4.1 CNN Universal Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
III Smart
Image System 43
1 Obscura - Image Sensing Unit 47
1.1 PCB Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.2 Hardware Interface Card - dSpace . . . . . . . . . . . . . . . . . . . . . . . . 47
1.3 Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.4 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2 Obscura - Measurement 53
2.1 Introduction to Fixed Pattern Noise (FPN) . . . . . . . . . . . . . . . . . . . 53
2.1.1 Transistor Mismatch in Weak Inversion . . . . . . . . . . . . . . . . . 53
2.1.2 Fixed Pattern Noise Correction in Logarithmic Image Sensors . . . . . 55
2.2 Photoreceptor response and �xed pattern noise . . . . . . . . . . . . . . . . . 56
2.2.1 Response curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.2.2 Remaining �xed pattern noise . . . . . . . . . . . . . . . . . . . . . . . 56
2.2.3 Slope variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.3 Complementing measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.3.1 Crosstalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.3.2 Temporal Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.4 Spectral characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3 Aladdin - Image Processing Unit 65
4 Conclusion 67
4.1 Current state of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 Post-script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
x M. Bettler & S. Zahnd
Contents
A The Layout Structure 71
B Analysis of the pixel circuit 73
C Content of the CD-ROM 75
Bibliography 81
Thesis Report { Smart Image System xi
Part I
Image Sensing
Thesis Report { Smart Image System 1
Preface
The goal of the �rst part of our work is to develop a single-chip image sensor, which provides
high dynamic range with respect to light intensity. We use a Cmos technology in order to
achieve better results than in standard CCD implementations. By using a Cmos technology
we expect to well approximate the characteristics of the human eye relative to light intensity.
With this project we will test the current and cheap technology of Cmos image sensors
to get the knowledge for further implementations in embedded systems for theMicroLab1.
The main ideas for our work come from [23]. In article [23] the Cmos technology is used
to build an image-sensor for a retina-implant system that will provide visual sensations to
patients su�ering from photoreceptor degeneration.
1The MicroLab is the laboratory for microelectronics at the School of Engineering and Architecture in
Biel.
Thesis Report { Smart Image System 3
1 Basics of a CMOS image sensor
The overall aim of our work is to approximate the human eye by a microelectronic system
based on a Cmos technology.
1.1 The perfect model
The human eye is a brilliant creation. Its main part is the retina, which is a thin sheet of
neural tissue that partially lines the orb of the eye. This thiny outpost of the central nervous
system is responsible for collecting all the visual information about properties of objects in
the world over many orders of magnitude of illumination.
The high degree to which a perceived image is independent of the absolute illumination
level is, in a large part, the result of the initial analog stages of retinal processing, from the
photoreceptors through the outer plexiform layer. This processing relies on lateral inhibition
to adapt the system to a wide range of viewing conditions, and to produce an output that
is independent of the absolute illumination level.
The major parts of the retina are shown in cross-section in Figure 1.1. Light is transduced
into electrical potential by the photoreceptors at the top. After further processing and
combining the signals leave the retina by way of the ganglion cells.
Cross-section of a primate retina, indicat-
ing the primary cell types and signal path-
ways. The outer-plexiform layer is beneath
the foot of the photoreceptors. The invagi-
nation into the foot of the photoreceptor is
the site of the triad synapse. In the center
of the invagination is a bipolar-cell process,
anked by two horizontal cell processes. R:
photoreceptor, H: horizontal cell, IB: invagi-
nating bipolar cell, FB: at bipolar cell, A:
amacrine cell, IP: inter plexiform cell, G:
ganglion cell.
(Source: [17], p. 258)
Figure 1.1: Cross-section of a primate retina.
The primary function of the photoreceptor is to transduce light into an electrical signal.
Thesis Report { Smart Image System 5
1 Basics of a CMOS image sensor
For intermediate levels of illumination, this signal is proportional to the logarithm of the
incoming light intensity. The logarithmic nature of the output of the biological photoreceptor
is supported by psychological and electro-physical evidence. This has two important system-
level consequences:
1. An intensity range of many orders of magnitude is compressed into a manageable range
in signal level.
2. The voltage di�erence between two points is proportional to the contrast ratio between
the corresponding points in the image.
We discuss our implementation of a Cmos image-sensor in Chapter 2.1.1.
1.2 The photodiode
To detect optical radiation (photons) a photodiode is needed. The basic detection process is
illustrated in Figure 1.2 which shows a p-n photodiode. This device is reverse biased and the
electric �eld developed across the p-n junction sweeps mobile carriers (holes and electrons)
to their respective majority sides. A depletion layer is therefore created on either side of the
junction. This barrier has the e�ect of stopping the majority carriers crossing the junction
in the opposite direction to the �eld. However, the �eld accelerates minority carriers from
both sides of the opposite side of the junction, forming the reverse leakage current of the
diode. Thus, intrinsic conditions are created in the depletion region.
A photon incident in or near the depletion region of this device which has an energy
greater than or equal the bandgap energy EG of the fabricating material (i.e. hf � EG )
will excite an electron from the valence band into the conduction band. This process leaves
empty holes in the valence band and is known as the photo-generation of an electron-hole
pair, as shown in Figure 1.2(a). Carrier pairs so generated near the junction are separated
and sweep (drift) under the in uence of the electric �eld to produce a displacement by current
in the external circuit in excess of any reverse leakage current Figure 1.2(b). Figure 1.2(c)
shows the photo-generation and the separation of a carrier pair in the depletion region of
this reverse biased p-n junction.
The absorption of photons in a photodiode to produce carrier pairs and thus a pho-
tocurrent, is dependent on the quantum eÆciency QE, the area of the depletion region, the
wavelength, and the power of the incident light. Formula (1.1) gives a de�nition of the
quantum eÆciency. Formula (1.2) shows that the photocurrent is proportional to the power
of radiation for a speci�c wavelength. Note that one of the major factors which determines
the quantum eÆciency is the semiconductor material. Finally the QE is a function of the
photon wavelength and must be quoted for a speci�c wavelength.
QE =number of electrons collected
number of incident photons: (1.1)
Ip = QE(�)q�PLAD
hc
: (1.2)
6 M. Bettler & S. Zahnd
1.2 The photodiode
Figure 1.2: Operation of the p-n photodiode. (a) Photo-generation of an electron-hole pair;
(b) the structure of the reverse biased p-n junction; (c) energy band diagram of the reverse
biased p-n junction. (Source: [24], p. 423)
In the formula (1.2) q is the electronic charge, h is Planck's constant and c is the speed
of light.
In Figure 1.3 one can see a graphical plot of the formula (1.2) for two given wavelengths.
10−4
10−3
10−2
10−1
100
101
102
103
10−14
10−13
10−12
10−11
10−10
10−9
10−8
10−7
10−6
Irradiance in W/m2
Pho
tocu
rren
t in
A
QE=0.8
Area=1714um2
Lambda=700nmLambda=400nm
Figure 1.3: Photocurrent versus the power of the incident light, see formula (1.2).
In our image sensor the Cmos compatible photodiode is formed between drain di�usion
and the p�-substrate. For further information about our implementation see Chapter 2.1.2.
Thesis Report { Smart Image System 7
2 Logarithmic APS chip
The following sections presents the integration of a Cmos image-sensor with a active pixel
sensor (APS) structure on a chip.
2.1 The sensor core
2.1.1 Simulation of the sensor core
The chosen circuit topology is based on [23] and has to meet the following requirements.
1. Logarithmic light detection and conversion into voltage in a useful linear range.
2. Minimal power consumption by using the Mos-Fet in the subthreshold mode.
3. Minimal power dissipation by cutting o� inactive cells.
4. Maximal dynamic range over many magnitudes of illumination.
Figure 2.1 shows the circuit for one picture element. For a detailed mathematical de-
scription see Appendix B. By simulating1 the circuit, we �nd the optimal sizes for each
Mos-Fet. Table 2.1 shows the results. Figure 2.1 shows the simulated signals (Vout, Iphoto)
of the circuit versus the light irradiance which is proportional to the area.
Q Width[�m] Length[�m]
Q1 40 1.2
Q2 1.2 1.2
Q3 1.2 1.2
Table 2.1: Optimal sizes for the Mos-Fet of the pixel circuit.
2.1.2 Implementation of the sensor core
This was one of the most important parts of our work. The goal was to implement the
simulated cell structure; see Section 2.1.1. A further aim was to optimize the �ll factor, that
means, the relation between light active area and the rest of the pixel's area.
For our �rst chip we chose a pixel size of 50 � 50�m. This size is very large, but useful
for a �rst implementation. The result of our e�ort is the layout of one pixel, presented in
Figure 2.3.
1We are not able to simulate the dependence of light on the circuit, because there are unknown physical
parameters which depend on the fabrication process. So we model the in uence of light by varying the
area of the photoreceptor in the simulation, because this area is proportional to the light irradiance.
Thesis Report { Smart Image System 9
2 Logarithmic APS chip
Figure 2.1: Schematic of the circuit for one pixel . From all picture elements there is always
one row active depending on the digital Row Select signal. Then, all the generated output
signals ow along the Column Readout lines and will later be multiplexed. By turning o�
the analog reference signal Row Preselect, we minimize the power consumption.
Figure 2.2: Simulated pixel circuit which works linearly over more than seven decades. The
in uence of the light intensity was modeled by varying the area of the photodiode.
10 M. Bettler & S. Zahnd
2.2 Input/Output
Figure 2.3: Layout of the pixel circuit.
2.2 Input/Output
The global structure of the input and output con�guration is shown in Figure 2.4. For
further information on the implementation, see Appendix A.
Figure 2.4: Blockdiagram of the I/O system on chip, divided into analog & digital part and
into input & output section.
Thesis Report { Smart Image System 11
2 Logarithmic APS chip
2.2.1 Input (digital part)
The digital part of our chip consists of a 6 to 48 de-multiplexer (Demux) and a Nand
unit. The Demux with its active-low2 output is used to switch-on the p-pass-transistors
that supply each row of our chip with the analog reference voltage (VpsRef). By a further
Nand combination with (VsEna) the selection of each row (vs<0:47>) of our 48 � 48 pixel
image-sensor will be done .
2.2.2 Input (analog part)
The only task of this simple circuitry is to supply the analog reference signal (VpsRef) to the
selected row. In this way we keep the power consumption low. Instead of a transmission-gate
we chose a p-pass-transistor design. So the row selection is active-low.
2.2.3 Output (analog part)
With a analog multiplexer (MUX) the 48 column signals are merged into 3 lines, because
the dSpace-Board3 has only 4 analog input ports. The MUX is accomplished with a 4 level
transmission-gate structure.
2.3 The complete chip
Figure 2.5 shows a photography of our bonded image sensor chip. The pin outline is presented
in Figure 2.6 and Table 2.2 provides an overview of features and speci�cations of the chip.
Figure 2.5: The whole CMOS image sensor chip.
2Active-low output of the Demux is needed because of the p-pass-transistor structure in the analog input
part.3See Chapter 1.2 in Part III for an introduction to dSpace
12 M. Bettler & S. Zahnd
2.3 The complete chip
Figure 2.6: Bonding plan.
Parameter Specification Unit
Features of the Single-Chip Cmos Image Sensor
Resolution 48 � 48 pixel
Dynamic range 130 dB
Slope 69.5 mV/decade
Chip and Package
Name CH011
Die Size 11.27 mm2
Package LCC44
Electrical properties
Supply (Vdd) 3.3 V
Reference (VpsRef) 1.1 V
Digital inputs 3.3 V Cmos V
Analog outputs 1.8 to 2.25 (linear range) V
Pins and Usage
Row[0..5] Binary coded row selection.
0xb00000 means the top row.
Row[0..3] Binary coded column selection.
0xb000 means the most left column of one third.
VpsRef Reference Voltage for the pixel circuit.
Only the selected row is supplied.
EnaVs Digital control signal for the
internal row selection. Only when
EnaVs is low, the addressed row is selected.
Vdd Power supply
Vss Ground
Table 2.2: Chip features and description.
Thesis Report { Smart Image System 13
3 Chip development at MicroLab
3.1 Process Fabrication
We will make use of a multi-chip module (MCM) provided by Europractice1. A MCM is
a complete electronic system with complex functionality, using bare (unpackaged) Ic's to
achieve a very high integration density.
The �nal production will be done by Alcatel by using the Cmos 0:5�m Mietec C05M-D
process.
3.2 Design-flow for the digital part
We describe the whole Demux/Nand design in the Vhdl language and did the simulation
as well as the synthesis to the Alcatel 0:5�m Cmos technology by Synopsis. The design- ow
of the digital part is shown in Figure 3.1 below.
Figure 3.1: Design- ow of the Digital Part.
The written Vhdl code needs to be checked with a simulation on the functional level
�rst. For that reason and for further simulations on other levels of the design- ow a test
bench has been written. Following the veri�cation of the Demux on the functional level,
the design has been synthesized to the Register Transfer Level RTL. The next step is the so
called post synthesis simulation that should verify the design including the cell delays. The
�nal task is the oor-planing with Silicon Ensemble from Cadence.
1Europractice is a European organisation, divided into many universities and supported by the European
Union. Europractice allows to use design tools from vendors such as Cadence and to produce microchips
at a�ordable prices.
Thesis Report { Smart Image System 15
3 Chip development at MicroLab
3.3 Design-flow for the analog part
All the analog parts of our chip are designed as full custom parts. The main tool for the
design of the layout, the veri�cation of the design layout rules, the schematic entry and the
layout versus schematic test are performed with the design tools of Cadence.
3.4 Assembling (analog & digital)
To link all the parts of our design together, the Silicon Ensemble and the library develop-
ment tool Auto Abgen as well as the Synopsis synthesis tools have been used. The synthe-
sis tool generates the Vhdl source for simulation and the verilog source for oor-planning.
For the analog part, a LEF-�le2 needs to be generated by the library development tool men-
tioned above. Figure 3.2 shows a very simple representation of the oor-planing/place&route
tool Silicon Ensemble.
Figure 3.2: Simpli�ed representation of the design- ow for the place and route task of our
design.
2A Library Exchange Format(LEF) �le contains library information for a class of designs. Library data
includes layer, via, placement site type, and macrocell de�nitions.
16 M. Bettler & S. Zahnd
Part II
Image Processing
Thesis Report { Smart Image System 17
Preface
In our �rst preliminary study (Part I) we presented the development of a Single-Chip Cmos
Image Sensor. It enables us to capture grayscale images with a resolution of 48 x 48 pixels
and a high dynamic range. Caused by the physics, these raw data are noised, so they need
to be free from this in uence. We therefore proposed to develop signal processing algorithm,
which can be implemented later on the same chip. A possible structure for this, could be the
use of a pseudo-resistive di�usive averaging network based on MOS transistors as presented
in [23].
An absolutely new approach to eliminate noise would be the use of Cellular Neural
Networks (CNNs). Furthermore this technique could be useful for pattern recognition and
more. This way we are able to build an intelligent sensor system. The implementation of
the CNN and the Imager on the same chip will be more diÆcult. Finally, we preferred using
CNNs instead of an averaging network, thanks to its exibility and its current actual aspects.
This minor change in the project realisation has a major e�ect on the bachelor's thesis
contents. Therefore, we will not develop the �nal chip as previously suggested. But with a
CNN Universal Machine Prototyping System linked together with our Imager, we can show
the way to build a powerful and intelligent sensor system.
Thesis Report { Smart Image System 19
1 Introduction to Cellular Neural
Networks
In this chapter we give a brief [15] introduction to cellular neural networks, their description,
and their processing. A general introduction can be found, e.g., in [5, 2].
1.1 Cellular Neural Networks
The Cellular Neural Network (CNN) architecture was invented by Leon O. Chua1 and his
graduate student Lin Yang in 1988 [5, 4]. The properties of this net are: nonlinear continuous
time dynamic elements placed in a cellular array. This results in a nonlinear system in space,
which is very complex to handle. The inventors, however, showed that these networks can
be designed and used for a wide variety of engineering purposes, while maintaining stability
and keeping the dynamic range within well designed limits.
Since then a lot of studies have been presented on this subject. There is also a bi-annual
conference dedicated to CNNs and their applications2. CNNs become of interest in many
applications, e.g., 2-D processing and recognising (picture, di�erential equations,. . . ) and
1-D processing and representation (audio, cryptography, data compression,. . . ).
To perform such processing by digital signal processors (DSP) requires fast and/or paral-
lel machines, because there are usually many pixels to process. Using CNNs, the processing
may be performed by analog circuits. In applications, such circuits do not require extreme
accuracy for a correct functionality. Furthermore, CNNs can be built by using less silicon
area and less power for the same task and throughput rates.
1.1.1 Network Structure
The cells in a CNN are arranged and connected in a certain way, which can be charac-
terised by the following attributes. There are basically two types of dimensions in CNN,
one-dimensional and planar respectively. Further we distinguish between several types of
topology, e.g., square or hexagonal grid. The connections from a particular cell to the oth-
ers is de�ned by a set of neighbours which are inside a certain number of connection units
1Leon O. Chua received the M.S. degree from the Massachusetts Institute of Technology in 1961 and the
Ph.D. degree from the University of Illinois, Urbana, in 1964.
He is currently a Professor of Electrical Engineering and Computer Sciences at the University of Cali-
fornia, Berkeley. His research interests are in the areas of general nonlinear network and system theory.
He has been a consultant to various network analysis, modeling, and computer-aided design. He is the
author of several books and papers.
Professor Chua is holder of �ve U.S. patents and was awarded with multiple Honorary Doctorate titles.2The IEEE International Workshop on Cellular Neural Networks and their Applications.
Thesis Report { Smart Image System 21
1 Introduction to Cellular Neural Networks
(Fig.1.1). Each connection should be understood as being bi-directional, i.e., connected cells
in uence each other.
Figure 1.1: Network structure of a planar, square grid CNN with nearest neighbor connec-
tions.
1.2 Mathematical Description
CNNs may be studied in a purely mathematical point of view.
1.2.1 Cell Dynamics
State Voltage and Output Function. The dynamics of the simplest CNN, as presented
in Chapter1.3, is described by
d
dt
xi;j(t) = �xi;j(t) +P
k;l2N
Ak;lyi+k;j+k(t) +P
k;l2N
Bk;lui+k;j+k + I (1.1)
with the output nonlinearity, called unity gain,
y(x) = 12 [jx� 1j � jx+ 1j] (1.2)
as shown in Fig. 1.2. The input, state, output, represented by ui;j , xi;j , and yi;j , respectively,
are de�ned on 0 � i � N1 and 0 � j � N2. N , Nr, respectively, denotes the set of all cells
with which the i; jth cell is directly connected, where r is the within neighborhood radius.
Ak;l, Bk;l and I are the network coeÆcients.
Other output functions have been proposed [2], such as rather complicated, multi-
breakpoint piecewise linear, Gaussian or simply thresholding output functions.
Block diagram. The emphasis on A template can be easily understood by writing (1.1)
in block diagram form, as shown in Fig. 1.3. From the diagram, it can be seen that the B
22 M. Bettler & S. Zahnd
1.2 Mathematical Description
Figure 1.2: The unity gain output function, piecewise-linear function (PWL), respectively.
template forms a simple feed forward �nite impulse response (FIR) �ltered version of the
input, while A template is operating in a feedback loop along with a nonlinearity.
Figure 1.3: A block diagram showing the standard CNN.
Initial State. xi;j(0) represents the initial state. It is convenient to restrict the range
to [+1, -1].
Boundary Values. In order to guarantee that all pixels have the same number of
neighbors, it is necessary to surround the image with a ring of boundary pixels. Their state
is �xed to a boundary value.
Stability issues. In connection with linear system theory, \stability" means that the
e�ect of a suÆciently small disturbance will decay in time and the network will return to
the same equilibrium state. There exist a lot of reports which describe stability analysis and
conditions.
Settling time. In the case of a nonlinear dynamical system, the processing speed is
de�ned by its settling time, i.e., the time it takes the system to reach its equilibrium state.
The settling time depends on its input u, the initial state xi;j(0) and, in a complex and
highly nonlinear manner, on the template set A;B; I.
Having an estimate of the settling time at one's disposal allows template optimization
with respect to processing speed [10]. Design rules for faster templates can be derived.
Thesis Report { Smart Image System 23
1 Introduction to Cellular Neural Networks
1.2.2 Spatial Invariance
Each term in (1.1) carries the (i; j) index, implying that all quantities depend on the grid
position. Usually this is not the case, as long as we work with a subclass of CNN, the spatially-
invariant networks. Spatial-invariance implies that each cell has the identical controlled
sources, i.e., that the nature of the source (i; j : k; l) depends only on the relative position
of (i; j) and (k; l), and that the constant sources I(i; j) are all identical. One can speak
of a cloning template where each cell is repeated. The operation of the CNN is de�ned by
specifying the various controlled sources for each di�erence (k � i; l � j) and the constant
source, by de�ning the feedback template A and the control template B, and the constant
source I.
1.3 Electrical Description
In order to use the mathematical description of a cellular nonlinear network in practical
problems mentioned before, this network has to be implemented in hardware.
1.3.1 Cell Schematic
The translation of the original CNN publication [5] results in the simple CNN cell circuit
shown in Fig.1.4.
Figure 1.4: The CNN cell schematic. The output function is shown as a functional block
rather than a current source and a non-linear resistor.
The cell at the grid location (i; j) consists of the parallel Rc circuit (Rx; Cx), several
current sources, and a voltage source. The cell input is represented by the independent
voltage source Eij whose output voltage is uij . The capacitor voltage xij is the state voltage
of the cell. The state voltage observed through the PWL output function is the output
voltage yij . The PWL block can be built with Chua's circuit [1].
The controlled current sources enable the interconnections between the current cell (i; j)
and the neighboring cells (k; l). The voltage controlled current sources (Vccs) B take as
their controlling voltage ukl. The VCCS A take as their controlling voltage ykl. The current
source I is a constant current source.
1.3.2 Network operation
Generally, a CNN is programmed by choosing a template set fA;B; Ig and assigning the
appropriate initial data to uij and xij(0). The CNN circuit then operates as follows: at time
t = 0� the state voltage of each cell is set to some initial value xij(0), all the current sources
A;B; and I are inhibited,i.e., output no current, and the voltage uij is provided in each cell.
24 M. Bettler & S. Zahnd
1.4 Types of Processing Tasks
At t = 0+ the current sources are switched on and allowed to operate. The state voltage
will evolve in time, as per (1.1). Provided that the network is stable, starting from an initial
state, an equilibrium state will eventually be reached after transients have decayed. The
evolution to an equilibrium state will be called a CNN transient. Either one or both of the
voltages xij(0) and uij can be considered to represent the input image(s), depending on the
desired processing task.
1.4 Types of Processing Tasks
CNN processing tasks may be divided in two classes: based on the type of interaction between
cells, and based on the type of input data.
Coupled and Uncoupled Processing Tasks. One can divide processing tasks in
coupled and uncoupled ones, based on the type of interaction implied by the A template.
Coupled processing tasks have non-zero o�-center A template entries, i.e., the interaction
between cells involves feedback. On the other hand, uncoupled tasks have, at most, the
self-feedback template entry non-zero, so the interaction between cells is feed-forward only.
Coupled templates can exhibit propagation behavior, and thus perform operations of a global
nature, in contrast to uncoupled templates which usually have fast settling times.
Bipolar and Gray-Scale Processing. Some tasks assume the input pixels take on
only +1 and -1 (bipolar). With no intermediate values. For other tasks, particularly when
\real world" data is involved, the inputs and initial states take on gray-scale values between
�1.
1.5 Design of CNN Templates and their Robustness
CNNs are a subclass of neural networks and resemble a Hop�eld network. But they are
di�erent in many respects from general neural networks, i.e., in implementation issues and
kind of programming. CNNs do not normally go through a training phase, and the connec-
tions weights for a certain task can be deduced or even computed in closed-form. Therefore
one can use di�erent tools, like statistical methods, genetic algorithms, by experience, lin-
ear algebra and ad hoc reasoning. Unlike conventional arti�cial neural networks (ANN),
CNNs are not usually used as approximators of a unknown function, but rather to perform
a well-de�ned mapping.
However, the absence of training and the fact that the connection weights are \pre-
scribed" rather than found by experiencing, means that the circuit has to achieve these
weights with a high degree of precision. considering the physical limitations that analog
implementations entail, robust operation of a CNN chip with respect to parameter varia-
tions has to be ensured. So far not all mathematically possible CNN tasks can be carried
out reliably on an analog chip. By applying other techniques CNN templates can be found
which guarantee a satisfactory optimal robustness [10].
The design of CNN templates { even more the design of optimally robust templates {
are a key element in CNN research.
Thesis Report { Smart Image System 25
2 CNN Template Design for Image
Processing
Many image Processing and Pattern Formation e�ects of the simple Cellular Neural Network
(CNN) 1 can be understood by means of a common approach as shown in [6]. By examining
the dynamics in the frequency domain, when all CNN cells are in the linear region, the
mechanism for IIR spatial �ltering, pattern formation, morphogenesis, and synergetics can
be shown to be present, even though each cell has only �rst order dynamics. In addition,
the method allows many of the standard CNN templates, such as the nonlinear \averaging",
\halftoning" and \di�usion" templates to be explained in a new light. With one example in
Chapter 2.3 it is shown how generalizations of these templates can be used to design linear
�lters 2 for image processing tasks. Another, more direct approach to design templates is
brie y introduced in Chapter 3.1.5 for completeness.
2.1 Convolution Formulation
In Chapter 1.2 the dynamics of the CNN was introduced by equation (1.1), which is repeated
here for convenience:
d
dt
xi;j(t) = �xi;j(t) +P
k;l2N
Ak;lyi+k;j+k(t) +P
k;l2N
Bk;lui+k;j+k + I (2.1)
If all j xi;j(t) j< 1 then, because of the unity gain output function (Fig. 1.2), yi;j = xi;j
and the whole system behaves according to the linear system
d
dt
xi;j(t) = �xi;j(t) +P
k;l2N
Ak;lxi+k;j+k(t) +P
k;l2N
Bk;lui+k;j+k + I (2.2)
which we now assume to operate over all state space. For simplicity, de�ne the linearized
template mask as follows:
a(n1; n2) =
8><>:
A0;0 � 1 (n1; n2) = (0; 0)
A�n1;�n2 for �n1;�n2 2 N
0 otherwise
b(n1; n2) =
(B�n1;�n2 for �n1;�n2 2 N
0 otherwise
(2.3)
1Please see Chapter 1 for an introduction to CNN's2Linear Filtering is �ltering in which the value of an output pixel is a linear combination of the values of
pixels in the input pixel's neighborhood. For example, an algorithm that computes a weighted average of
the neighborhood pixels is one type of linear �ltering operation.
Thesis Report { Smart Image System 27
2 CNN Template Design for Image Processing
For now, assume that the sequences x0(n1; n2) and u0(n1; n2) are de�ned to be zero for
all integers n1 and n2 where the supplied data are not de�ned. Then the dynamics can be
written in convolution form:
d
dt
xt(n1; n2) = a(n1; n2) � xt(n1; n2) + b(n1; n2) � u(n1; n2) + I: (2.4)
2.2 Spatial Frequency Formulation
We will now make use of the two-dimensional Discrete Space Fourier Transform (DSFT),
which gives the representation of a sequence on the basis of complex exponentials. Assuming
all the DSFT's exist, the dynamics can be written in the new basis by transforming (2.4)
into
d
dt
X (!1; !2) = A(!1; !2)Xt(!1; !2) + B(!1; !2)U(!1; !2) + IÆ(!1; !2) (2.5)
which is uncoupled in spatial frequency, i.e., for each of the uncoupled numbers of !1; !2,
this is a single linear �rst order ordinary di�erential equation. This equation describes the
manner in which the coeÆcients of each of the basis functions of the basis changes over time.
The dynamics of the modes evolve independently for each spatial frequency.
If, for all (!1; !2), we have A(!1; !2) < 0 then the central linear system is stable, and all the
exponential terms in the time solution will tend to zero as time goes to in�nity. The stable
equilibrium, which is independent on the initial conditions, can be found easily by �nding
the limit of the time solution of (2.5):
X1(!1; !2) =H(!1; !2)U(!1; !2) (2.6)
H(!1; !2) =�1
A(!1; !2)B(!1; !2): (2.7)
The spatial transfer function H(!1; !2) can be shown to have two important properties:
\zero phase" and \in�nite impulse response" (IIR). Because the A and B templates are
real and symmetric, their DSFT's are as well. Therefore, the phase of a spatial sinusoid in
the input image is not modi�ed by the transfer functions A(!1; !2) and B(!1; !2). Since
H(!1; !2) is a simple function of these transfer characteristics, it will inherit the zero-phase
property. And, because the transfer function is made by �nding the inverse of the FIR
�lter a(n1; n2) it is, typically, spatially in�nite in extend. That is, due to the feedback in the
dynamics, the local connections of the A template can be used to perform non-local �ltering.
28 M. Bettler & S. Zahnd
2.3 Template Design Examples
2.3 Template Design Examples
Linear spatial �ltering is the work horse of image processing algorithms, and the many
applications that use linear �ltering, such as interpolation, visual modeling, and image com-
pression, could bene�t.We now give some examples of possible approaches to such templates.
2.3.1 Low-Pass Filter Design
Good quality low-pass �lters have many important uses, such as image interpolation, and
therefore provide a speed/performance benchmark for any image processing hardware. It
is interesting to see how close a single CNN template pair can come to performing ideal
low-pass �ltering of the input by using the equilibrium approach described above.
There are two important system constraints on the design process in both the template
domain and the frequency domain. The most important point may be that the gains of
A(!1; !2) are strictly negative. Some other concerns are the available range and the accuracy
of the template elements of a particular CNN implementation and a slow convergence speed
or stability sensitivity if the eigenvalues A(!1; !2) are too small.
As we would like to perform the parameter minimization process in the frequency domain
while retaining control over template size we make use of a transformation method similar
to that used in FIR �lter design. In addition, the method reduces the number of parameters
to be minimized by imposing circular symmetry.
Let C(!1; !2) be a �lter with the desired contours. Then, if we specify
A(!1; !2) =RPr=0
�rCr(!1; !2) (2.8)
B(!1; !2) =RPr=0
�rCr(!1; !2) (2.9)
both A(!1; !2) and B(!1; !2) are simple continuous functions of C(!1; !2), and they will
both have the same shaped constant contours as C(!1; !2) and therefore, H(!1; !2) will as
well. Also of importance, by choosing a c(n1; n2) to be nonzero only on a �nite support, the
size of the support of the nonzero parts of a(n1; n2) and b(n1; n2) can be controlled.
This method will now be used to design 5�5 A and B templates with the goal of a circularly
symmetric low-pass �lter at equilibrium with passband extending to 0:4� and stopband
starting at 0:5�.
The sequence with the nonzero elements
c(n1; n2) =
264 0:25 0:50 0:25
0:50 1:0 0:50
0:25 0:50 0:25
375 (2.10)
with the center element 1.0 is known to have contours with good circular symmetry and
radial monotonicity. Because we want our templates to be 5 � 5, we have to perform a
frequency-weighted minimization with respect to the parameters �0; �1; �2; �0; �1; �2:
A(!1; !2) = �0 + �1C(!1; !2) + �2C2(!1; !2) (2.11)
Thesis Report { Smart Image System 29
2 CNN Template Design for Image Processing
B(!1; !2) = �0 + �1C(!1; !2) + �2C2(!1; !2): (2.12)
We eventually used Genetic Algorithms GA's to perform the minimization task.
There are a number of parameters in a genetic algorithm which have to be speci�ed. The
following parameters were used in the simulation:
� population size= 50
� bit mutation rate= 0:05
� non-overlapping populations
� two-point crossover
� direct mapping as �tness technique
For an introduction into genetic algorithms see Chapter 3.1.
For the design of the �tness function we used �lter design techniques that are very com-
mon in image processing, the frequency transformation method and the frequency sampling
method.
The frequency transformation method transforms a one-dimensional �lter into a two-dimensional
�lter. It preserves most characteristics of the one-dimensional �lter, particularly the transi-
tion bandwidth and ripple characteristic. This method uses a transformation matrix (2.10)
i.e., a set of elements that de�ne the frequency transformation.
The frequency sampling method creates a �lter based on a desired frequency response given
a matrix of points that de�nes its shape. This method creates a �lter of which the frequency
characteristic passes through those points. Frequency sampling places no constraints on the
behaviour of the frequency response between the given points; usually, the response ripples
in these areas.
For the low-pass �lter design we sampled the spectrum every 0:05� as shown in Fig. 2.1.
There you can also see a possible �lter characteristic of a high order low-pass �lter.
However, another important property of the �lter design is not visible in Fig. 2.1: the
weights of the individual samples. We will refer to this problem later and will now go on to
the design of the �tness function.
The �tness function, in fact, has to perform a least square minimization with respect to the
weights of the individual samples in the frequency domain. Now consider the transformation
matrix (2.10), which is to be transformed in the frequency domain C(!1; !2):
c(n1; n2) =
264 0:25 0:50 0:25
0:50 1:0 0:50
0:25 0:50 0:25
375
�j�
C(!1; !2) = 1 + cos(!1) + cos(!2) + cos(!1) cos(!2):
(2.13)
Then if we set the frequency !2 = 0 we can easily �nd the magnitude of the one-
dimensional transfer function j H(!) j:
30 M. Bettler & S. Zahnd
2.3 Template Design Examples
0 0.5 1 1.5 2 2.5 3 3.5−0.2
0
0.2
0.4
0.6
0.8
1
1.2
spatial frequency
mag
nitu
de
Figure 2.1: Sampled low-pass �lter characteristic.
j H(!) j =j �0 + 2�1(1 + cos(!)) + 4�2(1 + 2 cos(!) + cos2(!)) j
j �0 + 2�1(1 + cos(!)) + 4�2(1 + 2 cos(!) + cos2(!)) j: (2.14)
Now we evaluate this transfer function at every frequency sample and formulate the
results in a matrix Zi multiplied by the two coeÆcient column vectors, that represent the
parameters to be minimized, to perform the frequency sample error vector Ei. For the zero
magnitude case (stopband) the �tness function turns out to be
0BBBBBBBB@
�������Z0 �
0B@ �2
�1
�0
1CA��������������Z0 �
0B@ �2
�1
�0
1CA�������
1CCCCCCCCA
2
= E0 (2.15)
with the error vector E0. In the unity magnitude case (passband) with the error vector E1
is straightforward:
0BBBBBBBB@
�������Z1 �
0B@ �2
�1
�0
1CA��������������Z1 �
0B@ �2
�1
�0
1CA�������� 1
1CCCCCCCCA
2
= E1: (2.16)
Thesis Report { Smart Image System 31
2 CNN Template Design for Image Processing
It is important to mention that in the two formulas above the division and the square
exponent are element by element operations.
The two matrices Z0 and Z1 evaluated numerically will lead to
Z0 =
0BBBBBBBBBBBBBBBBBBB@
4:0000 2:0000 1:0000
2:8464 1:6871 1:0000
1:9098 1:3820 1:0000
1:1925 1:0920 1:0000
0:6797 0:8244 1:0000
0:3431 0:5858 1:0000
0:1459 0:3820 1:0000
0:0475 0:2180 1:0000
0:0096 0:0979 1:0000
0:0006 0:0246 1:0000
0 0 1:0000
1CCCCCCCCCCCCCCCCCCCA
(2.17)
Z1 =
0BBBBBBBBBBBBBB@
16:0000 4:0000 1:0000
15:8036 3:9754 1:0000
15:2265 3:9021 1:0000
14:3036 3:7820 1:0000
13:0902 3:6180 1:0000
11:6569 3:4142 1:0000
10:0842 3:1756 1:0000
8:4564 2:9080 1:0000
6:8541 2:6180 1:0000
1CCCCCCCCCCCCCCA
(2.18)
where each row represents one sample in the frequency domain. If we combine the two error
vectors E0 and E1 in a way that the �rst element of the vector represents the frequency
! = 0 and the last element ! = � to form the vector E(! = [0; �]) we can easily apply a
frequency weight by using the vector dot product. For example, consider the weight-vector
W (! = [0; �]) = (1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1)T ; as now all frequency weights
are equal to the dot product,
� = E:W (2.19)
would simply sum up the vector E(!) there � is the �nal �tness function that has to be
minimized by a suitable algorithm 3. The \equal-weight" approach would lead to the low-
pass �lter characteristics shown in Fig. 2.2.
Although this result is already very close to our desired �lter properties, we were still
able to improve the characteristic by applying the weight-vector
W = (1; 1:1; 1:2; 1:3; 1:1; 0:9; 0:7; 1; 1:3; 1:3; 1; 0:8; 1; 1:2; 1:1; 1; 0:8; 0:8; 0:8; 0:8)T (2.20)
to our search algorithm. This error vector emphasize the transition region and allows the
3see Chapter 3 for further information about optimization techniques
32 M. Bettler & S. Zahnd
2.3 Template Design Examples
0 0.5 1 1.5 2 2.5 3 3.50
0.2
0.4
0.6
0.8
1
1.2
1.4System Transfer Function
spatial frequency
mag
nitu
de
Figure 2.2: Low-pass �lter.
response to ripple near the transition region4. Performing the minimization with respect to
�0; �1; �2; �0; �1; �2 returned
A(!1; !2) = �4:60 + 3:47C(!1; !2)� 0:74C2(!1; !2) (2.21)
B(!1; !2) = 0:30� 0:75C(!1; !2) + 0:32C2(!1; !2): (2.22)
By taking the inverse DSFT of (2.21) and (2.22), the sequences a(n1; n2) and b(n1; n2),
and therefore, the A and B templates can easily be found:
A =
0BBBBB@
�0:0462 �0:1850 �0:2775 �0:1850 �0:0462
�0:1850 0:1275 0:6250 0:1275 �0:1850
�0:2775 0:6250 �2:7950 0:6250 �0:2775
�0:1850 0:1275 0:6250 0:1275 �0:1850
�0:0462 �0:1850 �0:2775 �0:1850 �0:0462
1CCCCCA (2.23)
B =
0BBBBB@
0:0200 0:0800 0:1200 0:0800 0:0200
0:0800 0:1325 0:1050 0:1325 0:0800
0:1200 0:1050 0:2700 0:1050 0:1200
0:0800 0:1325 0:1050 0:1325 0:0800
0:0200 0:0800 0:1200 0:0800 0:0200
1CCCCCA : (2.24)
Fig. 2.3 shows the characteristic of the improved low-pass �lter with its very sharp cuto�
and little ripple. The �nal 2-dimensional �lter characteristic with good circular symmetry
is shown in Figure 2.4.
4A good way to �nd this error vector is by trial and error. It is also possible to use frequency masks which
are often used in classical �lter designs even tough they cannot easily be applied to genetic algorithms
Thesis Report { Smart Image System 33
2 CNN Template Design for Image Processing
0 0.5 1 1.5 2 2.5 3 3.50
0.2
0.4
0.6
0.8
1
1.2
1.4System Transfer Function
spatial frequency
mag
nitu
de
Figure 2.3: Improved low-pass �lter.
−3−2
−10
12
3
−3−2
−10
12
30
0.2
0.4
0.6
0.8
1
1.2
spatial frequencyspatial frequency
gain
Figure 2.4: 2-D low-pass �lter characteristic.
34 M. Bettler & S. Zahnd
3 CNN Optimization Techniques
In order to perform a frequency weighted minimization for CNN template design as described
in Chapter 2 we were faced with the problem to �nd a suitable search method. The current
literature identi�es three main types of search methods: calculus-based, enumerative, and
random.
We �rst tried to apply calculus-based and enumerative methods to our problem but our
functions turned out to have multiple-peaks that are even very close to each other, and the
search space was far too big as we have to minimize at least six parameters with a reasonable
accuracy.
The combination of random and calculus-based methods led to quite good but still insuf-
�ciently accurate results. Thats why we applied a genetic algorithm to our search problem
which has shown to be very e�ective for CNN template learning [13]. In the following chapter
we show how genetic algorithms work and how they are di�erent from other search methods.
3.1 Genetic Algorithms
Genetic algorithms (GA's) are stochastic optimization algorithms that were originally mo-
tivated by the mechanism of natural selection and genetics, and have proven to be e�ective
in a number of applications. Genetic algorithms have been developed by John Holland, his
colleagues, and his students at the University of Michigan. The central theme of research on
genetic algorithms has been robustness, the balance between eÆciency and eÆcacy necessary
for survival in many di�erent environments.
3.1.1 How Genetic Algorithms are Different from Traditional
Methods
In order for genetic algorithms to surpass their more traditional cousins in the quest for
robustness, GA's must di�er in some very fundamental ways. Genetic algorithms are di�erent
from more normal optimization and search procedures in four ways:
� GA's work with parameter set coding, not the parameters themselves.
� GA's search from a population of points, not a single point.
� GA's use payo� information, not derivates or other auxiliary knowledge.
� GA's use probabilistic transition rules, not deterministic rules.
What might make a genetic algorithm attractive is its simplicity and the fact that its
applicability is not limited by restrictive assumptions about the search space (continuity,
unimodality, existence of derivates, etc.). Despite their relative simplicity, GA's outperform
Thesis Report { Smart Image System 35
3 CNN Optimization Techniques
any random search because they can exploit information cumulated during the evolution of
the search. Calculus based methods are inevitably superior in the problem domain where
they can be used, but GA's provide a robust search in discontinuous and multimodal noisy
search spaces.
3.1.2 Genetic Search Mechanism
Because genetic algorithms are rooted in both natural genetics and computer science, their
terminology mixes natural and arti�cal expressions. The scope of GA's is global, as they
use a population of binary strings - called chromosomes - to explore the search space. Each
chromosome encodes a point in the parameter space, i.e., a possible solution for the problem
to be solved. These binary strings are evaluated through a �tness function which contains
all the information about the problem. Evaluation means that the �tness value of the
corresponding chromosome is calculated accordingly. The better the solution is encoded by
a chromosome, the higher the �tness. The genetic algorithm then tries to improve the �tness
of the population by combining information contained in high �tness chromosomes.
A common GA implementation consists of the following four steps that are also illustrated
in Figure 3.2:
1. Determine the Initial Population
A population of binary strings is randomly determined. We have chosen relatively
small population sizes with respect to the whole set of possible binary strings. Due to
the constraints coming from the Vlsi technology we coded the parameters with 10bits
in the range of [�5; 5].
2. Reproduction or Selection
Reproduction is a process in which individual strings are copied according to their
objective function values (biologists call this function the �tness function). Copying
strings according to their �tness values means that strings with a higher value have a
higher probability of contributing one or more o�spring in the next generation. The
reproduction can be implemented in algorithmic form in a number of ways. Perhaps
the easiest is to create a roulette wheel where each current string in the population has
a roulette wheel slot sized in proportion to its �tness. Figure 3.2 shows this approach
graphically.
3. Crossover
Crossover means exchange of substrings between two parent chromosomes combining
valuable information of the parents. We used the two-point crossover operator in
our GA implementation which has two crossing sites that are selected and substrings
between the crossing sites are exchanged as shown as an example in Fig. 3.1.
4. Mutation
Mutation maintains diversity in the string population by ipping an arbitrary bit in
the chromosomes with a given probability that is generally low.
Our GA has been implemented in Matlab1 and complied to C-code which has been
shown to be an e�ective way to implement genetic algorithms because the �tness function to
1Matlab handles a range of computing tasks in engineering and science, from data acquisition and analysis
to application development. The Matlab environment integrates mathematical computing, visualiza-
36 M. Bettler & S. Zahnd
3.1 Genetic Algorithms
parents 1: 0011 j 101101 j 1000
2: 1001 j 110010 j 0101
m m
o�spring 1: 0011 j 110010 j 1000
2: 1001 j 101101 j 0101
Figure 3.1: The two-point crossover operator.
Figure 3.2: Graphical representation of a genetic algorithm.
be evaluated during the reproduction task would be diÆcult and therefore time consuming
to implement in C.
3.1.3 Some Mathematical Foundations
The operation of genetic algorithms is remarkably straightforward. After all, we start with a
random population of n strings, copy strings with some bias toward the best, mate and par-
tially swap substrings, and mutate an occasional bit value for good measure. In this section
we would like to brie y give an overview on the mathematical background of GA's in order
to give the reader a better basis to understand GA's and therefore for our Matlab source-
code. However, this description will be very short and for further readings we recommend [7].
Let us consider a schema H taken from the three-letter alphabet f0; 1; �g. The asterisk
or star * is a \don't care" symbol which matches either a 0 or a 1 at a particular position.
For example, consider the length l = 7 schema H = �11 � 0 � �.
Not all schemata are created equal. Some are more speci�c than others. For example, the
schema 011 � 1 � � is a more de�nite statement about important similarity than the schema
0 � � � � � �. Furthermore, certain schema do span more of the total string length than
tion, and a powerful technical language. Built-in interfaces let you quickly access and import data from
instruments, �les, and external databases and programs. In addition, Matlab lets you integrate external
routines written in C, C++, Fortran, and Java with your Matlab applications.
Thesis Report { Smart Image System 37
3 CNN Optimization Techniques
others. For example, schema 1 � � � �1� spans a larger portion of the string than schema
1 � 1 � � � �. To quantify these ideas, we introduce two schema properties: schema order and
length de�nition.
The order of a schema H, denoted by o(H), is simply the number of �xed positions (in a
binary alphabet, the number of 1's and 0's) present in the template. In the example above,
the order of the schema 011 � 1 � � is 4, whereas the order of the schema 0 � � � � � � is 1.
The de�ning length of a schema H, denoted by Æ(H), is the distance between the �rst and
last speci�c string position. For example, the schema 011 � 1 � � has de�ning length Æ = 4.
Schemata and their properties are interesting notational devices for discussing and classify-
ing string similarities rigorously. More than this, they provide the basic means for analyzing
the net e�ect of reproduction and genetic operators on building blocks contained within the
population.
The e�ect of reproduction on the expected number of schemata in the population is particu-
larly easy to determine. Suppose at a given time step t there are m examples of a particular
schema H contained within the population A(t). If we recognize that the average �tness of
the entire population may be written as f =Pfj=n then we may rewrite the reproductive
schema growth equation as follows:
m(H; t+ 1) =m(H; t)f(H)
f(3.1)
In words, a particular schema grows as to the ratio of the average �tness of the schema
to the average �tness of the population. Schemata with �tness values above the population
average will receive an increasing number of samples in the next generation and vice versa.
Now suppose we assume that a particular schema H remains above average an amount cf
with c a constant. On this assumption we can rewrite the schema di�erence equation as
follows:
m(H; t+ 1) = m(H; t)f + cf
f
= (1 + c)m(H; t) (3.2)
Starting at t = 0 and assuming a stationary value of c, we obtain the equation:
m(H; t+ 1) =m(H; 0)(1 + c)t (3.3)
We can recognize a geometric progression or the discrete analog of an exponential form.
Reproduction allocates exponentially increasing numbers of trials to above-average schemata.
In [7] it has been shown that if crossover is itself performed by random choice, say with
probability pc at a particular mating, the survival probability may be given by the expression:
ps � 1� pcÆ(H)l�1
(3.4)
The combined e�ect of reproduction and crossover may now be considered. As when we
considered reproduction alone, we are interested in calculating the number of a particular
schema H expected in the next generation. Assuming independence of the reproduction and
crossover operations, we obtain the estimate:
38 M. Bettler & S. Zahnd
3.1 Genetic Algorithms
m(H; t+ 1) �m(H; t)f(H)
f
h1� pc
Æ(H)l�1
i(3.5)
Comparing this to the previous expression for reproduction alone, the combined e�ect
of crossover and reproduction is obtained by multiplying the expected number of schemata
for reproduction alone by the survival probability under crossover ps. Schema H grows or
decays depending upon a multiplication factor. With both crossover and reproduction, that
factor depends on two things: whether the schema is above or below the population average
and whether the schema has relatively short or long de�ning length. Clearly, those schemata
with both above-average observed performance and short de�ning lengths are going to be
sampled at exponentially increasing rates.
The last operator to consider is mutation. Mutation is the random alteration of a single
position with probability pm. It can be shown that a particular schema H receives an ex-
pected number of copies in the next generation under reproduction, crossover, and mutation
as given by the following equation:
m(H; t+ 1) �m(H; t)f(H)
f
h1� pc
Æ(H)l�1 � o(H)pm
i(3.6)
The addition of mutation changes our previous conclusion little. The following impor-
tant conclusion is named the Schema Theorem , or the Fundamental Theorem of Genetic
Algorithms:
Short, low-order, above-average schemata receive exponentially
increasing trials in subsequent generations.
Although the calculations to prove the schema theorem are not too demanding, the theorem's
implications are far reaching and subtle as shown in [7].
3.1.4 Design of the Fitness Function
The �tness function which is used during the selection process must be adapted to the
current problem. We used the genetic algorithm to design CNN templates in general. To be
more precise, we performed a frequency weighted minimization of some �lter parameters to
achieve a certain �lter characteristic. For further details see Chapter 2.3.
3.1.5 GA Based Template Learning
Originally, CNN templates were not designed in the frequency domain. In this section we
give a brief description of the GA based template learning for completeness2.
Operations performed by an asymptotically stable CNN can be described by a triplet
of signal arrays, e.g., images: the input, initial state, and settled output of the network
mapped into gray scale values of pixels. The problem of learning is to �nd the template
of an operation given by the image triplet. The template to be found should de�ne the
dynamics such that the desired output is a stable equilibrium point in the state space and
the initial state is in its basin of attractions.
2We did not implement this approach. Our design was carried out in the frequency domain. Please see
Chapter 2.3 for further details
Thesis Report { Smart Image System 39
3 CNN Optimization Techniques
We can meet both requirements by considering the trajectory of the transient. The simplest
way to attain this is to create a cost function which compares the desired output to the
result of the transient de�ned by a given template and the input and initial state from the
image triplet. The following formula gives such a function:
g(p) =kPi=1
(ydi � yi(1))2 (3.7)
where p denotes the parameter vector, i.e., the template, k is the size of the network (the
number of cells), ydi is the value of the ith pixel of the desired output and y(1) stands for
the corresponding pixel of the settled output. g(p) = 0 if the result of template p is identical
to the desired output and gives a quadratically increasing distance elsewhere. By using g(:)
as a cost function, the problem of learning can be formulated as an optimization problem.
Applying genetic algorithms g(:) is minimized indirectly: its value is mapped into a �tness
value f(:) which is to be maximized.
Implementations of the GA based template learning are discussed in [13] and [10].
40 M. Bettler & S. Zahnd
4 Hardware Implementation
After all these theoretical and numerical results we'd like to implement this network in
hardware. Furthermore, we will later link this system with our imager. In this way we are
able to process the noise cancellation in realtime. This platform could be later used for
further image processing tasks to recognition of objects.
4.1 CNN Universal Machine
In Chapter 1 we described the properties of a CNN and in Chapter 3 we proposed a method
to calculate coeÆcients for the network, so that a particular spatial �lter results. If we want
to implement a CNN in Vlsi we have to make for each template set another hardwiring.
The invention of the CNN universal machine (Cnn-Um ) [20, 19] has overcome the problem.
It is the �rst stored program array computer with analog nonlinear array dynamics. One
CNN operation , for example, solving thousands of nonlinear di�erential equations [21, 12]
in a microsecond, is just one single instruction. In image processing application we often
need a sequence of several templates to calculate the output. One key point is that, in order
to exploit the high speed of the CNN chips, intermediate results have to be stored cell by
cell. Therefore local analog memory is needed.
The term 'universal' comes from the fact that there is a theoretical basis for the statement
that virtually any processing task can be somehow solved by a Cnn-Um . In [3] it is indirectly
shown that a Cnn-Um is a so-called Turning Machine, a hypothetical computer capable of
solving any problem whose solution can be formulated as an algorithm.
The Cnn-Um now consists of CNN array with additional data ow elements like analog
memory, switches, converters and a superposed control structure. This necessary design
concept was de�ned in 1993 [20]. It is the basis for further Cnn-Um chip designs. Fig. 4.1
gives a small insight in the analogic1 computer architecture.
In 1999 a chip prototyping system for a Cnn-Um was presented in [19]2. It has all the
software and hardware ingredients of the stored programmable computer (highlevel language,
1Analogic is a contraction for \analog" and \logic" computation|two key features of the CNN universal
chip.2This system as well as a lot of CNN applications were developed at the University of Budapest at Hungary.
The head of the Analogical & Neural Computing Laboratory is Tam�as Roska.
Tamas Roska received the Diploma in Electrical Engineering from the Technical University of Budapest
in 1964 and the Ph.D. and D.Sc. degrees in Hungary in 1973 and 1982, respectively.
He is the head of the Analogical & Neural Computing Laboratory at the University of Budapest, Hungary.
His main research areas in electronic circuits and systems and computing have been: active circuits,
computer-aided design, nonlinear circuit and systems, neural circuits and analogic computing systems.
He has published several papers and books. Dr. Roska is a co-inventor of the CNN (Cellular Neural
Network) Universal Machine and Supercomputer (with L.O. Chua) and the analogic CNN Bionic Eye
(with F. Werblin and L.O. Chua)
Professor Roska was awarded with several titles.
Thesis Report { Smart Image System 41
4 Hardware Implementation
Figure 4.1: The CNN universal machine{global architecture.
compiler, operating system, assembly and machine code, analogic central processing unit,
analog and digital memory, peripherals, etc.). For our project we made use of the system,
called Aladdin .
42 M. Bettler & S. Zahnd
Part III
Smart Image System
Thesis Report { Smart Image System 43
Preface
In our preliminary studies we presented the development of a Cmos image-sensor (Part I)
and image processing by cellular neural networks (Part II). In this part we present how this
systems behave in practice. To get the image data from the chip to a computer we develop
a system called Obscura. After the capturing we process our images on a CNN universal
machine. The properties of this system with the name Aladdin were already mentioned in
Chapter 4.1 in Part II.
We present many measurement results that provide a valuable basis for further develop-
ments and research in the �eld of smart image systems.
Thesis Report { Smart Image System 45
1 Obscura
Image Sensing
The image sensing unit is one of the main parts of a smart image system. Therefore, we built
the imager Obscura, which allows us to capture images and link them to the Matlab1
platform. Obscura is composed of the single-chip Cmos image-sensor (Part I), a printed-
circuit-board (PCB), an optical lens system, and a hardware interface card. Figure 1.1 shows
the data- ow between the sensor chip and a computer.
Figure 1.1: Data- ow of Obscura .
1.1 PCB Hardware
The PCB contains besides the sensor chip some input and output drivers to protect the
sensor and to handle the di�erent digital voltage levels, like 3.3V Cmos and 5V TTL. The
schematic of the board is shown in Fig 1.2.
1.2 Hardware Interface Card dSpace
The DS1102 interface card from dSpace is a single board system, which is speci�cally designed
for development of high-speed multivariable digital controllers and real-time simulations in
various �elds. It is also well suited for general digital signal processing related tasks.
1Matlab handles a range of computing tasks in engineering and science, from data acquisition and analysis
to application development. The Matlab environment integrates mathematical computing, visualiza-
tion, and a powerful technical language. Built-in interfaces let you quickly access and import data from
instruments, �les, and external databases and programs. In addition, Matlab lets you integrate external
routines written in C, C++, Fortran, and Java with your Matlab applications.
Thesis Report { Smart Image System 47
1 Obscura - Image Sensing Unit
Figure 1.2: Electrical Schematic of the PCB for OBSCURA.
The DS1102 is based on the Texas Instruments TMS320C31 third generation oating-
point Digital Signal Processor (DSP), which builds the main processing unit, providing fast
instruction cycle time for numeric intensive algorithms.
The DSP has been supplemented by a set of on-board peripherals frequently used in digi-
tal control systems. Analog to digital and digital to analog converters, a DSP-microcontroller
based digital-I/O subsystem and incremental sensor interfaces make the DS1102 an ideal
single board solution for a broad range of digital control tasks, not at least for our Image
capturing system Obscura.
The TMS320C31 supports a total memory space of 16M 32-bit words including program,
data and I/O space. All o�-chip memory and I/O can be accessed by the host even while
the DSP is running thus allowing easy system setup and monitoring.
We used the three analog digital controllers (ADC) for our application in order to read
in the analog sensor data. Channel 1 and 2 provides a resolution of 16 bit at a sampling rate
of 250kHz, channel 3 provides 12-bit at 800kHz, respectively. Further we used one of the
digital analog converters (ADC) to set the VpsRef voltage for the pixel circuit, which has a
precision of 16 bit and achieves a conversion time of 4�s. Finally we needed eleven digital
I/O lines to address our image-sensor.
1.3 Optics
The needed optics depends mainly on the application. Obscura can be easily adapted to
microscopic or macroscopic requirements. For the measurement of the imager chip, the fol-
lowing points must be taken into account. Obscura represents a sensor system converting
48 M. Bettler & S. Zahnd
1.4 Software
optical information into electrical information. Therefore, most of the sensor measurements
require a possibility to optically stimulate the image sensor. For this reason, the measure-
ments were carried out in an optical laboratory providing di�erent kinds of light sources
like monochromatic laser light or white light from a xenon-arc lamp. To achieve the high
dynamic range necessary for evaluating the logarithmic photoreceptors, a number of neutral
density �lters were used. Finally, a spectrograph is needed to select a color out of the white
spectrum.
In order to focus the laser beam to a single pixel, to expand it over the hole chip, or to
get a homogenous illumination, some lenses are needed. In the focused case we calculated a
minimal beam diameter of 14�m. The Gaussian beam pro�le decreases at this distance to
1=�2. In reality this diameter is larger because of non-idealities of the lens.
1.4 Software
The main software consists of modules for our imager. The �rst basic module is a Simulink
model that includes some I/O's and S-Function which was coded in the C-Language to
perform permanent addressing and readout of the image-sensor data as well as storage of
the current picture on the dSpace level. It is important to mention that this application is
able to run as an independent program on dSpace and can be accessed by Matlab by the
so-calledMatlab-DSP interface which provides basic functions for reading and writing data
to the dSpace processor boards. The functions that provide this access from the Matlab
command-line or from Matlab M-�les are called Mlib/Mtrace functions and are an
important part of our second software module. This second module consists of several M-
�les that read the image data from dSpace and visualize them with the powerful numerical
and graphical tools running under Matlab .
In the �rst part of our software description we will have a closer look at the syntax and the
conventions for the design of S-Functions in general. We will introduce our Simulink model
and describe the state machine that is implemented in our S-function. In the second part of
the description we explain the functionality of our Matlab functions.
S-functions (system-functions) provide a powerful mechanism for extending the capabilities
of Simulink. They allow the user to add its own algorithms to Simulink models. S-functions
can either be coded in Matlab or in C; however, to build models that are executable
on dSpace, they must be written in C. The most common use of S-functions is to create
custom Simulink blocks. Each block within a Simulink model has the following general
characteristics: a vector of inputs, u, a vector of outputs, y, and a vector of states, x, as
shown in Fig. 1.3.
Figure 1.3: General Simulink model.
The state vector may consist of continuous states, discrete states, or a combination of
both. The mathematical relationships between the inputs, outputs, and states are expressed
by the following equations:
Thesis Report { Smart Image System 49
1 Obscura - Image Sensing Unit
y = f0(t; x; u);
_xc = fd(t; x; u);
xdk+1 = fu(t; x; u);
x = xc + xd:
(1.1)
Simulink makes repeated calls during speci�c stages of simulation to each block in the
model, directing it to perform tasks such as computing its outputs, updating its discrete
states, or computing its derivates. Additional calls are made at the beginning and end
of a simulation to perform initialization and termination tasks. The so called S-functions
routines that are listed in Table 1.1 are called by Simulink during the simulation task and
must therefore be implemented with exactly the same names in every C MEX S-function.2
Simulation Stages S�Function Routine
Initialization mdlInitializeSizes
Calculation of next sam-
ple hit (optional)
mdlGetTimeOfNextVarHit
Calculation of outputs mdlOutputs
Update discrete states mdlUpdate
Calculation of derivates mdlDerivates
End of simulation tasks mdlTerminate
Table 1.1: Simulink simulation stages
Other important topics are the hardware independent datatypes and the access to vari-
ables from outside the system. The S-function speci�c C code provides the following three
datatypes: real T, int T and boolean T. These datatypes have a corresponding represen-
tation on the dSpace level, depending on the dSpace board version. In any case, int T
corresponds to integer, boolean T to unsigned integer and real T to oat.
To access variables from Matlab via the Mlib interface a Variable Description �le (TRC-
�le) is needed. TRC-�les describe the names and types of the variables that can be accessed
by ControlDesk3 and therefore also by an Mlib function. By changing these variables, the
application can easily be updated to the requirements needed. Global, non-static variables
or pointer variables in the compilation unit of an S-function or any other C-coded module
can be included in the Variable Description �le, thus making them accessible from outside.
For a better understanding of the topics just mentioned, consider our Simulink model shown
in Fig. 1.4. As already described in Chapter 1.1, the chip provides three analog outputs for
the image data, one digital input for the capturing signal, one analog input for the reference
voltage, and twelve digital inputs to address and enable our image sensor.
Therefore we need analog-digital and digital-analog converters that are located on the
dSpace board. The most important part in our model is the S-function. With the RTW4
2The C MEX S-Functions are S-Functions, that are coded in C and then compiled by the MEX compiler.
The MEX compiler can be e.g. a Borland C-compiler and must be installed before the �rst usage on
the Matlab command.3ControlDesk is dSpace's experiment software that provides all the tools for controlling, monitoring, and
automating real-time experiments.4RTW is the abbreviation for Real-Time Workshop. The Real-Time Workshop, for use with Matlab and
50 M. Bettler & S. Zahnd
1.4 Software
Figure 1.4: Image sensor Simulink model.
fromMatlab 5.3 it is possible to generate a Variable Description �le (TRC �le) for the whole
Simulink model. However, in this TRC-�le only the input and output ports are speci�ed; to
access variables that are used insight the C code and which are not mapped to the output,
you have to write a user TRC-�le. The TRC-�le generated by RTW is located in hmodeli:trc
and the user �le must be named to hmodeli usr:trc. In this �le you can specify the variables
that you want to access while the application is running. These variables must be declared
as global and non-static in the C-code that describes the S-function. Note that, at the end
of the Variable Description �le an empty line has to be inserted to avoid an error message
caused by the TRC �le parser.
We next describe the main functionality of our S-function. We have implemented a state
machine with six di�erent states that control the slower column counter, the faster row
counter, the enable and disable of the reference voltage and the read-in of the pixel voltages
into a two-dimensional array. Figure 1.5 shows a graphical representation of the state model.
After the initialization process the �rst column is selected in the �rst state and enabled in
the second. The third state is to ensure that the �rst pixels in the column are settled before
any readout.The fourth state is a wait state. After this state the image data are in their
speci�c variables and are ready to be read into the memory. In the �fth state the row counter
is incremented until the whole row is read. The last state disables the current row and a new
program cycle begins. The timing diagram in Fig. 1.5 gives more detailed information about
the handshake principle used in the state machine just mentioned. During each new entry
in the program one state is executed. The sample time of the program, i.e. the length of one
state, can be set as a static variable in the C-code. However, the time between two states
is still depending on the complexity of the S-function and we have no control over the time
of initialization and execution in the model's overall execution order. The time overhead for
Level 25 S-functions on our dSpace board DS1102 is 2:8�s.
The second part of our software, as already mentioned above, consist of several M-functions
that are listed in Table 1.2. These functions provide a comfortable software package to
collect and process image data with the numerical Matlab tools and the Mlib/Mtrace
interface. For further information please use help hfunctioni on theMatlab command line.
Simulink, produces code directly from Simulink models and automatically builds programs that can be
executed in a variety of environments, including real-time systems and stand-alone simulations.5Level 2 S-functions are generated by Simulink2.2 and have always a shorter time overhead than Level 1
S-functions generated by Simulink2.1
Thesis Report { Smart Image System 51
1 Obscura - Image Sensing Unit
Figure 1.5: State machine implemented in a Simulink S-function.
FunctionName InputParameters Output ShortDescription
initImager reference voltage and
switch threshold volt-
age.
- sets the reference voltage and
the switch threshold voltage.
singleShot - image data in a col-
umn vector.
captures a single image from
the sensor which is located on
the carrier board.
startTrackingMode - image data in a col-
umn vector.
captures a single image af-
ter the switch on the carrier
board has been actuated.
data2img image data generated
either by singleShot
or by startTracking-
Mode.
intensity image data
in double array.
converts the input data to
an intensity image, calls
the function preProcessIm-
age, and displays the image.
preProcessImage intensity image data
in a double array.
corrected intensity
image in a double
array.
limits the image data to the
linear operating range of our
image-sensor.
videoMode recording time. video data in an 48�
48� k array, where k
is the number of im-
ages.
captures images during the
time speci�ed in the input ar-
gument.
data2video video data collected
by the function video-
Mode.
multiframe image
data in an 48 � 48 �
1 � k array, where
k is the number of
images.
displays the frames recorded
by the function videoMode in
one picture and in a video an-
imation.
Table 1.2: Overview over our Mlib M-functions
52 M. Bettler & S. Zahnd
2 Obscura
Measurement
2.1 Introduction to Fixed Pattern Noise (FPN)
Many integrated circuits rely on the assumption that devices identically drawn in the layout
also show an identical behaviour in reality. If this assumption does nearly hold, they are
called well-matched devices. However, transistors, resistors, or capacitors with the same
geometrical extensions usually di�er from each other due to a spatial variation of the pro-
cess parameters. Due to this mismatches there result improper current mirrors, operational
ampli�ers with high o�set voltages and, in the case of image sensors, non-uniform output
signals of the individual sensor pixels. Since the pixel variations are �xed but randomly
distributed across the chip, the image shows the so-called �xed pattern noise. For practical
use, the FPN has to be reduced to a value that is below the minimum intensity di�erence
to be detected.
The distribution of the mismatches between two supposedly identical Cmos devices is pri-
marily the result of two factors:
1. Variations in the location of the transistor, resistor, or capacitor edges resulting from
the limited imaging quality of the photolithographic process itself. This causes mis-
matches in length and width and thus di�erent electrical behaviour.
2. Variations of the process parameters like gate oxide thickness and doping concentration
across the waver resulting from non-uniform conditions during the redeposition and
di�usion. These variations cause the sheet resistances and the threshold voltages of
the transistors to vary with distance across the die.
Rotating or mirroring Cmos structures causes additional mismatch because some process
parameters depend on the geometrical direction.
2.1.1 Transistor Mismatch in Weak Inversion
Logarithmic photodetectors, as used in our implementation, show a very high �xed pattern
noise due to the weak inversion mismatch. The following equations show the current-voltage
law of the subthreshold region, assuming the bulk-source potential VBS to be zero and
VDS � Vt.
ID = ID0WLe
VGS � VT
nVt ;
(2.1)
ID0 ' �LW
2(nVt)2
e2
: (2.2)
Thesis Report { Smart Image System 53
2 Obscura - Measurement
Here, � is the MOS transconductance and n is a process parameter (subthreshold slope
factor) which is typically between one and two: 1 � n � 2. The temperature potential Vt is
equal to kT=q and must not be mistaken for the threshold voltage VT1.
Due to the exponential relation between drain current and gate-source voltage, ID varies
over a large dynamic range of several decades (fA to nA) when VGS varies from 0V to VT .
Therefore, a high mismatch sensitivity, especially for variations of VT , may be expected. The
standard deviation of the ID mismatch can be derived from (2.1):
�(ID) =
s(@ID
@�
)2�2(�) + (@ID
@VT
)2�2(VT )
=
vuut(2(nVt)
2
e2
e
VGS � VT
nVt )2�2(�) + (�2nVt
e2e
VGS � VT
nVt )2�2(VT ):
(2.3)
To obtain the relative error of current mismatch, the absolute error from equation (2.3)
is divided by the expression for ID in equation (2.1):
�(ID)
ID
=
s�2(�)
�2
+�2(VT )
(nVt)2: (2.4)
From the above relation we deduce that �(ID)=ID is not dependent on VGS . Hence it is
constant in the complete subthreshold region.2 On the other hand the contribution of the
VT mismatch is obvious. Since �(VT ) is only divided by nVt which is in the order of 25mV,
common threshold voltage variations of about 20mV can lead to a drain current mismatch
of nearly 100%. Therefore, subthreshold devices have to be designed very carefully to reduce
mismatch itself or at least its in uence on the circuit behaviour.
Particularly in the case of image sensors, the subthreshold devices are frequently used as
logarithmic compressors. Thus the exponential current-voltage law is reversed into a loga-
rithmic voltage-current law as used in our pixel circuit. Solving equation (2.1) for the gate
voltage VGS gives
VGS = VT + 2nV t� nVt ln2�(nVt)
2
ID
: (2.5)
Since VGS is a function of ID, current ratios are converted into voltage di�erences. Hence
not the relative but the absolute error �(VGS) is the interesting magnitude for mismatch
considerations. Using equation (2.5) and following the derivation in equation (2.3) yields
�(VGS) =
s�2(VT ) + (
nVt
�
)2�2(�): (2.6)
Here, the mismatch of VGS directly depends on the threshold voltage variations �(VT ).
Besides, it is independent of the drain current ID and therefore constant in the complete
subthreshold region.1The gate-source voltage, for which the concentration of electrons under the gate is equal to the concentration
of holes in the p- substrate far from the gate, is known as the transistor threshold voltage VT .2This statement is only valid in �rst approximation, because the used equations and assumptions are ap-
proximations in many respects, too. For example, the parameter n also shows a slight mismatch destroying
the independence of VGS .
54 M. Bettler & S. Zahnd
2.1 Introduction to Fixed Pattern Noise (FPN)
2.1.2 Fixed Pattern Noise Correction in Logarithmic Image Sensors
As discussed in the previous section, logarithmic photodetectors show a high FPN due to
weak inversion mismatch. Figure 2.1 shows the distribution of the pixels while our chip is
exposed to homogeneous light, and Figure 2.2 shows a bar plot of the homogeneously illumi-
nated pixels. The peak-peak variations of the pixel voltages are approximately 100mV which
corresponds almost to 2 decades of light intensity. It turns out to be more complicated to
correct this variations than in the case of integration-based image sensors.3 The reason is the
missing reset state of all continously working receptors. There is no de�ned reference state
whose corresponding pixel signal could be subtracted from the intensity-dependent output
signal. Thus we have to apply other calibration concepts.
Figure 2.1: Distribution of the pixels of the homogeneously illuminated image sensor.
Figure 2.2: Bar plot of the pixels of the homogeneous illuminated image sensor.
3A technique commonly used to eliminate �xed pattern noise in integration-based image sensors is the
correlated double sampling (CDS) method.
Thesis Report { Smart Image System 55
2 Obscura - Measurement
The most common way to correct the non-uniformities of logarithmic image sensors is to
carry out a digital correction method. Initially, the pixel errors are measured by illuminating
the sensor array homogeneously and stored in a digital memory. During readout operation,
the actual pixel signals are converted into digital values and then digitally corrected according
to the stored pixel errors. A one point (only o�set), two point (o�set and slope) or even
higher order calibration algorithm can be utilised. The digital �xed pattern noise correction
is usually carried out outside the chip although there is no problem to directly perform it on
the sensor itself. However, an on-chip solution would require a large area for digital memory
and additional control logic leading to larger chips and worse yield. Therefore, in case of
a digital correction, the o�-chip method is preferred. However, another concept, which is
manly based on a self-calibrating photoreceptor is given in [16].
2.2 Photoreceptor response and fixed pattern noise
The following sections describe the measured behavior of the photoreceptors with respect to
the incident light intensity J . Alternatively, a di�erent quantity is often used, the photomet-
ric quantity lux. It is adapted to the human eye spectral sensitivity. The exact correlation
between physical and photometric quantities is given in [16].
2.2.1 Response curves
The measurement of the photoreceptor response as a function of the light intensity lead to
a dynamic range of 7 decades. One pixel was illuminated with red light of the wavelength
632.8nm of a helium-neon laser. Figure 2.3 shows the response curve of a single photore-
ceptor. The cell works linearly over more than 6 decades, as expected. The slope, i.e. the
voltage increase per intensity decade, amounts to 73mV when the reference voltage VpsRef
is 1:1V. The results, both, the linear range and the slope, �t very well with the simulation
of Part I, Chapter 2.1.
2.2.2 Remaining fixed pattern noise
We have done di�erent e�orts to reduce the �xed pattern noise, but without any remarkable
success. Figure 2.4 visualize such a FPN correction operation. The correcting data are based
on several homogeneous illuminated pictures by di�erent light intensities. For each pixel its
own response curve was approximated, and the dependence correcting information (o�set
and gain) was calculated. The operations failed because the homogeneous images are not
enough constant with the time.
2.2.3 Slope variations
Due to slight variations of the subthreshold factor n introduced in section 2.1.1, the slope
of the photoreceptor response varies from pixel to pixel. These non-uniformities can result
in a considerable contribution to the total �xed pattern noise. Their in uence increases
with the distances between the calibrating points and the actual illumination. For detailed
information about the �xed pattern noise correction see section 2.1.2.
Figure 2.5 shows the histogram of the individual slopes averaged over a range of 2 decades.
The values correspond to the 2304 pixels of our image sensor.
56 M. Bettler & S. Zahnd
2.3 Complementing measurements
Figure 2.3: Response curves of the logarithmic cell.
Figure 2.4: Fixed pattern noise correction.
2.3 Complementing measurements
2.3.1 Crosstalk
The e�ective resolution of an image sensor depends not only on the number of pixels but
also on the crosstalk of adjacent pixels. In order to examine the crosstalk behavior, a single
pixel of the sensor was stimulated with a bright laser-spot. Ideally, the stimulated cell would
show an increased output signal whereas all others show a constant low level. The results
corresponding to a laser spot intensity 4 decades higher than the background intensity are
shown in Figure 2.6. Values below 3 decades under the maximum were set to zero. The
Thesis Report { Smart Image System 57
2 Obscura - Measurement
Figure 2.5: Distribution of the individual pixel slopes of our image sensor.
crosstalk e�ect shows a decrease of 2 decades of light intensity between the stimulated pixels
and its neighbours.
Figure 2.6: The crosstalk e�ect shows a decrease of 2 decades of light intensity, between the
stimulated and the neighboring pixels.
2.3.2 Temporal Noise
So far we regarded only the �xed pattern noise referring to the mean output signals of the
individual pixels. However, the pixel signals have an additional noise component which is
the temporal noise. To measure this noise we illuminated one pixel with a �xed intensity and
read out 500 frames with a frame rate of about 8 frames per second. The signal was band-
limited by a simple RC-�lter to 48kHz4. Figure 2.7 shows the distribution of our temporal
4With this frequency band we would be able to read out 50 frames per second.
58 M. Bettler & S. Zahnd
2.4 Spectral characteristic
noise with it's very low standard deviation of 850�V.
Figure 2.7: Temporal noise distribution.
2.4 Spectral characteristic
Generally, the inner photoelectric e�ect of a semiconductor describes the absorption of elec-
tromagnetic radiation by transferring the photon's energy to an electron and lifting the
electron to the conduction band. Depending on the material this interaction occurs in dif-
ferent wavelength ranges. See section 1.2 in Part I for further details.
Assume that a semiconductor is illuminated from a light source with the intensity I0. The
number of photons dJ absorbed along the distance dx is proportional to the local intensity
J(x):
dJ
dx
= ��J(x): (2.7)
The proportionality constant � is de�ned as the absorption coeÆcient. Due to Lambert's
law of absorption, the light intensity decreases exponentially along the distance x.
J(x) = J0e��x
: (2.8)
The absorption coeÆcient � depends on the photon energy and therefore on the wave-
length � because E = hc=�. For energies below the bandgap energy Eg, which is the
di�erence between the energy levels of valence band and conduction band, photons do not
possess enough energy to lift electrons to the conduction band. Silicon has a very low �
and is nearly transparent for wavelengths larger than 1100nm corresponding to the silicon
bandgap of Eg = 1:12eV. Nevertheless, a weak absorption even occurs in the region above
1100nm. The reasons are the states in the bandgap due to crystal defects and absorption
by free electrons brought to the conduction band by thermal excitation.
Thesis Report { Smart Image System 59
2 Obscura - Measurement
The spectral variation of � has consequences for the spectral sensitivity of silicon pho-
todetectors. Due to weak absorption infrared and red light penetrates deeply into the semi-
conductor crystal whereas ultraviolet radiation is absorbed directly below the surface. There-
fore, infrared receptors have to be largely extended into the substrate. Ultraviolet detectors,
which are very ineÆcient in silicon, must be located near the surface. Table 2.1 gives three
examples of the absorption length5 at a temperature of 300K [22].
wavelength photon energy absorption length
(nm) (eV ) (�m)
400 3.10 0.76
700 1.77 4.5
1000 1.24 110
Table 2.1: Absorption length in silicon at three di�erent wavelengths.
Figure 2.8 shows the overall spectral characteristic of our Cmos image-sensor. The best
responsivity, as expected, is around 600nm. The spectrum characteristic is limited between
the wavelengths 500nm and 1100nm because of the e�ects we discussed earlier in this section.
Figure 2.8: Spectral characteristic of our image-sensor.
However, another important property of Cmos image-sensors can be derived from the
measured spectrum. It is the presence of interference phenomenas due to the passivation
and oxide layers6 shown in Figure 2.10.
5After covering the absorption length the incident intensity has decreased by the factor of e.6The passivation layer is a protection layer usually made of Nitride which is located on the top of the chip's
layers. The oxide layers are used as isolation, e.g. between metal1 and metal2 layer. See also Figure 2.10.
60 M. Bettler & S. Zahnd
2.4 Spectral characteristic
Due to the very low extinction coeÆcient k of SiO2 and Si3N4 in the visible spectrum,
the absorption can be neglected. The extinction coeÆcient is de�ned as
k =�
2��; (2.9)
where � is the absorption coeÆcient used in equation (2.8). However, we cannot ignore
the in uence of re ection and interferences. We used the quantitative approach to calculate
the in uence of multilayer �lters from [9]. The approach is called the method of resultant
waves or the E+, E� matrix method. The boundary conditions associated with Maxwell's
equations are placed into a matrix equation format. To accomplish the reformulation, the
boundary conditions are manipulated so that the information about the angle of incident
and the polarization are placed into an e�ective index of refraction:7
ri;j =nj � ni
nj + ni; (2.10)
ki;j =2ki
nj + ni
: (2.11)
Here, r is the re ection coeÆcient and k is the transmission coeÆcient between the
medium i and j. The �elds on each side of the boundary can then be represented by
plane waves incident normal to the interface. Figure 2.9 shows the geometry for waves in a
dielectric �lm.
Figure 2.9: Geometry for waves in a dielectric �lm.
Each dielectric layer has two interfaces which can be represented as two interface matrices,
e.g. I0;1 and I1;2. Formula (2.14) shows the interface matrix equation
0B@ E
i+
Ei�
1CA =
0BBBB@
1
ti;j
ri;j
ti;j
ri;j
ti;j
1
ti;j
1CCCCA0B@ E
j+
E
j�
1CA : (2.12)
Normally, this formula is written in a more compact notation
Ei = Ii;jE
j; (2.13)
7To simplify the following derivation we assume that all the waves incident orthogonal to the chip's surface,
which is also the case in reality
Thesis Report { Smart Image System 61
2 Obscura - Measurement
where Ii;j is the interface matrix
Ii;j =
0BBBB@
1
ti;j
ri;j
ti;j
ri;j
ti;j
1
ti;j
1CCCCA ; (2.14)
and Ei and E
j are the waves in the �rst medium and the second medium.
The problem of �nding the values of A+ and A� in Figure 2.9 is a simple propagation
problem. The �elds A+ and E� must be modi�ed by the phase shift they experience after
propagating through the dielectric layer, here labeled 1
A+ = eiÆ1E1+; (2.15)
E1�= e
iÆ1A�; (2.16)
where Æi is given by
Æi =2�nidi
�
: (2.17)
Equation (2.15) and (2.16) can be combined into a matrix equation
A = T1E1; (2.18)
which can be generalized for the ith dielectric layer by de�ning a transmission matrix of the
form
Ti =
0B@ e
iÆi 0
0 e�iÆi
1CA : (2.19)
The e�ect of an m layer dielectric �lm can be described by the matrix equation
0B@ E
0+
E0�
1CA =M
0B@ E
f+
E
f�
1CA (2.20)
where
M = I0;1 � T1 � I1;2 � T2 � ::: � Im�2;m�1 � Tm�1 � Im�1;m; (2.21)
and the Ef are the �elds in the �nal medium. If we assume that Ef�= 0 then the re ection
of the stack is
R = jM1;0
M0;0j2; (2.22)
and from energy conversation it follows that the transmission of the stack IIN=IOUT is
62 M. Bettler & S. Zahnd
2.4 Spectral characteristic
� = 1�R: (2.23)
Now we apply this approach to our layer con�guration. Figure 2.10 shows a simpli�ed
cross section of our diode. However, the important layers are the nitride passivation and the
dielectric layers8.
Figure 2.10: Simpli�ed cross section of our n+-substrate-diode showing the layers that can
potentially cause interferences.
We are now interested in the the overall transmission � of one wavelength. Figure 2.8
shows that the minimum transmission loss is at the wavelength � � 580nm. Table 2.2 shows
the refraction coeÆcients we used for the calculation.
parameter value
� = 580nm [:]
n0, n3 1
n1 = nSi3N42.028
n2 = nSiO21.544
Table 2.2: Refraction coeÆcients for the overall transmission calculation [18].
The distances9 d1 and d2 from Figure 2.10 are d1 = 1:1�m and d2 = 2:8�m respectively.
The results of the calculation are presented in Table 2.3.
8The dielectric layers are used as isolation layers, e.g. between metal3 and metal2 that are used for wiring.9The thickness of the oxide and protection layers are process dependent and could also be calculated by
regarding the wavelengths of two neighbouring sensitivity maximums at �1 and �2.
Thesis Report { Smart Image System 63
2 Obscura - Measurement
constellation reflection R transmission �
Si3N4 and SiO2 layer 36% 64%
SiO2 layer only 2% 98%
Table 2.3: Results of the overall transmission calculation.
The calculation shows, that the transmission is strongly dependent on the presence of
the Si3N4 nitride layer. For further implementations we should either omit the nitride layer
or choose another process with a SiO2 passivation only.
64 M. Bettler & S. Zahnd
3 Aladdin
Image Processing
We have presented the concept of a cellular neural network-universal machine (Cnn-Um) in
Part II, Chapter 4.1. For our smart image system we used such a machine, called Aladdin.1
Aladdin is able to program a Cnn-Um on a high level of abstraction and is very useful
during the prototyping phase. Figure 3.1 shows an example of a CNN operation. The used
template is presented in Part II, Chapter 2.3.1.
Figure 3.1: Linear spatial �lter template processing on a captured picture. The �lter has
a circulary symmetric low-pass characteristic. It is de�ned by A and B templates with
passband extending to 0:4� and stopband starting at 0:5�.
1We bought Aladdin form the University of Budapest, Hungary.
Thesis Report { Smart Image System 65
4 Conclusion
4.1 Current state of the project
Within the scope of this thesis, the concept of a smart image system, using a Cmos image
senor (Part I) and a Cnn-Um (Part II), has been tested. A �rst image-sensor with 48x48
pixels has been developed and examined with regard to its optical properties. Furthermore,
we presented a complete method to design CNN templates in the frequency domain using
genetic algorithms. We could �nally present some results of the two systems Aladdin and
Obscura working together.
Unfortunately, the working range of our sensor is shifted by 4 decades of light illumination
compared with several other implementations [16, 23]. That means that our imager is less
sensitive than expected; nevertheless, we were able to capture images.
With this work we provide a valuable basis for further research and development in the
�eld of smart image systems.
4.2 Post-script
The �rst part of our whole project included a lot of interesting work. For the �rst time we
had the chance to bring our ideas to silicon. The chip makes technology very palpable by
generating pictures from the world.
Thanks to the second preliminary study we became a good insight in the theory of
Cellular Neural Network and their applications. We have analyzed a lot of reports, and got
in contact with the Universities of Budapest and Berkeley.
Finally, the bachelor's thesis gave us the possibility to capture our own pictures with
Obscura and to process them on Aladdin.
Matthias Bettler & Samuel Zahnd
Biel/Bienne, School of Engineering and Architecture
Thesis Report { Smart Image System 67
Appendices
Thesis Report { Smart Image System 69
A The Layout Structure
Figure A.1: This picture shows the hierarchical decomposition of the 48 � 48 pixel image
sensor layout (analog part), in which the Analog Core presents the top level design.
Thesis Report { Smart Image System 71
B Analysis of the pixel circuit
The function of the pixel circuit (Figure B.1) will be described in mathematical terms:
In the following mathematical description
ID1. . . ID1 means the Drain-Source current.
ID means the photocurrent.
VD is the inverse photodiode voltage.
VPS is the RowPreselect voltage.
Figure B.1: Schematic of the pixel circuit.
Q basic equation
Q2 at saturation ID2 =�p2 (VDD � V PS � Vtp)
2
Q3 at saturation ID3 =�n2 (VD � Vtn)
2
Q1 at weak inversion ID1 =W1
L1ID0�e
(Vout�V1)�1
n�Vt
Photodiode at inversion �ID = Iph(e�
VDn�Vt � 1) = �Iph � � � intensity
ID2 = ID3 )�p
2(VDD � V PS � Vtp)
2 =�n
2(VD � Vtn)
2
VD =
s�n
�p
(VDD � V PS + Vtp) + Vtn
ID1 = ID )W1
L1ID0 � e
(Vout�V1)�1
n�Vt = Iph
Vout = n � Vt � ln(Iph
W1
L1ID0
) + VD
= n � Vt � ln(Iph)� n � Vt � ln(W1
L1ID0) +
s�n
�p
(VDD � V PS + Vtp) + Vtn
= n � Vt �1
log(e)| {z }89mV
decwith n=1:5
� log(Iph) + VA
Thesis Report { Smart Image System 73
C Content of the CD-ROM
The whole quantity of �les was splited up on two CD-Rom's and arranged in the following
structure.
diskh1i data/ .
preliminary study 1/ Image sensing related documents.
preliminary study 2/ Image processing related documents.
bachelor's thesis/ Smart Image System related documents.
diskh2i over head/ .
documentation/ Report, poster and abstract.
presentation/ Presentation slides used in Biel and Istanbul.
web page/ Web page of the Smart Image Sensor project.
Thesis Report { Smart Image System 75
List of Figures
0.1 Smart Image System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
0.2 Project time schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
0.3 Job scheduling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1.1 Cross-section of a primate retina. . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Operation of the p-n photodiode. . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Photocurrent versus Irradiance. . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Schematic of the circuit for one pixel. . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Simulated pixel circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Layout of the pixel circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Blockdiagram of the I/O system on chip. . . . . . . . . . . . . . . . . . . . . . 11
2.5 The whole chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Bonding plan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 Design- ow of the Digital Part. . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Design- ow for place and route of our design. . . . . . . . . . . . . . . . . . . 16
1.1 Network structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.2 The unity gain output function. . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.3 A block diagram showing the standard CNN. . . . . . . . . . . . . . . . . . . 23
1.4 The CNN cell schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1 Sampled low-pass �lter characteristic. . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Low-pass �lter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Improved low-pass �lter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4 2-D low-pass �lter characteristic. . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1 The two-point crossover operator. . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Graphical representation of a genetic algorithm. . . . . . . . . . . . . . . . . . 37
4.1 The CNN universal machine{global architecture. . . . . . . . . . . . . . . . . 42
1.1 Data- ow of Obscura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.2 Electrical Schematic of the PCB for OBSCURA. . . . . . . . . . . . . . . . . 48
1.3 General Simulink model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
1.4 Image sensor Simulink model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.5 State machine implemented in a Simulink S-function. . . . . . . . . . . . . . . 52
2.1 Distribution of the pixels of the homogeneously illuminated image sensor. . . 55
2.2 Bar plot of the pixels of the homogeneous illuminated image sensor. . . . . . 55
Thesis Report { Smart Image System 77
List of Figures
2.3 Response curves of the logarithmic cell. . . . . . . . . . . . . . . . . . . . . . 57
2.4 Fixed pattern noise correction. . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5 Distribution of the individual pixel slopes of our image sensor. . . . . . . . . 58
2.6 Crosstalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.7 Temporal noise distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.8 Spectral characteristic of our image-sensor. . . . . . . . . . . . . . . . . . . . 60
2.9 Geometry for waves in a dielectric �lm. . . . . . . . . . . . . . . . . . . . . . 61
2.10 Cross section of our photo-diode. . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.1 Spatial �lter template processing. . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.1 Hierarchical tree of layout designs. . . . . . . . . . . . . . . . . . . . . . . . . 71
B.1 Schematic of the pixel circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
78 M. Bettler & S. Zahnd
List of Tables
2.1 Optimal sizes of the Mos-Fet for the pixel circuit. . . . . . . . . . . . . . . . 9
2.2 Chip features and description. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1 Simulink simulation stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.2 Overview over our Mlib M-functions . . . . . . . . . . . . . . . . . . . . . . . 52
2.1 Absorption length in silicon at three di�erent wavelengths. . . . . . . . . . . . 60
2.2 Refraction coeÆcients for the overall transmission calculation. . . . . . . . . . 63
2.3 Results of the overall transmission calculation. . . . . . . . . . . . . . . . . . 64
Thesis Report { Smart Image System 79
Bibliography
[1] Leon O. Chua. Global unfolding of chua's circuit. IEICE Trans. on Fundamentals
Electron. Commun., Comp. Sci., Vol. E76-A, May 1993.
[2] Leon O. Chua and Tam�as Roska. The cnn paradigm. IEEE Transaction on Circuits
and Systems-I, Vol. 40(3):147{156, March 1993.
[3] Leon O. Chua, Tam�as Roska, and P�eter Venetianer. The cnn is as universal as the
turning machine. IEEE Transaction on Circuits and Systems-I, Vol. 40(4):289{291,
April 1993.
[4] Leon O. Chua and Lin Yang. Cellular neural networks: Applications. IEEE
Transaction on Circuits and Systems-I, Vol. 35(10):1273{1290, October 1988.
[5] Leon O. Chua and Lin Yang. Cellular neural networks: Theory. IEEE Transaction on
Circuits and Systems-I, Vol. 35(10):1257{1272, October 1988.
[6] Kenneth R. Crounse and Leon O. Chua. Methods for image processing and pattern
formation in cellular neural networks: A tutorial. IEEE Transaction on Circuits and
Systems-I, Vol. 42(10):583{601, October 1995.
[7] David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine
Learning. Addison-Wesly Publishing Company, 1989.
[8] Michel Goossens, Frank Mittelback, and Alexander Samarin. The LATEX Companion.
Addison-Wesley Publishing Co., Reading, Mass., 1994.
[9] Robert Guenther. Modern Optics. John Wiley & Sons, Inc., 1990.
[10] Martin H�anggi. Analysis, Desin, and Optimization of Cellular Neural Networks. ETH
Z�urich, 1999.
[11] Donald E. Knuth. The TEXbook, volume A of Computers and Typesetting .
Addison-Wesley Publishing Co., Reading, Mass., second edition, 1984.
[12] Tibor Kozek, Leon O. Chua, Tam�as Roska, Dietrich Wolf, Ronald Tetzla�, and Frank
Pu�er K�aroly Lotz. Simulating nonlinear waves and partial di�erential equations via
cnn{part ii: Basic techniques. IEEE Transaction on Circuits and Systems-I, Vol.
42(10):816{820, October 1995.
[13] Tibor Kozek, Tam�as Roska, and Leon O. Chua. Genetic algorithm for cnn template
learning. IEEE Transaction on Circuits and Systems-I, Vol. 40(6):392{402, June 1993.
[14] Leslie Lamport. LATEX: A Document Preparation System. Addison-Wesley Publishing
Co., Reading, Mass., second edition, 1994.
Thesis Report { Smart Image System 81
Bibliography
[15] Drahoslav L�IM. Implementation of a Programmable, Modularly Extendable
Cellular-Neural-Network Signal Processor. ETH Z�urich, 1999.
[16] Markus Loose. A self-calibrating cmos image sensor with logarithmic response.
Technical report, Institut f�ur Hochenergiephysik Universit�at Heidelberg, 1999.
[17] Carver Mead. Analog VLSI and Neuronal Systems. Addison-Wesly Publishing
Company, 1989.
[18] Edward D. Palik. Handbook of Optical Constants of solids I. Academic Press, Inc.
[19] Tam�as Roska, �Akos Zar�andy, S�andor Z�old, P�eter F�oldesy, and P�eter Szolgay. The
computational infrastructure of analogic cnn computing|part i: The cnn-um chip
prototyping system. IEEE Transaction on Circuits and Systems-I, Vol. 46(2):261{268,
February 1999.
[20] Tam�as Roska and Leon O. Chua. The cnn universal machine: An analogic array
computer. IEEE Transaction on Circuits and Systems-II, Vol. 40(3):163{173, March
1993.
[21] Tam�as Roska, Leon O. Chua, Dietrich Wolf, Tibor Kozek, Ronald Tetzla�, and Frank
Pu�er. Simulating nonlinear waves and partial di�erential equations via cnn{part i:
Basic techniques. IEEE Transaction on Circuits and Systems-I, Vol. 42(10):807{815,
October 1995.
[22] Peter Schneider. Simulation und visualisierung elektrischer und optischer
eigenschaften von halbleiterbauelementen. Technical report, Institut f�ur Physik und
Astronomie Universit�at Heidelberg, 1998.
[23] Markus Schwarz, Ralf Hauschild, Bedrich J. Hosticka, J�urgen Huppertz, Thorsten
Kneip, Stephan Kolnsberg, Lutz Ewe, and Hoc Kheim Trieu. Single-chip cmos image
sensors for a retina implant system. IEEE Transaction on Circuits and Systems-I, Vol.
46(7):870{877, July 1999.
[24] John M. Senior. Optical Fiber Communications: Principles and Practice. Prentice
Hall Europe, 1992.
[25] Olivier Vietze. Active pixel image sensors with application speci�c performance based
on standard silicon CMOS processes. ETH Z�urich, 1997.
82 M. Bettler & S. Zahnd
Index
active pixel sensor APS, 9
ALADDIN, 65
cellular neural networks CNNs, 21
boundary values, 23
cell dynamics, 22
CNN universal machine, 41
constant source, 24
control template, 24
feedback template, 24
intial state, 23
introduction, 21
low-pass �lter design, 29
network coeÆcients, 22
network structure, 21
optimization techniques, 35
output nonlineraity, 22
settling time, 23
stability issues, 23
template, 22
unity gain ouput function, 27
CMOS image-sensor, 9
crosstalk, 57
de-multiplexer, 15
design- ow
analog part, 16
digital part, 15
dynamic range, 56
features, 13
�ll factor, 9
I/O system, 11
de-multiplexer, 12
multiplexer, 12
layout, 11
measured spectrum, 60
photoreceptor response, 56
process fabrication, 15
schematic, 10
spectral characteristic, 59
temporal noise, 58
CNN universal machine, 41
global architecture, 42
crosstalk, 57
depletion layer, 6
dielectric layers, 63
dSpace, 47
dynamic range, 56
�ll factor, 9
�tness function, 30
�xed pattern noise FPN, 53
remaining, 56
frequency sampling method, 30
frequency transformation method, 30
frequency-weighted minimization, 29
genetic algorithms, 35
binary strings, 36
chromosome, 36
crossover, 36
de�ning length, 38
�tness function, 36
genetic search mechanism, 36
initial population, 36
mathematical foundations, 37
mutation, 36
reproduction, 36
schema, 37
schema order, 38
schema theorem, 39
selection, 36
template learning, 39
inerference, 60
measured spectrum, 60
MLIB, 49
OBSCURA, 47
dSpace, 47
�xed pattern noise FPN, 53
Thesis Report { Smart Image System 83
Index
remaining, 56
Hardware, 47
software, 49
MLIB, 49
s-function, 49
state machine, 51
optics, 48
output nonlinearity, 22
passivation, 63
photo-generation, 6
photodiode, 6
photoreceptor response, 56
photoreceptors, 5
picture element
layout, 11
schematic, 10
quantum eÆciency, 6
refraction coeÆcient, 63
retina, 5
reverse leakage current, 6
s-function, 49
schema theorem, 39
slope variations, 56
spatial �lter, 27, 33, 65
spectral characteristic, 59
interference, 60
measured spectrum, 60
transmission calculation, 63
subthreshold, weak inversion, 9, 53
template, 65
temporal noise, 58
unity gain output function, 27
weak inversion, subthreshold, 9, 53
weight-vector, 32
zero-phase property, 28
84 M. Bettler & S. Zahnd
top related