Saınt Louıs Unıversıty

12 Temmuz 2007



SAINT LOUIS UNIVERSITY

SCHOOL OF BUSINESS & ADMINISTRATION

An individual project within MISB-420-0

Author:        Daniel Klerfors   

Professor:  Dr Terry L. Huston

St.Louis November 1998

Artificial Neural Networks

What are they?

How do they work?

In what areas are they used?

Table of contence

1. Purpose

1.1 Method

2. What are Artificial Neural Networks?

2.1 The Analogy to the Brain

2.1.1 The Biological Neuron

2.1.2 The Artificial Neuron

2.2 Design

2.2.1 Layers

2.2.2 Communication and types of connections

2.2.2.1 Inter-layer connections

2.2.2.2 Intra-layer connections

2.2.3 Learning

2.2.3.1 Off-line or On-line

2.2.3.2 Learning laws

2.3 Where are Neural Networks being used?

References

1 Purpose

This report is intended to review and help the reader understand what Artificial Neural Networks are, how they work, and where they are currently being used. This project is a result of an assignment in AI. The report is a non-technical report, thereby it does not go into depth with mathematical formulas, but tries to give a more general understanding

1.1 Method

To achieve the objectives with this report, the report is done by a descriptive approach. The data used in this report is secondary data gained by studying, reviewing books, Internet publications, and information gained in AI-lectures taught by Dr. Terry L. Huston.

2 What are Artificial Neural Networks?

Artificial Neural Network is a system loosely modeled on the human brain. The field goes by many names, such as connectionism, parallel distributed processing, neuro-computing, natural intelligent systems, machine learning algorithms, and artificial neural networks. It is an attempt to simulate within specialized hardware or sophisticated software, the multiple layers of simple processing elements called neurons. Each neuron is linked to certain of its neighbors with varying coefficients of connectivity that represent the strengths of these connections. Learning is accomplished by adjusting these strengths to cause the overall network to output appropriate results.

2.1 The Analogy to the Brain

The most basic components of neural networks are modeled after the structure of the brain. Some neural network structures are not closely to the brain and some does not have a biological counterpart in the brain. However, neural networks have a strong similarity to the biological brain and therefore a great deal of the terminology is borrowed from neuroscience.

2.1.1 The Biological Neuron

The most basic element of the human brain is a specific type of cell, which provides us with the abilities to remember, think, and apply previous experiences to our every action. These cells are known as neurons, each of these neurons can connect with up to 200000 other neurons. The power of the brain comes from the numbers of these basic components and the multiple connections between them.

All natural neurons have four basic components, which are dendrites, soma, axon, and synapses. Basically, a biological neuron receives inputs from other sources, combines them in some way, performs a generally nonlinear operation on the result, and then output the final result. The figure below shows a simplified biological neuron and the relationship of its four components.

2.1.2 The Artificial Neuron

The basic unit of neural networks, the artificial neurons, simulates the four basic functions of natural neurons. Artificial neurons are much simpler than the biological neuron; the figure below shows the basics of an artificial neuron.

Note that various inputs to the network are represented by the mathematical symbol, x(n). Each of these inputs are multiplied by a connection weight, these weights are represented by w(n). In the simplest case, these products are simply summed, fed through a transfer function to generate a result, and then output.

Even though all artificial neural networks are constructed from this basic building block the fundamentals may vary in these building blocks and there are differences.

2.2 Design

The developer must go through a period of trial and error in the design decisions before coming up with a satisfactory design. The design issues in neural networks are complex and are the major concerns of system developers.

Designing a neural network consist of:

Arranging neurons in various layers.

Deciding the type of connections among neurons for different layers, as well as among the neurons within a layer.

Deciding the way a neuron receives input and produces output.

Determining the strength of connection within the network by allowing the network learn the appropriate values of connection weights by using a training data set.

The process of designing a neural network is an iterative process; the figure below describes its basic steps.

2.2.1 Layers

Biologically, neural networks are constructed in a three dimensional way from microscopic components. These neurons seem capable of nearly unrestricted interconnections. This is not true in any man-made network. Artificial neural networks are the simple clustering of the primitive artificial neurons. This clustering occurs by creating layers, which are then connected to one another. How these layers connect may also vary. Basically, all artificial neural networks have a similar structure of topology. Some of the neurons interface the real world to receive its inputs and other neurons provide the real world with the networkÂ’s outputs. All the rest of the neurons are hidden form view.

As the figure above shows, the neurons are grouped into layers The input layer consist of neurons that receive input form the external environment. The output layer consists of neurons that communicate the output of the system to the user or external environment. There are usually a number of hidden layers between these two layers; the figure above shows a simple structure with only one hidden layer.

When the input layer receives the input its neurons produce output, which becomes input to the other layers of the system. The process continues until a certain condition is satisfied or until the output layer is invoked and fires their output to the external environment.

To determine the number of hidden neurons the network should have to perform its best, one are often left out to the method trial and error. If you increase the hidden number of neurons too much you will get an over fit, that is the net will have problem to generalize. The training set of data will be memorized, making the network useless on new data sets.

2.2.2 Communication and types of connections

Neurons are connected via a network of paths carrying the output of one neuron as input to another neuron. These paths is normally unidirectional, there might however be a two-way connection between two neurons, because there may be an another path in reverse direction. A neuron receives input from many neurons, but produce a single output, which is communicated to other neurons.

The neuron in a layer may communicate with each other, or they may not have any connections. The neurons of one layer are always connected to the neurons of at least another layer.

2.2.2.1 Inter-layer connections

There are different types of connections used between layers, these connections between layers are called inter-layer connections.

Fully connected

Each neuron on the first layer is connected to every neuron on the second layer.

Partially connected

A neuron of the first layer does not have to be connected to all neurons on the second layer.

Feed forward

The neurons on the first layer send their output to the neurons on the second layer, but they do not receive any input back form the neurons on the second layer.

Bi-directional

There is another set of connections carrying the output of the neurons of the second layer into the neurons of the first layer.

Feed forward and bi-directional connections could be fully- or partially connected.

Hierarchical

If a neural network has a hierarchical structure, the neurons of a lower layer may only communicate with neurons on the next level of layer.

Resonance

The layers have bi-directional connections, and they can continue sending messages across the connections a number of times until a certain condition is achieved.

2.2.2.2 Intra-layer connections

In more complex structures the neurons communicate among themselves within a layer, this is known as intra-layer connections. There are two types of intra-layer connections.

Recurrent

The neurons within a layer are fully- or partially connected to one another. After these neurons receive input form another layer, they communicate their outputs with one another a number of times before they are allowed to send their outputs to another layer. Generally some conditions among the neurons of the layer should be achieved before they communicate their outputs to another layer.

On-center/off surround

A neuron within a layer has excitatory connections to itself and its immediate neighbors, and has inhibitory connections to other neurons. One can imagine this type of connection as a competitive gang of neurons. Each gang excites itself and its gang members and inhibits all members of other gangs. After a few rounds of signal interchange, the neurons with an active output value will win, and is allowed to update its and its gang memberÂ’s weights. (There are two types of connections between two neurons, excitatory or inhibitory. In the excitatory connection, the output of one neuron increases the action potential of the neuron to which it is connected. When the connection type between two neurons is inhibitory, then the output of the neuron sending a message would reduce the activity or action potential of the receiving neuron. One causes the summing mechanism of the next neuron to add while the other causes it to subtract. One excites while the other inhibits.)

2.2.3 Learning

The brain basically learns from experience. Neural networks are sometimes called machine learning algorithms, because changing of its connection weights (training) causes the network to learn the solution to a problem. The strength of connection between the neurons is stored as a weight-value for the specific connection. The system learns new knowledge by adjusting these connection weights.

The learning ability of a neural network is determined by its architecture and by the algorithmic method chosen for training.

The training method usually consists of one of three schemes:

Unsupervised learning

The hidden neurons must find a way to organize themselves without help from the outside. In this approach, no sample outputs are provided to the network against which it can measure its predictive performance for a given vector of inputs. This is learning by doing.

Reinforcement learning

This method works on reinforcement from the outside. The connections among the neurons in the hidden layer are randomly arranged, then reshuffled as the network is told how close it is to solving the problem. Reinforcement learning is also called supervised learning, because it requires a teacher. The teacher may be a training set of data or an observer who grades the performance of the network results.

Both unsupervised and reinforcement suffer from relative slowness and inefficiency relying on a random shuffling to find the proper connection weights.

Back propagation

This method is proven highly successful in training of multilayered neural nets. The network is not just given reinforcement for how it is doing on a task. Information about errors is also filtered back through the system and is used to adjust the connections between the layers, thus improving performance. A form of supervised learning.

2.2.3.1 Off-line or On-line

One can categorize the learning methods into yet another group, off-line or on-line. When the system uses input data to change its weights to learn the domain knowledge, the system could be in training mode or learning mode. When the system is being used as a decision aid to make recommendations, it is in the operation mode, this is also sometimes called recall.

Off-line

In the off-line learning methods, once the systems enters into the operation mode, its weights are fixed and do not change any more. Most of the networks are of the off-line learning type.

On-line

In on-line or real time learning, when the system is in operating mode (recall), it continues to learn while being used as a decision tool. This type of learning has a more complex design structure.

2.2.3.2 Learning laws

There are a variety of learning laws which are in common use. These laws are mathematical algorithms used to update the connection weights. Most of these laws are some sort of variation of the best known and oldest learning law, HebbÂ’s Rule. ManÂ’s understanding of how neural processing actually works is very limited. Learning is certainly more complex than the simplification represented by the learning laws currently developed. Research into different learning functions continues as new ideas routinely show up in trade publications etc. A few of the major laws are given as an example below.

HebbÂ’s Rule

The first and the best known learning rule was introduced by Donald Hebb. The description appeared in his book The organization of Behavior in 1949. This basic rule is: If a neuron receives an input from another neuron, and if both are highly active (mathematically have the same sign), the weight between the neurons should be strengthened.

Hopfield Law

This law is similar to HebbÂ’s Rule with the exception that it specifies the magnitude of the strengthening or weakening. It states, “if the desired output and the input are both active or both inactive, increment the connection weight by the learning rate, otherwise decrement the weight by the learning rate.” (Most learning functions have some provision for a learning rate, or a learning constant. Usually this term is positive and between zero and one.)

The Delta Rule

The Delta Rule is a further variation of HebbÂ’s Rule, and it is one of the most commonly used. This rule is based on the idea of continuously modifying the strengths of the input connections to reduce the difference (the delta) between the desired output value and the actual output of a neuron. This rule changes the connection weights in the way that minimizes the mean squared error of the network. The error is back propagated into previous layers one layer at a time. The process of back-propagating the network errors continues until the first layer is reached. The network type called Feed forward, Back-propagation derives its name from this method of computing the error term.

This rule is also referred to as the Windrow-Hoff Learning Rule and the Least Mean Square Learning Rule.

KohonenÂ’s Learning Law

This procedure, developed by Teuvo Kohonen, was inspired by learning in biological systems. In this procedure, the neurons compete for the opportunity to learn, or to update their weights. The processing neuron with the largest output is declared the winner and has the capability of inhibiting its competitors as well as exciting its neighbors. Only the winner is permitted output, and only the winner plus its neighbors are allowed to update their connection weights.

The Kohonen rule does not require desired output. Therefor it is implemented in the unsupervised methods of learning. Kohonen has used this rule combined with the on-center/off-surround intra- layer connection (discussed earlier under 2.2.2.2) to create the self-organizing neural network, which has an unsupervised learning method.

On this Internet site by Sue Becker you may see an interactive demonstration of a Kohonen network, which may give you a better understanding.

http://www.psychology.mcmaster.ca/4i03/competitive-demo.html

2.3 Where are Neural Networks being used?

Neural networks are performing successfully where other methods do not, recognizing and matching complicated, vague, or incomplete patterns. Neural networks have been applied in solving a wide variety of problems.

The most common use for neural networks is to project what will most likely happen. There are many areas where prediction can help in setting priorities. For example, the emergency room at a hospital can be a hectic place, to know who needs the most critical help can enable a more successful operation. Basically, all organizations must establish priorities, which govern the allocation of their resources. Neural networks have been used as a mechanism of knowledge acquisition for expert system in stock market forecasting with astonishingly accurate results. Neural networks have also been used for bankruptcy prediction for credit card institutions.

Although one may apply neural network systems for interpretation, prediction, diagnosis, planing, monitoring, debugging, repair, instruction, and control, the most successful applications of neural networks are in categorization and pattern recognition. Such a system classifies the object under investigation (e.g. an illness, a pattern, a picture, a chemical compound, a word, the financial profile of a customer) as one of numerous possible categories that, in return, may trigger the recommendation of an action (such as a treatment plan or a financial plan.

A company called Nestor, have used neural network for financial risk assessment for mortgage insurance decisions, categorizing the risk of loans as good or bad. Neural networks has also been applied to convert text to speech, NETtalk is one of the systems developed for this purpose. Image processing and pattern recognition form an important area of neural networks, probably one of the most actively research areas of neural networks.

An other of research for application of neural networks is character recognition and handwriting recognition. This area has use in banking, credit card processing and other financial services, where reading and correctly recognizing handwriting on documents is of crucial significance. The pattern recognition capability of neural networks has been used to read handwriting in processing checks, the amount must normally be entered into the system by a human. A system that could automate this task would expedite check processing and reduce errors. One such system has been developed by HNC (Hecht-Nielsen Co.) for BankTec.

One of the best known applications is the bomb detector installed in some U.S. airports. This device called SNOOPE, determine the presence of certain compounds from the chemical configurations of their components.

In a document from International Joint conference, one can find reports on using neural networks in areas ranging from robotics, speech, signal processing, vision, character recognition to musical composition, detection of heart malfunction and epilepsy, fish detection and classification, optimization, and scheduling. One may take under consideration that most of the reported applications are still in research stage.

Basically, most applications of neural networks fall into the following five categories:

Prediction

Uses input values to predict some output. e.g. pick the best stocks in the market, predict weather, identify people with cancer risk.

Classification

Use input values to determine the classification. e.g. is the input the letter A, is the blob of the video data a plane and what kind of plane is it.

Data association

Like classification but it also recognizes data that contains errors. e.g. not only identify the characters that were scanned but identify when the scanner is not working properly.

Data Conceptualization

Analyze the inputs so that grouping relationships can be inferred. e.g. extract from a database the names of those most likely to by a particular product.

Data Filtering

Smooth an input signal. e.g. take the noise out of a telephone signal.

References

Data & Analysis Center for Software, “Artificial Neural Networks Technology”, 1992 (http://www.dacs.dtic.mil/techs/neural/neural.title.html, printed November 1998)

Avelino J. Gonzalez & Douglas D. Dankel, “The Engineering of Knowledge-based Systems”, 1993 Prentice-Hall Inc. ISBN 0-13-334293-X.

Fatemeh Zahedi, “Intelligent Systems for Business: Expert Systems with Neural networks, 1993 Wadsworth Inc. ISBN 0-534-18888-5.

Haykin Simon, “Neural Networks”, 1994 Macmillan College Publishing Company Inc. ISBN 0-02-352761-7

Also referred to as connectionist architectures, parallel distributed processing, and neuromorphic systems, an artificial neural network (ANN) is an information-processing paradigm inspired by the way the densely interconnected, parallel structure of the mammalian brain processes information. Artificial neural networks are collections of mathematical models that emulate some of the observed properties of biological nervous systems and draw on the analogies of adaptive biological learning. The key element of the ANN paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements that are analogous to neurons and are tied together with weighted connections that are analogous to synapses.

Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well. Learning typically occurs by example through training, or exposure to a truthed set of input/output data where the training algorithm iteratively adjusts the connection weights (synapses). These connection weights store the knowledge necessary to solve specific problems.

Although ANNs have been around since the late 1950’s, it wasn’t until the mid-1980’s that algorithms became sophisticated enough for general applications. Today ANNs are being applied to an increasing number of real- world problems of considerable complexity. They are good pattern recognition engines and robust classifiers, with the ability to generalize in making decisions about imprecise input data. They offer ideal solutions to a variety of classification problems such as speech, character and signal recognition, as well as functional prediction and system modeling where the physical processes are not understood or are highly complex. ANNs may also be applied to control problems, where the input variables are measurements used to drive an output actuator, and the network learns the control function. The advantage of ANNs lies in their resilience against distortions in the input data and their capability of learning. They are often good at solving problems that are too complex for conventional technologies (e.g., problems that do not have an algorithmic solution or for which an algorithmic solution is too complex to be found) and are often well suited to problems that people are good at solving, but for which traditional methods are not.

There are multitudes of different types of ANNs. Some of the more popular include the multilayer perceptron which is generally trained with the backpropagation of error algorithm, learning vector quantization, radial basis function, Hopfield, and Kohonen, to name a few. Some ANNs are classified as feedforward while others are recurrent (i.e., implement feedback) depending on how data is processed through the network. Another way of classifying ANN types is by their method of learning (or training), as some ANNs employ supervised training while others are referred to as unsupervised or self-organizing. Supervised training is analogous to a student guided by an instructor. Unsupervised algorithms essentially perform clustering of the data into similar groups based on the measured attributes or features serving as inputs to the algorithms. This is analogous to a student who derives the lesson totally on his or her own. ANNs can be implemented in software or in specialized hardware.

© Copyright 1997 Battelle Memorial Institute

Artificial neural networks (ANNs) are computational paradigms which implement simplified models of their biological counterparts, biological neural networks. Biological neural networks are the local assemblages of neurons and their dendritic connections that form the (human) brain. Accordingly, ANNs are characterized by

Local processing in artificial neurons (or processing elements, PEs),

Massively parallel processing, implemented by rich connection pattern between PEs,

The ability to acquire knowledge via learning from experience,

Knowledge storage in distributed memory, the synaptic PE connections.

The attempt of implementing neural networks for brain-like computations like patterns recognition, decisions making, motory control and many others is made possible by the advent of large scale computers in the late 1950’s. Indeed, ANNs can be viewed as a major new approach to computational methodology since the introduction of digital computers.

Although the initial intent of ANNs was to explore and reproduce human information processing tasks such as speech, vision, and knowledge processing, ANNs also demonstrated their superior capability for classification and function approximation problems. This has great potential for solving complex problems such as systems control, data compression, optimization problems, pattern recognition, and system identification.

Artificial neural networks were originally developed as tools for the exploration and reproduction of human information processing tasks such as speech, vision, olfaction, touch, knowledge processing and motor control. Today, most research is directed towards the development of artificial neural networks for applications such as data compression, optimization, pattern matching, system modeling, function approximation, and control. One of the application areas to which we apply artificial neural networks is flight control. Artificial neural networks give control systems a variety of advanced capabilities. We are currently developing a neural network control system for a wave rider shaped vehicle called LoFLYTE TM. This 23 foot vehicle will demonstrate the control system at subsonic speeds. A successful flight will pave the way for super sonic and hypersonic versions of the vehicle.

Since artificial neural networks are highly parallel systems, conventional computers are unsuited for neural networks algorithms. Special purpose computational hardware has been constructed to efficiently implement atrificial neural networks. Accurate Automation has developed a Neural Network Processor (NNP®). This hardware will allow us to run even the most complex neural networks in real time. The NNP TM is capable of multiprocessor operation in Multiple-Instruction-Multiple-Data (MIMD) fashion. It is the most advanced digital neural network hardware in existence. Each NNP TM system is capable of implementing 8K neurons with 32K interconnections per processor. The computational capability of a single processor 140M connections (8 bit multiply-accumulates) per second (35MHz). An 8 processor NNP® would be capable of over one billion connections per second. The NNP® architecture is extremely flexible and any neuron is capable of interconnecting with other neuron in the system. The NNP TM is implemented on both VME and PC compatible cards.

Learning versus A Priori Problem Solving

Conventional computers rely on programs that solve a problem using a pre-determined series of steps, called algorithms. These programs are controlled by a single, complex central processing unit, and store information at specific locations in memory. Artificial neural networks use highly distributed representations and transformations that operate in parallel, have distributed control through many highly interconnected neurons, and store their information in variable strength connections called synapses.

There are many different ways in which people refer to the same type of neural networks technology. Neural networks are described as connectionist systems, because of the connections between individual processing nodes. They are somtimes called adaptive systems, because the values of these connections can change so that the neural network performs more effectively. They are also somtimes called parallel distributed processing systems, which emphasize the way in which the many nodes or neurons in a neural network operate in parallel. The theory that inspires neural network systems is drawn from many disciplines; primarily from neuroscience, engineering, and computer science, but also from psychology, mathematics, physics, and linguistics. These sciences are working toward the common goal of building intelligent systems.

Home or Back

Artificial Neural Networks

Artificial neural networks (ANNs) are programs designed to simulate the way a simple biological nervous system is believed to operate. They are based on simulated nerve cells or neurons which are joined together in a variety of ways to form networks. These networks have the capacity to learn, memorize and create relationships amongst data. There are many different types of ANN but some are more popular than others. The most widely used ANN is known as the Back Propagation ANN. This type of ANN is excellent at prediction and classification tasks. Another is the Kohonen or Self Organizing Map which is excellent at finding relationships amongst complex sets of data.

What are ANNs Used For ?

Their applications are almost limitless but fall into a few simple categories.

Classification: Customer/Market profiles, medical diagnosis, signature verification, loan risk evaluation, voice recognition, image recognition, spectra identification, property valuation, classification of cell types, microbes, materials, samples.

Forecasting: Future sales, production requirements, market performance, economic indicators, energy requirements, medical outcomes, chemical reaction products, weather, crop forecasts, environmental risk, horse races, jury panels.

Modeling: Process control, systems control, chemical structures, dynamic systems, signal compression, plastics moulding, welding control, robot control, and many more.

Who Needs ANNs?

People that have to work with or analyze data of any kind. People in business, finance, industry, education and science whose problems are complex, laborious, fuzzy or simply un-resolvable using present methods. People who want better solutions and wish to gain a competitive edge.

Why Are ANNs Better?

1. They deal with the non-linearities in the world in which we live.

2. They handle noisy or missing data.

3. They create their own relationship amongst information - no equations!

4. They can work with large numbers of variables or parameters.

5. They provide general solutions with good predictive accuracy.

Jump to Artificial Neural Network Software

Return to Applications Menu

A Sample of Further Reading:

General

1. Rumelhart DE and Mcclelland JL(1986) Parallel distributed processing: Explorations in the microstructure of cognition.; MIT Press, Cambridge,; Vols I and II. ————This is the book that started the explosion of uses on ANNs in real world.

Eberhart RC and Dobbins RW (1990) Early neural network development history: The age of Camelot. IEEE Engineering in Medicine and Biology 9 ,15-18.

Kohonen T (1997) Self-organising Maps. Pub Springer-Verlag, Berlin. ——The definitive book about self organising maps. ISBN 3-540-62017-6.

Hinton G.E. (1992) How neural networks learn from experience. Scientific American 267, 144-151.

Swingler K. (1996) Applying neural networks. A Practical Guide. Pub Academic Press, NY. ISBN 0-12-679170-8.

Business/Finance

Wong, Bo K. Bodnovich, Thomas A. Selvi, Yakup. (1995) Bibliography of neural network business applications research: 1988-September 1994. Expert Systems. v 12. p 253-262.

Kaastra I. Boyd M. (1996 ) Designing a neural network for forecasting financial and economic time series. Neurocomputing. 10, 215-236 Apr.

3. Poddig T. Rehkugler H (1996) A world model of integrated financial markets using artificial neural networks. Neurocomputing. 10, 251-273.

4. Kathmann, Ruud M. (1993) Neural networks for the mass appraisal of real estate. Computers Environment & Urban Systems. 17, 373-384.

5. Ask for an extended list from Answers From Computers.

Science and Medicine

Burns JA, Whitesides GM (1993) Feed forward neural networks in chemistry : Mathematical systems for classification and pattern recognition. Chem Rev 93 : 2583-2601.

Astion ML, Wilding PW (1992) The application of backpropagation neural networks to problems in pathology and laboratory medicine. Arch Pathol Lab Med 116: 995-1001.

Maddalena DJ. (1996) Applications of artificial neural networks in quantitative structure activity relationships. Exp Opin Ther Patents 6, 239-251.

Baxt WG. (1995) Application of artificial neural networks to clinical medicine. [Review] Lancet. 346 (8983):1135-8, Oct 28.

Ask for an extended list from Answers From Computers.

Engineering

Chablo A. (1994) Potential applications of artificial intelligence in telecommunications. Technovation. 14, 431-435.

Horwitz D. and El-Sibaie M. (1995) Applying neural nets to railway engineering. AI Expert, January pp 36- 41.

Plummer J (1993) Tighter process control with neural networks. AI Expert October, pp 49-55.

Ask for an extended list on an engineering topic from AFC.

Return to Applications Menu

last updated 23rd October 1997

Artificial Neural Networks

Artificial neural networks (ANNs) are programs designed to simulate the way a simple biological nervous system is believed to operate. They are based on simulated nerve cells or neurons which are joined together in a variety of ways to form networks. These networks have the capacity to learn, memorize and create relationships amongst data. There are many different types of ANN but some are more popular than others. The most widely used ANN is known as the Back Propagation ANN. This type of ANN is excellent at prediction and classification tasks. Another is the Kohonen or Self Organizing Map which is excellent at finding relationships amongst complex sets of data.

What are ANNs Used For ?

Their applications are almost limitless but fall into a few simple categories.

Classification: Customer/Market profiles, medical diagnosis, signature verification, loan risk evaluation, voice recognition, image recognition, spectra identification, property valuation, classification of cell types, microbes, materials, samples.

Forecasting: Future sales, production requirements, market performance, economic indicators, energy requirements, medical outcomes, chemical reaction products, weather, crop forecasts, environmental risk, horse races, jury panels.

Modeling: Process control, systems control, chemical structures, dynamic systems, signal compression, plastics moulding, welding control, robot control, and many more.

Who Needs ANNs?

People that have to work with or analyze data of any kind. People in business, finance, industry, education and science whose problems are complex, laborious, fuzzy or simply un-resolvable using present methods. People who want better solutions and wish to gain a competitive edge.

Why Are ANNs Better?

1. They deal with the non-linearities in the world in which we live.

2. They handle noisy or missing data.

3. They create their own relationship amongst information - no equations!

4. They can work with large numbers of variables or parameters.

5. They provide general solutions with good predictive accuracy.

Jump to Artificial Neural Network Software

Return to Applications Menu

A Sample of Further Reading:

General

1. Rumelhart DE and Mcclelland JL(1986) Parallel distributed processing: Explorations in the microstructure of cognition.; MIT Press, Cambridge,; Vols I and II. ————This is the book that started the explosion of uses on ANNs in real world.

Eberhart RC and Dobbins RW (1990) Early neural network development history: The age of Camelot. IEEE Engineering in Medicine and Biology 9 ,15-18.

Kohonen T (1997) Self-organising Maps. Pub Springer-Verlag, Berlin. ——The definitive book about self organising maps. ISBN 3-540-62017-6.

Hinton G.E. (1992) How neural networks learn from experience. Scientific American 267, 144-151.

Swingler K. (1996) Applying neural networks. A Practical Guide. Pub Academic Press, NY. ISBN 0-12-679170-8.

Business/Finance

Wong, Bo K. Bodnovich, Thomas A. Selvi, Yakup. (1995) Bibliography of neural network business applications research: 1988-September 1994. Expert Systems. v 12. p 253-262.

Kaastra I. Boyd M. (1996 ) Designing a neural network for forecasting financial and economic time series. Neurocomputing. 10, 215-236 Apr.

3. Poddig T. Rehkugler H (1996) A world model of integrated financial markets using artificial neural networks. Neurocomputing. 10, 251-273.

4. Kathmann, Ruud M. (1993) Neural networks for the mass appraisal of real estate. Computers Environment & Urban Systems. 17, 373-384.

5. Ask for an extended list from Answers From Computers.

Science and Medicine

Burns JA, Whitesides GM (1993) Feed forward neural networks in chemistry : Mathematical systems for classification and pattern recognition. Chem Rev 93 : 2583-2601.

Astion ML, Wilding PW (1992) The application of backpropagation neural networks to problems in pathology and laboratory medicine. Arch Pathol Lab Med 116: 995-1001.

Maddalena DJ. (1996) Applications of artificial neural networks in quantitative structure activity relationships. Exp Opin Ther Patents 6, 239-251.

Baxt WG. (1995) Application of artificial neural networks to clinical medicine. [Review] Lancet. 346 (8983):1135-8, Oct 28.

Ask for an extended list from Answers From Computers.

Engineering

Chablo A. (1994) Potential applications of artificial intelligence in telecommunications. Technovation. 14, 431-435.

Horwitz D. and El-Sibaie M. (1995) Applying neural nets to railway engineering. AI Expert, January pp 36- 41.

Plummer J (1993) Tighter process control with neural networks. AI Expert October, pp 49-55.

Ask for an extended list on an engineering topic from AFC.

Return to Applications Menu

last updated 23rd October 1997

Neural Networks Background

Many tasks which seem simple for us, such as reading a handwritten note or recognizing a face, are difficult for even the most advanced computer. In an effort to increase the computer’s ability to perform such tasks, programmers began designing software to act more like the human brain, with its neurons and synaptic connections. Thus the field of “artificial neural networks” was born. Rather than employ the traditional method of one central processor (a Pentium) to carry out many instructions one at a time, the neural network software analyzes data by passing it through several simulated processors which are interconnected with synaptic-like “weights”.

     Although the programming and mathematics behind neural network technologies are complex, using neural network software can be quite simple and the results are often quite extraordinary. Once you have collected several records of the data you wish to analyze, the network will run through them and “learn” how the inputs of each record may be related to the result. Each “record” might be a machine on an assembly line, or a particular stock, or the weather one day. If the record was a patient at a hospital, the record’s inputs (such as: age, sex, body fat, allergies, blood pressure) and it’s related output (such as: did the drug work in this case?) are both fed into the “neurons” of the network. The network then continually refines itself until it can produce an accurate response when given those particular inputs.

     After training on a few dozen cases, the network begins to organize itself, and refines its own architecture to fit the data, much like a human brain “learns” from example. If there is any overall pattern to the data, or some consistent relationship between the inputs and result of each record, the network should be able to eventually create an internal mapping of weights that can accurately reproduce the expected output.

     Once you realize how powerful this type of “reverse engineering” technology can be, you begin to understand why neural networks were once regarded as the best kept secret of large corporate, government, and academic researchers. Once only available to those with the training and the computing power, this advanced intelligence technique is now available to anyone using Microsoft Excel. (see BrainSheet) Neural networks still require a lot of processing power, but they are now quite simple to use, and thanks to today’s faster generation of desktop computers, there are fewer reasons to stick with the traditional statistical methods each year.

What is a Neural Network?

writtten by Chris Stergiou

First of all, when we are talking about a neural network, we should more properly say “artificial neural network” (ANN), because that is what we mean most of the time. Biological neural networks are much more complicated than the mathematical models we use for ANNs. But it is customary to be lazy and drop the “A” or the “artificial”.

An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well.

Some Other Definitions of a Neural Network include:

According to the DARPA Neural Network Study (1988, AFCEA International Press, p. 60):

… a neural network is a system composed of many simple processing elements operating in parallel whose function is determined by network structure, connection strengths, and the processing performed at computing elements or nodes.

According to Haykin, S. (1994), Neural Networks: A Comprehensive Foundation, NY: Macmillan, p. 2:

A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects:

1.Knowledge is acquired by the network through a learning process.

2.Interneuron connection strengths known as synaptic weights are used to store the knowledge.

ANNs have been applied to an increasing number of real-world problems of considerable complexity. Their most important advantage is in solving problems that are too complex for conventional technologies — problems that do not have an algorithmic solution or for which an algorithmic solution is too complex to be found. In general, because of their abstraction from the biological brain, ANNs are well suited to problems that people are good at solving, but for which computers are not. These problems includes pattern recognition and forecasting (which requires the recognition of trends in data).

Why use a neural network?

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an “expert” in the category of information it has been given to analyze. This expert can then be used to provide projections given new situations of interest and answer “what if” questions.

Other advantages include:

Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience.

Self-Organisation: An ANN can create its own organisation or representation of the information it receives during learning time.

Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manifactured which take advantage of this capability.

Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilites may be retained even with major network damage.

Neural Networks in Practice

Given this description of neural networks and how they work, what real world applications are they suited for? Neural networks have broad applicability to real world business problems. In fact, they have already been successfully applied in many industries.

Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including:

sales forecasting

industrial process control

customer research

data validation

risk management

target marketing

But to give you some more specific examples; ANN are also used in the following specific paradigms: recognition of speakers in communications; diagnosis of hepatitis; recovery of telecommunications from faulty software; interpretation of multimeaning Chinese words; undersea mine detection; texture analysis; three-dimensional object recognition; handwritten word recognition; and facial recognition.

Historical Background of Neural Networks

Neural network simulations appear to be a recent development. However, this field was established before the advent of computers, and has survived at least one major setback and several eras.

Many importand advances have been boosted by the use of inexpensive computer emulations. Following an initial period of enthusiasm, the field survived a period of frustration and disrepute. During this period when funding and professional support was minimal, important advances were made by relatively few reserchers. These pioneers were able to develop convincing technology which surpassed the limitations identified by Minsky and Papert. Minsky and Papert, published a book (in 1969) in which they summed up a general feeling of frustration (against neural networks) among researchers, and was thus accepted by most without further analysis. Currently, the neural network field enjoys a resurgence of interest and a corresponding increase in funding.

The history of neural networks that was described above can be divided into several periods:

First Attempts: There were some initial simulations using formal logic. McCulloch and Pitts (1943) developed models of neural networks based on their understanding of neurology. These models made several assumptions about how neurons worked. Their networks were based on simple neurons which were considered to be binary devices with fixed thresholds. The results of their model were simple logic functions such as “a or b” and “a and b”. Another attempt was by using computer simulations. Two groups (Farley and Clark, 1954; Rochester, Holland, Haibit and Duda, 1956). The first group (IBM reserchers) maintained closed contact with neuroscientists at McGill University. So whenever their models did not work, they consulted the neuroscientists. This interaction established a multidiscilinary trend which continues to the present day.

Promising & Emerging Technology: Not only was neroscience influential in the development of neural networks, but psychologists and engineers also contributed to the progress of neural network simulations. Rosenblatt (1958) stirred considerable interest and activity in the field when he designed and developed the Perceptron. The Perceptron had three layers with the middle layer known as the association layer.This system could learn to connect or associate a given input to a random output unit.

Another system was the ADALINE (ADAptive LInear Element) which was developed in 1960 by Widrow and Hoff (of Stanford University). The ADALINE was an analogue electronic device made from simple components. The method used for learning was different to that of the Perceptron, it employed the Least-Mean-Squares (LMS) learning rule.

Period of Frustration & Disrepute: In 1969 Minsky and Papert wrote a book in which they generalised the limitations of single layer Perceptrons to multilayered systems. In the book they said: “…our intuitive judgment that the extension (to multilayer systems) is sterile”. The significant result of their book was to eliminate funding for research with neural network simulations. The conclusions supported the disenhantment of reserchers in the field. As a result, considerable prejudice against this field was activated.

Innovation: Although public interest and available funding were minimal, several researchers continued working to develop neuromorphically based computaional methods for problems such as pattern recognition.

During this period several paradigms were generated which modern work continues to enhance.Grossberg’s (Steve Grossberg and Gail Carpenter in 1988) influence founded a school of thought which explores resonating algorithms. They developed the ART (Adaptive Resonance Theory) networks based on biologically plausible models. Anderson and Kohonen developed associative techniques independent of each other. Klopf (A. Henry Klopf) in 1972, developed a basis for learning in artificial neurons based on a biological principle for neuronal learning called heterostasis.

Werbos (Paul Werbos 1974) developed and used the back-propagation learning method, however several years passed before this approach was popularized. Back-propagation nets are probably the most well known and widely applied of the neural networks today. In essence, the back-propagation net. is a Perceptron with multiple layers, a different thershold function in the artificial neuron, and a more robust and capable learning rule.

Amari (A. Shun-Ichi 1967) was involved with theoretical developments: he published a paper which established a mathematical theory for a learning basis (error-correction method) dealing with adaptive patern classification. While Fukushima (F. Kunihiko) developed a step wise trained multilayered neural network for interpretation of handwritten characters. The original network was published in 1975 and was called the Cognitron.

Re-Emergence: Progress during the late 1970s and early 1980s was important to the re-emergence on interest in the neural network field. Several factors influenced this movement. For example, comprehensive books and conferences provided a forum for people in diverse fields with specialized technical languages, and the response to conferences and publications was quite positive. The news media picked up on the increased activity and tutorials helped disseminate the technology. Academic programs appeared and courses were inroduced at most major Universities (in US and Europe). Attention is now focused on funding levels throughout Europe, Japan and the US and as this funding becomes available, several new commercial with applications in industry and finacial institutions are emerging.

Today: Significant progress has been made in the field of neural networks-enough to attract a great deal of attention and fund further research. Advancement beyond current commercial applications appears to be possible, and research is advancing the field on many fronts. Neurally based chips are emerging and applications to complex problems developing. Clearly, today is a period of transition for neural network technology.

Are there any limits to Neural Networks?

The major issues of concern today are the scalability problem, testing, verification, and integration of neural network systems into the modern environment. Neural network programs sometimes become unstable when applied to larger problems. The defence, nuclear and space industries are concerned about the issue of testing and verification. The mathematical theories used to guarantee the performance of an applied neural network are still under development. The solution for the time being may be to train and test these intelligent systems much as we do for humans. Also there are some more practical problems like:

the operational problem encountered when attempting to simulate the parallelism of neural networks. Since the majority of neural networks are simulated on sequential machines, giving rise to a very rapid increase in processing time requirements as size of the problem expands.

Solution: implement neural networks directly in hardware, but these need a lot of development still.

instability to explain any results that they obtain. Networks function as “black boxes” whose rules of operation are completely unknown.

The Future

Because gazing into the future is somewhat like gazing into a crystal ball, so it is better to quote some “predictions”. Each prediction rests on some sort of evidence or established trend which, with extrapolation, clearly takes us into a new realm.

Prediction 1:

Neural Networks will fascinate user-specific systems for education, information processing, and entertainment. “Alternative ralities”, produced by comprehensive environments, are attractive in terms of their potential for systems control, education, and entertainment. This is not just a far-out research trend, but is something which is becoming an increasing part of our daily existence, as witnessed by the growing interest in comprehensive “entertainment centers” in each home.

This “programming” would require feedback from the user in order to be effective but simple and “passive” sensors (e.g fingertip sensors, gloves, or wristbands to sense pulse, blood pressure, skin ionisation, and so on), could provide effective feedback into a neural control system. This could be achieved, for example, with sensors that would detect pulse, blood pressure, skin ionisation, and other variables which the system could learn to correlate with a person’s response state.

Prediction 2:

Neural networks, integrated with other artificial intelligence technologies, methods for direct culture of nervous tissue, and other exotic technologies such as genetic engineering, will allow us to develop radical and exotic life-forms whether man, machine, or hybrid.

Prediction 3:

Neural networks will allow us to explore new realms of human capabillity realms previously available only with extensive training and personal discipline. So a specific state of consiously induced neurophysiologically observable awareness is necessary in order to facilitate a man machine system interface.

References: Klimasauskas, CC. (1989). The 1989 Neuro Computing Bibliography. Hammerstrom, D. (1986). A Connectionist/Neural Network Bibliography. DARPA Neural Network Study (October, 1987-February, 1989). MIT Lincoln Lab. Neural Networks, Eric Davalo and Patrick Naim. Prof. Aleksander. articles and Books. (from Imperial College) WWW pages through out the internet Assimov, I (1984, 1950), Robot, Ballatine, New York. current news from multimedia services (Tv)

Kategori: Genel kültür


Rasgele...