Home | Read Online | Amazon | GoodReads | Google Books | PDF (code) | GitHub
Negative Selection Algorithm, NSA.
The Negative Selection Algorithm belongs to the field of Artificial Immune Systems. The algorithm is related to other Artificial Immune Systems such as the Clonal Selection Algorithm, and the Immune Network Algorithm.
The Negative Selection algorithm is inspired by the self-nonself discrimination behavior observed in the mammalian acquired immune system. The clonal selection theory of acquired immunity accounts for the adaptive behavior of the immune system including the ongoing selection and proliferation of cells that select-for potentially harmful (and typically foreign) material in the body. An interesting aspect of this process is that it is responsible for managing a population of immune cells that do not select-for the tissues of the body, specifically it does not create self-reactive immune cells known as auto-immunity. This problem is known as 'self-nonself discrimination' and it involves the preparation and on going maintenance of a repertoire of immune cells such that none are auto-immune. This is achieved by a negative selection process that selects-for and removes those cells that are self-reactive during cell creation and cell proliferation. This process has been observed in the preparation of T-lymphocytes, naive versions of which are matured using both a positive and negative selection process in the thymus.
The self-nonself discrimination principle suggests that the anticipatory guesses made in clonal selection are filtered by regions of infeasibility (protein conformations that bind to self-tissues). Further, the self-nonself immunological paradigm proposes the modeling of the unknown domain (encountered pathogen) by modeling the complement of what is known. This is unintuitive as the natural inclination is to categorize unknown information by what is different from that which is known, rather than guessing at the unknown information and filtering those guesses by what is known.
The information processing principles of the self-nonself discrimination process via negative selection are that of a anomaly and change detection systems that model the anticipation of variation from what is known. The principle is achieved by building a model of changes, anomalies, or unknown (non-normal or non-self) data by generating patterns that do not match an existing corpus of available (self or normal) patterns. The prepared non-normal model is then used to either monitor the existing normal data or streams of new data by seeking matches to the non-normal patterns.
Algorithm (below) provides a pseudocode listing of the detector generation procedure for the Negative Selection Algorithm. Algorithm (below) provides a pseudocode listing of the detector application procedure for the Negative Selection Algorithm.
Input
:
SelfData
Output
:
Repertoire
Repertoire
$\leftarrow \emptyset$While
($\neg$StopCondition
())Detectors
$\leftarrow$ GenerateRandomDetectors
()For
($Detector_{i}$ $\in$ Repertoire
)If
($\neg$Matches
{$Detector_{i}$, SelfData
})Repertoire
$\leftarrow$ $Detector_{i}$End
End
End
Return
(Repertoire
)Input
:
InputSamples
, Repertoire
For
($Input_{i}$ $\in$ InputSamples
)For
($Detector_{i}$ $\in$ Repertoire
)If
(Matches
{$Input_{i}$, $Detector_{i}$})Break
End
End
End
Listing (below) provides an example of the Negative Selection Algorithm implemented in the Ruby Programming Language. The demonstration problem is a two-class classification problem where samples are drawn from a two-dimensional domain, where $x_i \in [0,1]$. Those samples in $1.0>x_i>0.5$ are classified as self and the rest of the space belongs to the non-self class. Samples are drawn from the self class and presented to the algorithm for the preparation of pattern detectors for classifying unobserved samples from the non-self class. The algorithm creates a set of detectors that do not match the self data, and are then applied to a set of randomly generated samples from the domain. The algorithm uses a real-valued representation. The Euclidean distance function is used during matching and a minimum distance value is specified as a user parameter for approximate matches between patterns. The algorithm includes the additional computationally expensive check for duplicates in the preparation of the self dataset and the detector set.
def random_vector(minmax) return Array.new(minmax.length) do |i| minmax[i][0] + ((minmax[i][1] - minmax[i][0]) * rand()) end end def euclidean_distance(c1, c2) sum = 0.0 c1.each_index {|i| sum += (c1[i]-c2[i])**2.0} return Math.sqrt(sum) end def contains?(vector, space) vector.each_with_index do |v,i| return false if v<space[i][0] or v>space[i][1] end return true end def matches?(vector, dataset, min_dist) dataset.each do |pattern| dist = euclidean_distance(vector, pattern[:vector]) return true if dist <= min_dist end return false end def generate_detectors(max_detectors, search_space, self_dataset, min_dist) detectors = [] begin detector = {:vector=>random_vector(search_space)} if !matches?(detector[:vector], self_dataset, min_dist) detectors << detector if !matches?(detector[:vector], detectors, 0.0) end end while detectors.size < max_detectors return detectors end def generate_self_dataset(num_records, self_space, search_space) self_dataset = [] begin pattern = {} pattern[:vector] = random_vector(search_space) next if matches?(pattern[:vector], self_dataset, 0.0) if contains?(pattern[:vector], self_space) self_dataset << pattern end end while self_dataset.length < num_records return self_dataset end def apply_detectors(detectors, bounds, self_dataset, min_dist, trials=50) correct = 0 trials.times do |i| input = {:vector=>random_vector(bounds)} actual = matches?(input[:vector], detectors, min_dist) ? "N" : "S" expected = matches?(input[:vector], self_dataset, min_dist) ? "S" : "N" correct += 1 if actual==expected puts "#{i+1}/#{trials}: predicted=#{actual}, expected=#{expected}" end puts "Done. Result: #{correct}/#{trials}" return correct end def execute(bounds, self_space, max_detect, max_self, min_dist) self_dataset = generate_self_dataset(max_self, self_space, bounds) puts "Done: prepared #{self_dataset.size} self patterns." detectors = generate_detectors(max_detect, bounds, self_dataset, min_dist) puts "Done: prepared #{detectors.size} detectors." apply_detectors(detectors, bounds, self_dataset, min_dist) return detectors end if __FILE__ == $0 # problem configuration problem_size = 2 search_space = Array.new(problem_size) {[0.0, 1.0]} self_space = Array.new(problem_size) {[0.5, 1.0]} max_self = 150 # algorithm configuration max_detectors = 300 min_dist = 0.05 # execute the algorithm execute(search_space, self_space, max_detectors, max_self, min_dist) end
The seminal negative selection algorithm was proposed by Forrest, et al. [Forrest1994] in which a population of detectors are prepared in the presence of known information, where those randomly generated detectors that match against known data are discarded. The population of pattern guesses in the unknown space then monitors the corpus of known information for changes. The algorithm was applied to the monitoring of files for changes (corruptions and infections by computer viruses), and later formalized as a change detection algorithm [Dhaeseleer1996a] [Dhaeseleer1996].
The Negative Selection algorithm has been applied to the monitoring of changes in the execution behavior of Unix processes [Forrest1996] [Hofmeyr1998], and to monitor changes in remote connections of a network computer (intrusion detection) [Hofmeyr1999] [Hofmeyr1999a]. The application of the algorithm has been predominantly to virus host intrusion detection and their abstracted problems of classification (two-class) and anomaly detection. Esponda provides some interesting work showing some compression and privacy benefits provided by maintaining a negative model (non-self) [Darlington2005] Ji and Dasgupta provide a contemporary and detailed review of Negative Selection Algorithms covering topics such as data representations, matching rules, detector generation procedures, computational complexity, hybridization, and theoretical frameworks [Ji2007]. Recently, the validity of the application of negative selection algorithms in high-dimensional spaces has been questioned, specifically given the scalability of the approach in the face of the exponential increase in volume within the problem space [Stibor2006].
[Darlington2005] | C. F. Esponda Darlington, "Negative Representations of Information", [PhD Thesis] The University of New Mexico, 2005. |
[Dhaeseleer1996] | P. D'haeseleer, "An immunological approach to change detection: theoretical results", in Proceedings of the 9th IEEE Computer Security Foundations Workshop, 1996. |
[Dhaeseleer1996a] | P. D'haeseleer and S. Forrest and P. Helman, "An immunological approach to change detection: algorithms, analysis\n\tand implications", in Proceedings of the IEEE Symposium on Security and Privacy, 1996. |
[Forrest1994] | S. Forrest and A. S. Perelson and L. Allen and R. Cherukuri, "Self-Nonself Discrimination in a Computer", in Proceedings of the 1992 IEEE Symposium on Security and Privacy, 1994. |
[Forrest1996] | S. Forrest and S. A. Hofmeyr and A. Somayaji and T.\n\tA. Longstaff, "A Sense of Self for Unix Processes", in Proceedings of the 1996 IEEE Symposium on Security and Privacy, 1996. |
[Hofmeyr1998] | S. A. Hofmeyr and S. Forrest and A. Somayaji, "Intrusion Detection using Sequences of System Calls", Journal of Computer Security, 1998. |
[Hofmeyr1999] | S. Hofmeyr and S. Forrest, "Immunity by Design: An Artificial Immune System", in Proceedings of the Genetic and Evolutionary Computation Conference\n\n\t(GECCO), 1999. |
[Hofmeyr1999a] | S. A. Hofmeyr, "An Immunological Model of Distributed Detection and its Application\n\tto Computer Security", [PhD Thesis] Department of Computer Sciences, University of New Mexico, 1999. |
[Ji2007] | Z. Ji and D. Dasgupta, "Revisiting Negative Selection Algorithms", Evolutionary Computation, 2007. |
[Stibor2006] | T. Stibor, "On the Appropriateness of Negative Selection for Anomaly Detection\n\tand Network Intrusion Detection", [PhD Thesis] Darmstadt University of Technology, 2006. |
Please Note: This content was automatically generated from the book content and may contain minor differences.