Intrusion tolerance include reacting, counteract Eng, recovering masking a wide set of faults encompassing intentional and malicious faults (intrusions), which may lead to failure of the system security properties if nothing is done to counter their effect on the system State. Instead Of trying to prevent every single intrusion, these are allowed and tolerated.
The intrusion tolerant system will trigger mechanisms that prevent the intrusion from generating a system failure The common approach taken today for securing our critical systems is to build a layers of defenses around them using security technologies, such as firewalls and access control mechanisms. The machines inside the security layer are assumed (trusted) to be correct. The goal is also to protect the canines inside from attackers on the outside. While critical systems may have operated exclusively on private .NET. Rooks in the past, thus affording them some degree of protection from external attackers.
Many of them are now connected to the Internet and are vulnerable to a wide range of threats that may not have been considered threats when the systems were originally designed. Given that thousands of machines are compromised on the Internet each day, it seems likely that some of the attacks will be able to breach the security walls of even those critical systems specifically designed tit security in mind. In addition, insider attacks, such as from disgruntled employees, who take advantage of existing security vulnerabilities, are becoming more and more common and are growing sources of machine compromise.
Such attacks do not need to breach the security at all. The attacker already has the credentials to access the system, and the power to abuse them. Intrusion Tolerance: The fault tolerance capabilities of critical systems depend on building systems that are intrusion-tolerant Intrusion tolerant systems can continue functioning even if part of the system is compromised. The design of intrusion-tolerant systems is motivated by the assumption that it is not possible to enumerate all of the potential attacks on a system that can be mounted by compromised machines.
Therefore, the fault tolerant system is designed in a model assuming that there are little possible ways in which faulty components can fail. It assumes that systems remain to a certain extent vulnerable and assumes that attacks on components or sub-systems can happen and some will be successful. It ensures that the overall system nevertheless remains secure and operational. A fault-tolerant system may be blew to tolerate one or more fault types including 1) Transient, intermittent or permanent hardware faults 2) Software and hardware design errors 3) Operator errors or 4) Externally induced upsets or physical damage.
Hardware Fault-Tolerance The majority of fault tolerant designs have been directed toward building computers that automatically recover from random faults occurring in hardware components. The techniques employed to do this generally involve partitioning a computing system into modules that act as fault-containment regions. Each module is backed up with protective redundancy so that, if the doodle fails, others can assume its function. Special mechanisms are added to detect errors and implement recovery.
Two general approaches to hardware fault recovery have been used: 1) Fault Masking 2) Dynamic Recovery. Fault masking is a structural redundancy tech unique that completely masks faults with a set of redundant modules. A number of identical modules execute the same functions, and their outputs are collected to remove errors created by a faulty module. Dynamic recovery is required when only one copy of a computation is running at a time, and it involves automated self-repair. As in fault masking, he computing system is partitioned into modules backed up by spares as redundancy.
In the case of dynamic recovery, special mechanisms are required to detect faults in the modules, switch out a faulty module, switch in a spare, and instigate those software actions necessary to restore and continue the computation. In single computers special hardware is required along with software to do this, while in multipurpose the function is often managed by the other processors. Dynamic recovery is generally more hardware efficient than fault masking systems. Therefore this is the approach of choice in high performance scalable systems.
Its disadvantage is that computational delays occur during fault recovery, fault coverage is often lower, and specialized operating systems may be required. Software Fault- Tolerance Software that can tolerate software design faults (programming errors) uses both static and dynamic redundancy approaches similar to those used for hardware faults. One such approach, N-version programming, uses static redundancy in the form of independently written programs that perform the same functions, and their OUtPUtS are voted at special checkpoints.
An alternative dynamic approach is based on the concept of recovery blocks. Programs are partitioned into blocks and acceptance tests are executed after each block. If an acceptance test fails, a redundant code block is executed. An approach called design diversity combines hardware and software fault- tolerance by implementing a fault-tolerant computer system using different hardware and software in redundant channels. Each channel is designed to provide the same function, and a method is provided to identify if one channel deviates unacceptably from the others.
The goal is to tolerate both hardware and software design faults. This is a very expensive technique, but t is used in very critical control applications. Replication is a widely used technique for improving the availability and performance of client-server systems. Byzantine Fault Tolerance Byzantine fault presents different symptoms to different observers. Byzantine failure is the loss of a system service due to a Byzantine fault. Some of the approaches to Byzantine fault mitigation are full exchange (e. G. SPIDER), hierarchical exchange (e. . The Safest architecture), and filtering (e. G. HTTP star topologies). The Scalable Processor-Independent Design for Electromagnetic Resilience (SPIDER) implements a classical approach to Byzantine fault mitigation using full message exchange and voting The Safest uses self-checking pair (SSP) buses and SSP Blue to ensure that Byzantine faults are not propagated (I. E. A Byzantine input will not cause the halves of a pair to disagree). HTTP star topologies uses centralized filtering to remove the asymmetric manifestation of a Byzantine fault.
This provides the architecture with a truly independent guardian function and a mechanism to prevent systematic failure. Design/Replica diversity for Byzantine fault tolerance Intrusion-tolerant systems are built using four or more replicated servers. The common vulnerabilities affecting more than one system significantly decrease as the no. Of. SO pairs increases . The number of vulnerabilities that affect more than one SO depends on how diverse the configuration is: they are higher for ASS from the same family (e. G. , BBS) but very low (and in many cases zero) in ASS from different families (e. . , BBS and Windows). Several methodologies have been used to identify the best SO pairs which have no vulnerabilities in common. The Common vulnerability indicator calculates the vulnerabilities for a given year y that were shared by ASS A and B over a eroded of Tsarap previous years. CIVIC is built to ensure the following desirable properties Building replicated systems with diversity follows any of the three procedures to choose the best SO pair. Common Vulnerability Count Strategy (CSV’s), is simplest approach which uses raw data collected over a large interval for selecting SO pairs.
The second strategy, Common Vulnerability Indicator Strategy (CSV’s), uses the CIVIC described in the previous section to select SO pairs taking into account the incidence of common vulnerabilities over the years. It is indicated when one wants to give greater importance to more recent vulnerabilities, because it is a weighted sum. The third strategy, Inter-reporting Times Strategy (Artist), focus not so much on common vulnerabilities directly, but on the frequency in which vulnerabilities appear in the two Ass.
This is best when one wants to give more importance to the time interval between successive reports of common Vulnerabilities Replication Based Fault Tolerance Technique Replication is a process of maintaining different copies of a data item or object. In replication techniques, request from client is forwarded to one of replica among a set of replicas. This technique is used for request that do not modify state of service. Replication adds redundancy in system. Replication protocol can be described using five generic phases.
These phases are client contact, server coordination, execution, agreement, coordination and client response . Fusion Based Technique Fusion based technique overcomes the process of no. Of. Backups required by replication method. Number of backups increases drastically as coverage against number of faults increase making replication method costly. Fusion based technique handles multiple faults with fewer backup machines. In fusion based fault tolerance a technique, back up machines is used which is cross product of original Proactive-Reactive Recovery Earlier, Proactive recovery can only be implemented in systems with some synchrony.
In short, in an asynchronous system a compromised replica can delay its recovery (e. G. , by making its local clock slower) for a sufficient amount of time to allow more than f replicas to be attacked. To overcome this fundamental problem a hybrid system model has been developed. The combination of proactive and reactive recovery increases the overall silence of intrusion-tolerant systems that seek perpetual unattended correct operation.
It also guarantees of the periodic rejuvenations triggered by proactive recovery and ensuring that, as long as the faulty behavior exhibited by a replica is detectable, this replica will be recovered as soon as possible and there will be an amount of system replicas available to sustain system’s correct operation. For the first time that reactive and proactive recoveries are combined in a single approach can be used in a concrete scenario, by applying it to the construction of the CICS, an intrusion-tolerant really for critical infrastructures.
Proactive-reactive recovery approach guarantees the availability of a minimum number of system replicas necessary to sustain correct operation of the system . Conclusion Fault tolerance consists of two major components; failure detection and recovery which are used to identify important issues such as fast, adaptive, accuracy, completeness, confidence and able to detect multiple faults. A reliable detector must detect all faults as early as possible but not suspect a working process or processor and at the same time. Recovery time artist be very less and efficient.
Recovery time IS reduced by high availability of log information and starting the recovery from last checkpoint instead of complete restart. Multiple faults with performance is future trends of fault tolerance techniques Performance of a multiple fault tolerance algorithm depends on how much algorithm is capable to prevent the further loss due to faults. One of many ways to decrease the probability of common vulnerabilities/faults on the replicas of intrusion-tolerant systems is by using diverse TOTS software components. References: 1) Intrusion Tolerance Via Network Layer Controls, Dick O’Brien, Rick Smith,