Grid Computing – The General View
The term, Grid, is chosen as the concept is seen analogous to that of the power Grid consistent, dependable, and transparent access to electricity, regardless of source. Grid computing, being a new approach to network computing, is also known as metacomputing, scalable computing, global computing, and Internet computing (Baker, 2002).
The concept of grid computing was first explored in the 1951 I-WAY experiment. In this experiment, high-speed networks were used to temporarily connect resources from 17 sites in North America. This started the number of new technology researches about grid computing. Some of the first few examples include the US National Science Foundation’s National Technology Grid to serve the university and the Information Power Grid of NASA (Foster, 2000).
The Emergence of Grid Computing
Here are the principles underpinning the emergence of grid (Baker, 2002):
1. Multiple administrative domains and autonomy. In addition to the fact that the resources in grid are diversely distributed across different locations, these are owned by different organizations. This principle simple stresses the autonomy of resource owner that needs to be honored.
2. Heterogeneity. Grid expects the applications to be within the vast range of technologies, in addition to the multiplicity of heterogeneous resources.
3. Scalability. The number of resources integrated in the grid may expand enormously and grid should be able to handle and manage the possible expansion, no matter how little they started. The applications in the grid must be designed such that it can accommodate the distance factor (bandwidth and latency concerns).
4. Dynamic. Resource failure is but a common scenario in grid. With so many resources in grid, it is likely that some of them will fail. Effective and efficient resource and applications management must be in place.
More than just being a computing infrastructure, grid is a technology that has the power to unify diverse and distributed resources (Baker, 2002).
Figure 1 illustrates the basic grid topology.
Figure 1. The Basic Grid Topology
Grid Computing – Its Worth
Figure 2 best exemplifies or graphically summarizes the worth of engaging into grid computing systems.
Figure 2. Graphical Summary of Worth of Engaging Into Grid Computing Systems
Types of Problem Grid Computing Addresses
Grids are categorized based on the type of solution it intends to cater.
Computational grid is the type that requires high capacity servers for great computing powers. For desktop machines, scavenging grid is the one most widely used, where machines are scavenged for available resources, such as CPU cycles. Computational grid is defined as a conglomerate of computers and information located across different venues within the multiplicity of administrative domains. This intends to provide authorized users easy access to sharable information (Humphrey and Thompson, 2004).
A data grid, on the other hand, is used for providing access to data in a wide range of organizations, where the users do not actually have an idea of the origin of the data. Their main concern is the access to the data pool. The data grid allows the users to share and manage the data across, and to manage the security issues like who can access what. Most grid environments support the so-called information service to let users locate resources (Yang, S. et al., 2006).
Grid and utility computing enable distributed computing within the pre-defined range in a grid environment. Grid community is dominated by Europe and North America. The commercial application of this technology is very limited in Korea, but after several grid business seminars in the country, large IT companies like Korea Telecom and Samsung Networks are now seriously considering investing in this grid technology (GGF13 – The thirteenth Global Grid Forum, 2005).
Uses of Grid Computing
In this section, the typical usage of grid computing is discussed. Grid computing concerns the use of computers to utilize power grids to tap the spare capacity in the linked systems. This means users can now share databases and computing power over the web. Originally, grid computing aims to harness the utilization of computing resources and improve its scalability and availability. It is conceptualized to harness the power of interconnected processing machines and their accompanying storage, without the mediation of network. In turn, it offers the same ability as the power computer, in a far, far lower cost. Most grids use identical processing machines running in operating systems that are configured identically. This uniformity helps to make security management easier.
A grid can be used in accessing practically any type of resource, such as computation and storage, among others. Among the basic facilities needed for grid computing are: (1) security (concerned with secure data transfer, authentication, and authorization), (2) resource management (concerned with remote job allocation and submission), (3) data management (secure and robust movement of data), and (4) information services (inventory of available resources) (Yang, S. et al., 2006).
On a layman’s view, grid can be seen as a largely integrated computational and collaborative environment that provides seamless connection among entities in that environment. Grid can provide the end-user with any of the following services (Baker, 2002):
1. Computational services. These services are concerned with providing ways to let the user execute application jobs on computational resources that are distributed either individually or collectively. This type of service ensures the application job is securely executed.
2. Data services. These services are about secure access and management to and of datasets. Grid is responsible in providing scalable access to data and scalable storage of data. To be able to do it, data may be catalogued or different datasets may be stored in different locations. This way, grid can create an illusion of mass storage.
3. Application services. These cover the provision for remote and transparent access to software and libraries.
4. Information services. These cover the retrieval and presentation of meaningful data, by using all the other service types of grid.
5. Knowledge services. Covered here are knowledge acquisition, retrieval, maintenance, and usage.
Grid computing started to build its name across the scientific and academic communities. Grid computing is often associated or confused with another distributed computing model called, peer-to-peer computing. Grid computing drives the development of technologies like utility computing, wherein the applications, business processes, and infrastructure are delivered over the Internet for a fee, via the secured, scalable, and shared computer environment. The main goal is to provide customers the needed processes, infrastructure, and application on demand, wherein they will be automatically charged for the consumption or usage of the utility (Marchesini, 2005).
Centralized database server model can better exemplify or illustrate how grid is able to act as central force among different, independent systems within the grid is seen in figure 3.
Figure 3. Centralized Database Server Model Within the Grid
Technologies Used To Setup Grid
The Hardware Components
Grid computer is one of the major components behind the grid computing. A grid computer is a cluster of multiplicity of computers of the same class. Grid computer is part of the network where it shares storage, printers, memory, etc. It is the sophisticated operating system that is in-charge of load sharing and computing processes. Machines in grid configuration cost less than 10% of the super computer. Grid computers can be built from low-end computers. The way to increase the computing power in the grid is by adding more members. On the other hand, using microcomputer as part of grid lessens the computing power. Grid can be viewed as if there are billions of people doing the computation, which will make it faster, compared when done individually (Grid Computers, 2007). Figure 4 describes the different layers that compose a grid topology.
Figure 4. Different Layers That Compose a Grid Topology
So, what are the resources needed to form a grid?
1. Grid fabric. All the globally distributed resources (computers, operating systems, databases, storage disk, etc.) accessible over the Internet make up grid fabric.
2. Core Grid middleware. This is responsible in management of remote access in the grid, allocation of resources, overall aspects of Quality of Service, and storage access.
3. User-level Grid middleware. Falling in this resource are the application development environments and programming tools.
4. Grid applications and portals. Grid portals offer Web-enabled services, where the end-user can submit information or transaction and where the results can be collected from remote terminals via the Internet.
There are software designed to act as infrastructure to support the computational grids. This software aims to promote easy and secure access to resources across the grids. One of the common examples is the Globus toolkit. Regardless of resource physical location and ownership, software like Globus allows users to access the targeted resources. With the absence of comprehensive understanding of grid, however, both the administrator who contributes the resources to the grid and the grid user may be in grave security compromise. With the emergence of wide use of grid computing in the enterprise world, understanding the repercussions is imperative. This is high time to promote awareness among grid users since grid usage is getting more and more of a commonplace nowadays. Globus Toolkit is where all major grid projects are built from (Foster, 2000).
The Globus Project
The Globus project is the aftermath of the joint effort of researchers, developers, analysts, and IT experts around the globe on understanding and securing the grid computing. Globus toolkit is one of the most used toolkits for creating, maintaining, managing the grid setup or environment. This toolkit provides the necessary framework for the needed secure, robust, and reliable infrastructure. Globus toolkit provided a security model for grid computing called, Grid Security Infrastructure (GSI). Globus focuses on single sign-on for users, protecting user identities, and interoperability with other security solutions in the local setup (Ferrari, 1999).
GSI uses SSLv3 as the primary means of authentication. Discussed below is the process into which SSLv3 is used in GSI (Ferrari, 1999):
1. Each entity generates a public/private key pair. This key and other pertinent information are then combined into a certificate request.
2. At least one certificate authority is then created in the network (more often, there will be more than one certificate authority), each having its own private/public key pair.
3. The private key is used to sign other entities’ request, which then certifies the requesting entity. This is to vouch that all the information stored in the certificate is validated to be correct and that the public key is right one to belong to that entity.
4. Each entity (computer or other type of resource including the user himself) in the grid should have signed certificate authority. Before signing the certification request, the CA requires proof of information.
Another standard evolving now in the world of grid computing is the Open Grid Services Architecture (OGSA), where Globus Toolkitv3 is the reference for implementation. This one gathered much industry support and is considered to have defined the standard for the overall services that need to be provided in grid environments (Finkelstein, 2004).
For largely distributed information, Public Key Infrastructure (PKI) promises secure information services. Some applications, like S/MIME, use keys for message encryption. Some client-side SSL, on the other hand, uses keys for authentication. Bottom line is that all applications make use of keypairs (Lorch, 2004).
Some users put their key on hardware device like USB token or smart card; while some even directly put it under the hard disk, like in the case of browser or system keystore. Cryptographic Service Provider and keystore are features of modern operating systems known nowadays, such as Windows and MAC. In view of this, most cross-platform software systems add keystore in their systems, to enhance portability, such that they do not have to rely on the particular Operating Systems for using a user’s keypair.
Keystores can be categorized into four (Foster, 2002):
Software token. It stores the keys on disk in some form of encrypted format.
Hardware token. It stores the keys and performs key operations. There are cases where the application directly interacts with the device. There are also instances where the operating systems mediate between the two.
Secure coprocessor. It stores the key, uses cryptographic hardware to perform key operations internally, and in some cases, directly houses the application.
Credential repository. It stores private keys for various users.
The Grid computing community came up with credential repository, called MyProxy, designed to provide both security and mobility to users. This makes use of IBM 4758 for storing keys and carrying out cryptographic operations. MyProxy was originally designed with the purpose of allowing grid users obtain and delegate access from their respective credentials from among the diverse location in the grid. There are two main reasons for embracing MyProxy. One is by getting the private keys completely off the desktop. This then shrinks the trusted computing base. And in the event the user or the process calls the need for the credentials, the trusted computing base, using the process of delegation, expands to include the desktop temporarily. Second is its support for mobility. With the presence of central repository for private keys, it can be accessed from anywhere within the grid, without the need for the physical transfer (Dyer, 2001).
SHEMP is a keystore solution built on MyProxy, secure hardware, and set of policy tools. SHEMP takes advantage of Proxy certificates for applications that require more than just mere authentication. SHEMP uses Extensible Access Control Markup Language that is built on the mechanism that allows users to define their respective key usage options, based on both the client and the repository properties. This solution solves the problem of unsuitability of modern desktops to be used as public key infrastructure clients, which leads to the following related issues (Marchesini, 2005):
Large probability for the private keys to be stolen
There is possibility for the keys to be used for transaction the user was not actually aware.
SHEMP is built on the following foundations (Marchesini, 2005):
MyProxy and Proxy Certificates. These are used in getting keys away from desktop and thus, promoting mobility to users. Proxy Certificate is currently being widely used in grid deployment.
Secure Hardware. This functions like a basic keystore at the repository side and at the client side. This allows shrinking of trusted computing base of each machine and thus, reducing the risk of disclosing keys.
Policy Language. This is used in expressing key usage and policies about delegation and expressing repository’s and client’s attributes.
Commercial Software Used for Virtualisation Architecture
One seen great solution for sharing grid compute resources in a single machine is by using virtualization architecture. There is number of commercial software that can be used for this effort. To name a few, we have VMware GSX Server and Microsoft Virtual PC. One of the open source software offering this kind of solution is the Xen hypervisor that is developed at the University of Cambridge, Unite Kingdom. The grid community is now exploring the use of this Xen hypervision. Xen is capable of providing independent and secure virtual machines. With Xen, the master operating systems instance is the only instance that has access to the resource provider’s hard disk. It is only the systems administrator who has access to this master systems operating instance. With Xen, it is now possible for solutions producer to deploy an application and required input data to the resource provider’s node, plus they can now deploy the complete installation of operating systems, along with the needed shared libraries, external software, and the like (Smith, 1999).
This only gets to show that by using virtualization, the solutions producer now takes number of administration and configurations tasks conventionally handled by the resource provider. The systems administrator’s tasks are trimmed into providing systems that run Xen (along with the master operating systems instance) and network connectivity. Consequently, the solutions producers now have the administrative privileges that allow them to freely define and provide the needed software architecture, without the need to find destination systems to the grid. Providing the complete Xen operating system instance likewise promotes opportunity for solutions producer to do a complete system installation without additional cost incurred by operating systems licenses (Kim, 2004).
Setting up Xen virtual involves series of activities. First, the solutions producer needs to analyze the full requirements of his grid applications, from third party libraries, runtime environment, complex third party applications, etc. Although this may seem to be a complex task for solution producer, most Linux distributions supporting packet management already support the solution for these requirements. Prerequisite to deploying system on Xen-based virtual environment is the creation of customized and minimized Linux installation that can be done by accumulating all required packets. Since systems will be deployed over the Internet, it should be small in size. Xen requires no operating systems kernel as the kernel is provided by the Xen systems. This is in view of compatibility and security reasons. Requiring this predefined kernel provided by the master operating systems administrator acts as piggyback to the level of security, specially when the Xen hypervisor suffers from security attacks. Next is the customization of Linus file systems images, followed by the generation of the configuration file. The job-specific file system images are now ready for deployment to the systems. Within the grid, the system that fits the requirement seen in the configuration file is selected. The whole of file system image is then transferred to this selected system for execution. Initialization (configuration, creation, and booting of the so-called XenU instance) of the grid job in the system destination follows. The grid job is then executed. The grid job returns the calculated result via the Internet (Smith, 1999).
In the described Xen setup and procedure, resource provider is responsible in providing the physical components of running the job. The resource provider then installs Xen hypervisor software, together with the Linus OS instance (this instance is under the total and sole control of the resource provider). This kind of arrangement makes it easy for the resource provider to have ultimate control over the usage of the resource, including network connectivity. The application the solutions producer loads into this system is distributed solely to designated users. The user tests the solution on local computers, where he has the option customize the Linux image to augment the generic Linux image generic or innate to the system. Of course, customization requires compliance with the set dependencies of the generic Linux image and requirements initially set by the application (Smith, 2004).
By having Xen, the aspect of authorization does not really differ from how it is in traditional grid systems. Since the resources the application is required to access only involve the CPU cycle, main memory, and disk storage, the job execution can be experienced as if it is running on their own systems. It is the Xen’s sandboxing that is in-charge or restricting the consumption of CPU cycle and disk space of a XenU instance. The firewall and packet filter that are in the Xen domain can constrain network connectivity, aiming at protecting the system from illegitimate password and/or certificate. Since data and resources of different users are in separate domain, these entities are protected. The resource provider can always destruct the attacking instance and still keeping the sole control of both the hardware and resource allocation in the wideness of the system (Smith, 1999).
Auditing in Xen system is easier and far more organized since every single information passing through the domain passess through the hypervisor. The hypervisor record and intercept it (if need be). One of the seen weaknesses is that it can only record information about network packet, the received and sent data, and the disk read/write operations.
Confidentiality is more easily attainable using the Xen system since the operating systems installed in each instance is exclusive for a particular user. This user has exclusive access to the corresponding virtual disk image. This ensures each solution producer is protected from other solution producer and each user is likewise protected from the solution producer used by different users in the grid.
Since the users practically do not have access to the operating systems instance or other users’ instances, it protects the solution producer from suspiciously malicious users.
Several studies in the past have also studied performance of the different virtualization solutions, including Xen itself. Drawbacks can be in terms of runtime performance and the cost of deploying sandbox image.
There are rooms for improvement or optimization in the Xen system. For classes of applications requiring similar configurations, which can drive the systems administrator to come up with pre-configured operating systems images. Another room for improvement may be the provision for generic operating systems in a peer-to-peer system, where images are stored in grid notes in replicated and distributed manner, thereby managing the network traffic during image transfer (Volbrecht, 2000).
Given this setup, both the solution producer and the user can actually do remote checking of the remote system’s integrity, in view if ensuring the application is not altered and data is secured. One way of making system hacking difficult for attackers is the implementation of the hardware-level security measures. Attacking the hardware is more difficult than attacking the software systems.
The Efforts of Concerning Institutions
Architectural Engines for Information Security (AEGIS) came up with a systematic approach of classifying and dealing with the security functionality of both hardware and software. This applies in the broad range of computing systems. The advancement of this system AEGIS proposes is that it addresses both the hardware and the software functionality at the same time. Some of the problems that AEGIS can solve are (AEGIS: Architectural EnGines for Information Security (AEGIS Publications), 2007):
Tamper-evident environment. The systems provides an authentication within the computing system such that software and hardware tampering is caught at once.
Private tamper-resistant environment. The system provides private and authenticated environment where attackers cannot get a single piece of information about the software they are tampering. Cryptography is secure in this environment and each software and hardware medium undergoes this secure cryptography and becomes copy-protected.
So, up to what extent is AEGIS responsible in the security in the computing environment? AEGIS covers the privacy and integrity of both the application and the information. It does not cover attacks that completely destroy the operating systems.
What else is in AEGIS for grid computing? Both the tamper-resistant and tamper-evident environments AEGIS offers enable large-scale grid computing on a multitasking server setup. In this environment, computation power is guaranteed to have to have the ability to process data in a correct and duly private manner. In a private tamper-resistant environment, applications are enabled to run in a computer that is considered as trusted third party hardware.
There are four mechanisms in an AEGIS platform; each of which helps the designer of the system to come up with the correct partitioning (hardware and software) with assured security (AEGIS: Architectural EnGines for Information Security (AEGIS Publications), 2007).
1. Physical Random Function. Sharing a secret to a hardware device, in a tamper-resistant approach.
2. Certified Execution. This ensures the user/client that the computation was carried out in a systems architecture that ensures correct computing system.
3. Secure virtual machine. Each user is provided with a virtual machine, having all processes secure, by letting each process to be cryptographically protected from other processes in the system.
4. Integrity verification. Untrusted hardware components (hard disk, memory, etc.) are subject to integrity checking, using the integrity verification algorithm.
Sun Microsystemsâ says that grid is the future of computing in the IT world. The Sun Grid Compute Utility is the world’s first compute utility. This is composed of many other features, such as accessibility, affordability, power, and ease of use. Above all these is security. Sun Grid has multiple-layered defense at every level (Sun Utility Computing, 2007).
It is important that basic principles in grid computing be followed to attain the aspired results, such as (Foster, 2002):
1. It does not interfere with the existing autonomy.
2. It does not compromise the security of any aspects of grid.
3. There is no need to discard the existing network protocols, operating systems, and services, and replace it with new one just to fit in the grid.
4. The remote sites should be given the privilege when to join and when to quit in the grid.
5. It should not dictate the programming language to be used, tools, or libraries.
6. The infrastructure should be designed and built with no single point of failure.
7. It should have full support to heterogeneous environment.
8. It should be able to interact with legacy systems.
9. It should use standard and existing technologies.
It should provide appropriate synchronization of application programs within the grid.
Grid involves creation of large-scale infrastructure that requires the definition and acceptance of standard protocols. Although there are no formally accepted processes yet in grid computing, there is pattern in the way it is handled in the core technologies. In a grid environment, there is a very huge possibility of multiple users and multiple solutions producers having access in the same resource environment at the same time. There should be a provision for real-time access to resources without the need for the resource provider to personally do the access and resource monitoring, and thus, the security threats. These threats come from the increased number of concurrent participants accessing the same resources, the differences in their usage models, and complex interactive behavior not seen in individual transactions (Filkenstein, 2004).
There are various ways into which a user can access the information shared through computational grid. Each access type requires distinct security requirements and implications for both the user of the resource and the resource provider. Authorization, authentication, confidentiality, and integrity are the security issues accompanying the deployment of computational grid, in general. In the grid environment, the user accesses a virtual computer, composed of numbers of diverse computing resources not visible to the users. This setup requires the grid computing to come up with secure, robust, and reliable infrastructure to be built. With this said, security is the major requirement for grid computing (Humphrey and Thompson, 2004).
Grids are naturally protected from external attacks, given basically the same set of tools the enterprise network uses – configuration management, firewall, and authenticated access, among others. However, due to the nature of interaction between the host and the grids, there is a need for higher level of security in grid. Grid hosts must ensure there is no leftover from the previous users. That is, all codes must be eliminated (Humphrey and Thompson, 2004).
Gaining access to the grid system requires obtaining certificate from Certificate Authority and filling in request forms. This process requires the applicant to submit some form of valid identification like company ID, passport, etc. Processing of request commences after submission of all requirements. Qualified applicants are given digitally-signed certificate. As lax in terms of security as this process may seem, this provides means to control who can access shared resources across the grid and who is responsible for what malicious attempts. Firewalls and secure communication that can link the information to form part of the standard security practices that convince the customers to entrust their software and data be on grid (Marchesini, 2005).
Authentication ensures granting access only to valid users. This can be done through the use of passwords. In grid computing world, certificates are more widely used nowadays.
Authorization is the granting of due privileges to resources to entities authorized in using them. The current best practices for this are composed of secure file systems and having audit trails for the secure and reliable logging. Delegation is defined as allowing the users to impersonate the real user to carry out operations in the user’s behalf. The current best practice in place is the use of audit trails having secure and reliable logging. Secure communication pertains to protecting the content and privacy of the content or information. The best practice in place to attain this is by having encrypted content through the anonymous communications channel.
Confidentiality pertains to ensuring information is not in anyway read or accessed without due authorization. The current best practices to attain this include strong authentication, having encrypted storage, and having secure communications.
Data availability is ensuring there is backup to the data, in case of whether short- or long-term failure, let us say on the side of the server or the network links. The current best practices are versioning mechanism of operating systems, journaling of operating systems, advanced hardware storage solution, and availability of spare storage for the execution of job, and having user’s input/output data in archive storage. Auditing means tracking the state of the system for some analysis, usually forensic. The current best practice for this is the addition of confidentiality to audit logs though cryptography and adding reliability to audit logs through windowing connection-oriented protocols (Smith, et al, 2004).
Access Control Within the Grid Systems
Monitoring the access to the resources can be done by the system administrator, the grid administrator, or the stakeholder himself (for his own resource only). The information that can be gathered from the monitoring can be used for security monitoring and for the up-to-date intrusion detection. Usually, both the allowed and denied accesses are kept tracked. In tracking all types of access, the resource gateway must have the needed logs with information like user identification and time of access. The stakeholders need to have access to the logs for their own resource. Restricting the log to all the others who want to view it is up to the administrator. In addition to the logs, the resource gateway has to recognize and notify stakeholder of questionable or troublesome access request to a particular resource (Zsolt, 2003).
Prior to the user running any application on a particular host or machine, the user needs assurance his proprietary software will not be compromised and will not be exposed to the possibility of being stolen. The user or the grid infrastructure itself may set certain users who should have access to the information. The use of the applications should be keen enough to check the availability of security services and/or how to invoke them. Security is non-negotiable and the people involved and affected by the breaches should all be aware and cautious. They are the grid application developer, grid users, and grid resource provider. The grid users should know the security implications of interacting with the components of a grid. The grid application developer should know the best practices involved in promoting and maintaining security in the grid. For grid resource provider, it is important to know the implication of the program and usage of grid in the local machine itself and the interaction among users and other computers in the grid. Both the services and applications must understand and embrace the concept of configuring security on per user approach and have it applied, accordingly. If, from the design phase, security issues are considered, one can avoid redesign and rework that may arise midway large-scale implementation (Ferrari, 1999).
In general, the grid computing environment is assumed to have the following components (Humphrey and Thompson, 2004):
Each user possesses a grid-wide, unique identity that can be verified by all the other grid principals, regardless of the administrative domain.
To be able to map the grid identification to local user identification within the control of the local administrator, some local resource managers require for their resources the legacy local user identification. The local resource manager enforces the access control using the legacy access control mechanisms. There must be a convenient way for users to request for access grants and for the stakeholder to grant so.
In a grid environment, it is uncustomary that all identities are issued and verified from a single source. Each authentication source, therefore, needs to be the authentication server for the concerning entity. The applications must be able to vouch for the credibility and capability of the authentication server in terms of the service it provides.
In grid computing, user may wish to put into a consolidated and coordinated job resources from multiple sites. This is for immediate and faster execution of job. In this regard, the user may specify the pool of resources from where to choose, or the super scheduler may do it for the user. Remote execution usually requires full access (read and write) to files from remote sites. In this setup or usage of grid computing, the super schedule will have to interact with the grid’s information systems component to identify the possible host. This is done when the user has not yet selected the set of hosts. The super scheduler is responsible in determining if the target user is allowed to execute the program on the target grid machine and assess the remaining allocation of the said user. This can be done by directly querying each grid machine in question. Each remote job then requests resources from the super schedule. This task is done by each remote job in behalf of the user. Before running or executing any program in the host, there must be mutual authentication of both the user and the grid gateway on the host in question. Mapping of the grid identification to local identification and the submission of the request to the resource gateway to allow the job to run locally are done by the grid gateway on the specified host, where the executing job may be granted of read and write access to remote files in the user’s stead. The super scheduler, remote jobs that need to read/write files, and the controlling agent should be capable of acting like the user (the components will indeed act in behalf of the user). Machine availability is queried and handled by the super scheduler. Detailed information about the machine is regarded as confidential and thus, arbitrary entities are not allowed to make inquiry to it about availability and account information, in general. The super scheduler then is granted unlimited access to such information, where it is expected that no information divulgence to unauthorized users will occur. It is the grid gateway server that authorizes the use of target machine. It is the grid gateway that defines the start of the job, especially in the cases where there is sequence of processes involved that all run on different hosts and domains. Each entity must present the needed passes to be granted the same rights as the user (Volbrecht, 2000).
Transactions that require large flow of data in real time require advanced reservation of network bandwidth, data storage, and other resources. This advanced reservation requires granting the super scheduler and bandwidth broker same privileges as those of the user’s. If the reservation is granted, there is a need for assurance that the user can have it the moment of need or when it is claimed. For the bandwidth reservation, the bandwidth broker needs assurance that the connection is coming from an authorized site. In claiming the reservation, the user must be able to identify himself as the valid and authorized user. If the original requester is a group, the user must be able to prove itself as legitimate member of the group. If the requester is authorizing another entity to claim the reservation, there needs to be a transfer of claim tickets from the user/requester to the designated claimer (Volbrecht, 2000).
Job Control Within the Grid Systems
Two important requirements in grid computing are group membership and non-repudiation. No one strict way had been established as to how the group is defined. Non-repudiation refers to the requirement that no arbitrary denying of granted reservation will be done by the resource gateway (Volbrecht, 2000).
Total control over the job is one of the requirements of the user. That is, the user may disconnect from a long-running remote job then reconnect on a later time or in a different location, when the user wants to check on the status of the job or inject some steering information somewhere along the execution of the job. Steering is the process wherein the user defines an entry point and defines list of users who can access the job from that point.
Another approach of job control is when the job appears to be out-of-control for the systems administrator. In this case, the job needs to be forcefully terminated. This case necessitates the systems administrator to detect the out-of-control process and trace the origin to the concerning grid user. Another means is by the grid monitoring software, which will inform the systems administrator of the out-of-control process. Termination of the process is done by first informing the grid administrators of the planned termination. The grid administrator then coordinates the job termination across all sites across the grid. The grid administrator has two ways to terminate the out-of-control job. One is by direct interaction with the job to terminate its individual components and the other is by asking the systems administrators to do the termination of particular processes residing on their machines. It is the responsibility of the grid administrator to inform the owner of the job of its termination. The job is considered a resource. Both the job owner and the systems administrator have default full access rights to this resource. The mere fact that both the local and grid users access the resource, determining the origin of the process is not very obvious at once. This is where the grid software comes in. The grid software is responsible in keeping track of the job and providing information about how the job can be associated with the user in the grid who started using it. Since the job will be spanned across different domains in the grid, there will be more than one person to terminate the grid computation in the case of forced termination. It is imperative and decent to inform the user of the job termination (Smith, 1999).
It is typical for grid architecture to have information system that serves as repository of essential information about the jobs in the grid, such as location, status, and availability. Service providers want to ensure secure access to the services they offer and thus, the following applies to users who try to access, query, or update the service information in the server (Zsolt, 2003):
Authentication of the user
Application of access control policy as may be defined by the provider
Integrity of message (in case of message publication)
In case of multiple service directory that provides basically the same information, the user may require server authentication to validate the integrity of information. This form of mutual authentication goes with the purpose of avoiding the effect of malicious information to be treated as valid source.
The parameters set to ensure confidentiality of information involved encryption method and key agreement or proper key management protocol. There needs to be a mechanism in place that would allow negotiation of confidentially parameters between the user and the service. If the encrypted data need to stay longer in the server, there needs to be a provision for the encryption keys to stay longer or as long as needed, where the keys are expected to be protected all the while. The grid administrator is responsible in dissemination to users of key management requirements and since the keys are expected to cross multiple domains across the grid, it will then be a challenge for the users to manage the keys (Zsolt, 2003).
For the communication layer, grid systems rely on public-switched networks. The performance of the network can affect the efficiency of service of the applications in grid. Virtual private network (VPN) is a generic term, referring to public and private networks’ capability of supporting the communication infrastructure that connects physically distant locations, where the users from various locations should feel like accessing the private network (Andreozzi, et al., 2004).
A more detailed view of how the grid is setup (the different layers) in relation to the local environment (local database and local application) can be seen in figure 5.
Figure 5. The Grid Setup in Relation to the Local Environment
Attacks in Grid Computing
Internal attacks are those committed by people/software having valid access to the systems. External attacks, on the other hand, are those made by people/software that do not have legitimate access to the systems. These are the ones who break into the systems. Internal attacks are more common in enterprise networks as compared with external attacks. The constantly changing nature of computing in cluster environment and the emergence of grid computing paradigm, external attacks are becoming more and more possible nowadays. From being in proprietary environment, clusters are now being opened in open or standard systems, making them exposed to public networks.
External attackers can then dig down for the vulnerabilities of the resources made public. In grid computing, these open clusters are typically connected with nodes accessible via the Internet, making the clusters exposed to numerous attack tools over the Internet. The number of legitimate solution producers also increases, thus, in creasing the possibility of internal attacks. And the increasing number of legitimate solution producers does not just promote increased probability of internal attacks. It is also a contributing factor to the increase in number of external attacks because (1) as you increase the number of legitimate activities or processes, detecting the illegitimate activities then becomes a challenge. This is particularly visible incases where the attackers make use of legitimate activities persona and impersonate the legitimate user and (2) as you increase the number of solutions producer, consequently, the number of third party codes increases. These third party codes, as we know, have potential attack vectors with them (Filkenstein, 2004).
Combination of Internal and External Attacks
When the internal and external attackers connive, it will be very difficult to defeat. Internal attackers have ways to give external attackers access and they even have the power to cover the attack traces of these external attackers.
Large number of grid and clusters now run the source codes from external software provider (third party) and the commercial and practical considerations make it not feasible to have regular and thorough audit of these ‘foreign’ codes. This makes it a must for the assured and secure hardware and software components to have an interaction with the non-secure ones. This is too much of advancement from the traditional cluster that runs source codes in controlled base (Dyer, 2001).
In one of the security studies made by IBM back in the 80s, one visible trend in the hacking world is that hackers focus on attacking the clients or the servers. This may be attributable to the wrong configuration of services due to services that are not efficiently designed and implemented. Years before this, attackers are focused on network attacks. But because of the presence of firewalls and intrusion detection to protect the servers, hackers changed the strategy by focusing more on the servers (Dyer, 2001).
Solution producers, logically, should have greater range of privileges to be able to successfully install the applications in the server. It is of critical concern to have a system in place to make sure solutions producers are not misusing this wide range of privileges. The study of Smith, et al labeled this as the “private threats.” Proprietary grid environment no longer applies to on-demand computing. Users from various organizations rent services from same on-demand resource provider, which calls for the enforcement of strict access separation among all participants in the resource. Smith, et al labeled this as the “shred-use threats.”
The Threats in Grid Computing
Privilege threat is driven by the need for solutions producers’ ability to administer the systems without the intervention of the central administrator who is trusted by all participants having access in the resource; and by the need for solutions producers’ ability to code audit on all applications submitted to the system. Their ability to install and uninstall legitimate software may be used in installed illegitimate ones, as well (Smith, 1999).
Shared-use threat is further classified into three types, namely: (1) resource attacks, (2) data attacks, and (3) metadata attacks. These three types of attacks can be against the user, the solutions producer, or the resource provider (Smith, 1999).
In this context, shared-use resource attack against the users pertains to the illegal use of the user-owned resources. An example of which is when the attacker may impersonate as the user to make use of the user’s allocated CPU cycles. Shared-user resource attack to the solutions producer pertains to the illegal use of software or hardware belonging to the solutions producer. An example to this may be when the attacker modify the solution producers’ programs without due authorization from the solutions producers. The shared-use resource attack against the resource provider pertains to the illegal use of CPU cycles, storage, network bandwidth, and all the other resources duly owned by the resource provider. An example is when the attacker sends unsolicited emails, which are bulky in size, from the network host (Smith, 1999).
The shared-use data attack against users pertains to illegitimate modification or viewing of user’s data. The shared-use data attack against the solutions producer pertains to the illegal modification or viewing of data owned by the solutions producers. Shared-use data attack against resource provider is the illegal modification or viewing of data owned by the resource provider.
Shared-use metadata attack against user pertains to the illegal modification or viewing of data owned by the user. Shared-use metadata attack against the solutions producer is the illegal modification or viewing of the metadata owned by the solutions producer. Shared-use metadata attack against the resource provider is the illegal modification or viewing of the metadata owned by the resource provider.
Studies Regarding Threats and Attacks in Grid Computing
According to the security study conducted by IBM, 80% of Windows clients are having spyware infestations and about 30% are having back doors. This opens the door for identity theft. There are no external indications of an attack, such as loss of functionality. This makes detecting certificate theft undetectable for long period of time.
There are studies in the past that tried to study the security protocols for standard keystrokes and the way they interact with the desktop. It was concluded that it is not safe to store private keys in software tokens, for in many cases, the attacker can do private key stealing easily. This security issue with software token is just in addition to the fact that software keystrokes are immobile. That is, the only way to transport a private key that is installed on a desktop to another machine is to actually do key importation and exportation. Mitigating mobility, in this case, is at the expense of security, since this importation-exportation process exposes the key to attacks. With the technology trend, wherein large part of the population is becoming more and mobile with the use of various devices, these immobility and security issues become serious concern of this category of keystore (Dyer, 2001).
In response to this, hardware tokens promise to solve these two serious issues in software tokens. Studies, however, have shown that attackers can still use the keys at will. It is acknowledged, on the lighter note, that it can actually address the mobility issue, by using devices like USB tokens, provided appropriate software is installed in the machine and that the operating systems being used support the application (Dyer, 2001).
There had been several attempts to address the security problems accompanying software and hardware tokens. The use of secure coprocessor is one. The concept is to create a device that has security domain that differs from its host name’s. They say these devices can be used to shrink the trusted computing base, which is basically the set of all computer components critical to the computer’s security. However, this possesses weak computational power, considering the very high cost in the commercial market. Many IT experts who have studied and tested secure coprocessors claim that this is difficult to use (Ateneisi, 2005).
The known major cause of problems with private keys is the complexity found in most of modern software that makes drawing conclusions about computation result impossible. This software complexity leads to decrease usability, which, in turn, leads to decrease security, in addition to the extended set of software that should be trusted to let the systems operate as needed. Trusted Computing Base is how this set of trusted software is referred (Ateneisi, 2005).
In view of better understanding the security threats the grid computing community faces, Smith, et al., developed a threat model called, the “threat tree for on-demand computing,” which aims to accurately describe, categorize, and organize attacks of different nature. Attackers, in general, look for running the storage space or the service reliability. They can attack the system by having access to storage, computing procedures and cycles, and network bandwidth.
Security Issues in Grid
Security and Information Technology (IT) experts believe that there is tendency of increasing security risks by sharing great deal of information across the traditional trust boundaries. This is one concern in grid computing needing most urgent attention. This is particularly risky for large-scale enterprises embracing grid computing systems. The sophisticated manner into which resources or information are placed on grid can be the gateway of the attackers to steal corporate data. It is the building of many grid setups on top of legacy components that makes it more difficult to secure the grid setup itself (Krebs, 2004).
Although there had been some big IT companies that invested on grid computing (as it started to gain its popularity across the IT corporate world), the Computer Science and Engineering community recommends the clear identification and comprehension of risks accompanying the large-scale grid computing implementation. The pros and cons of safety of data storage in grid computing must be taken cared of, with high consideration of the confidentiality of information passing though the grid system on a corporate, huge scale.
Security on grid computing is influenced by number of parameters that affects the security of the interaction of users to grid services and grid resources. Example of such parameters is the confidentiality, validity, and integrity of the messages crossing the network and the stored data. Integrity, in this sense, is the concept that says the data should be read the way it was written – no unnecessary and unauthorized add-ons, while confidentiality is restricting the data from exposure to unauthorized readers. The server that maintains the security policy must strictly enforce the needed policy (Kim, et al., 2004).
Enforcing security policies has always been a very complex and challenging task for system administrators. In grid computing, this is all the more the challenge because the running time it (the security policy) would take may interfere in the processing time of other computing jobs in the system.
One of the known challenges in building grids is when stakeholders deploy and implement applications, each with different functionalities and applications programming interface. This is the case for majority of instances. But since restricting the user interfaces will most likely impede the utility of the stakeholders, there has to be an established way for imposing message confidentiality and integrity across the diverse applications in the grid. There are instances where different stakeholders who are having jurisdiction over varying usage rights own a single resource. This poses difficulty in supporting the access policy defined by stakeholders. To avoid conflicts, the server must strictly enforce the stakeholder-defined policy, with respect to the ability of each stakeholder to do modification of the policy.
During peak of workload, the computational loads are outsourced to organizations that offer computational power and the needed resources. For some practical and strategic considerations, it is preferred to outsource to organizations with as less administrative headcount as possible. This makes it a must for grid computing to acquire and configure the required resources without manual intervention from administrator for every transaction. This leads to exposure to security threats. Organizations would typically go renting computational resources that offer exclusive access to defined number of nodes for the needed time. This forms part of the secure mechanisms to share resources safely and efficiently for the on-demand computing. However, in the grid environment, this exclusive access complicates the process of how the process can be dynamically rent or outsourced (Lorch, 2004).
The Grid Users
Another concern in grid security is the authorization policies that apply for each resource. A query is usually sent to authorization policy interpreter. Samples of queries regarding this are when user wants to check his access to the resource, when the stakeholder wants to see another person’s access privilege to the resource, when the stakeholder wants to change the policy of the resource, and when the stakeholder opts to revoke access to certain users. The user’s access can be determined by either the independent policy analyzer or the resource gateway. This can be done through the assigned grid identification. Policy usually comes with a validity period. In case where the stakeholder wants to modify the policy, the stakeholder must be able to securely connect to the resource gateway then modify the policy. He may also choose to authenticate himself to a server that can be used to modify the resource policies. Digital signing of policy applies in case where the policy information is kept within the stakeholder’s local environment. Caching of access rights is not advisable, especially for a long time. There has to be a mechanism to flush the cached information. Incase of multiple copies of policy information kept in different locations, there must be a mechanism to link all these information into a single index so deletion can be done easily (Dyer, 2001).
Another aspect of security parameters is the trust between the user and the host. The user may opt to declare the host he wants to use – a single host or multiple hosts. The grid administrator has the power to restrict the user to interact with another user coming from a different administrative domain. In which case, the grid host should prove its membership in the particular domain. This is usually done using secure sockets layer, secure domain name server, or secure IP security. The server must authenticate the identity, location, and domain of the origin of request. It is really a challenge for grid software to implement all these security parameters (Zsolt, 2003).
Trust relationships in the grid environment are in the form of the following:
1. Resource Provider. This is the owner of the computational node and other physical resources
2. Solution Producers. This is the owner of applications and/or database deployed on the resource provider’s environment.
3. Users. This is the owner of the input data to be used in the applications of the Solutions Producer.
Resource provider and solutions provider can be the same person – same with user and solution producer. The resource providers provide access to both the solution producer and the user, through user account through which the nodes of the resource provider are accessed. Most custom software is installed in the home page of the solutions producer. There are cases, however, when it is necessary for solutions producer to install the needed software in the root, requiring the resource provider to either grant the solutions producer with temporary access to the root or perform the needed operations in the root in behalf of the solutions producer. In this setup, the resource provider must trust the solutions producer to effectively and correctly use the provided resource. In the same fashion, the solutions producer must entrust the resource provider not to illegally use the software and application hosted in the asset of the resource provider. Users must trust both the service producer and resource provider of the data they enter to service producer’s applications. On the other hand, there is no need for the solutions producer and resource provider to trust the user, as the user’s access is determined by the solutions provider’s capabilities plus the access privilege from the resource provider. The known standard security mechanism can protect both the resource provider and the solutions producer, if the two entities will cooperate (Kim, 2004).
In case of account misuse, both the password and the certificate should be able to be tied to the real owner of the account to trace and take actions on account misuse. The current best practice for this is to monitor the user behavior and identify should there be behavior consistent with that of the attackers. Users are the ones managing the certificates granted to them. Password can be memorized. On the other hand, certificates must be kept in a digital form. This then increases the security risk since there really is no enforcement for users to have key hygiene. The Certificate Authority has no means of ensuring the certificate in question was not in anyway stolen after it had been signed (Smith, et al, 2004).
Most of the grid cases known have multiple solutions producers and multiple users, and where there is just one resource provider in the picture, preventing the exposure of confidential information poses a challenge. Two typical ways of addressing these security challenges are (1) the concerning information is not confidential in nature and therefore, exposure to other users is not really of biggest concern; and (2) the resource provider granting exclusive access to specific solutions producer or set of users, to make resource sharing impossible. The latter approach, however, is not most preferred by the resource providers since doing so limits the use of the resources (Lorch, 2004).
Conflicting Interests of Grid Users
In grid computing environment, there are instances of conflicting interest among resource provider, solutions producer, and the user. One classic scenario is when users need to dynamically get resources based on some predefined rule like priority or deadline; the solutions producer has to get resource from the resource provider, wherein they will deploy the application specifically needed by the user; service providers, on the other end, try to maximize some resources and generate more profit, thus, controlling the resource allocation (Krebs, 2004).
If one program contains sensitive algorithm, running it in a foreign computer may lead to serious security risks. The current operation systems may be able to protect the user programs from malicious programs, provided they are running in the same host. This is in addition to the fact that even if you have trusted operating systems, physically intercepting communication between main memory and processor, with the purpose of gathering information about the executed program, may not be avoidable. Attacks from system administrator and malicious operating systems not running in the same host are not covered by the security offered by the current operating systems. These security risks make it not advisable to execute programs computing sensitive data on untrustworthy remote systems (Kim, et al., 2004).
Many types of grid usage require the principal to be impersonated by an agent. When the user asks the service to act in his behalf, this would mean the user granting the service unconditional and unlimited delegation that would allow the service to perfectly imitate the user. This is acceptable in an environment having all the systems fully trusted by the users but may not be the case for the general grid computing world. For all types of delegation under this context, the challenge is how to determine the rights that should be granted access to the service and the specific circumstances into which those rights are considered valid. The dilemma is that if one delegates too many rights, this could lead to an abuse of use. On the other hand, delegating too few rights could hamper the completion of the job. Thus, because of the difficulty to design and implement it, the restricted delegation approach is not currently practiced in grid computing. It is difficult and tedious to restrict delegation because determining the minimum set of rights the job execution would require is a challenge as it is. This is in addition to the fact that there is no one rough way of how servers name the rights across the grid. It is also a challenge determining the number of delegation level required. When the chain of delegated certificates is received by the resource gateway, the decision as to whether all the checkpoints the delegation has passed can be trusted or not. This entails a very big trust relationship on the side of the server gateway. The user may not at all know the delegation happening among all the hosts in the system. Therefore, it is possible that an authorized user that passed through the non-trusted domain will be rejected at the destination (Finkelstein, et al., 2004).
In a nutshell, security issues around grid computing are the following:
You give others access to your machine, where the person is able to do practically anything a normal user can do.
Remote machines across the grid must trust each other.
Management of private keys
The root on a particular machine can intercept on any data available on the said machine.
Unless the secure copy protocol is specifically used, log files on remote machines are readable.
There are problems associated with placement of private keys in complicated systems. Users are exposed to tricks of attacker to get their private keys or have them used it without them really knowing that they use it.
There are IT professionals who suggest taking the private keys off the desktop and have it stored in another secure device like USB token, to decrease the probability of being stolen and the required number of software to be trusted to make the system work. Further analysis, however, showed that while this approach makes the keys physically secure, it cannot shrink the trusted computing base.
The Known Security Solutions – Are They Really Solutions?
The security issue may be addressed by a secure coprocessing, but due to the high cost required by installing high-end devices, it does not pass as a practical choice. Lower priced devices cannot stand against some common hardware and root attacks (Zsolt, 2003).
About software attacks like computer viruses, most people would think that these could be prevented by the installation of anti-spyware or anti-virus. Without most of us knowing, the anti-virus/anti-spyware can only detect attacks that are known to them. They do not protect against unobvious viruses and thus, some viruses are designed to be stealthy enough for anti-spyware/anti-virus not to recognize. Having these do not equate to having zero attacks.
As for biometrics, they do not usually work unless, probably, layered. Information, once inputted, is nothing more than a binary stream. They can be easily captured and replayed.
Public Key Identifier is neither working. With this said, online banking and stock trading are not safe to do. Secure networks relying on Public Key Identifier are neither secured.
What about firewall and virtual private networks between the user and the server, or between two or more server hosts? This poses serious problem to grid security scheme. The security the static firewall and virtual private network provide is not likely to cover the grids spanning administrative sites and those that encourage the dynamic addition of resources. Grids need to impose and enforce their own security. Ironically, there is high probability that the firewall restricts grid-authorized access, because, typically, firewall only allows access from and to specific host and port. On the other note, in order to make connection using virtual private networks, there has to be specific authentication and authorization.
By having credential repository that can store certificate on behalf of the real user, wherein the user will access the certificate via password, the issue on keeping the certificates on unsecured individual user machines can be addressed. This system is a combo approach for password-certificate trust systems. But if somewhere in the end-user’s computer, the password for credential repository is stored, this negates the security purpose. Credential repositories can also be so enticing for the attackers. Managing these credential repositories is very important as poor management of these can have grave effect on security. Many would agree that the best way to go about it is by having centrally-managed authentication server (Smith, 2004).
Revocation of access right is another issue in grid computing. Currently, the revocation delay in grid computing is 10 to 60 minutes. It is desirable to have lower delays to let the resource provider urgently react to abuses in resources. Certificate Revocation Lists that are currently used to revoke access rights in grid environment do not meet the targeted delay (Smith, 1999).
Delegation is one of the most important security aspects in grid computing. Proxy delegation systems that are used to enable the single sign-on introduce new security risks, because single sign-on requires more credentials stored online and thus, the easier it is for attackers to steal the identity of the user. Possible solutions to this include the use of biometric data and utilizing cryptocard (Ateniese, 2005).
Solutions to Security Problems
“Security is non-negotiable!”
Computers play very vital role in disaster response and recovery programs. One could imagine the disaster the security breaches can bring. The malfunctioning of a transmission circuit protection device is said to be root cause of the power failure in North America back in August 14, 2003. They say this is due to the malfunctioning of the remainder of the power grid (Smith, 2004). Mitigation and control plans should be in place alongside deployment or use of grid computing. Encryption technologies can be one way to mitigate the threat (Krebs, 2004). In the collaborative grid-facilitated environment, utilizing well-defined and well-accepted libraries for security solution facilitates interoperability.
Human is the number one problem in security, as most security is user-dependent. How do we prevent people from sharing password? According to SonicWALL’s survey revealed that almost half of the user population does not memorize their respective passwords.
Computers then are put into main consideration of huge software attacks, as they can be both points for success and points for failure.
Whenever we talk of security in a general IT context, we mean several aspects. It can mean protection from hackers, protection from viruses, and protecting the physical aspect or the hardware. Security may also mean privacy than mean the enforcement of access rights to information and the credibility or licenses of the software used. When it comes to grid computing specifically, security concerns authentication or identifying the entities you are transacting with; authorization or the assignment of entity entitlement; and accounting or the tracking of all types of access made to the information.
Grid session is defined to be the set of activities performed by user in a grid computing environment. The following points should be done and considered prior to engaging to a grid session (Humphrey and Thompson, 2004):
Supercomputer centres usually require each individual user to have local user identification and allocation. Some sites may allow group allocation or may allow the user to use the resource in a limited manner. This all depends on the policy set by the resource owner. Usually, asking permission is done via email.
To restrict or limit the long exposure of private keys, the use of short-term certificates in replacement to the long-term grid identification is one desirable feature of grid system.
Most of the security sessions are set to last during the activity itself. Another option is for the user to build security parameters designed to exist for the life of the session. That is, the person can actually assign the role to a particular resource, such as being ordinary user or system administrator.
Promoting Security in Mobility
The security and mobility issues with hardware and software keys are addressed by using credential repositories. It significantly shrinks the trusted computing base. The safe storage of private keys no longer relies on general purpose desktops, because there is dedicated server keeping the keys and this server is maintained by some IT professionals. The user can access the private keys from multiple machines. While this means mobility, there is difficulty in using credential repositories, especially for application developers. When the user wants to use the private keys in performing operations, he may need to bring it to his desktop. An alternative to this is to create a protocol, which would allow the user to use his private keys in the repository server. For the already built applications, this would mean rewriting the applications to cater to this protocol (Krebs, 2004).
The Virtual Private Network
Virtual private network technology can overcome the showstoppers in the grid computing across network boundaries, such as firewall and non-routable IP addresses. However, using open VPN software, such as NetIO and NAS, has performance implication, mainly in terms of latency and CPU load on the VPN gateways. Since grid topology has the potential to cause traffic hotspots over the cluster of links, the more application being ran, the worse the traffic can be (Mache, et al., 2007).
Virtual private network, with the authentication and authorization required to make a connection, may be a possible way into which the standard access control will work and make it behind the restriction of the firewall (Zsolt, 2003).
Other Integrity Means
To address the access rights revocation problem, Online Certificate Status Protocols are seen to allow the system to immediately revoke the access rights. This again calls for the collaboration among resource providers to possibly share authorization to quickly respond to attackers not previously known to them (Smith, 1999).
As for the issue of confidentiality, operating system offers standard data access control that can be configured to protect the files of the users from others. However, in the event the attacker gains legitimate access to the root, it can negate the purpose of secure storage systems. Commands like illegal traffic sniffing can facilitate the gathering of data traffic and computational information. The fact that solutions producers must be granted the access to install custom software in the server makes it even more difficult to protect operational data that can be found in the standard operating systems. Sandboxing is the seen solution for the confidentiality issue. It separates one solution producer from the other and separates one user from the other (Filkenstein, 2004).
Communications over the grid can only be considered secured if it can guarantee integrity, non-repudiation of packet communication between entities over the grid, and confidentiality of information. Most of the solutions do not meet the need of the grid environment. Confidentiality can be addressed by cryptographic solutions, such as symmetric or asymmetric encryption; while non-repudiation can be addressed by having digital signatures. Virtual private networks are long-lived and are manually configured (Volbrecht, 2000).
For data availability, short-term failures like crashed operating systems, hard disk failure, and the event when attackers delete the files, require backup of files for short period of time. Versioning operating systems, journaling operating systems, or some sort of hardware solutions can address these. Electrical outages, flood, and fire are some examples of events that call for long-term backup of files. These long-term failure events can be handled by having offsite backup and long-term storage such as tape libraries (Volbrecht, 2000).
Auditing makes it possible for both resource provider and solutions producer to trace and track all actions taken in the systems. Auditing does not directly solve security threats in grid computing. But it does play very important role by providing the needed historical data to gauge analysis and conclusion about certain behavioral patterns, etc. This is the reason why attackers also focus on destroying and illegally modifying the audit logs of the system. Access to audit data is usually given to resource provider only. However, some systems also allow solutions producer to have access to this (Smith, 1999).
Ideally, development of an application that will go through grid computing environment should take into consideration the awareness and mitigation of security breaches issues associated with grid computing. The programmer or software developer should design the application such that it addresses common and known security issues in grid computing (Humphrey and Thompson, 2004).
It is said that while it is true that enterprise security pros may not feel the comfort with using shared resources, psychological constraints are controllable. Having a proof-of-concept prior to the use of new system, product, or approach can help promote confidence among users.
There had been previous studies stating the ways into which the delegation of rights can be restricted. These include specifying the rights that can actually be delegated, specifying the length of time where the delegated right is considered which means determining the closest approximate as to how long it would take to finish the job, and specifying the server or user as to whom the rights can be delegated (this requires knowing the servers the job execution would invoke) (Zsolt, 2003).
What about identity mapping? To enable the single grid sign-on and still keep a control on legacy access on all sites requiring sign-on, there has to be a mapping of grid entities to local user identifications. To put it simply, the user must posses a local identification at the sites that require it. The mapping the grid server gateway would use is something the grid administrator and the site administrator agree about. By giving the user local accounts in the machine he uses, he will be given more access to the host than necessary. In this case, the local site would have to trust the certificate authority of grid environment to identify the valid users; the grid administrators would have to trust the access control mechanism of the host; and the local site would have to trust the grid software to authenticate the users (Volbrecht, 2001).
As an alternative to this direct ID mapping approach, the site administrator may be allowed to define the trust relations among various sites and certificate authority.
The continuously increasing number of entities in the system makes it more and more necessary to enforce hygiene in password and credential repositories. For single user, it is easy for the administrator to monitor the usual pattern and thus, any unusual behavior can easily be identified. This is, however, not possible in grid computing. There are systems being proposed to automatically identify the type of users by using accounting in determining the corresponding command pattern. This system can be used in alarming the administrators in case of detecting a pattern similar to that of known attackers. This is not applicable to grid computing, however, since grid computing is capable of providing support in detecting certificate theft in certain boundaries. Resource providers should have joint efforts in sharing their known attack patterns and broaden the knowledge that their individual territory knows. This would help in building a wider range basis for detecting questionable and malicious behavior in the system. This would then protect them all from thefts (Smith, 2004).
By relying on data-access protocols not providing integrity and confidentiality, applications that generate file transmission/transfer across the grid are at risk. VPN can actually offer security and privacy to these applications. For each VPN, the forwarding control plane, the signaling information in the intermediate forwarding device, and the routing information in the intermediate forwarding device, VPN can support data separation. By associating priority level to the traffic exchange in the specific tunnel instances and by activating the so-called, differentiation mechanisms, VPN can offer traffic forwarding behavior to the traffic crossing in the shared public infrastructure. This has some advantages, such as efficiency in the distribution of workload among small grid sites within the wideness of grid and the overall reduction of traffic bottleneck in the grid network. VPN can actually group physically distant sites that belong to the same grid. VPN can help grids bypass the firewall in order to avoid performance penalties incases of data-intensive applications. This can be seen as advantage or security threat, depending on the way it is used and for what malicious or noble purpose. The replication of large data file within the wide area grid can be supported by the high-speed connectivity between grid nodes that are well-defined. Since the number of grid users cannot help but increase, large-scale grids can gain from stable and scalable VPN service. In addition to this, the dynamic provisioning of VPN can increase the number of users grid can serve and help the software producer do away with having concurrent VPN services to support. It is then strongly proposed the use of VPN in grids, to meet the following objectives (Andreozzi, 2004):
1. Improved privacy and security for applications in the wide area network
2. Improved communication (because VPN can provide on-demand traffic engineering when particular level of Quality of Service is of great concern)
3. Reduced performance penalties caused by firewall policies
The following design features are required in the architecture of grid environment to be able to give users access to various resources seamlessly (Foster, 2002):
1. Administrative hierarchy. This deals with how the grid will divide itself as it grows with time. This determines how the administrative information gets itself within the grid.
2. Communication services. The communication infrastructure needs to have the needed support for transmission of large bulk of data, group communications, and streaming data.
3. Information services. Information should be easily and timely obtained by the requesting, legitimate services.
Naming services. It provides uniformity in naming convention within the grid. Grid, in this context, is the term used to refer to wide variety of objects (computer, services, etc.).
While the larger majority in the IT world acknowledges the potential security risks in grid computing, there are those who are convinced that security in grid computing makes no difference in network security, wherein both lie within the manageable and easily controllable manner. There are those who say that the business gains outweigh the issues on security; and that the intellectual property present in the hosted environment is a determinant of the level of risks it poses – and this is a manageable aspect of risk in grid computing. Sophistication in grid computing is expected to continue despite the issues in security (Volbrecht, 2000).
Figure 6 gives a positive picture of how grid computing can improve the non-grid systems by promoting flexibility, full utilization of resources, and eliminating excess human resources in carrying out the needed operations.
Figure 6. What Does Grid System Promote
There is an annual, international conference about grid that brings together the grid researchers, developers, users, and practitioners. In this conference, best grid research studies and results are presented. This also serves as room for exploring future possibilities and assessing present endeavor in grid technology. The Institute of Electrical and Electronics Engineers (IEEE) Committee on Scalable Computing is the sponsor of this conference (GridXY: IEEE/ACM International Conference on Grid Computing, 2007).
Security policy in grid computing comprises of user identification and authentication; user registration and authorization; access control and management; integrity; trust relationship; privacy and confidentiality; service and information availability; and audit.
Trust is influenced by many factors, such as security, credentials, regulations and audit, experience, and brand value. Giving the trust to shared service can drive the success or failure of grid computing.
The best practices in grid computing are the following:
Using firewall protecting but allowing or opening ports for traffic
No sharing of private keys over the net as sniffed traffic can be easily tracked
No storing of private keys on the network file systems
No leaving of user certificates in a non data encryption system format
Sharing the computing power over the Internet that makes CPU cycles available in a shared environment in a flexible manner always sounds so enticing. There should be a systematic architecture, conceptual design, and implementation of grid/grid services. Otherwise, the commercial world will not accept it.
Of course there are solutions to the security challenges faced by grid computing. Some require tradeoffs between solution and additional administrative staff to do some more routine job while some require tradeoffs in terms of cost. Either way, there seems to be no perfect solution that would eliminate any add-on on one aspect or the other as you try to improve the other aspects. It is for the grid community to weigh. Since security is non-negotiable, there may be a need to embrace a more expensive solution just to attain the needed security.
Nowadays, large IT companies are embracing what grid computing world has to offer. Oracleâ is saying that the Oracle Grid allows the company to do away with buying expensive computers when there is a need for higher and bigger computing capacity, with the use of just one PC server per department for super scalability. Some of the benefits include:
1. Total flexibility to be meet and beat the need of the business
2. Quality service at far lower cost
3. Faster computing capability
4. Large monetary savings
IBMâ is another giant IT company that embraces grid computing, because grid computing can support the nature of development in IBM like e-business on demand and autonomic computing. For number of practical benefits, IBM is on the go for grid computing. According to IBM, grid can be seen as the evolution of some major developments (What is grid computing, 2007):
1. Web. Just like the web, Grid computing hides the complexity from the users. That is, multiple users can enjoy same sort of experience. Unlike the web, grid computing supports total collaboration towards attainment of common business goals, as opposed to web that solely enables communications among users. Grids make use of middleware to communicate with heterogeneous hardware and access/manipulate set of data. Despite the distance factor, the end-users perceive the whole system as one, unified system, without the thought of differences in platform and physical location. Grid makes research possible despite the location or distance barrier. It also addresses shortage of some institutions in computational power
2. Like peer-to-peer computing, users can share files with other users. Unlike peer-to-peer computing, grid has provision for many-to-many sharing of almost all sorts of resources that can be shared electronically.
3. Like cluster programming, grids can bring all the computing resources together. Unlike cluster programming, grids do not require physical proximity.
4. Like virtualization technologies, grid supports virtualization of resources. Unlike virtualization technologies, grid computing is not bound in virtualization of single system. It supports virtualization of vast of resources.
As one quote goes: “With a million people, you can create a road in one day; one worker needs a million days to do the same.”
As seen in the recommendation from previous studies, there needs to be collaboration and coordination among resource providers to be able to handle the security risks better. Ensuring each and every entity in the grid environment practices the best practices in place can be of very big help in promoting common benefits to all parties, expect, of course, for the attackers!
Figure 1 is retrieved August 10, 2007 from http://images.google.com.ph/imgres?imgurl=http://www.csa.com/discoveryguides/grid/images/gridcomp.gif&imgrefurl=http://www.csa.com/discoveryguides/grid/reviewf.php&start=1&h=566&w=801&sz=70&tbnid=7mnw_ZvM53hhIM:&tbnh=101&tbnw=143&hl=tl&prev=/images%3Fq%3Dgrid%2Bcomputing%26gbv%3D1%26svnum%3D10%26hl%3Dtl%26ie%3DUTF-8%26oe%3DISO-8859-1
Figure 2 is retrieved September 3, 2007 from http://www.bsearchtech.com/english/about/grid.html
Figure 3 is retrieved September 3, 2007 from http://www.mil-embedded.com/articles/white_papers/farabaugh/include/content/subcontent_files/2.png
Figure 4 is retrieved September 2, 2007 from http://www.sei.cmu.edu/isis/guide/gifs/gridlayer.gif
Figure 5 is retrieved September 2, 2007 from http://web.datagrid.cnr.it/grid/internal/Deliverables/D2.2/DataGrid-02-TED-0103-1.0_file/image004.gif
Figure 6 is retrieved September 3, 2007 from http://www.gridtoday.com/grid/695457.html
Andreozzi, S., Ferrari, T., & Ronchieri, E. 2004. On-Demand VPN Support for grid Applications. Accessed July 26, 2007 from http://www.cnaf.infn.it/~ferrari/papers/myslides/chep2004_network.pdf.
Ateniese, G., Fu, K., Green, M., and Hohenberger, S. 2005. Improved Proxy Re-Encryption Schemes with Applications to Secure Distribution Storage. In Network and Distributed System Security Symposium.
Dyer, J., Lindemann, M., Perez, R., Sailer, R., Smith S., Doorn, L., and Weingart S. 2001. Building the IBM 4758 Secure Coprocessor. IEEE Computer, 34:57–66
Ferrari, A. et al. 1999. A Flexible Security System for Metacomputing
Environments. Proc. High Performance Computing and Networking
Finkelstein, A., Gryce, C., and Lewis-Bowen, J. 2004. Relating Requirements and Architecture. Journal of Grid Computing.
Foster, I. 2000. Internet Computing and the Emerging Grid. Retrieved July 26, 2007 from http://www.nature.com/nature/webmatters/grid/grid.html.
Foster, I., Kesselman, C., Nick, J., and Tueckle, S. 2002. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration,” in Open Grid Service Infrastructure WG, Global Grid Forum, 2002, pp. 1–31
Humphrey, M. & Thompson, M.R. 2004. Security Implications of Typical Grid Computing Usage Scenarios.
Kim, Seung-Hyun, et al. 2004. Workflow-Based Authorization Service in the Grid. Journal of Grid Computing.
Krebs, B. 2004. Hackers Strike Advanced Computing Networks. Washington Post
Lorch, M., Basney, J., and Kafura, D. 2004. A Hardware-secured Credential Repository for Grid PKIs. 4th IEEE/ACM International Symposium on Cluster Computing and the Grid
Mache, J., Tyman, D., Pinter, A., & Allick, C. Performance Implications of Using VPN Technology for Cluster Integration and Grid Computing. Retrieved July 26, 2007 from http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/icns/2006/2622/00/2622toc.xml&DOI=10.1109/ICNS.2006.83
Marchesini, J. & Smith, S. 2005. SHEMP: Secure Hardware Secure MyProxy
Smith, M., Engel, M., Freise, T., Freisleben, B. 2004. Security Issues in On-Demand Grid and Cluster Computing.
Smith, S. and Weingart, S.1999. Building a High-Performance, Programmable Secure Coprocessor. Computer Networks. 31:831–860
Volbrecht, J. et al. 2000. AAA Authorization Application Examples.
Yang, S. et al. 2006. A Fair, Secure and Trustworthy Peer-to-Peer Based Cycle-Sharing System.
Zsolt, N. & Sunderam, V. 2003. Characterizing Grids: Attributes, Definitions, and Formalisms. Journal of Grid Computing.
AEGIS: Architectural EnGines for Information Security (AEGIS Publications). Retrieved July 26, 2007 from http://csg.csail.mit.edu/pubs/aegissecurity.html.
GGF13 – The thirteenth Global Grid Forum. March 14-17, 2005. Seoul, Korea. Retrieved July 26, 2007 from http://www.ogf.org/ggf_events_past_13.htm.
GridXY: IEEE/ACM International Conference on Grid Computing. Retrieved July 26, 2007 from http://www.gridcomputing.org/.
What is grid computing? Retrieved July 26, 2007 from http://www-03.ibm.com/grid/about_grid/what_is.shtml.
Sun Utility Computing. Retrieved July 26, 2007 from http://www.sun.com/service/sungrid/index.jsp.