Securing Organizational Data Using Data MaskingTechniquesVidhya Chaudhary, Mridul Chavan, Ketki Joshi, Ila MandaliyaStudent, BE-Information Technology, Atharva College of Engineering, Mumbai, [email protected], [email protected],[email protected], [email protected] Patil, Supriya MandhareAssistant Professor, Information Technology, Atharva College of Engineering, Mumbai, [email protected], [email protected]—As the volume of personal data grows acrossindustries and the number of data attacks on enterprisescontinues to increase, organizations large and small are seekingbest practices on how to protect their data. Securityprofessionals and managers are increasingly concerned that theleading information security risk to organization comes fromwithin. After evaluating all threats to an organization, surveysconclude that even though most attacks come from outside theorganization, the most serious damage is done with help frominside. Hence, there is a need to deal with exposure of sensitiveorganizational data at the hands of insider threats. Ourapproach in this project is to consider this aspect of cloud datasecurity which deals with security, preventing the exposureof sensitive production data to developers, testers, thirdparty vendors by employing advanced data maskingtechniques to minimize the probability of such internalattacks affecting an organization/business etc.Keywords – Organizational data security, data masking, nonproduction environment, data exposure.I. INTRODUCTIONCloud Computing is a model for enabling access to sharedpool of resources( networks, server, storage etc) which can berapidly deployed over the internet. It is quite popularly usedin the corporate IT industries. In fact, cloud’s potential hascaptured the attention of business leaders across everyvertical industry, looking to capitalize on its speed, scale,control and economics. But, all of this makes it necessary tohave a secure cloud. There have been various techniques todo the same like SSL (Secure Socket Layer), Encryption,Intrusion Detection System, Multi tenancy based accesscontrolled, etc.It is a common misconception to assume that externalattacks are the most damage causing in cloud but surveyconclude that the most serious damage is caused by insiderattacks. Hence, data protection is an integral part of the cloudexperience, one which is often ignored by organizations.Data Masking is one such way to tackle the security issueof data protection. Data Masking is the technique ofobfuscating sensitive data to prevent exposure of this data. Itis performed as per the access privileges of the user.The Project deals with the security issue of Data protectionin the IT Corporate industry, it will help to preventunauthorized users from accessing sensitive information whilealso dramatically decrease the risk of data breach. It alsoincludes customizing Data Masking solutions for differentregulatory or business requirements. This project will alsoenable IT organization to apply sophisticated masking tolimit sensitive data access with flexible data maskingrules based on a user’s authentication level. Blockingunauthorized users, auditing the data, and alerting users in caseof data breaches, IT personnel, and outsourced teams whoaccess sensitive information, it ensures compliance withsecurity policies and industry and civil privacy regulations.Data Masking is the approach to replace the existingsensitive information from test or development environmentwith realistic but not real information. Data maskingtechniques will obscure specific data within a database tableIt ensures data security is maintained while obscuring specificdata within a database table or a database server.Data masking is an effective strategy in reducing the riskof data exposure and data breaches from both inside andoutside an organization, and must be considered a norm forprovisioning non-production databases. Effective datamasking requires data to be altered in such a way that actualvalues are re-engineered, while retaining the functional andstructural meaning of the data, so that it can be used in ameaningful way without compromising on security.II. REVIEW OF LITERATUREA. Security Techniques for Data Protection in CloudComputingAuthor Kire Jakimoski have described the various securityissues in Security Techniques for Data Protection in CloudComputing that are gaining great attention nowadays, includingthe data protection, network security, virtualization security,application integrity, and identity management. Data protectionis one of the most important security issues, becauseorganizations won’t transfer its data to remote machines ifthere is no guaranteed data protection from the cloud serviceproviders. Many techniques are suggested for data protection incloud computing, but there are still a lot of challenges in thissubject. The most popular security techniques include SSL(Secure Socket Layer) Encryption, Intrusion Detection System;Multi Tenancy based Access Control, etc. Goal of this paper isto analyze and evaluate the most important security techniquesfor data protection in cloud computing. Furthermore, securitytechniques for data protection is recommended in order to haveimproved security in cloud computing.B. Privacy- Preserving Public Auditing For DataStorage Security in Cloud Computing.Author Cong Wang,Qian Wang Kui Ren have describedabout remotely storing user data on cloud and enjoy the on-demand high quality applications and services from a sharedpool of configurable computing resources, without the burdenof local data storage and maintenance in Privacy-PreservingPublic Auditing For Data Storage Security in CloudComputing7.However, the fact that users no longer havephysical possession of the outsourced data makes the dataintegrity protection in Cloud Computing a formidable task,especially for users with constrained computing resources.Also, the users should be able to use the cloud as if it werelocal, without having to worry about the need to verify itsintegrity. Thus, enabling public auditability for cloud storageis of critical importance so that users can resort to a third partyauditor (TPA) to check the integrity of outsourced data and beworry-free.The auditing process should not bring anyadditional vulnerabilities towards data privacy of users, andintroduce any additional online burden to users to securelyintroduce an effective TPA. In this paper, they authors haveproposed a secure cloud storage system supporting privacy-preserving public auditing. The authors have extended theirresult to enable the TPA to perform data audits for multipleusers simultaneously and efficiently. Extensive security andperformance analysis show the proposed schemes areprovably secure and highly efficient.C. Securing Cloud Data in Transit Using MaskingTechnique in Cloud Enabled Multi Tenant SoftwareService.Authors S. Selvakumar and M. Mohanapriya, havedescribed the issues in data security in the cloud computingenvironment in Securing Cloud Data in Transit UsingMasking Technique in Cloud Enabled Multi Tenant SoftwareService. It employs data masking to hide sensitive data fromcloud services thereby ensuring reliability and trust in thecloud environment. Data access in the cloud can becategorized into three such methods such as at rest, at transit,in use. The main aim of this paper is integrate security in datamasking techniques. We employ the existing mechanisms tothe cloud environment to secure the data with virtual machinemasking and platform masking. Findings: The masked data istransmitted to the processing environment. The services incloud utilize this masked data for processing. It iscomparatively secured when compared to the conventionaltechnique. This mechanism increases the trust worthiness andcan be masked dynamically or statistically in application ordatabase based service environments.Application/Improvements: The main application of thisresearch is to serve people with secured cloud, therebyovercoming the data security issues.D. A Few New Approaches to Data Masking.Authors G.Sarada, G.Manikandan and Dr. N.Sairamhave put forward four new approaches for masking the datausing min-max normalization, fuzzy logic, and rail-fence andmap range in A Few New Approaches to Data Masking. Fromthe experimental results it has been evident that thesetechniques overcame the limitations of the traditionalmethods. The advantage of our approach is that it makes themasked data to appear as the original data to the end users.This work can be extended in the future by using somemethods which can mask both categorical and numeric data.III. PROPOSED WORKThe user provides their login credentials and isaccordingly allowed or denied access. They can access orupdate the data as per their assigned access privileges.Whenever the data needs to be accessed for any nonproduction environments, the user will send in a querythrough the application. The application forwards this query tothe server/database. The query is processed and the result(unmasked) is captured by the application where it is maskeddynamically, after which a realistic looking but fake data isgenerated on which the tests can be carried out. This preventsthe exposure of sensitive production data to testers, developersetc.A comprehensive 4-step approach to implementing datamasking . These steps are:A. Analyse Sensitive DataThis phase identifies sensitive or regulated data across theentire organization. The purpose is to come up with the list ofsensitive data elements specific to the organization anddiscover the associated tables, columns and relationshipsacross databases that contain the sensitive data. This is carriedout usually by data, security and business analysts.B. AssessThis phase identifies the masking algorithms to replace theoriginal sensitive data. Developers or DBAs work withbusiness or security analysts with their own masking routines.C. Secure and Test:This is the iterative phase. The masking process is executedto secure the sensitive data by the security administrator. Oncethe masking process has completed and has been verified, theDBA then hands over the environment to the applicationtesters. The production users execute application processes totest whether the resulting masked data can be turned over tothe other non-production users. If the masking routines needto be tweaked further, the DBA restores the database to thepre-masked state, fixes the masking algorithms and re-executes the masking process.IV. DYNAMIC MASKING TECHNIQUESA. Fuzzy Based Approach:The concepts of fuzzy sets are merely an extension to thegeneric set theory. In fuzzy sets the gradual assessment ofdata is done using a suitable fuzzy membership functionfuzzification of original data to fuzzy set preservesprivacy and relativity between data9. This approachenhances the efficiency of clustering by decreasing requirednumber of passes. The fuzzy membership function used alsoinfluences the processing time. Thus, selecting proper fuzzymembership function can improve the efficiency of algorithmand also aid in overcoming most of the limitationsstated in the previous section. In our work, data masking isachieved using S-Shaped membership function. S – Shapedfuzzy membership function is given by……..(1)Where x – is value of the sensitive attribute, a ; b –is minimum and maximum value in the sensitive attribute.The only limitation of this approach is that it can only mapthe values between 0 and 1. Still, it can be used to mask thedata having their domain from 0 to 1.B. Rail-fence Method :This technique is mostly applied to categorical data wherein the original data is written row/column-wise and thetransformed data is fetched by traversing alongcolumn/row wise respectively.C. Map Range (Rosetta Code) :Map Range method of normalization performsmapping of original data to a range (mostly for mappinglarge values to a small range) given by the user. This methodis used for numerical data. The formula is given as follows:Given two ranges, a1, a2 and b1, b2; then a value s inrange a1, a2 is linearly mapped to a value t in rangeb1, b2 when:…….………………………………(2)D. Masking Outs:Masking out is replacing certain parts of the data withspecific characters(X or *). Care should be taken in maskingout appropriate data by not masking required information. Ifthe required information is masked then the entire fieldbecomes useless. This technique is generally used in creditcard transfers and in internet banking. For example, creditcard number 5289 7895 1236 4598 can be masked as 5289XXXX XXXX 4598. It would be tedious for applyingdifferent patterns for the same field.V. CONCLUSIONWe propose an application that takes into considerationone of the most vital aspects of cloud storage security- dataprotection. We plan on preventing the exposure of sensitiveproduction data by employing advanced dynamic datamasking techniques. This will give obfuscated realisticlooking data which can be further deployed to testingenvironments etc. Dynamic Masking is one of the measureswhich should be employed by organizations to safeguardtheir data from insider attacks.REFERENCES1 https://en.wikipedia.org/wiki/Cloud_computing2 https://www.forbes.com/sites/ibm/2014/11/03/three-companies- that-transformed-their- businesses-using- cloud-computing/#1e52cf921b663 Kire Jakimoski, “Security Techniques for Data Protection in CloudComputing” International Journal of Grid and Distributed ComputingVol. 9, No. 1, (2016).4 https://www.isdecisions.com/insider-threat/statistics.html5 Cong Wang Qian Wang Kui Ren “Privacy- Preserving PublicAuditing For Data Storage Security in Cloud Computing “, IEEEINFOCOM 2010.6 https://www.gartner.com/it-glossary/dynamic- data-masking-ddm7 http://searchsecurity.techtarget.com/definition/data-breach8 S.Selvakumar and M. Mohanapriya, ;Securing Cloud Datain Transit using Data Masking Technique in CloudEnabled Multi Tenant Software Service;, Indian Journalof Science and Technology, vol. 9, no. 20, 2016.9 G Sarada, G Manikandan, Dr.N. Sairam,”A Few NewApproaches to Data Masking”, International Conferenceon Circuit , Power and Computing Technology 2015.10 Data Masking Best Practice, Oracle White Paper, June 2013.