Abstract
In
emerging information networks, it is crucially important to provide efficient
search on distributed documents while preserving their owners’ privacy, for
which privacy preserving indexes or PPI presents a possible solution. An
understudied problem for the PPI techniques is how to provide differentiated
privacy preservation in the presence of multi-keyword document search. The
differentiation is necessary as terms and phrases bear innate differences in
their semantic meanings.
In
this paper we present
-MPPI, the first work to provide the
distributed document search with quantitatively differentiated privacy
preservation. In the design of
-MPPI, we identified a suite of
challenging problems and proposed novel solutions. For one, we formulated the
quantitative privacy computation as an optimization problem that strikes a
balance between privacy preservation and search efficiency. We also addressed
the challenging problem of secure
-MPPI construction in the multi domain
information network which lacks mutual trusts between domains. Towards a secure
-MPPI construction with practically
acceptable performance, we proposed to optimize the performance of secure
multi-party computations by making a novel use of secret sharing. We
implemented the
-MPPI construction protocol with a
functioning prototype. We conducted extensive experiments to evaluate the
prototype’s effectiveness and efficiency based on a real-world dataset.
Aim
The
aim is to provide differentiated privacy preservation in the presence of multi-keyword
document search.
Scope
To
implement the
-MPPI, a new PPI abstraction which can
quantitatively control the privacy leakage for multi-keyword document search.
Existing system
Secure Indexing on Untrusted
Servers
The
existing system is data indexing in P2P networks. Those P2P indices are built
on top of and distributed to Distributed Hash Tables (or DHT).
Privacy Definitions for
Anonymization
Publishing public-use data about individuals
without revealing sensitive information has received a lot of research
attentions in the last decade. Various privacy definitions have been proposed
and gained popularity, including k-anonymity, l-diversity,and differential
privacy. In particular, in a k-anonymized dataset, each record is indistinguishable
from at least k−1 other records. This idea is applied in the PPI setting; most
existing PPI uses the grouping notion to make servers k-anonymized in the
public-use PPI. We propose a non-grouping
-
MPPI which demonstrates the promise for better quality of privacy preservation.
-MPPI utilizes a new privacy definition,
-PHRASE-PRIVACY, to particularly address the
privacy with multi-term document searches. The most relevant privacy definition
to our
-PHRASE-PRIVACY degree is r-confidentiality
which also addresses the privacy preservation of a PPI system for public use.
However, r-confidentiality does not particularly consider the case of
multi-term phrases.
Disadvantages
· Existing
work focuses on the single-term phrase protection.
· In
the age of cloud computing, data users, while enjoying a multitude of benefits
from the cloud (e.g. cost effectiveness and data availability), are
simultaneously reluctant or even resilient to use the clouds, as they lose data
control.
Proposed System
This
project
-MPPI currently assumes a centralized
entity for index serving, it is straightforward to extend
-MPPI’s architecture to a P2P network;
-MPPI can be served as a P2P index if a DHT
structure is imposed on the information network which achieves better load
balancing and scalability.This
project proposes
-MPPI for multi-term phrase publication with
quantitative privacy control in emerging information networks. We propose
several practical approaches for the secure construction of an
-MPPI system in an environment without mutual
trusts, while being able to provide the multi-term privacy. For practical
performance of secure computations, we propose an MPC-reduction technique based
on the efficient use of secret sharing schemes. We also discovered a common-term
vulnerability and proposed a term-mixing solution. Through both
simulation-based and real experiments,
Advantages
Comparing
to existing work on secure data serving in the cloud the PPI scheme is unique
in the sense that
1)
Data is stored in plain-text (i.e. without encryption) in the PPI server, which
makes it possible for efficient and scalable data serving with rich
functionality. Without use of encryption, PPI preserves user privacy by adding
noises to obscure the sensitive ground truth information.
2)
Only coarse-grained information (e.g. the possession of a searched phrase by an
owner) is stored in the PPI server, while the original content which is private
is still maintained and protected in the personal servers, under the
user-specified access control rules.
System Architecture
The PPI System
SYSTEM CONFIGURATION
Hardware Requirements
- Speed - 1.1 Ghz
- Processor - Pentium IV
- RAM - 512 MB (min)
- Hard Disk - 40 GB
- Key Board - Standard Windows Keyboard
- Mouse - Two or Three Button Mouse
- Monitor - LCD/LED
Software
requirements
- Operating System : Windows 7
- Front End : ASP.Net and C#
- Database : MSSQL
- Tool : Microsoft Visual studio
References
Yuzhe
Tang, Ling Liu “PRIVACY-PRESERVING MULTI-KEYWORD SEARCH IN INFORMATION
NETWORKS”, IEEE Transactions on Knowledge and Data Engineering (Volume: PP, Issue: 99 ) March 2015.
No comments:
Post a Comment