Privacy-disclosure and Privacy-Preserving for Online Data Open Access
Downloadable ContentDownload PDF
There are a number of web databases existed in the world. Some of which are hidden from the public view. They provide search interfaces to users that allow issuing searching queries, and present information based on the users' request. The information is usually ordered searching results (e.g. return top-k results to users, where k is much smaller than the size of database) from proper ranking functions. For example, Amazon.com, expedia.com, etc. Each site dynamically creates pages based on the user request. Some other web databases, however, are public to users that allow users to completely retrieve the dataset. For example, apple watch health data, credit card history transactions, etc. Each database grants access to users to get their own completed data.In this dissertation, we consider two kinds of problem. The first is to infer the hidden information over web databases. One of which is to use the published time-series data to infer a user's daily activities. Theoretical analysis and real-world experiments demonstrate the effectiveness of our proposed algorithms over the baseline algorithm. We also investigate a novel problem on the implications of the information asymmetry model with transparency strategies. We propose IHF-matching Algorithm and the real-world experiments demonstrate the high succeed attack rate over real datasets. The second is to protect privacy in web databases. In this dissertation, we propose a novel privacy-preserving framework. Our framework protects private attributes' privacy not only under inference attacks but also under arbitrary attack methods. We demonstrate the effectiveness and efficiency of our framework through theoretical analysis, extensive experiments over real-world datasets.