Browsing by Author "Yang, Xiaowei"
Results Per Page
Sort Options
Item Embargo Characterizing and Detecting Physical Layer Issues in Cable Broadband Networks(2023) Hu, JiyaoCable broadband networks are one of the few ``last-mile'' broadband technologies widely available in the U.S. The COVID-19 pandemic has highlighted the critical role that broadband networks play in the US economy and society. Unfortunately, Cable broadband networks usually suffer from poor reliability. Many cable broadband networks in the United States were built in the 1990s and early 2000s, which result in poor application-layer performance, e.g., high packet loss rate, slow web responses, or low-quality video streaming. Improving the reliability of cable broadband networks satisfies both customer experiences and operator needs.
This work aims to improve the reliability of cable broadband networks with the help of a regional cable ISP. Our cooperating ISP provided telemetry data collected by the Proactive Network Maintenance (PNM) infrastructure in cable networks. The data is collected from 77K+ cable modems that spread across 394 hybrid-fiber-coaxial (HFC) network segments during a 16-month period. Firstly, the study investigates the degree of unreliability in cable broadband networks by examining network layer packet loss resulting from physical layer transmission errors. We estimate that physical-layer errors can contribute to 12% to 25% of packet loss in the cable ISPs measured by the FCC's Measuring Broadband America project. Then, we propose CableMon, the first public-domain system that applies machine learning techniques to PNM data to improve the reliability of cable broadband networks. CableMon uses statistical models to generate features from time series data and uses customer trouble tickets as hints to infer abnormal thresholds for these generated features. Our results show that 81.9% of the abnormal events detected by CableMon overlap with at least one customer trouble ticket. This ticket prediction accuracy is four times higher than that of the existing public-domain tools used by ISPs. Last, we present TelAPart, a fault diagnosis system for cable networks, which can differentiate network faults caused by a faulty component inside a cable network and the network faults caused by a faulty component within a user's premise.
Item Open Access Improving Network Security with Low-Cost and Easy-to-Adopt Solutions(2020) Zheng, ShengbaoSecurity is always a big concern. According to the statistics, there are over 80,000 cyberattacks per day or over 30 million attacks per year. To make the Internet safe, both the industry and academia propose many solutions. However, these security solutions mainly concentrate on being effective, and ignore the other two features: deployment cost and usability. Therefore, though many works have been proposed to improve security, attacks still happen frequently.
Our goal is to improve network security with low-cost and easy-to-adopt solutions. In this thesis, we choose Distributed Denial-of-Services (DDoS) attack and I/O path malware attack as two representatives. Fueled by IoT botnets and DDoS-for-Hire services, DDoS attacks have reached a record high volume, and launching such attacks is increasingly easy and cheap. We speculate the main reasons why existing solutions still leave DDoS as the top threat are 1) Commercial DDoS protection services are costly. 2) Solutions that require upgrading the core Internet architecture turned out to be extremely difficult to deploy. Similarly, modern operating systems enable user-level malware to log a user's keystrokes or scrape a user's screen output, which usually contains user sensitive data. Solutions with trusted hardware, virtual machines, and mobile phone facilitation all have high costs of deployment and usability for non-expert users.
In this thesis, we present our low-cost and easy-to-adopt solutions to these two attacks. Specifically, 1) Dynashield, an on-demand DDoS defense architecture built on top of different cloud services. Dynashield introduces lower financial cost than Protection-as-a-Service product like Cloudflare, and is easier to adopt than network architecture based solutions. 2) Switchman, a framework to protect a user's I/O paths against user-level malware attacks stealing sensitive privacy data. Switchman helps non-expert users protect their sensitive data. It is easier to adopt than trusted hardware solutions like Intel SGX, and has higher usability compared to VM and additional devices based solutions.
Item Open Access Improving System Availability for Better Services(2020) Zhou, ZhenyuEntering the information age, the demands for online services increase dramatically. Such high demands are pushing the network systems to become more complex and making system availability a crucial requirement for both service providers and clients. The service providers are aiming to have an effective, efficient and stable service: the service should be failure resilient, be scalable to support a large group of clients and still keep acceptable performance. Whereas the clients need a “powerful” service – high performance without threats to their privacy or security. To this end, huge efforts have been made to improve the service availability issues, to detect failures, to overcome the failures introduced by bugs and scalability problems, or to provide a strong guarantee of performance, effectiveness and security. We explore the system availability problem with several network scenarios including Software Defined Networks, Data Center Networks and Cable Networks, and propose new concepts to further improve networking services’ availability.
Item Open Access Mitigating Denial-of-Service Flooding Attacks with Source Authentication(2012) Liu, XinDenial-of-Service (DoS) flooding attacks have become a serious threat to the reliability of the Internet. For instance, a report published by Arbor Networks reveals that the largest DoS flooding attack observed in 2010 reaches 100Gbps in attack traffic volume. The defense against DoS flooding attacks is significantly complicated by the fact that the Internet lacks accountability at the network layer: it is very difficult, if not impossible, for the receiver of an IP packet to associate the packet with its real sender, as the sender is free to craft any part of the packet.
This dissertation proposes to mitigate DoS flooding attacks with a two-step process: first to establish accountability at the network layer, and second to utilize the accountability to efficiently and scalably mitigate the attacks. It proposes Passport, a source authentication system that enables any router forwarding a packet to cryptographically verify the source Autonomous System (AS) of the packet. Passport uses symmetric key cryptography to enable high-speed verification and piggy-backs its key exchange into the inter-domain routing system for efficiency and independence from non-routing infrastructures.
On top of Passport, this dissertation proposes NetFence, a DoS flooding attack mitigation system that provides two levels of protection against the attacks: if a victim can receive and identify the attack traffic, it can throttle the attack traffic close to the attack sources; otherwise, the attack traffic cannot be eliminated, but it would not be able to consume more than the attack sources' fair shares of the capacity of any bottleneck link. NetFence achieves its goals by putting unforgeable congestion policing feedback into each packet. The feedback allows bottleneck routers to convey congestion information back to the access routers that police the traffic accordingly. A destination host can throttle unwanted traffic by not returning the feedback to the source host.
We have implemented prototypes of Passport and NetFence in both ns-2 simulator and Linux. We have also implement a prototype of Passport on a NetFPGA board. Our evaluation of the prototypes as well as our security and theoretical analysis demonstrate that both Passport and NetFence are practical for high-speed router implementation and could mitigate a wider range of attacks in a more scalable way compared to previous work.
Item Open Access Predicting Application Performance in the Cloud(2011) Zong, XuanranDespite the exceptional prominence of the cloud computing, the customers are
lack of direct sense to select the cloud that delivers the best performance,
due to the performance heterogeneity of each cloud provider. Existing solutions
either migrate the application to each cloud and evaluate the performance
individually, or benchmark each cloud along various dimensions and predict the
overall performance of the application. However, the former incurs significant
migration and configuration overhead, while the latter may suffer from coarse
prediction accuracy.
This thesis introduces two systems to address this issue. CloudProphet predicts the web
application performance by tracing and replaying the on-premise resource demand
on the cloud machines. DTRCP further predicts the performance for general
applications. In particular, it addresses the execution path divergence
manifested during replaying the on-premise resource demand. Our experiment
results show that both systems can accurately predict the application
performance.
Item Embargo Proactive and Passive Performance Optimization of IP Anycast(2023) Zhang, XiaoIP Anycast, as a vital routing technique, can distribute user requests to different servers with the same IP worldwide. It can improve large-scale distributed systems performance and load balance. Nonetheless, all the sites in the anycast-based system have identical IP addresses, which makes it challenging to control the system’s catchment (which site the user should go to) and results in anycast performance inefficiency.
In this thesis, we introduce two approaches to optimize the performance of IP anycast, proactively and passively. The first approach-AnyOpt, managed to build a prediction model to predict the catchment site of the user with controlled experiments and measurements with the sites. Using AnyOpt, a network operator can find a subset of anycast sites that minimizes client latency. In an experiment using 15 sites, each peering with one of six transit providers, AnyOpt predicted site catchments of 15 300 clients with 94.7% accuracy and client RTTs with a mean error of 4.6%. AnyOpt identified a subset of 12 sites, announcing to which lowers the mean RTT to clients by 33 ms compared to a greedy approach that enables the same number of sites with the lowest average unicast latency.
The second approach-regional anycast, is an approach that we found to have already been implemented by two large CDNs (Edgio and Imperva). In regional anycast, a CDN divides its content-hosting sites into different geographic regions, announces a distinct IP anycast prefix from each region, and uses DNS and IP-geolocation to direct a client to a CDN site in the same geographic area. We aim to understand how a regional anycast CDN partitions its sites and maps its customers’ clients, and how a regional anycast CDN performs compared to its global anycast counterpart. We study the deployment strategies and the performance of two CDNs (Edgio and Imperva) that currently deploy regional IP anycast. We find that both Edgio and Imperva partition their sites and clients following continent or country borders. In addition, we compare the client latency distribution in Imperva’s regional anycast CDN with that in its similar-scale DNS global anycast network, after discounting the relevant deployment differences between the two networks. We find that regional anycast can effectively mitigate the pathology in global IP anycast where BGP routes a client’s traffic to a distant CDN site (e.g., a site in a different continent). However, DNS mapping inefficiencies, where DNS returns a sub-optimal regional IP anycast address that does not cover a client’s low-latency CDN sites, can harm regional anycast’s performance. Finally, using the Tangled testbed, we show what performance benefit regional IP anycast can achieve if we discount DNS mapping sub-optimality.
We also include a measurement work about the ever-increasing anycast flipping. We observe an increase in flipping over the past several years, reaching 4.4% of RIPE Atlas vantage points in 2023. We present evidence that the prevalence of anycast flipping is increasing, and for a small but not negligible portion of clients, the impact on web performance is significant.
Item Open Access Solving Practical Problems in Datacenter Networks(2013) Wu, XinThe soaring demands for always-on and fast-response online services have driven modern datacenter networks to undergo tremendous growth. These networks often rely on scale-out designs with large numbers of commodity switches to reach immense capacity while keeping capital expenses under check. Today, datacenter network operators spend tremendous time and efforts on two key challenges: 1) how to efficiently utilize the bandwidth connecting host pairs and 2) how to promptly handle network failures with minimal disruptions to the hosted services.
To resolve the first challenge, we propose solutions in both network layer and transport layer. In the network layer solution, We advocate to design practical datacenter architectures for easy operation, i.e., an architecture should be reliable, capable of improving bisection bandwidth, scalable and debugging-friendly. By strictly following these four guidelines, We propose DARD, a Distributed Adaptive Routing architecture for Datacenter networks. DARD allows each end host to reallocate traffic from overloaded paths to underloaded paths without central coordination. We use congestion game theory to show that DARD converges to a Nash equilibrium in finite steps and its gap to the optimal flow allocation is bounded in the order of 1/logL, with L being the number of links. We use a testbed implementation and simulations to show that DARD can achieve a close-to-optimal flow allocation with small control overhead in practice.
In the transport layer solution, We propose Explicit Multipath Congestion Control Protocol (MPXCP), which achieves four desirable properties: fast convergence, efficiency, being fair to flows with different RTTs and negligible queue size. Intensive ns-2 simulation shows that MPXCP can quickly converge to efficiency and fairness without building up queues despite different delay-bandwidth products.
To resolve the second challenge, recent research efforts have focused on automatic failure localization. Yet, resolving failures still requires significant human interventions, resulting in prolonged failure recovery time. Unlike previous work, we propose NetPilot, a system aims to quickly mitigate rather than resolve failures. NetPilot mitigates failures in much the same way operators do -- by deactivating or restarting suspected offending components. NetPilot circumvents the need for knowing the exact root cause of a failure by taking an intelligent trial-and-error approach. The core of NetPilot is comprised of an Impact Estimator that helps guard against overly disruptive mitigation actions and a failure-specific mitigation planner that minimizes the number of trials. We demonstrate that NetPilot can effectively mitigate several types of critical failures commonly encountered in production datacenter networks.
Item Open Access Supplemental Authentication via Internet Fingerprinting(2011) Tate, RyanInternet websites are a regular medium for exchanging sensitive information such as online banking. The security of this information is paramount. Today, one facet of this security - authenticating a website to its users - depends on the trust of a third party (i.e., a certificate authority). However, web browsers currently trust many certificate authorities from around the world. Some of them may be compromised or untrustworthy. This work explores an authentication scheme that does not require trust but instead uses unexploited network characteristics of a website to authenticate the website to users. Our preliminary evaluation shows that this scheme can reject all of over 200,000 verified online phishing website visits while recognizing more than 99% of the 7,000 legitimate websites over the course of a week. Results suggest that network characteristics can provide a supplemental website authentication scheme. It has no noticeable overhead or network footprint and is independent of any third party trust.
Item Open Access Toward Attack-Resistant Distributed Information Systems by Means of Social Trust(2010) Sirivianos, MichaelTrust has played a central role in the design of open distributed systems that span distinct administrative domains. When components of a distributed system can assess the trustworthiness of their peers, they are in a better position to interact with them. There are numerous examples of distributed systems that employ trust inference techniques to regulate the interactions of their components including peer-to-peer file sharing systems, web site and email server reputation services and web search engines.
The recent rise in popularity of Online Social Networking (OSN) services has made an additional dimension of trust readily available to system designers: social trust. By social trust, we refer to the trust information embedded in social links as annotated by users of an OSN. This thesis' overarching contribution is methods for employing social trust embedded in OSNs to solve two distinct and significant problems in distributed information systems.
The first system proposed in this thesis assesses the ability of OSN users to correctly classify online identity assertions. The second system assesses the ability of OSN users to correctly configure devices that classify spamming hosts. In both systems, an OSN user explicitly ascribes to his friends a value that reflects how trustworthy he considers their classifications. In addition, both solutions compare the classification input of friends to obtain a more accurate measure of their pairwise trust. Our solutions also exploit trust transitivity over the social network to assign trust values to the OSN users. These values are used to weigh the classification input by each user in order to derive an aggregate trust score for the identity assertions or the hosts.
In particular, the first problem involves the assessment of the veracity of assertions on identity attributes made by online users. Anonymity is one of the main virtues of the Internet. It protects privacy and freedom of speech, but makes it hard to assess the veracity of assertions made by online users concerning their identity attributes (e.g, age or profession.) We propose FaceTrust, the first system that uses OSN services to provide lightweight identity credentials while preserving a user's anonymity. FaceTrust employs a ``game with a purpose'' design to elicit the
opinions of the friends of a user about the user's self-claimed identity attributes, and uses attack-resistant trust inference to compute veracity scores for the attributes. FaceTrust then provides credentials, which a user can use to corroborate his online identity assertions.
We evaluated FaceTrust using a crawled social network graph as well as a real-world deployment. The results show that our veracity scores strongly correlate with the ground truth, even when a large fraction of the social network users are dishonest. For example, in our simulation over the sample social graph, when 50% of users were dishonest and each user employed 1000 Sybils, the false assertions obtained approximately only 10% of the veracity score of the true assertions. We have derived the following lessons from the design and deployment of FaceTrust: a) it is plausible to obtain a relatively reliable measure of the veracity of identity assertions by relying on the friends of the user that made the assertion to classify them, and by employing social trust to determine the trustworthiness of the classifications; b) it is plausible to employ trust inference over the social graph to effectively mitigate Sybil attacks; c) users tend to mostly correctly classify their friends' identity assertions.
The second problem in which we apply social trust involves assessing the trustworthiness of reporters (detectors) of spamming hosts in a collaborative spam mitigation system. Spam mitigation can be broadly classified into two main approaches: a) centralized security infrastructures that rely on a limited number of trusted monitors (reporters) to detect and report malicious traffic; and b) highly distributed systems that leverage the experiences of multiple nodes within distinct trust domains. The first approach offers limited threat coverage and slow response times, and it is often proprietary. The second approach is not widely adopted, partly due to the
lack of assurances regarding the trustworthiness of the reporters.
Our proposal, SocialFilter, aims to achieve the trustworthiness of centralized security services and the wide coverage, responsiveness, and inexpensiveness of large-scale collaborative spam mitigation. It enables nodes with no email classification functionality to query the network on whether a host is a spammer. SocialFilter employs trust inference to weigh the reports concerning spamming hosts that collaborating reporters submit to the system. To the best of our knowledge,
it is the first collaborative threat mitigation system that assesses the trustworthiness of the reporters by both auditing their reports and by leveraging the social network of the reporters' human administrators. Subsequently, SocialFilter weighs the spam reports according to the trustworthiness of their reporters to derive a measure of the system's belief that a host is a spammer.
We performed a simulation-based evaluation of SocialFilter, which indicates its potential:
during a simulated spam campaign, SocialFilter classified correctly 99% of spam, while yielding no false positives. The design and evaluation of SocialFilter offered us the following lessons: a) it is plausible to introduce Sybil-resilient OSN-based trust inference mechanisms to improve the reliability and the attack-resilience of collaborative spam mitigation; b) using social links to obtain the trustworthiness of reports concerning spammers (spammer reports) can result in comparable spam-blocking effectiveness with approaches that use social links to rate-limit spam (e.g., Ostra); c) unlike Ostra, SocialFilter yields no false positives. We believe that the design lessons from SocialFilter are applicable to other collaborative entity classification systems.
Item Open Access Towards Systematic and Accurate Environment Selection for Emerging Cloud Applications(2012) Li, AngAs cloud computing is gaining popularity, many application owners are migrating their
applications into the cloud. However, because of the diversity of the cloud environments
and the complexity of the modern applications, it is very challenging to find out which
cloud environment is best fitted for one's application.
In this dissertation, we design and build systems to help application owners select the
most suitable cloud environments for their applications. The first part of this thesis focuses
on how to compare the general fitness of the cloud environments. We present CloudCmp,
a novel comparator of public cloud providers. CloudCmp measures the elastic computing,
persistent storage, and networking services offered by a cloud along metrics that directly
reflect their impact on the performance of customer applications. CloudCmp strives to
ensure fairness, representativeness, and compliance of these measurements while limiting
measurement cost. Applying CloudCmp to four cloud providers that together account
for most of the cloud customers today, we find that their offered services vary widely in
performance and costs, underscoring the need for thoughtful cloud environment selection.
From case studies on three representative cloud applications, we show that CloudCmp can
guide customers in selecting the best-performing provider for their applications.
The second part focuses on how to let customers compare cloud environments in the
context of their own applications. We describe CloudProphet, a novel system that can
accurately estimate an application's performance inside a candidate cloud environment
without the need of migration. CloudProphet generates highly portable shadow programs
to mimic the behavior of a real application, and deploys them inside the cloud to estimate
the application's performance. We use the trace-and-replay technique to automatically
generate high-fidelity shadows, and leverage the popular dispatcher-worker pattern
to accurately extract and enforce the inter-component dependencies. Our evaluation in
three popular cloud platforms shows that CloudProphet can help customers pick the bestperforming
cloud environment, and can also accurately estimate the performance of a
variety of applications.
Item Open Access Understanding and Defending Against Malicious Identities in Online Social Networks(2014) Cao, QiangServing more than one billion users around the world, today's online
social networks (OSNs) pervade our everyday life and change the way people
connect and communicate with each other. However, the open nature of
OSNs attracts a constant interest in attacking and exploiting them.
In particular, they are vulnerable to various attacks launched through
malicious accounts, including fake accounts and compromised real user
accounts. In those attacks, malicious accounts are used to send out
spam, spread malware, distort online voting, etc.
In this dissertation, we present practical systems that we have designed
and built to help OSNs effectively throttle malicious accounts. The overarching
contribution of this dissertation is the approaches that leverage the fundamental
weaknesses of attackers to defeat them. We have explored defense schemes along
two dimensions of an attacker's weaknesses: limited social relationships
and strict economic constraints.
The first part of this dissertation focuses on how to leverage social
relationship constraints to detect fake accounts. We present SybilRank, a novel
social-graph-based detection scheme that can scale up to OSNs with billions of
users. SybilRank is based on the observation that the social connections between
fake accounts and real users, called attack edges, are limited. It formulates
the detection as scalable user ranking according to the landing probability of
early-terminated random walks on the social graph. SybilRank generates an informative
user-ranked list with a substantial fraction of fake accounts at the bottom,
and bounds the number of fake accounts that are ranked higher than legitimate
users to O(log n) per attack edge, where n is the total number of users. We have
demonstrated the scalability of SybilRank via a prototype on Hadoop MapReduce,
and its effectiveness in the real world through a live deployment at Tuenti,
the largest OSN in Spain.
The second part of this dissertation focuses on how to exploit an attacker's
economic constraints to uncover malicious accounts. We present SynchroTrap, a system
that uncovers large groups of active malicious accounts, including both fake
accounts and compromised accounts, by detecting their loosely synchronized actions.
The design of SynchroTrap is based on the observation that malicious accounts usually
perform loosely synchronized actions to accomplish an attack mission, due to
limited budgets, specific mission goals, etc. SynchroTrap transforms the detection
into a scalable clustering algorithm. It uncovers large groups of accounts
that act similarly at around the same time for a sustained period of time. To
handle the enormous volume of user action data in large OSNs, we designed SynchroTrap
as an incremental processing system that processes small data chunks on a daily
basis but aggregates the computational results over the continuous data stream.
We implemented SynchroTrap on Hadoop and Giraph, and we deployed it on Facebook
and Instagram. This deployment has resulted in the unveiling of millions of malicious
accounts and thousands of large attack campaigns per month.