osdi 2021 accepted papers

However, with the increasingly speedy transactions and queries thanks to large memory and fast interconnect, commodity HTAP systems have to make a tradeoff between data freshness and performance degradation. With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. Tej Chajed, MIT CSAIL; Joseph Tassarotti, Boston College; Mark Theng, MIT CSAIL; Ralf Jung, MPI-SWS; M. Frans Kaashoek and Nickolai Zeldovich, MIT CSAIL. A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. There are two major GNN training obstacles: 1) it relies on high-end servers with many GPUs which are expensive to purchase and maintain, and 2) limited memory on GPUs cannot scale to today's billion-edge graphs. OSDI'20: 14th USENIX Conference on Operating Systems Design and ImplementationNovember 4 - 6, 2020 ISBN: 978-1-939133-19-9 Published: 04 November 2020 Sponsors: ORACLE, VMware, Google Inc., Amazon, Microsoft Get Alerts for this Conference Save to Binder Export Citation Bibliometrics Citation count 96 Downloads (6 weeks) 317 Downloads (12 months) The blockchain community considers this hard fork the greatest challenge since the infamous 2016 DAO hack. Notification of conditional accept/reject for revisions: 3 March 2022. Second, GNNAdvisor implements a novel and highly-efficient 2D workload management tailored for GNN computation to improve GPU utilization and performance under different application settings. blk-switch evaluation over a variety of scenarios shows that it consistently achieves s-scale average and tail latency (at both 99th and 99.9th percentiles), while allowing applications to near-perfectly utilize the hardware capacity. A significant obstacle to using SC for practical applications is the memory overhead of the underlying cryptography. Although SSDs can be simplified under the current ZNS interface, its counterpart LFS must bear segment compaction overhead. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. The co-chairs may then share that paper with the workshops organizers and discuss it with them. The 15th USENIX Symposium on Operating Systems Design and Implementation seeks to present innovative, exciting research in computer systems. Extensive experiments show that GNNAdvisor outperforms the state-of-the-art GNN computing frameworks, such as Deep Graph Library (3.02 faster on average) and NeuGraph (up to 4.10 faster), on mainstream GNN architectures across various datasets. We present Storm, a web framework that allows developers to build MVC applications with compile-time enforcement of centrally specified data-dependent security policies. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. As a result, data characteristics and device capabilities vary widely across clients. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude. We propose a new framework for computing the embeddings of large-scale graphs on a single machine. The novel aspect of the nanoPU is the design of a fast path between the network and applications---bypassing the cache and memory hierarchy, and placing arriving messages directly into the CPU register file. Conference site 49 papers accepted out of 251 submitted. Reviews will be available for response on Wednesday, March 3, 2021. A.H. Hunter, Jane Street Capital; Chris Kennelly, Paul Turner, Darryl Gove, Tipp Moseley, and Parthasarathy Ranganathan, Google. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. This post is for recording some notes from a few OSDI'21 papers that I got fun. Here, we focus on hugepage coverage. This change is receiving considerable attention in the architecture and security communities, for example, but in contrast, so-called OS researchers are mostly in denial. The NAL maintains 1) per-node partial views in PM for serving insert/update/delete operations with failure atomicity and 2) a global view in DRAM for serving lookup operations. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. 64 papers accepted out of 341 submitted. Overall, the OSDI PC accepted 31 out of 165 submissions. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. Lifting predicates and crash framing make the specification easy to use for developers, and logically atomic crash specifications allow for modular reasoning in GoJournal, making the proof tractable despite complex concurrency and crash interleavings. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. The file system performance of the proposed ZNS+ storage system was 1.33--2.91 times better than that of the normal ZNS-based storage system. Camera-ready submission (all accepted papers): 15 Mars 2022. For more details on the submission process, and for templates to use with LaTeX, Word, etc., authors should consult the detailed submission requirements. The wire-to-wire RPC response time through the nanoPU is just 69ns, an order of magnitude quicker than the best-of-breed, low latency, commercial NICs. In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Instead, we propose addressing the root cause of the heuristics problem by allowing software to explicitly specify to the device if submitted requests are latency-sensitive. Kernel code requires manual memory management and type-unsafe code and must efficiently handle complex, asynchronous events. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. Four months after we reported the bugs to Geth developers, one of the bugs was triggered on the mainnet, and caused nodes using a stale version of Geth to hard fork the Ethereum blockchain. We demonstrate that the hardware thread scheduler is able to lower RPC tail response time by about 5 while enabling the system to sustain 20% higher load, relative to traditional thread scheduling techniques. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. Concretely, Dorylus is 1.22 faster and 4.83 cheaper than GPU servers for massive sparse graphs. She also invented the spanning tree algorithm, which transformed Ethernet from a technology that supported a few hundred nodes, to something that can support large networks. Writing a correct operating system kernel is notoriously hard. Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. Secure hardware enclaves have been widely used for protecting security-critical applications in the cloud. Advisor: You have a past or present association as thesis advisor or advisee. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. We also verified a simple NFS server using GoJournals specs, which confirms that they are helpful for application verification: a significant part of the proof doesnt have to consider concurrency and crashes. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. . We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. PLDI seeks outstanding research that extends and/or applies programming-language concepts to advance the field of computing. For general conference information, see https://www.usenix.org/conference/osdi22. USENIX new Date().getFullYear()>document.write(new Date().getFullYear()); Grants for Black Computer Science Students Application, Title Page, Copyright Page, and List of Organizers, OSDI '21 Proceedings Interior (PDF, best for mobile devices). We implemented the ZNS+ SSD at an SSD emulator and a real SSD. Sanitizers detect unsafe actions such as invalid memory accesses by inserting checks that are validated during a programs execution. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. Prepublication versions of the accepted papers from the summer submission deadline are available below. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. A hardware-accelerated thread scheduler makes sub-nanosecond decisions, leading to high CPU utilization and low tail response time for RPCs. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. For instance, FAST 21 and NSDI 21 have author-notification dates after the OSDI 21 abstract-registration deadline. Submitted November 12, 2021 Accepted January 20, 2022. Researchers from the Software Systems Laboratory bagged Best Paper Awards at the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021) and the 2021 USENIX Annual Technical Conference (USENIX ATC 2021).. Jay Lepreau Best Paper Award, OSDI'21. The 20th ACM Workshop on Hot Topics in Networks (HotNets 2021) will bring together researchers in computer networks and systems to engage in a lively debate on the theory and practice of computer networking. Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. This talk will discuss several examples with very different solutions. USENIX, like other scientific and technical conferences and journals, prohibits these practices and may, on the recommendation of a program chair, take action against authors who have committed them. Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Research Petuum CASL research and engineering team's Pollux technical paper on adaptive scheduling for optimized. Authors are required to register abstracts by 3:00 p.m. PST on December 3, 2020, and to submit full papers by 3:00 p.m. PST on December 10, 2020. Mothy joined the Computer Science Department ETH Zurich in January 2007 and was named Fellow of the ACM in 2013 for contributions to operating systems and networking research. Manuela M. Veloso is the Head of J.P. Morgan AI Research, which pursues fundamental research in areas of core relevance to financial services, including data mining and cryptography, machine learning, explainability, and human-AI interaction. Cores can safely and concurrently read from their local kernel replica, eliminating remote NUMA accesses. Consensus bugs are bugs that make Ethereum clients transition to incorrect blockchain states and fail to reach consensus with other clients. Marius is open-sourced at www.marius-project.org. Title Page, Copyright Page, and List of Organizers | Proceedings Cover | Based on the observation that real-world workloads always feature skewed access patterns, Nap introduces a NUMA-aware layer (NAL) on the top of existing concurrent PM indexes, and steers accesses to hot items to this layer. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. Log search and log archiving, despite being critical problems, are mutually exclusive. will work with the steering committee to ensure that the symposium program will accommodate presentations for all accepted papers. Manuela will present examples and discuss the scope of AI in her research in the finance domain. When registering your abstract, you must provide information about conflicts with PC members. Responses should be limited to clarifying the submitted work. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. Authors may use this for content that may be of interest to some readers but is peripheral to the main technical contributions of the paper. How can we design systems that will be reliable despite misbehaving participants? If the conference registration fee will pose a hardship for the presenter of the accepted paper, please contact conference@usenix.org. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. Authors of each accepted paper must ensure that at least one author registers for the conference, and that their paper is presented in-person at the conference. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). OSDI 2021 papers summary. Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. Timothy Roscoe is a Full Professor in the Systems Group of the Computer Science Department at ETH Zurich, where he works on operating systems, networks, and distributed systems, and is currently head of department. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. Accepted papers will be allowed 14 pages in the proceedings, plus references. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. In particular, responses must not include new experiments or data, describe additional work completed since submission, or promise additional work to follow. We focus on NVMe storage devices and show that it is natural to express these semantics in the kernel and the application and only requires a modest two-bit change to the device interface. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. However, memory allocation decisions also impact overall application performance via data placement, offering opportunities to improve fleetwide productivity by completing more units of application work using fewer hardware resources. Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. All the times listed below are in Pacific Daylight Time (PDT). For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked. The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. Session Chairs: Ryan Huang, Johns Hopkins University, and Manos Kapritsos, University of Michigan, Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan, Columbia University. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. Academic and industrial participants present research and experience papers that cover the full range of theory . Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%. J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. Third, GNNAdvisor capitalizes on the GPU memory hierarchy for acceleration by gracefully coordinating the execution of GNNs according to the characteristics of the GPU memory structure and GNN workloads. We also show that Marius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550 GB of total parameters on a single machine with 16 GB of GPU memory and 64 GB of CPU memory. Welcome to the SOSP 2021 Website. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. Zeph executes privacy-adhering data transformations in real-time and scales to thousands of data sources, allowing it to support large-scale low-latency data stream analytics. Session Chairs: Dushyanth Narayanan, Microsoft Research, and Gala Yadgar, TechnionIsrael Institute of Technology, Jinhyung Koo, Junsu Im, Jooyoung Song, and Juhyung Park, DGIST; Eunji Lee, Soongsil University; Bryan S. Kim, Syracuse University; Sungjin Lee, DGIST. Of the 26 submitted artifacts: 26 artifacts received the Artifacts Available badge (100%). Foreshadow was chosen as an IEEE Micro Top Pick. Ethereum is the second-largest blockchain platform next to Bitcoin. DeSearch uses trusted hardware to build a network of workers that execute a pipeline of small search engine tasks (crawl, index, aggregate, rank, query). Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. Nico Lehmann and Rose Kunkel, UC San Diego; Jordan Brown, Independent; Jean Yang, Akita Software; Niki Vazou, IMDEA Software Institute; Nadia Polikarpova, Deian Stefan, and Ranjit Jhala, UC San Diego. Session Chairs: Deniz Altinbken, Google, and Rashmi Vinayak, Carnegie Mellon University, Tanvir Ahmed Khan and Ian Neal, University of Michigan; Gilles Pokam, Intel Corporation; Barzan Mozafari and Baris Kasikci, University of Michigan. USENIX Security '21 has three submission deadlines. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. Fluffy found two new consensus bugs in the most popular Geth Ethereum client which were exploitable on the live Ethereum mainnet. See www.cs.cmu.edu/~mmv/Veloso.html for her scientific publications. Although the number of submissions is lower than the past, it's likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. This is especially true for DPF over Rnyi DP, a highly composable form of DP. sosp ACM Symposium on Operating Systems Principles. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. Radia Perlman is a Fellow at Dell Technologies. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. If you submit a paper to either of those venues, you may not also submit it to OSDI 21. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 14-16, 2021. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. Password USENIX discourages program co-chairs from submitting papers to the conferences they organize, although they are allowed to do so. While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. Existing decentralized systems like Steemit, OpenBazaar, and the growing number of blockchain apps provide alternatives to existing services. Copyright to the individual works is retained by the author[s]. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. Grand Rapids, Michigan, United States . At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. For example, optimistic concurrency control (OCC) is better than two-phase-locking (2PL) under low contention, while the converse is true under high contention. Author Response Period Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! We also propose two file system techniques for ZNS+-aware LFS. OSDI is "a premier forum for discussing the design, implementation, and implications of systems software." A total of six research papers from the department were accepted to the . We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. PLDI is a premier forum for programming language research, broadly construed, including design, implementation, theory, applications, and performance. We have implemented a prototype of our design based on Penglai, an open-sourced enclave system for RISC-V. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. In addition, CLP outperforms Elasticsearch and Splunk Enterprise's log ingestion performance by over 13x, and we show CLP scales to petabytes of logs. PET discovers and applies program transformations that improve computation efficiency but only maintain partial functional equivalence.