CAREER: Enabling Futuristic Distributed Applications With Integrative Multistream Networking

January 8, 2015

Principal Investigator: Ketan Mayer-Patel
Funding Agency: National Science Foundation
Agency Number: ANI-0238260

Abstract

We live in a sea of information devices. Aside from traditional computers, our work and home environments are brimming with cell phones, laptops, personal digital assistants (PDA’s), digital cameras, camcorders, digital video recorders, webcams, security sensors, and myriad other information devices. Despite their steady increase in both number and sophistication, these devices still largely exist as islands of information. Communication between devices limited to simple transfers and/or remote access. In our mind’s eye, however, we can easily see the potential for much more. Realizing the full potential of our information infrastructure requires building distributed applications in which the capabilities of these myriad devices are harnessed together. The Internet applications of today connect devices (i.e., connecting my browser with the CNN server). The Internet applications of the future will connect environments (i.e., all of the devices in my office to the devices in the CNN newsroom). The networking challenges presented by these applications are fundamentally different than traditional client/server applications. Futuristic distributed applications are difficult to build because current networking infrastructures do not support their characteristic multistream architecture. In my career, I propose to address the networking challenges presented by futuristic distributed multiflow applications. I will focus on the problems that arise when managing multiple semantically related flows of data within an application. Exploiting high-level semantic relationships between flows is important for achieving application-level goals and appropriately allocating limited network resources. These problems are particularly interesting when these ‘peer’ flows have heterogeneous transport-level requirements (i.e., latency, reliability, response to packet loss, etc.). My research will advance the field by defining and exploring a new dimension of networking in which interstream relationships are significant. Doing so not only enables the development of exciting innovative multistream distributed applications, but in a larger sense advances our understanding of the complex relationship between the structure and meaning of information. Ultimately, the greatest impact of my research will be to enable people to communicate more effectively. The proposed mechanisms will facilitate the development of a new generation of Internet applications in which myriad information devices within the environment are employed in a coordinated fashion. These applications will create seamless information spaces that will provide people with unprecedented access to complex information.

CI-TEAm Implementation Project; Collaborative Research: Cyber-Infrastructure for Engineering Informatics Education

January 8, 2015

Principal Investigator: Ming Lin
Funding Agency: National Scicence Foundation
Agency Number: OCI-0636208

Abstract
Complex engineering systems are now routinely designed and analyzed in-silico, with minimal reliance on expensive physical prototyping. Such systems are essentially massive data sets embedded within a highly intricate semantic structure. Enabling current and future generations of engineers to perform in-silico prototyping (including design, visualization, and physics-based simulation) systems remains a formidable task. The problems are multi-disciplinary and the tools require substantial integrative knowledge to master. This CI-Teams Implementation proposal is designed to build on the momemtum of the team’s current CITeams Demonstration, the objective of which is the creation of a comprehensive, multi-disciplinary approach to engineering modeling. For the CI-Team implementation, the team will use the important emerging engineering domain of biologically-inspired robotic systems as a focal point. Building on the CI-Team Demonstration with snake-inspired robots, the investigators will aim to create novel ways in which to train future generations of engineering and computer science students to build physically realized systems for important applications in medicine, civil engineering (e.g., inspection), search and rescue, and homeland security. Results from the CITeam demonstration include a program of education around the creation of comprehensive engineering models. These models include semantic descriptions of robotic components, behavioral and simulation software, software for snake robot control and navigation, as well as the tools needed to perform analysis, component surrogation and mission assessment. The team has created a project portal, a “Source Forge”, focused on the domain of building shared engineering models for the domain of bio-inspired robots. As the project concludes over the coming months, the PIs will execute a coordinated set of multi-disciplinary courses concurrently taught across the partner institutions. The focus of the CI-Teams Implementation effort will be to transition this demonstratation beyond merely the team members and their direct collaborators. Specifically, we propose to (1) advance a general educational program in Engineering Informatics and (2) deploy a Repository that contains components and tools needed to advance the state of the art in snake-inspired and bio-inspired robotic systems. This repository will be made available over the Internet and provided for use by educators and researchers around the country and the world. A major challenge for the team will be ensuring the self-sustainability of the work beyond the conclusion of the CI-Team’s Implementation program. Intellectual Merit: The proposed CI-Team is an inter-disciplinary group from four universities consisting of computer scientists and engineers with complimentary expertise needed to create both the shared model and the educational deliverables. In doing so, the CI-Team will create a lasting piece of Cyber-Infrastructure. The technical work spans and integrates disciplines: semantics, engineering model representations, and computational tools. The efforts of this CI-Team will show how to more deeply connect different sub-fields of engineering and computer science. In addition, the resulting model will enable the rapid creation and simulation of new bio-inspired robot designs. While the targeted domain is bio-robotics, the proposed modeling effort also applies to other complex engineered artifacts such as power-plants, aircraft, MEMS, and other emerging technologies in which design and simulation is important. Broader Impact: The proposal contributes to the transformation of engineering into an “informatics” discipline and broadens the interface between computer science and engineering. Ultimately the CI-Team aims to stimulate creation of a new engineering knowledge industry, leading to productivity gains similar to those that resulted from introduction of CAD systems over the last twenty years. The “science of integration” that can be enabled by improving our collective ability to create shared models will be instrumental in developing the next generation undergraduate and graduate curricula in engineering and computer science. Education and Training Impact: Recent reports from NSF, NAE and DoD have all lamented the fact that the “inter-disciplinary engineers” so desperately needed simply do not exist in adequate numbers to fulfill current needs. There are not enough educational initiatives and programs to produce these new engineers, nor have the standard, stove-piped, curriculas of engineering and computer science departments adapted to this need. This CI-Team, representing a set not found at any one university, is uniquely positioned to address this national challenge. The team plans to creatively leverage the CI-Team project to inter-connect existing education and training programs on our respective campuses. The team plans an ambitious use of tele-collaboration, eLearning and distance education technologies to propel this project and generate initial Cyber-Infrastructure content. The ultimate goal is to enable students to begin a curriculum of “Engineering Informatics” and define the content of the new Cyber-Engineering discipline that unites computer and information sciences with traditional engineering domains.

Future Analyst Workspace (A-Desk)

January 8, 2015

Principal Investigator: Henry Fuchs, Greg Welch
Funding Agency: US Air Force Office of Scientific Research
Agency Number: FA8750-08-2-0209

Abstract
UNC will develop a functional mockup of IARPA’s (Intelligence Advanced Research Projects Activity) vision for an intelligence analyst’s future workspace (A-Desk). The mockup will resemble a single-person, surround cubicle, where some portion of the interior is covered with seamless panoramic imagery of an example analyst’s application. We will mock up the analyst’s application, developing an interactive 3D application that matches the IARPA vision, with software “stubs” that can be used later to increase the functionality of the mockup. We will prototype the A-Desk workspace with analyst application at UNC, and then deliver the system to IARPA for their use in demonstrations to analysts and other interested parties.

3D Worlds for Location based Warfighter Assistance

January 8, 2015

Principal Investigator: Jan-Michael Frahm
Funding Agency: US Defense Advanced Research Project Agency
Agency Number: TS00003

Abstract
To summarize the information of a scene collected through a large number of photos or videos of body worn cameras we propose to use computer vision techniques to reconstruct a 3D model form these photos or videos. These models naturally combine the scene information captured in the photos in a form that is easy to understand for humans and can be used to visualize the scene as part of an aerial model. Our team has significant expertise in fast 3D reconstruction from videos or photo collections. From the DARPA UrbanScape project and the DTO VACE project “3D Context form Video” we are equipped with a fully functional 3D reconstruction from video system. The new technical challenges here are to adapt our reconstruction techniques to handle the illumination changes in the scene and the generally less constraint motion of body worn cameras.

CSR–EHS Real-time Computing on Multicore Platforms

January 8, 2015

Principal Investigator: James H. Anderson
Funding Agency: National Science Foundation
Agency Number: CNS-0615197

Abstract
Thermal and power problems impose limits on the performance that single-processor chips can deliver. Multicore architectures (or chip multiprocessors), which include several processors on a single chip, are being widely touted as a way to circumvent this impediment. Several chip manufactures have released dual-core chips. Such chips include Intel’s Pentium~D and Pentium Extreme Edition, IBM’s PowerPC, AMD’s Opteron, and Sun’s UltraSPARC IV. A few designs with more than two cores have also been announced. For instance, Sun expects to ship its eight-core Niagara chip by early 2006. In addition, IBM recently introduced the Cell processor, which includes, on the same chip, a dual-core PowerPC plus eight “synergistic processing elements’ that are optimized for single- and double-precision mathematical calculations. Intel is expected to release four-, eight-, 16-, and perhaps even 32-core chips within a decade. Special-purpose multicore systems, such as network processors, have also been available for several years now. For software designs to take advantage of the parallelism available in these systems, careful attention must be paid to resource-allocation issues. For throughput-oriented applications, some initial work on such issues has been done. However, almost no such work has targeted real-time applications, which require very different resource-allocation methods, as they need performance guarantees. In this project, an approach will be investigated for synthesizing real-time applications on multicore systems. Both hard real-time applications, in which deadlines can never be missed, and soft real-time applications, in which some deadline misses are tolerable, will be considered. Examples of the former include control and tracking systems, and examples of the latter include multimedia and gaming systems. In multicore systems, care must be taken when scheduling and synchronizing tasks in order to avoid thrashing shared on-chip caches. In real-time systems, of course, real-time constraints must be ensured as well. The main objective of this project is to develop an allocation framework that addresses both concerns. Our main thesis is that such a framework should be based upon global real-time scheduling algorithms. Such algorithms are more flexible than the alternative, partitioning approaches. This flexibility yields two advantages. First, global algorithms are better able to use information about cache behavior to influence co-scheduling choices. Second, such algorithms (at least, those considered in this project) are immune from the bin-packing-like problems that plague partitioning approaches. In real-time systems, such problems can result in the need to place restrictive caps on overall utilization, wasting resources. Preliminary research suggests that the proposed framework, when fully deployed, will be flexible in its ability to ensure timing constraints, while encouraging low miss rates in shared caches. However, full deployment will require further work on several topics. The proposed research agenda includes research on these topics and an associated experimental evaluation. The proposed evaluation includes experiments with synthetically-generated real-time workloads on a multicore simulator, and experiments on an actual multicore platform involving multimedia workloads and also a human-tracking system used in immersive virtual environments. Broader impacts. With the ongoing shift to multicore technologies, this project could have a significant, far-reaching impact: in the future, multicore platforms will be the “standard’ computing platform in many settings, and real-time applications, many quite complex, will be deployed on them. Multicore platforms differ significantly from the kinds of platforms considered previously in work on real-time systems. This project, if funded, would be the first attempt within the real-time-systems research community to acknowledge these differences and to attempt to deal with them. The most direct impact of the proposed research on industry will likely be with respect to the design of operating-system components for future multicore products. Researchers at Intel, in particular, have expressed interest in our work, as evidenced by the attached supporting letter. We are hopeful that this interest will lead to collaborative efforts, summer internship possibilities for our students, etc. We also expect such interactions to lead to new collaborative research directions and to impact the content of several of our courses. Our research group has a good track record in involving graduate students from underrepresented groups, having graduated one female Ph.D. student last year, with a second expected this year. We expect this trend to continue. Public outreach will be accomplished by including the test-case systems developed in this project in our department’s long-running demo program. Any software of general utility that we produce will be made publically available on the web.

Real-time Computing on Multicore Platforms

January 8, 2015

Principal Investigator:James Anderson
Funding Agency:U.S. Army Research Office
Agency Number:W911NF-06-1-0425

Abstract
Thermal and power problems impose limits on the performance that chips with a
single processing unit can deliver. Multicore architectures (or
chip multiprocessors), which include multiple processing units on a single
chip, are being widely touted as a way to circumvent this impediment. Several
chip manufactures have released, or will soon release, dual-core chips. Such
chips include Intel’s Pentium D and Pentium Extreme Edition, IBM’s PowerPC,
AMD’s Opteron, and Sun’s UltraSPARC IV. A few designs with more than two cores
have also been announced. For instance, Sun expects to ship its eight-core
Niagara chip by early 2006. In addition, IBM recently introduced the Cell
processor, which includes, on the same chip, a dual-core PowerPC plus eight
“synergistic processing elements’ that are optimized for single- and
double-precision mathematical calculations. Intel is expected to release four-,
eight-, 16-, and perhaps even 32-core chips within a decade. Special-purpose
multicore systems, such as network processors, have also been available for
several years now.

For software designs to take advantage of the parallelism available in these
systems, careful attention must be paid to resource-allocation issues. For
throughput-oriented applications, some initial work on resource-allocation
tradeoffs has been done. However, no such work has targeted
real-time applications, which require very different scheduling
methods, as they need performance guarantees. In this project, an approach will
be investigated for synthesizing real-time applications on multicore systems.
Both hard real-time applications, in which deadlines can never be missed, and
soft real-time applications, in which some deadline misses are tolerable, will
be considered. Examples of the former include control and tracking systems,
and examples of the latter include multimedia and gaming systems.

In multicore systems, care must be taken when allocating tasks in order to avoid
thrashing shared on-chip caches. In real-time systems, of course, there is an
additional objective of meeting real-time constraints. The main objective of
this project is to develop scheduling and allocation schemes that address both
concerns. This objective is made more difficult by the fact that multicore
platforms are multiprocessors. The allocation framework that is
proposed to meet this objective is based upon a novel cache-cognizant real-time
multiprocessor scheduling algorithm that is near-optimal in its ability to
schedule real-time workloads. This algorithm is superior to other known methods
in its ability to ensure timing constraints, and its use on multicore platforms
can result in substantially better performance than other approaches. However,
the development of a complete resource-allocation and scheduling framework will
require further work on several topics. The proposed research agenda includes
research on these topics and an associated experimental evaluation. The
proposed evaluation includes experiments with synthetically-generated real-time
workloads on a multicore simulator, and experiments on an actual multicore
platform involving multimedia and data-fusion applications of interest to the
Army.

Relevance to ARO’s objectives.

With the ongoing shift to multicore technologies, future real-time workloads
will likely be deployed on multiprocessor platforms that differ
significantly from the kinds of platforms considered in prior work. This
project, if funded, would be the first attempt within the real-time-systems
research community to acknowledge these differences and to attempt to deal with
them. This technology shift obviously will affect the Army as well: in the
future, multicore platforms will be the “standard’ computing
platform in many settings, and real-time applications of relevance to the Army
will be deployed on them. In addition, the ability to process
multimedia data content and to support data-fusion functions in real time in
battlefield scenarios is an important objective in supporting battlefield
situational awareness.

National Alliance for Medical Image Computing (NAMIC) Core1: Structural Analysis of Anotomical Shapes and of White Matter

January 8, 2015

Principal Investigator: Guido Gerig
Funding Agency: Brigham and Women’s Hospital
Agency Number: 149881

Abstract
In-vivo imaging studies of brain structures provide valuable information about the nature of neuropsychiatric disorders including neurodegenerative diseases and/or disorder of abnormal neurodevelopment. Imaging can depict functional and morphologic information and has become an important component to detect normal biological variability and change from normal. Image acquisition, in particularly methods based on MRI, shows steady progress with respect to spatial resolution, contrast-to-noise ration, high-speed imaging, and versatility of scanning sequences measuring a variety of properties of tissue and function. To keep pace with advances in imaging and new needs of clinical research, image analysis research has to develop effective processing tools suitable for exploratory studies and for testing clinical hypothesis. Advanced image analysis methodology and statistical analysis methods, if developed based on a sound mathematical foundation and on the principle to produce generic methodology intergrated into a common platform, will not be applicable to human neuroimaging research as proposed here but wil be relevant for various other imaging domains like animal imaging studies, confocal microscopy imaging, and even analysis of microarrays, for example. The objective of this project is to develop, test, and validate novel image analysis methodology driven by challenging clinical neuroimaging problems. In particular, this project focuses on the representation and analysis of anatomical shapes and on the structure analysis of white matter fiber tracts. Specific neuroscience analysis problems related to schizophrenia research will closely drive our research and development.

MolProbity Service and Related 3D-Analysis Resources

January 8, 2015

Principal Investigator: Jack Snoeyink
Funding Agency: National Institutes of Health (indirect Duke University)
Agency Number: 133612

Abstract
This project consists of extension, modernization, and increased interoperability for the MolProbity web service and the related suite of software for analyzing 3D macromolecular structures, to further improve its user-friendliness, generality, robustness, speed, and future maintainability for the benefit of both the many current users and the even broader community of biomedical researchers who now wish to employ it. This system has two central aspects that are so far unique. One is the kinemage graphics concept that separates a human-editable text file of the hierarchical display list from a content-independent 3D graphics program (Mage or KiNG) that displays the kinemage with priority on smooth interactive performance, open-ended explorability, and optimized perception of 3D relationships. The second unique aspect is all-atom contact analysis, which uses detailed sterics of all optimized hydrogens to characterize inter- and intra-molecular contacts and to diagnose and correct most errors in experimental structure models. This system has been in wide use since 1992, growing by major enhancements. It is open-source and cross-platform (Mac/Windows /Linux/Unix/web browser), supporting structural biology, biochemistry, bioinformatics, educational, and even non-molecular uses. MolProbity had 11,500 serious working sessions this year, and the Protein Data Bank site now uses our validation and graphics tools. This project would allow the reorganization and documen-tation that cannot be justified as part of a laboratory research effort but which is essential to long-term viability and growth as an open resource. Support would be added for CIF format files, nucleic acids, NMR ensembles, and structure comparisons. Modularity, code re-use, object-oriented organization, and standard web-design practice would be improved. The interface and the underlying programs of the MolProbity service would be thoroughly reworked to enhance both turnkey simplicity for study of a single protein and powerful flexibility for expert structural-biology clients. Scripted command-line use, precalculated material, and software web access would be enhanced to serve a new group of prediction and bioinformatics users analyzing thousands of files. User feedback, collaborators, and our own testing in production mode will guide this process. This project is particularly relevant to public health in its goals of helping substantially to speed production of and to improve accuracy of molecular models for development of chemotherapeutics.

New Frameworks for Detecting and Minimizing Information Leakage in Anonymized Network Data

January 8, 2015

Principal Investigator: Michael Reiter and Fabian Monrose
Funding Agency: John Hopkins University
Agency Number: 2000457356

Abstract
The availability of realistic network data plays a significant role in fostering collaboration and ensuring U.S. technical leadership in network security research. Unfortunately, a host of technical, legal, policy, and privacy issues limit the ability of operators to produce datasets for information security testing. In an effort to help overcome these limitations, the Department of Homeland Security (DHS) has endeavored to create a national repository of network traces under the Protected Repository for the Defense of Infrastructure against Cyber Threats (PREDICT) program. A key technique used in this program to assure low-risk, high-value data is that of trace anonymization—a process of sanitizing data before release so that information of concern cannot be extracted. Indeed, many believe that proven anonymization techniques are the missing link that will enable cyber security researchers to tap real Internet traffic and develop effective solutions tailored to current risks.

Recently, however, the utility of these techniques in protecting host identities, network topologies, and network security practices within enterprise networks has come under scrutiny. Much of our own work, for example, has shown that deanonymization of public servers [2], recovery of network structure, and identification of browsing habits [1] may not be as difficult as first thought. Given the significant reliance on anonymized network traces for security research, we argue that a more exhaustive and principled analysis of the trace anonymization problem is in order.

The naive solution to this problem (i.e., new anonymization techniques that directly addresses the specifics of these attacks) fail to address in its entirety the underlying dynamic of the dataset publication—the trade-off between dataset quality and privacy. While important, isolated advances will simply shift the information-encoding burden to other properties of the traces, resulting in future breaches. To truly address this problem, we argue that what is needed is a framework for evaluating the risk in anonymization techniques and datasets. Based on novel information-theoretic concepts, we propose techniques that will allow the network or security practitioner to evaluate the risk inherent in their choice of policy and anonymization technique, and ways for minimizing the risks of deanonymization. We propose to implement and evaluate this framework in the context of traces, techniques, and policies from our own networks as well as those offered as a part of the PREDICT project. In addition, we propose to investigate the public policy implications of this work, particularly those pertaining to types of networking research that rely on trace attributes that cannot be effectively anonymized.

CIFellow-Scott Coull (Mentor Michael Reiter)

January 8, 2015

Principal Investigator: Michael Reiter
Funding Agency: Computing Research Association
Agency Number: CIF-51

Abstract
A professor plays several key roles in the academic community, including those of researcher, teacher, mentor, and project manager. Thus, in order for a student to successfully transition into the position of professor, he must gain proficiency in each of these roles. With that in mind, I would like to take the opportunity offered by the Computing Innovation Fellows program to gain the experience necessary to become a productive member of the academic community. By exploring problems in the areas of network data anonymization and adversarial machine learning. I hope to develop solutions that help direct future research efforts in those areas and provide practical methods for addressing immediate problems. Additionally, I would like to hone my skills as an educator by teaching a graduate-level course in applied cryptography. Aside from research and teaching, I would also like to gain experience as a mentor and manager by supervising graduate students on a number of smaller research projects. Finally, I hope to participate in several other professional development activities, such as participation in program committees and grant writing activities, which help introduce me to aspects of academic life that I was not exposed to as a graduate student.

The primary focus of my fellowship would be in addressing two research problems in the field of information security. The first of these research projects seeks to develop the theoretical underpinnings of network data anonymization and explore its connections to the well-established field of microdata (e.g., census data) anonymization. The second research project deals with problems that arise when applying machine learning methods to security problems in adversarial environments. Beyond the two research projects, I will also be continuing work in developing practical oblivious databases that simultaneously protect user privacy and preserve the functionality of traditional databases.