Skip to main content

Following is a complete list of doctoral graduates of the Department of Computer Science, with their dissertation titles. Graduates of other departments or schools, whose primary adviser was a member of the Department of Computer Science, are also listed.


Advisors names are given in parentheses. Advisor affiliations are given if the advisor was not on the UNC Department of Computer Science faculty.


Clicking on the author’s name will take you to a separate page containing dissertation abstracts, where available.


UNC Libraries: Except where noted, all of the dissertations listed here are available from the libraries at UNC-Chapel Hill. Those from the most recent year will not immediately be available, however.

  • ProQuest database
  • Microfiche copies: Microforms Section of Davis Library (basement)
  • Bound copies: Kenan Science Library Annex
  • Non-circulating copies: North Carolina Collection, Wilson Library


M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z

Ph.D. Graduates


Maimone, Andrew (2015)

“Computational See-Through Near-Eye Displays”
Under the direction of Henry Fuchs
Electronic copy available

See-through near-eye displays with the form factor and field of view of eyeglasses are a natural choice for augmented reality systems: the non-encumbering size enables casual and extended use and large field of view enables general-purpose spatially registered applications. However, designing displays with these attributes is currently an open problem. Support for enhanced realism through mutual occlusion and the focal depth cues is also not found in eyeglasses-like displays.

This dissertation provides a new strategy for eyeglasses-like displays that follows the principles of computational displays, devices that rely on software as a fundamental part of image formation. Such devices allow more hardware simplicity and flexibility, showing greater promise of meeting form factor and field-of-view goals while enhancing realism. This computational approach is realized in two novel and complementary see-through near-eye display designs. The first subtractive approach filters omnidirectional light through a set of optimized patterns displayed on a stack of spatial light modulators, reproducing a light field corresponding to in-focus imagery. The design is thin and scales to wide fields of view; see-through operation is achieved with transparent components placed directly in front of the eye. Preliminary support for focal cues and environment occlusion is also demonstrated. The second additive approach uses structured point light illumination to form an image with a near minimal set of rays. Each of an array of defocused point light sources is modulated by a region of a spatial light modulator, essentially encoding an image in the focal blur. See-through operation is also achieved with transparent components, and thin form factors and wide fields of view >100 degrees are demonstrated.

The designs are examined in theoretical terms, in simulation, and through prototype hardware with public demonstrations. This analysis shows that the proposed computational near-eye display designs offer a significantly different set of trade-offs than conventional optical designs. Several challenges remain to make the designs practical, most notably addressing diffraction limits.

Majumder, Aditi (2003)

“A Practical Framework to Achieve Perceptually Seamless Multi-Projector Displays”
Under the direction of Gregory Welch and Rick Stevens
Electronic copy available

Arguably the most vexing problem remaining for planar multi-projector displays is that of color seamlessness between and within projectors. While researchers have explored approaches that strive for strict color uniformity, this goal typically results in severely compressed dynamic range and generally poor image quality. In this dissertation, I introduce the emineoptic function that models the color variations in multi-projector displays. I also introduce a general goal of color seamlessness that seeks to balance perceptual uniformity and display quality. These two provide a comprehensive generalized framework to study and solve for color variation in multi-projector displays. For current displays, usually built with same model projectors, the variation in chrominance (hue) is significantly less than in luminance (brightness). Further, humans are at least an order of magnitude more sensitive to variations in luminance than in chrominance. So, using this framework of the emineoptic function I develop a new approach to solve the restricted problem of luminance variation across multi projector displays. My approach reconstructs the emineoptic function efficiently and modifies it based on a perception-driven goal for luminance seamlessness. Finally I use the graphics hardware to reproject the modified function at interactive rates by manipulating only the projector inputs. This method has been successfully demonstrated on three different displays made of 5 x 3 array of fifteen projectors, 3 x 2 array of six projectors and 2 x 2 array of four projectors at the Argonne National Laboratory. My approach is efficient, accurate, automatic and scalable, requiring only a digital camera and a photometer. To the best of my knowledge, this is the first approach and system that addresses the luminance problem in such a comprehensive fashion and generates truly seamless displays with high dynamic range.

Mark, William R. (1999)

“Post-Rendering 3D Image Warping: Visibility, Reconstruction, and Performance for Depth-Image Warping”
Under the direction of Gary Bishop
Electronic copy available

The images generated by real-time 3D graphics systems exhibit enormous frame-to-frame coherence, which is not exploited by the conventional graphics pipeline. I exploit this coherence by decoupling image rendering from image display. My system renders every Nth frame in the conventional manner, and generates the in-between frames with an image warper. The image warper modifies a rendered image so that it is approximately correct for a new viewpoint and view direction. My image warper uses McMillan’s 3D image warp. Unlike perspective image warps, the 3D image warp can correct for changes in viewpoint, even for objects at different depths. As a result, my system does not require the application programmer to segment the scene into different depth layers, as is required by systems that use a perspective image warp. I attack three major challenges associated with using 3D warping for rendering acceleration: visibility, reconstruction, and performance. I describe how to choose pairs of rendered images so that most of the needed portions of the scene are visible in these images. I describe a framework for the 3D warp reconstruction problem, and develop reconstruction algorithms that produce good quality images. Finally, I describe properties of the 3D warp that could be used to build efficient 3D image warpers in hardware. My technique can also compensate for rendering-system latency and network latency. I have built a real-time system that demonstrates this capability by displaying rendered images at a remote location, with low latency.

Maascarenhas, Ajith A. (2006)

“Time-varying Reeb Graphs: A Topological Framework Supporting the Analysis of Continuous Time-varying Data”
Under the direction of Jack Snoeyink
Electronic copy available

I present time-varying Reeb graphs as a topological framework to support the analysis of continuous time-varying data. Such data is captured in many studies, including computational fluid dynamics, oceanography, medical imaging, and climate modeling, by measuring physical processes over time, or by modeling and simulating them on a computer. Analysis tools are applied to these data sets by scientists and engineers who seek to understand the underlying physical processes. A popular tool for analyzing scientific datasets is level sets, which are the points in space with a fixed data value s. Displaying level sets allows the user to study their geometry, their topological features such as connected components, handles, and voids, and to study the evolution of these features for varying s. For static data, the Reeb graph encodes the evolution of topological features and compactly represents topological information of all level sets. The Reeb graph essentially contracts each level set component to a point. It can be computed efficiently, and it has several uses: as a succinct summary of the data, as an interface to select meaningful level sets, as a data structure to accelerate level set extraction, and as a guide to remove noise. I extend these uses of Reeb graphs to time-varying data. I characterize the changes to Reeb graphs over time, and develop an algorithm that can maintain a Reeb graph data structure by tracking these changes over time. I store this sequence of Reeb graphs compactly, and call it a time-varying Reeb graph. I augment the time-varying Reeb graph with information that records the topology of level sets of all level values at all times, that maintains the correspondence of level set components over time, and that accelerates the extraction of level sets for a chosen level value and time. Scientific data sampled in space-time must be extended everywhere in this domain using an interpolant. A poor choice of interpolant can create degeneracies that are di±cult to resolve, making construction of time-varying eeb graphs impractical. I investigate piecewise-linear, piecewise-trilinear, and piecewise-prismatic interpolants, and conclude that piecewise-prismatic is the best choice for computing time-varying Reeb graphs. Large Reeb graphs must be simplified for an effective presentation in a visualization system. I extend an algorithm for simplifying static Reeb graphs to compute simplifications of time-varying Reeb graphs as a first step towards building a visualization system to support the analysis of time-varying data.

McAllister, David F. (1972)

“Algorithms for Chebychev Approximation Over Finite Sets”
Under the direction of Stephen M. Pizer

Several new algorithms and modifications to existing algorithms for linear and rational Chebychev approximations on finite point sets are developed theoretically and compared numerically. All algorithms considered employ the solution of a linear program or sequences of linear programs in their search for a best approximation. In the rational case, root finding methods were found to be superior to any existing algorithm when the denominator was bounded below by a positive constant. The Appendix discusses a method of computing min ||Az|| ||z|| =1 , where the norm is the Chebychev norm, by minimizing the norm of the pseudo-inverse of A.

McAllister, David K. (2002)

“A Generalized Surface Appearance Representation for Computer Graphics”
Under the direction of Anselmo A. Lastra
Electronic copy available

For image synthesis in computer graphics, two major approaches for representing a surface’s appearance are texture mapping, which provides spatial detail, such as wallpaper, or wood grain; and the 4D bi-directional reflectance distribution function (BRDF) which provides angular detail, telling how light reflects off surfaces. I combine these two modes of variation to form the 6D spatial bi-directional reflectance distribution function (SBRDF). My compact SBRDF representation simply stores BRDF coefficients at each pixel of a map. I propose SBRDFs as a surface appearance representation for computer graphics and present a complete system for their use. I acquire SBRDFs of real surfaces using a device that simultaneously measures the BRDF of every point on a material. The system has the novel ability to measure anisotropy (direction of threads, scratches, or grain) uniquely at each surface point. I fit BRDF parameters using an efficient nonlinear optimization approach specific to BRDFs. SBRDFs can be rendered using graphics hardware. My approach yields significantly more detailed, general surface appearance than existing techniques for a competitive rendering cost. I also propose an SBRDF rendering method for global illumination using prefiltered environment maps. This improves on existing prefiltered environment map techniques by decoupling the BRDF from the environment maps, so a single set of maps may be used to illuminate the unique BRDFs at each surface point. I demonstrate my results using measured surfaces including gilded wallpaper, plant leaves, upholstery fabrics, wrinkled gift-wrapping paper and glossy book covers.

McAnulty, Michael A. (1973)

“Computer-Aided Processing of Coronary Arterial Tree Cinearteriograms”
Under the direction of Donald F. Stanat

A coronary cinearteriogram is an x-ray motion picture of a beating heart whose coronary arteries are made selectively visible with radio-opaque dye. A careful numerical tracking of the arterial tree through time and space enables the calculation of dynamic information which is clinically useful and only sketchily available by other means. Digitizing trees by hand is possible, but tedious and error-prone. Computer-aided collection of such data can both speed up the tree definition process and make the data more readily available to subsequent numerical reduction. A program is described which drives an automatic film scanner viewing the cinearteriogram, performs basic book-keeping and tracking functions, and aids in the primary detection of the trees, taking advantage of the ability with motion pictures to predict the position of a feature on one frame from that on the previous frame. Although operator interaction with the system is necessary to correct erroneous processing, the system is competitive with a hand-entry system with respect to both speed and accuracy.

McInroy, John W. (1978)

“A Concept-Vector Representation of the Paragraphs in a Document, Applied to Automatic Extracting”
Under the direction of Stephen F. Weiss

I report here on a study of several related problems in automatic extracting. First, I present (a) a new method of automatic extracting, which uses concept vectors, along with (b) a procedure for evaluating extracts, and (c) results of experiments in which I use the evaluation procedure to compare extracts produced employing the concept-vector method with human-produced abstracts and with other automatically produced extracts. Concept-vector extracts perform better in information retrieval tests in the SMART system than a competing automatic method, but these differences are not statistically significant. Concept-vector extracts do not perform so well as human-produced abstracts, but do perform better than extracts taken at random from documents. These differences are significant according to some of the statistical tests used. Second, I examine the information-retrieval performance of extracts of various sizes, using the same evaluation procedure. Performance drops off gradually at first with decreasing size, then more and more sharply as extracts become smaller than a certain size. Third, I attempt to compute directly from the document and the extract a statistic that will predict performance in the evaluation procedure. The correlations between this statistic and actual performance are significant only in cases where the performance of extracts is very poor compared with the performance of full texts.

McKenzie Jr., Leslie E. (1988)

“An Algebraic Language for Query and Update of Temporal Databases”
Under the direction of Richard T. Snodgrass
Department of Computer Science Technical Report: TR88-050T
Electronic copy available

Although time is a property of events and objects in the real world, conventional relational database management systems (RDBMS’s) can’t model the evolution of either the objects being modeled or the database itself. Relational databases can be viewed as snapshotdatabases in that they record only the current database state, which represents the state of the enterprise being modeled at some particular time. We extend the relational algebra to support two orthogonal aspects of time: valid time, which concerns the modeling of time-varying reality, and transaction time, which concerns the recording of information in databases. In so doing, we define an algebraic language for query and update of temporal databases. The relational algebra is first extended to support valid time. Historical versions of nine relational operators (i.e., union, difference, cartesian product, selection, projection, intersection, theta-join, natural join, and quotient) are defined and three new operators (i.e., historical derivation, non-unique aggregation, and unique aggregation) are introduced. Both the relational algebra and this new historical algebra are then encapsulated within a language of commands to support transaction time. The language’s semantics is formalized using denotational semantics. Rollback operators are added to the algebras to allow relations to be rolled back in time. The language accommodates scheme and contents evolution, handles single-command and multiple-command transactions, and supports queries on valid time. The language is shown to have the expressive power of the temporal query language TQuel. The language supports both unmaterialized and materialized views and accommodates a spectrum of view maintenance strategies, including incremental, recomputed, and immediate view materialization. Incremental versions of the snapshot and historical operators are defined to support incremental view materialization. A prototype query processor was built for TQuel to study incremental view materialization in temporal databases. Problems that arise when materialized views are maintained incrementally are discussed, and solutions to those problems are proposed. Criteria for evaluating temporal algebras are presented. Incompatibilities among the criteria are identified and a maximal set of compatible evaluation criteria is proposed. Our language and other previously proposed temporal extensions of the relational algebra are evaluated against these criteria.

McMillan, Jr., Leonard (1997)

“An Image-Based Approach to Three-Dimensional Computer Graphics”
Under the direction of Gary Bishop
Department of Computer Science Technical Report TR97-013
Electronic copy available

The conventional approach to three-dimensional computer graphics produces images from geometric scene descriptions by simulating the interaction of light with matter. My research explores an alternative approach that replaces the geometric scene description with perspective images and replaces the simulation process with data interpolation. I derive an image-warping equation that maps the visible points in a reference image to their correct positions in any desired view. This mapping from reference image to desired image is determined by the center-of-projection and pinhole-camera model of the two images and by a generalized disparity value associated with each point in the reference image. This generalized disparity value, which represents the structure of the scene, can be determined from point correspondences between multiple reference images. The image-warping equation alone is insufficient to synthesize desired images because multiple reference-image points may map to a single point. I derive a new visibility algorithm that determines a drawing order for the image warp. This algorithm results in correct visibility for the desired image independent of the reference image’s contents. The utility of the image-based approach can be enhanced with a more general pinhole-camera model. I provide several generalizations of the warping equation’s pinhole-camera model and discuss how to build an image-based representation when information about the reference image’s center-of-projection and camera model is unavailable.

Meehan, Michael J. (2001)

“Physiological Reaction as an Objective Measure of Presence in Virtual Environments”
Under the direction of Frederick P. Brooks Jr.
Department of Computer Science Technical Report TR01-018
Electronic copy available

Virtual environments (VEs) are one of the most advanced human-computer interfaces to date. A common measure of the effectiveness of a VE is the amount of presence it evokes in users. Presence is commonly defined as the sense of being there in a VE. In order to study the effect that technological improvements such as higher frame rate, more visual realism, and lower lag have on presence, we must be able to measure it. There has been much debate about the best way to measure presence, and we, as presence researchers, have yearned for a measure that is

  • Reliable–produces repeatable results, both from trial to trial on the same subject and across subjects;
  • Valid–measures subjective presence, or at least correlates well with established subjective presence measures;
  • Sensitive–is capable of distinguishing multiple levels of presence; and
  • Objective–is well shielded from both subject bias and experimenter bias.

We hypothesize that to the degree that a VE seems real, it will evoke physiological responses similar to those evoked by the corresponding real environment, and that greater presence will evoke a greater response. Hence, these responses serve as reliable, valid, sensitive, and objective measures of presence. We conducted three experiments that support the use of physiological reaction as a reliable, valid, sensitive, and objective measure of presence. We found that change in heart rate was the most sensitive of the physiological measures (and was more sensitive than most of the self-reported measures) and correlated best among the physiological measures with the reported presence measures. Additionally, our findings showed that passive haptics and frame rate are important for evoking presence in VEs. Inclusions of the 1.5-inch wooden ledge into the virtual environment significantly increased presence. Also, for presence evoked: 30 FPS (frames per second) >20 FPS >15 FPS. In conclusion, physiological reaction can be used as a reliable, valid, sensitive, and objective measure of presence in stress-inducing virtual environments.

Meenakshisundaram, Gopi (2001)

“Theory and Practice of Sampling and Reconstruction of Manifolds with Boundaries”
Under the direction of Jack Snoeyink
Electronic copy available

Surface sampling and reconstruction are used in modeling objects in graphics and digital archiving of mechanical parts in Computer Aided Design and Manufacturing (CAD/CAM). Sampling involves collecting 3D points from the surface. Using these point samples, a reconstruction process rebuilds a surface that is topologically equivalent and geometrically close to the original surface. Conditions are imposed on sampling to ensure correct reconstruction. In this dissertation, I study the sampling and reconstruction of manifolds with boundaries. For surfaces without boundaries, the sampling condition presented till now in the literature prescribes minimum required sampling density, and the reconstruction algorithm takes only the positions of the sample points as input. For the class of surfaces with boundaries, I show that the conditions on minimum required sampling density is not sufficient to ensure correct reconstruction, if the input to the reconstruction algorithm is just the positions of the sample points. Hence, for reliable reconstruction of surfaces with boundaries, either the sampling conditions should be strengthened or the reconstruction algorithm has to be provided with more information. These results have impact on the computation of the medial axis from the sample points of the surfaces with boundaries also. In this dissertation, I present a sampling condition which prescribes both relative minimum and maximum sampling densities for sampling surfaces with boundaries, and based on these sampling conditions, I develop a reconstruction algorithm which takes only the positions of the sample points as input. Further, I prove that the reconstructed surface is homeomorphic to the underlying sampled surface.

Mehra, Ravish (2015)

Efficient Techniques for Wave-Based Sound Propagation in Interactive Applications

Under the direction of Dinesh Manocha

Electronic copy available

Sound propagation techniques model the effect of the environment on sound waves and predict their behavior from point of emission at the source to the final point of arrival at the listener. Sound is a pressure wave produced by mechanical vibration of a surface, and the problem of sound propagation can be formulated mathematically as a second-order partial differential equation called the wave equation. Accurate techniques based on solving the wave equation, also called the wave-based techniques, are too expensive computationally and memory-wise. Therefore, these techniques face many challenges in terms of their applicability in interactive applications including sound propagation in large environments, time-varying source and listener directivity, and high simulation cost for mid-frequencies.

In this dissertation, we propose a set of efficient wave-based sound propagation techniques that solve these three challenges and enable the use of wave-based sound propagation in interactive applications. Firstly, we propose a novel equivalent source for interactive wave-based propagation in large scenes spanning hundreds of meters. It is based on the equivalent source theory used for solving radiation and scattering problems in acoustics and electromagnetics. Instead of using a volumetric or surface-based approach, this technique takes an object-centric approach to sound propagation. The proposed equivalent source technique generates realistic acoustic effects and takes orders of magnitude less runtime memory compared to prior wave-based techniques.

Secondly, we present an efficient framework for handling time-varying source and listener directivity for interactive wave-based sound propagation.  The source directivity is represented as a linear combination of elementary spherical harmonic sources. This spherical harmonic-based representation of source directivity can support analytical, data-driven, rotating or time-varying function at runtime. Unlike previous approaches, the listener directivity approach can be used to compute spatial audio (3D audio) for a moving, rotating listener at interactive rates. Lastly, we propose an efficient GPU-based time-domain solver for the wave equation that enables wave simulation up to the mid-frequency range in tens of minutes on a desktop computer. It is demonstrated that by carefully mapping all the components of the wave simulation to match the parallel processing capabilities of the graphics processors, significant improvement in performance can be gained compared to the CPU-based solvers, while maintaining numerical accuracy. 

We validate these techniques with offline numerical simulations and measured data recorded in an outdoor scene. We present results of preliminary user evaluations conducted to study the impact of these techniques on user’s immersion in virtual environment. We have integrated these techniques with the Half-Life 2 game engine, Oculus Rift head-mounted display, and Xbox game controller to enable users to experience high quality acoustics effects and spatial audio in the virtual environment.

Meng, Yu (2017)

Computational Cortical Surface Analysis for Study of Early Brain Development

Under the direction of Dinggang Shen

Electronic copy available

The study of morphological attributes of the cerebral cortex and their development is very important in understanding the dynamic and critical early brain development. Comparing with conventional studies in the image space, cortical surface-based analysis provides a better way to display, observe, and quantify the attributes of the cerebral cortex. The goal of this dissertation is to develop novel cortical surface-based methods for better studying the attributes of the cerebral cortex during early brain development. Specifically, this dissertation aims to develop methods for 1) estimating the development of morphological attributes of the cerebral cortex and 2) discovering the major cortical folding patterns. 

Estimation of the Development of Cortical Attributes. The early development of cortical attributes is highly correlated to the brain cognitive functionality and some neurodevelopmental disorders. Hence, accurately modeling the early development of cortical attributes is crucial for better understanding the mysterious normal and abnormal brain development. This task is very challenging, because infant cortical attributes change dramatically, complicatedly and regionally-heterogeneously during the first year of life. To address these problems, this dissertation proposes a Dynamically-Assembled Regression Forest (DARF). DARF first trains a single decision tree at each vertex on the cortical surface, and then groups nearby decision trees around each vertex as a vertex-specific forest to predict the cortical attribute. Since the vertex-specific forest can better capture regional details than the conventional regression forest trained for the whole brain, the prediction result is more precise. Moreover, because nearby forests share a large portion of decision trees, the prediction result is spatially smooth. On the other hand, missing cortical attribute maps in the longitudinal datasets often lead to insufficient data for unbiased analysis or training of accurate prediction models. To address this issue, a missing data estimation strategy based on DARF is further proposed. Experiments show that DARF outperforms the existing popular regression methods, and the proposed missing data estimation strategy based on DARF can effectively recover the missing cortical attribute maps.

Discovery of Major Cortical Folding Patterns. The folding patterns of the cerebral cortex are highly variable across subjects. Exploring the major cortical folding patterns in neonates is of great importance in neuroscience. Conventional geometric measurements of the cortex have limited capability in distinguishing major folding patterns. Although the recent sulcal pits-based analysis provides a better way for comparing sulcal patterns across individuals of adults or older children, whether and how sulcal pits are suitable for discovering major sulcal patterns in infants remain unknown. This dissertation adapts the sulcal pits extraction method of adults to infants, and validates the spatial consistency of sulcal pits in infants, so that they can be used as reliable landmarks for exploring major sulcal patterns. This dissertation further proposes a sulcal graph-based method for discovering major sulcal patterns, which is then applied to studying three primary cortical regions in 677 neonatal cortical surfaces. The experiments show that the proposed method is able to identify the previously unreported major sulcal patterns. Finally, this dissertation investigates and verifies that the sulcal pattern information could be utilized to help DARF for better estimating cortical attribute maps.

Menges, John E.(2009)

“Concur: An Investigation of Lightweight Migration in Support of Centralized Synchronous Distributed Collaboration”
Under the direction of Kevin Jeffay
Electronic copy available

Synchronous distributed collaborative systems support simultaneous observation of and interaction with shared objects by multiple, dispersed participants. Both centralized and replicated systems have been built, each class having its characteristic advantages and disadvantages. Centralized systems support simpler user mental models and are easier to implement, but they suffer from greater latency and reduced scalability. Replicated systems improve latency and scalability at the expense of more complex user mental models and implementations. Attempts to resolve these conflicting characteristics have resulted in hybrid (partly centralized and partly replicated) systems. These systems apply centralized and replicated sub-architectures selectively in order to best fit the properties of the architectures to various aspects of the system. Recent research has investigated support for dynamic reconfiguration among centralized, replicated, and hybrid configurations in response to the changing properties of a collaborative session over time. Dynamic reconfiguration typically involves migration of processes or objects among physically dispersed processors. Unfortunately, hybrid and dynamic architectures further complicate the user’s mental model, because they expose the user to more of the internal structure of the system and how it changes dynamically. Process and (object-oriented) object migration are also expensive in terms of migration time and the container (runtime) requirements at various processors, and they can increase latency during migration or if objects are migrated to a non-optimal location as a result of inaccurate predictions as to where they should be located. I have instead organized collaborative applications and supporting infrastructures around migrating entities, which are not required to have full process or object semantics. These entities are defined and classified based on properties affecting migration, such as their size and their use of external references. The central thesis of this dissertation is that this organization helps us to build collaborative applications and supporting infrastructures that achieve both the simpler user mental models and implementations of centralized systems, and the superior performance characteristics of replicated systems. This is accomplished through the fast migration of lightweight entities in a multi-centered centralized system, where a multi-centered system is defined as having single physical center and multiple, independently-migrating and entity-specific logical centers. The speed of such lightweight entity migration opens up the possibility of triggering migrations based on telegraphed user intentions (user actions that hint at imminent succeeding actions). Such telegraphed intentions are likely to be more accurate predictors of future interactions than the longer-term interaction histories considered in previous systems. Identifying sub-object, easily migratable entity classifications also minimizes container requirements, facilitating widespread entity distribution.

Merck, Derek L. (2010)

“Model Guided Rendering for Medical Images”
Under the direction of Steve Pizer & Julian Rosenman
Electronic copy available

High quality 3D medical image visualization has traditionally been restricted to either particular clinical tasks that focus on easily identified or high contrast structures, such as virtual colonoscopy, or to atlas patients such as the Visible Human, which can be painstakingly micro-segmented and rendered offline. Model Guided Rendering (MGR) uses partial image segmentations as a framework for combining information from multiple data sources into a single view, which leads to a variety of methods for synthesizing high quality visualizations that require only a short setup time. Interactively presenting such scenes for particular target patients enables a variety of new clinical applications. MGR draws information about a scene not only from the target medical image but also from segmentations and object models, from medical illustrations and solid color textures, from patient photographs, from registration fields, and from other patient images or atlases with information about structures that are hidden in the base modality. These data sources are combined on a region-by-region basis to estimate context-appropriate shading models and to compose a globally useful composition (clipping) for the entire scene. Local mappings are based on segmenting a sparse set of important structures from the scene by deformable shape models with well defined volumetric coordinates, such as the discrete medial representation (m-reps). This partial segmentation provides object coordinates that can be used to guide a variety of fast techniques for oriented solid texturing, color transfer from 2D or 3D sources, volume animation, and dynamic hierarchical importance clipping. The mgrView library computes medial-to-world and world-to-medial mappings and implements many of MGR’s methods within a fast rasterize-and-blend rendering core that can render complex scenes in real time on modest hardware. Several vignette views demonstrate how MGR’s unique capabilities can lead to important new comprehensions in clinical applications. These views include an interactive anatomic atlas of the head and neck, animated display of the effects of setup error or anatomic shape change on fractionated external beam radiotherapy treatment, and a pharyngoscopic augmentation that overlays planning image guidance information onto the camera view.

Merrell, Paul (2009)

“Model Synthesis”
Under the direction of Dinesh Manocha
Electronic copy available

Most computer graphics applications extensively use three-dimensional models and the demand for 3D models is growing. However, despite extensive work in modeling for over four decades, creating models remains a labor-intensive and difficult process even with the best available tools. I will present a new procedural modeling technique called model synthesis that is designed to generate models of many different kinds of objects. Model synthesis is inspired by developments in texture synthesis and is designed to automatically generate a large model that resembles a small example model provided by the user. Every small part of the generated model is identical to a small part of the example model. By altering the example model, a wide variety of objects can be produced. I will present several different model synthesis algorithms and analyze their strengths and weaknesses. Discrete model synthesis generates models built out of small building blocks or model pieces. Continuous model synthesis generates models on set of parallel planes. I will also show how to incorporate several additional user-defined constraints to control the large-scale structure of the model, to control how the objects are distributed, and to generate symmetric models. I will demonstrate how general each approach is by showing the wide variety of models they can produce including cities, landscapes, spaceships, and castles. The models contain hundreds of thousands of model pieces and are generated in only a few minutes.

Middleton, David J. (1986)

“Alternative Program Representations in the FFP Machine”
Under the direction of Gyula A. Mago
Electronic copy available

Program representation seems to be an important factor in enabling parallel computers to provide the high performance they promise. Program representation, from that of the initial algorithm to that of the run-time machine code, must provide enough information that parallelism can be detected without imposing further constraints that prevent this parallelism from being exploited. The thesis of this research is that the efficiency of the FFP machine is significantly affected by the choice for program representation. The results support the conjecture above and suggest that greater emphasis should be placed on program representation in the design of parallel computers. This research examines run-time program representation for the FFP machine, a fine-grained language-directed parallel computer which uses a string-reduction model of computation. Several alternative program representations are proposed, and the resulting machine variants are compared. Four characteristics are then developed which accurately predict the advantages and problems that would arise with other forms of program representation. These characteristics can be derived from the basic nature of the FFP machine and its model of computation, so these characteristics may well be important in the design of other parallel computers that share the basic character of the FFP machine. The results are organized into a procedure for choosing a representation, based on the expected use of a particular instance of the FFP machine. A definition of processor granularity is proposed based on software function instead of hardware costs.

Miller, Dorian B. (2008)

“Can we work together?”
Under the direction of Dave Stotts
Electronic copy available

People have a versatility to adapt to various situations in order to communicate with each other regardless of a person’s disability. We research separate computer interfaces to support remote synchronous collaboration in two situations. First, a deaf person collaborating with a hearing person uses a shared workspace with video conferencing, such as the Facetop system. Second, a blind person collaborating with a sighted person uses our loosely coupled custom shared workspace called Deep View. The design features in the respective interfaces accommodate the disability of a deaf person or a blind person to communicate with a person without a disability, thereby expanding the ways in which people with disabilities participate in a collaborative task at a level of detail not possible without our interfaces. The design features of our user interfaces provide alternative channels for the collaborators with disabilities to communicate ideas or coordinate actions that collaborators without disabilities would otherwise do verbally or visually. We evaluate the interfaces through three user studies where collaborators complete full fledged tasks that require managing all aspects of communication to complete the task. Throughout the research we collaborated with members of the Deaf community and members of the blind community. We incorporated the feedback from members of these communities into the implementation of our interfaces. The members participated in our user studies to evaluate the interfaces.

Miller, Swaha Das (2005)

“OSHL-U: A First Order Theorem Prover Using Propositional Techniques and Semantics”
Under the direction of David A. Plaisted
Electronic copy available

Large and complex 3D models are required for computer-aided design, architectural visualizations, flight simulation, and many types of virtual Automated theorem proving is the proving of theorems with an automatic computer program. Automation of theorem proving is useful in proving repetitive theorems quickly and accurately, as well as in proving theorems that may be too large or complex for humans to handle. Automated theorem proving finds application in various fields, such as, the verification of integrated circuits, software verification, deductive databases, mathematics, and education. The early success of automated theorem provers for first-order logic based on the resolution-unification paradigm has caused theorem proving research to be directed largely towards resolution and its variants. The problems people looked at in the early days of automated deduction were mostly in the classes of problems that resolution is especially efficient at solving such as Horn problems (Every clause in such problems contains at most one positive literal) and UR resolvable problems (Problems in which the proof can be obtained purely by UR resolution, which is a restricted form of resolution). The initially good performance of resolution on such problems led to disillusionment later on. Problems that are hard to solve for humans are less and less likely to be Horn or UR resolvable, so resolution is less likely to be efficient on such problems. That few hard problems have been solved automatically by current provers further supports this. Therefore, there is a need to go beyond resolution for harder problems. Approaches based on propositional techniques applied to first-order theorem proving have great potential in this direction. Provers based on propositional techniques such as DPLL are used extensively for hardware verification but resolution is hardly ever used for solution of large propositional problems because it is far less efficient. Resolution has serious inefficiencies on non-Horn propositional problems and it is likely that these inefficiencies carry over also to first-order problems. In recent times, techniques developed for deciding propositional satisfiability perform at tremendous speeds, solving verification problems containing tens of thousands of variables and hard random 3SAT problems containing millions of variables. The desire to import such propositional efficiency into first-order theorem proving has revived an interest in propositional techniques for proving first-order logic problems. This dissertation explores propositional techniques to first-order theorem proving and how these compare in performance to resolution-unification techniques. It formulates some techniques that are shown to enhance the performance of OSHL, a theorem prover for first-order logic based on propositional techniques. The resulting implementation, OSHL-U, performs better than resolution on categories of problems that are hard for resolution. The performance of OSHL-U can further be improved by the use of semantics that provide problem-specific information to help the proof search; this is demonstrated experimentally in the dissertation. The techniques applied to enhance OSHL performance are applicable to other theorem provers, too, and contribute towards building more powerful theorem provers for first-order logic in general.

Millman, David L. (2012)

“Degree-Driven Design of Geometric Algorithms for Point Location, Proximity, and Volume Calculation”
Under the direction of Jack Snoeyink
Electronic copy available

Correct implementation of published geometric algorithms is surprisingly difficult. Geometric algorithms are often designed for Real-RAM, a computational model that provides arbitrary precision arithmetic operations at unit cost. Actual commodity hardware provides only finite precision and may result in arithmetic errors. While the errors may seem small, if ignored, they may cause incorrect branching, which may cause an implementation to reach an undefined state, produce erroneous output, or crash. In 1999 Liotta, Preparata and Tamassia proposed that in addition to considering the resources of time and space, an algorithm designer should also consider the arithmetic precision necessary to guarantee a correct implementation. They called this design technique degree-driven algorithm design. Designers who consider the time, space, and precision for a problem up-front arrive at new solutions, gain further insight, and find simpler representations. In this thesis, I show that degree-driven design supports the development of new and robust geometric algorithms. I demonstrate this claim via several new algorithms. For n point sites on a U X U grid I consider three problems. First, I show how to compute the nearest neighbor transform in O(U2) expected time, O(U2) space, and double precision. Second, I show how to create a data structure in O(n log Un) expected time, O(n) expected space, and triple precision that supports O(log n) time and double precision post-office queries. Third, I show how to compute the Gabriel graph in O(n2) time, O(n2) space and double precision. For computing volumes of CSG models, I describe a framework that uses a minimal set of predicates that use at most five-fold precision. The framework is over 500x faster and two orders of magnitude more accurate than a Monte Carlo volume calculation algorithm.

Mine, Mark R. (1997)

“Exploiting Proprioception in Virtual-Environment Interaction”
Under the direction of Frederick P. Brooks, Jr.
Department of Computer Science Technical Report TR97-014
Electronic copy available

Manipulation in immersive virtual environments is difficult partly because users must do without the haptic contact with real objects they rely on in the real world to orient themselves and the objects they are manipulating. To compensate for this lack, I propose exploiting the one real object every user has in a virtual environment, his body. I present a unified framework for virtual-environment interaction based on proprioception, a person’s sense of the position and orientation of his body and limbs. I describe three forms of body-relative interaction:

  • Direct manipulation – ways to use body sense to help control manipulation
  • Physical mnemonics – ways to store/recall information relative to the body
  • Gestural actions – ways to use body-relative actions to issue commands

Automatic scaling is a way to bring objects instantly within reach so that users can manipulate them using proprioceptive cues. Several novel virtual interaction techniques based upon automatic scaling and the proposed framework of proprioception allow a user to interact with a virtual world intuitively, efficiently, precisely, and lazily. Two formal user studies evaluate key aspects of body-relative interaction. The virtual docking study compares the manipulation of objects co-located with one’s hand and the manipulation of objects at a distance. The widget interaction experiment explores the differences between interacting with a widget held in one’s hand and interacting with a widget floating in space. Lessons learned from the integration of body-relative techniques into a real-world system, the Chapel Hill Immersive Modeling Program (CHIMP), are presented and discussed.

Mo, Qi (2015)

“Efficient Light and Sound Propagation in Refractive Media Using Analytic Ray Curve Tracing”
Under the direction of Dinesh Manocha

Refractive media, including the atmosphere and the ocean, are ubiquitous in the natural world, and light and sound propagation in refractive media follow curved trajectories. It is important to accurately simulate refractive propagation for engineering applications such as laser ranging, sonar, underwater communication, urban planning, and noise control, but many existing propagation methods tend to ignore the effects of refractive media. Moreover, the performance of the few algorithms that simulate refractive propagation is too slow and regarded as prohibitively expensive for fully general three- dimensional scenes.

In this thesis, we present efficient methods for simulating propagation of light and sound in refractive media. Our methods rely on the ray tracing model, and the key source of efficiency comes from replacing linear rays with analytic ray curves as tracing primitives. Based on the analytic ray curve formulation, we present two algorithms that accelerate the ray traversal of the media and the ray intersection with boundary surfaces, for static and dynamic scenes, respectively. For static scenes, we precompute a constrained tetrahedral mesh that decompose the media spatially; the mesh cell sizes are computed based on media variations to facilitate efficient ray traversal, and the boundary surfaces are embedded in the mesh for ray intersection with no additional computation. For dynamic scenes, we employ a mesh- free approach that samples the media properties and computes the ray curves on-the-fly; surface-only acceleration structure is built and updated interactively for fast intersection test. Our approaches  improve the state of the art by almost two orders of magnitude and are able to compute propagation paths in refractive media at almost interactive rates.

We also address the problem of efficient computation of acoustic pressure field in the scene given the propagation paths. We present an algorithm that computes on-ray pressure based on our ray curve formulation and combine it with the Gaussian beam model to compute the pressure field spanned by the rays. This algorithm and the ray-curve based path computation algorithms mentioned above constitute a complete and practical solution for acoustic propagation in refractive media for fully general large outdoor scenes. We validate the accuracy of our algorithms against published benchmarks and highlight the performance improvement for both light and sound propagation in refractive media.

Moir, Mark S. (1996)

“Efficient Object Sharing in Shared-Memory Multiprocessors”
Under the direction of James Anderson
Electronic copy available

The goal of this work is to facilitate efficient use of concurrent shared objects in asynchronous, shared-memory multiprocessors. Shared objects are usually implemented using locks in such systems. However, lock-based implementations result in substantial inefficiency inmultiprogrammed systems, where processes are frequently subject to delays due to preemption. This inefficiency arises because processes can be delayed while holding a lock, thereby delaying other processes that require the lock. In contrast, lock-free and wait-free implementations guarantee that the delay of one process will not delay another process. We show that lock-free and wait-free implementations for shared objects provide a viable alternative to lock-based ones, and that they can provide a significant performance advantage over lock-based implementations in multiprogrammed systems. Lock-free and wait-free implementations are notoriously difficult to design and to verify, as correct. Universal constructions alleviate this problem by generating lock-free and wait-free shared object implementations using sequential implementations. However, previous universal constructions either require significant creative effort on the part of the programmer for each object, or result in objects that have high space and time overhead due to excessive copying. We present lock-free and wait-free universal constructions that achieve low space and time overhead for a wide spectrum of important objects, while not placing any extra burden on the object programmer. We also show that the space and time overhead of these constructions can be further reduced by using k-exclusion algorithms to restrict access to the shared object. This approach requires a long-lived renaming algorithm that enables processes to acquire and release names repeatedly. We present several efficient k-exclusion and long-lived renaming algorithms. Our universal constructions are based on the load-linked and store-conditional synchronization instructions. We give a constant-time, wait-free implementation of load-linked and store-conditional using compare-and-swap, and an implementation of multi-word load-linked and store-conditional instructions using similar one-word instructions. These results allow our algorithms and others to be applied more widely; they can also simplify the design of new algorithms.

Molnar, Steven E. (1991)

“Image-Composition Architectures for Real-Time Image Generation”
Under the direction of Henry Fuchs
Department of Computer Science Technical Report TR91-046
Electronic copy available

This dissertation describes a new approach for high-speed image-generation based on image compositing. Application software distributes the primitives of a graphics database over a homogeneous array of processors called renderers. Each renderer transforms and rasterizes its primitives to form a full-sized image of its portion of the database. A hardware datapath, called an image-composition network, composites the partial images into a single image of the entire scene. Image-composition architectures are linearly scalable to arbitrary performance. This property arises because: 1) renderers compute their subimages independently, and 2) an image-composition network can accommodate an arbitrary number of renderers, with constant bandwidth in each link of the network. Because they are scalable, image-composition architectures promise to achieve much higher performance than existing commercial or experimental systems. They are flexible as well, supporting a variety of primitive types and rendering algorithms. Also, they are efficient, having approximately the same performance/price ratio as the underlying renderers. Antialiasing is a special challenge for image-composition architectures. The compositing method must retain primitive geometry within each pixel to avoid aliasing. Two alternatives are explored in this dissertation: simple z-buffer compositing combined with supersampling, and A-buffer compositing. This dissertation describes the properties of image-composition architectures and presents the design of a prototype z-buffer-based system called PixelFlow. The PixelFlow design, using only proven technology, is expected to render 2.5 million triangles per second and 870 thousand antialiased triangles per second in a two-card-cage system. Additional card cages can be added to achieve nearly linear increases in performance.

Morales, Nic (2017)

Efficient Wave-based Sound Propagation and Optimization for Computer-Aided Design

Under the direction of Dinesh Manocha

Electronic copy available

Acoustic phenomena have a large impact on our everyday lives, from influencing our enjoyment of music in a concert hall, to affecting our concentration at school or work, to potentially negatively impacting our health through deafening noises. The sound that reaches our ears is absorbed, reflected, and filtered by the shape, topology, and materials present in the environment. However, many computer simulation techniques for solving these sound propagation problems are either computationally expensive or inaccurate. Additionally, the costs of some methods are dramatically increased in design optimization processes in which several iterations of sound propagation evaluation are necessary.

 The primary goal of this dissertation is to present techniques for efficiently solving the sound propagation problem and related optimization problems for computer-aided design. First, we propose a parallel method for solving large acoustic propagation problems, scalable to tens of thousands of cores. Second, we present two novel techniques for optimizing certain acoustic characteristics such as reverberation time or sound clarity using wave-based sound propagation. Finally, we show how hybrid sound propagation algorithms can be used to improve the performance of acoustic optimization problems and present two algorithms for noise minimization and speech intelligibility improvement that use this hybrid approach.

All the algorithms we present are evaluated on various benchmarks that are computer models of architectural scenes. These benchmarks include challenging environments for existing sound propagation algorithms, such as large indoor or outdoor scenes, structural complex scenes, or the prevalence of difficult-to-model sound propagation phenomena. Using the techniques put forth in this dissertation, we can solve many challenging sound propagation and optimization problems on the scenes in an efficient manner. We are able to accurately model sound propagation using wave-based approaches up to 10kHz (the full range of human speech) and for the full range of human hearing (22kHz) using our hybrid approach. Our noise minimization methods show improvements of up to 13dB in noise reduction on some scenes, and we show a 71% improvement in speech intelligibility using our algorithm.

Morse, Bryan S. (1995)

“Computation of Object Cores from Grey-level Images”
Under the direction of Stephen M. Pizer

Pixels in discrete images represent not just dimensionless points but the integration of image information over spatial areas. The size of this measurement aperture (the scale of image measurements) limits our ability to discern finer-scale structure. Changing the magnification of objects in images changes the level of detail that can be described. Oversensitivity to such details causes difficulty in making shape-based comparisons for such tasks as identification or registration. To create object representations that are invariant to magnification, one must tie the measurement scale to the size of the objects involved. Research at UNC-CH has led to a representation known as a core that establishes this relationship. Cores are related to previous medial representations of shape but go farther to capture the central essence of objects. The object-relevant use of scale separates general shape properties of objects from specific detail (allowing more robust representations when comparing objects) and provides a basis for object-based tradeoff between noise and loss of detail (allowing finer, but noisier, representations for small objects while simultaneously producing less noisy representations for larger, more stable, objects). This dissertation presents four distinct algorithms that, taken together, compute cores for objects in images. They derive cores directly from the image and do not require a prior segmentation, thus producing image segmentation, as well as object representations. They include

  1. An algorithm, similar to the Hough transform, for transforming boundary-likelihood information (graded boundary values, not binary contours) to medial- likelihood information at all spatial positions and scales;
  2. An algorithm for finding curves in this scale space of medial information that are locally optimal with respect to both position and scale in directions transverse to the curve;
  3. A competitive credit attribution (feature-binding) algorithm that constrains the medial-likelihood computation to produce a more coherent interpretation of the image;
  4. A cooperative algorithm that encourages longer connected cores and completion of cores across gaps.

These algorithms are evaluated using both qualitative testing on various types of images and quantitative study of their behavior under systematically controlled variations in images. Comparisons are also made between computed cores and the results of previous and ongoing psychophysical studies.

Mudge, J. Craig (1973)

“Human Factors in the Design of a Computer-Assisted Instruction System”
Under the direction of Frederick P. Brooks, Jr.
Electronic copy available

This research is an exploratory case study of ease-of-use factors in man-computer interfaces. The approach has been to build and evaluate a real man-machine system, giving special attention to the form of all man-machine and machine-man communications. Author-controlled computer-assisted instruction was selected. Such a system has three principal interfaces: student-system, author-system, and computer programmer-system. The method was to design, build a large subset of the design, make systematic observations of the three interfaces in use, and then iterate on the design and on the observations. The system has been in regular class use for a year. Interactive development of course material by authors, execution of instructional programs by students, and the requisite administrative tasks are integrated into a single production-oriented system. The nucleus of the system is a new display-based interactive authorlanguage, DIAL. The design demands a language implementation which is systematic and which permits easy language modification. A translator writing system, based extensively on McKeeman’s, assists computer programmers in generating compilers for new versions of the language. Two of the design assumptions (that the course author is always an experienced teacher and that he does his own programming in DIAL, without an intermediary CAI language programmer) are major departures from most CAI authoring systems. Professorial-level authoring imposes stringent requirements on the ease-of-use and simplicity of the language and the operational environment in which it is embedded. A measured high level of user acceptance proved the soundness of the design and illuminated the relatively few mistakes made. A factor-of-five improvement in authoring time over data published for other systems was observed. Several improvements in DIAL over existing CAI languages were observed. The underlying machine intrudes much less than in existing languages, and there are other syntactic improvements. The provision in DIAL of a pattern matching function allowed a very general answer-evaluating technique, called the sieve, to be devised. Analysis of author use of DIAL has derived DIAL/2, which is radically different syntactically but only slightly enriched functionally. The translator writing system proved very useful in progressive implementation of DIAL and in the remediation of language weaknesses as they were discovered. Although a translator writing system was not available for the command language of the operational environment, the human-factors debugging period (necessary for all user-oriented systems) revealed the desirability of such.

Mueller, Carl A. (2000)

“The Sort-First Architecture for Real-Time Image Generation”
Under the direction of Anselmo A. Lastra
Department of Computer Science Technical Report TR01-023
Electronic copy available

The field of real-time computer graphics has been pushing hardware capability to its limits. There is demand to increase not only the complexity of the models that are displayed, but also the resolution of the images. To provide the ultimate power for interactive graphics, fully-parallelized systems have been developed which work on rendering multiple polygons at all major stages of the graphics pipeline. UNC researchers have devised a taxonomy of such architectures, naming the classes “sort first,” “sort-middle,” and “sort-last” [MOLN91, MOLN94]. While the latter two have been well explored and developed, sort-first has not, despite the fact that it offers great promise for real-time rendering. Sort-first offers an advantage over sort-middle in that it can take advantage of the frame-to-frame coherence that is inherent in interactive applications to reduce the communications bandwidth needed to send primitives among processors. Sort-first also offers an advantage over sort-last in that it does not require huge amounts of communication bandwidth to deal with pixel traffic. Based on these advantages, our thesis states that sort-first is the architecture best suited for rapidly rendering very high-resolution images of complex model datasets. However, to support this thesis, we must show how sort-first can be implemented efficiently. The main issues standing in the way of sort-first implementation are load balancing and management of a migrating primitive database. These issues are explored thoroughly in this dissertation. We demonstrate a new adaptive load-balancing method (applicable to sort-middle as well) which allows for efficient load distribution with low overhead. We also show two solution methods for managing a hierarchical database of migrating primitives. In addition we examine the communications requirements for sort-first, first by examining a design example and later by providing a comparison against sort-middle. Our research shows that sort-first is a viable and very competitive architecture. Its strengths make it the ideal choice when very high-resolution rendering is needed for complex, retained-mode datasets. While this dissertation clears the way for sort-first implementation, there are still many opportunities for further exploration in this area.

Munson, Jonathan P. (1997)

“Synchronization in Collaborative Applications.”
Under the direction of Prasun Dewan

Collaborative applications are designed to support the activities of a group of people engaged in a common task. When this task includes concurrent access to a common set of data, the application must synchronize the group’s accesses so that they do not interfere with each other and leave the data in an inconsistent state. The application may employ a synchronization system to provide this control. We distinguish between consistency requirements, which are rules defined by the application and its users that specify which concurrent accesses are allowed and which are not, and consistency criteria, which are the rules synchronization systems use to enforce consistency requirements. A synchronization system’s consistency criteria should prevent all concurrency not allowed by the application’s consistency requirements, but, ideally, should not prevent concurrency that is allowed by an application’s requirements. Since the consistency requirements of collaborative applications vary from one to another, a general-purpose synchronization system should have the flexibility to accommodate the consistency requirements of a wide range of applications. Moreover, access to this flexibility should be at as high a level as possible, to reduce the burden of synchronization policy specification that must fall upon application developers and users. Existing synchronization systems are either inflexible, offering fixed policies only, or are flexible but require extensive specification on the part of the application developers or users. Based on specific scenarios involving specific collaborative applications, we identify a comprehensive set of requirements for synchronization in collaborative systems that detail these needs for flexibility and ease of specification. We present a new synchronization model designed to fulfill these requirements better than existing systems. Our new model is based on two elements: (1) constructor-based object definition, which provides a set of high-level synchronization-aware types that may be hierarchically composed to construct new types, and (2) table-based, type-specific specification mechanisms whose defaults may be overridden on an individual-entry-basis by programmers and users. This approach gives programmers and users high flexibility in specifying the synchronization policy to be used for their application, while burdening them only incrementally with specification tasks. The model supports synchronization in both disconnected environments (logically or physically), and connected environments. An application can use both simultaneously to serve both connected and disconnected users. We describe two synchronization systems we developed based on our models, each fulfilling certain requirements not fulfilled by the other. We then describe our experiences with these systems in providing synchronization to five collaborative applications. Next we give a requirement-by-requirement evaluation of our systems and existing systems, showing that our system fulfills the flexibility and ease-of-specification requirements better than existing systems. Finally we present our conclusions and thoughts on future directions of this research.


Nackman, Lee R. (1982)

“Three-Dimensional Shape Description Using the Symmetric Axis Transform”
Under the direction of Stephen M. Pizer
Electronic copy available

Blum’s transform, variously known as the symmetric axis transform, medial axis transform, or skeleton, and his associated two-dimensional shape description methodology are generalized to three-dimensions. Bookstein’s two-dimensional algorithm for finding an approximation to the symmetric axis is also generalized. The symmetric axis (SA) of an object with a smooth boundary is the locus of points inside the object having at least two nearest neighbors on the object boundary. In three dimensions, the SA is, in general, a collection of smooth surface patches, called simplified segments, connected together in a tree-like structure. Together with the radius function, the distance from each point on the SA to a nearest boundary point, the SA forms the symmetric axis transform. The three-dimensional symmetric axis transform defines a unique, coordinate-system-independent decomposition of an object into disjoint, two-sided pieces, each with its own simplified segment and associated object boundary patches. Four principal contributions are presented. (1) A relationship among the Gaussian and mean curvatures of a simplified segment, the Gaussian and mean curvatures of the associated object boundary patches, and radius function measures is derived. (2) A further decomposition is proposed wherein each two-sided piece is partitioned into primitives drawn from three separate, but not completely independent, primitive sets: width primitives, boundary primitives, and axis primitives. Width primitives are regions derived from derivatives of the radius function; hence, they capture the behavior of the boundary patches with respect to the simplified segment. Axis and boundary primitives are regions of constant signs of Gaussian and mean curvatures of the simplified segment and boundary patches respectively. The aforementioned curvature relationship is used to derive relationships among the primitive sets. (3) In the course of studying width primitives, it is proved that, under certain non-degeneracy assumptions, the regions of the graph defined by the critical points, ridge lines, and course lines of a scalar valued function over a surface have one of three types of cycle as boundary. (4) An almost linear algorithm that takes a polyhedral approximation to a three-dimensional object and yields a polyhedral surface approximation to that object’s SA is developed.

Narain, Rahul (2011)

“Visual Modeling and Simulation of Multiscale Phenomena”
Under the direction of Ming C. Lin
Electronic copy available

Many large-scale systems such as human crowds, fluids, and granular materials exhibit complicated motion at many different scales, from a characteristic global behavior to important small-scale detail. Such multiscale systems are computationally expensive for traditional simulation techniques to capture over the full range of scales. I present novel techniques for scalable simulation of these large, complex phenomena for visual computing applications. These techniques achieve their efficiency by coupling together separate models for the large-scale and fine-scale dynamics of a complex system. In fluid simulation, it remains a challenge to efficiently simulate fine details such as foam, ripples, and turbulence without compromising the accuracy of the large-scale flow. I present two techniques for this problem that combine physically-based numerical simulation for the global flow with efficient local models for detail. For surface features, I propose the use of texture synthesis, guided by the physical characteristics of the macroscopic flow. For turbulence in the fluid motion itself, I present a technique that tracks the transfer of energy from the mean flow to the turbulent fluctuations and synthesizes these fluctuations procedurally, allowing extremely efficient visual simulation of turbulent fluids. Another large class of problems which are not easily handled by traditional approaches is the simulation of very large aggregates of discrete entities, such as dense pedestrian crowds and granular materials. I present a technique for crowd simulation that couples a discrete model of individual navigation with a novel continuum formulation for the collective motion of pedestrians, enabling simulation of extremely large crowds at nearreal-time rates on commodity hardware. I also present a technique for simulating granular materials which generalizes this model and introduces a novel computational scheme for friction, thus efficiently reproducing a wide range of granular behavior. In all these cases, the proposed techniques are typically an order of magnitude faster than comparable existing methods. Through these applications to a diverse set of challenging simulation problems, I demonstrate that the proposed approach is a powerful and versatile technique for the simulation of a broad range of large, complex systems.

Narang, Sahil (2018)

“Simulating Plausible Movement-Based Interactions Between Agents and Avatars Using Biomechanical Principles and Psychological Factors”

Under the direction of Dinesh Manocha

Electronic copy available

Many Virtual Reality-based applications enable the user to actively participate in the virtual environment (VE) by embodying a virtual avatar, and these applications rely on the presence of interactive and virtual pedestrians, or agents, to enhance the plausibility of the simulation. A key challenge is to generate plausible movements, in terms of navigation paths and full-body animation, for each virtual agent as it interacts with the user’s avatar and other agents. The overall effectiveness of the application depends on both the realism of the agent’s movements and the faithful representation of the user’s avatar.

         The primary goal of this dissertation is to present novel techniques to simulate plausible movements and behaviors for virtual agents, to synthesize personalized avatars that reflect the user’s unique motion, and to generate plausible avatar-agent interactions in immersive VE’s. First, we present a novel approach that generates trajectories for agents that reflect the speed and density relationship observed in human crowds. Our approach models the biomechanical relationship between stride length and walking speed, as well as psychological factors such as preference for personal space. Second, we propose a velocity computation algorithm that takes into account human motion constraints and generates plausible full body motion for the virtual agents and movement-based avatar-agent interactions. Third, we present an algorithm that simulates plausible gaze-based behaviors in avatar-agent interactions. Our approach relies on a Bayesian interpretation of the Theory of Mind concept to infer the user’s intentions and generates an appropriate response for the virtual agents. Finally, we study the perception of motion rendered on virtual avatars and propose a novel, data-driven approach to rapidly synthesize a photo-realistic virtual avatar that reflects the user’s unique gait. We highlight the interactive performance of our algorithms in complex virtual environments with tens of virtual agents. We also conduct extensive user studies and show that the interactions generated using our proposed algorithms are perceived as more plausible, and that they have a greater impact on the user in immersive settings.

Nashel, Andrew R. (2010)

“Rendering and Display for Multi-Viewer Tele-Immersion”
Under the direction of Henry Fuchs
Electronic copy available

Video teleconferencing systems are widely deployed for business, education and personal use to enable face-to-face communication between people at distant sites. Unfortunately, the two-dimensional video of conventional systems does not correctly convey several important non-verbal communication cues such as eye contact and gaze awareness. Tele-immersion refers to technologies aimed at providing distant users with a more compelling sense of remote presence than conventional video teleconferencing. This dissertation is concerned with the particular challenges of interaction between groups of users at remote sites. The problems of video teleconferencing are exacerbated when groups of people communicate. Ideally, a group tele-immersion system would display views of the remote site at the right size and location, from the correct viewpoint for each local user. However, is not practical to put a camera in every possible eye location, and it is not clear how to provide each viewer with correct and unique imagery. I introduce rendering techniques and multi-view display designs to support eye contact and gaze awareness between groups of viewers at two distant sites. With a shared 2D display, virtual camera views can improve local spatial cues while preserving scene continuity, by rendering the scene from novel viewpoints that may not correspond to a physical camera. I describe several techniques, including a compact light field, a plane sweeping algorithm, a depth dependent camera model, and video-quality proxies, suitable for producing useful views of a remote scene for a group local viewers. The first novel display provides simultaneous, unique monoscopic views to several users, with fewer user position restrictions than existing autostereoscopic displays. The second is a random hole barrier autostereoscopic display that eliminates the viewing zones and user position requirements of conventional autostereoscopic displays, and provides unique 3D views for multiple users in arbitrary locations.

Navon, Jaime (1976)

“Specification and Semi-Automated Verification of Coordination Protocols for Collaborative Software Systems”
Under the direction of David Stotts

Collaborative systems support groups of people working together towards a common goal. Cooperation, communication and coordination among the actors are necessary to perform collaborative work. Coordination involves the management of dependencies between activities through a set of interaction rules that control the execution of tasks over shared resources. Moreover, many times the requirements are such that coordination mechanisms must be updated at runtime. Although in the past decade many collaborative systems have been built, their coordination protocols are implicit, usually built-in and static. Not having a precise description of the collaboration protocol may produce unwanted behavior that goes undetected for a long time. In the same way that in a theater play a precise script must be written which specifies the moment and mode in which characters participate, collaborative systems (such as electronic meeting software) should have a precise description of all the coordination details: how many speak at a time, role of the moderator, etc. In order to better create computer supported collaborative systems we have developed techniques to express the associated coordination protocols in a precise manner so they can be analyzed, debugged and dynamically changed. We present a graphic language and formalism that allow us not only to describe and model coordination protocols, but also to perform automatic verification of these protocols through symbolic model checking techniques. We present examples of the use of these techniques in the context of multi-user hyperdocuments and floor control in meeting software. We also illustrate the broader applicability of the methods with an application to HTML frames.

Neumann, Ulrich (1993)

“Volume Reconstruction and Parallel Rendering Algorithms: A Comparative Analysis”
Under the direction of Henry Fuchs
Department of Computer Science Technical Report TR93-017
Electronic copy available

This work focuses on two issues of concern to designers and implementers of volume-rendering applications–finding the most efficient rendering method that provides the best image possible, and efficiently parallelizing the computation on multicomputers to render images as quickly as possible. Three volume rendering methods: ray casting, splatting, and volume shearing, are compared with respect to their reconstruction accuracy and computational expense. The computational expense of the rendering methods is modeled and measured on several workstations. The most-efficient rendering method of the three is found to be splatting. Three reconstruction-filter kernels are evaluated for their accuracy. Two error measurement methods are used. The first is image-based and uses a heuristic metric for measuring the difference between rendered images and reference images. The second method is analytical and uses a scale-space measure of feature size to compute an error bound as a function of feature size and sampling density. Of the three filter kernels tested, the separable cubic filter to found to produce the most-accurate reconstruction of the volume function. Parallel volume-rendering algorithms may be classified and described by the partitioning of tasks and data within the system. A taxonomy of algorithms is presented and the members are analyzed in terms of their communication requirements. Three optimal algorithm-classes are revealed: image partitions with both static and dynamic data distributions, and object partitions with contiguous dynamic block data distributions. Through analysis and experimental tests, a 2D mesh network-topology is shown to be sufficient for scaleable performance with an object-partition algorithm when the image size is kept constant. Furthermore, the network-channel bandwidth-requirement actually decreases as the problem is scaled to a larger system and volume data size. Implementations on Pixel-Planes 5 and the Touchstone Delta demonstrate and verify the scalability of object partitions. As part of these implementations, a new load balancing approach is demonstrated for object-partition algorithms.

Newton, Ben (2017)

Efficient Topology Management and Geographic Routing in High-Capacity Continental-Scale Airborne Networks

Under the direction of Kevin Jeffay

Electronic copy available

Large-scale high-capacity communication networks among mobile airborne platforms are quickly becoming a reality. Today, both Google and Facebook are seeking to form networks between high-flying balloons and drones in an effort to provide Internet connections from the upper atmosphere to users on the ground. This dissertation proposes an alternative, namely using the cargo and passenger aircraft already in the skies as the principal components of such a network. My work presents the design of a network architecture to overcome the challenges of managing the topology of and routing data within these continental-scale highly-dynamic networks.

The architecture relies on directional communication links, such as free-space optical communication links (FSO), to achieve high data rates over long distances. However, these state-of-the-art communication systems present new networking challenges. One such challenge is that of managing the physical topology of the network. Such a topology must be explicitly managed, ensuring that each directional data link is pointed at and connected with an appropriate neighbor (which is also pointing back) to yield an acceptable global topology. To overcome this challenge, a distributed topology management framework and associated topology generation algorithms were designed, implemented and tested via simulation. The framework is capable of managing the topology of thousands of nodes in a continental-scale airborne network and has no communication overhead except that required to exchange position information among nearby nodes.

A second component of the work concerns routing data at high data rates through a constantly changing network topology. To address this issue Topology Aware Geographic Routing (TAG), a position-based routing protocol was developed that strategically uses local topology information to make better local forwarding decisions, decreasing the number of hops required to deliver a packet, when compared with other geographic routing protocols. In addition, unlike other similar protocols, TAG is able to reliably deliver packets even when the topology changes while the packet is in flight. These protocols are tested and validated in a series of simulations where nodes trace the trajectories recorded from thousands of actual flights. This simulations indicate that the topology management framework and TAG are able to perform well in large-scale high-density conditions, over long durations, and are able to support tens of thousands of 1 Mbps flows.

Nie, Dong (2019)

Blurry Boundary Delineation and Adversarial Confidence Learning for Medical Image Analysis

Under the direction of Dinggang Shen

Electronic copy available

Low tissue contrast and fuzzy boundaries are major challenges in medical image segmentation which is a key step for various medical image analysis tasks. In particular, blurry boundary delineation is one of the most challenging problems due to low-contrast and even vanishing boundaries. Currently, encoder-decoder networks are widely adopted for medical image segmentation. With the lateral skip connection, the models can obtain and fuse both semantic and resolution information in deep layers to achieve more accurate segmentation performance. However, in many applications (e.g., images with blurry boundaries), these models often cannot precisely locate complex boundaries and segment tiny isolated parts. To solve this challenging problem, we empirically analyze why simple lateral connections in encoder-decoder architectures are not able to accurately locate indistinct boundaries. Based on the analysis, we argue learning high-resolution semantic information in the lateral connection can better delineate the blurry boundaries. Two methods have been proposed to achieve such a goal. a) A high-resolution pathway composed of dilated residual blocks has been adopted to replace the simple lateral connection for learning the high-resolution semantic features. b) A semantic-guided encoder feature learning strategy is further proposed to learn high-resolution semantic encoder features so that we can more accurately and efficiently locate the blurry boundaries. Besides, we also explore contour constraint mechanism to model the blurry boundary detection. Experimental results on real clinical datasets (infant brain MRI and pelvic organ datasets) show that our proposed methods can achieve state-of-the-art segmentation accuracy, especially for the blurry regions. Further analysis also indicates that our proposed network components indeed contribute to the performance gain. Experiments on extra dataset also validate the generalization ability of our proposed methods.


Generative adversarial networks (GANs) are widely used in medical image analysis tasks, such as medical image segmentation and synthesis. In these works, adversarial learning is usually directly applied to the original supervised segmentation (synthesis) networks. The usage of adversarial learning is effective in improving visual perception performance since adversarial learning works as realistic regularization for supervised generators. However, the quantitative performance often cannot be improved as much as the qualitative performance, and it can even become worse in some cases. In this dissertation, I explore how adversarial learning could be more useful in supervised segmentation (synthesis) models, i.e., how to synchronously improve visual and quantitative performance. I first analyze the roles of discriminator in the classic GANs and compare them with those in supervised adversarial systems. Based on this analysis, an adversarial confidence learning framework is proposed for taking better advantage of adversarial learning; that is, besides the adversarial learning for emphasizing visual perception, the confidence information provided by the adversarial network is utilized to enhance the design of supervised segmentation (synthesis) network. In particular, I propose using a fully convolutional adversarial network for confidence learning to provide voxel-wise and region-wise confidence information for the segmentation (synthesis) network. Furthermore,various loss functions of GANs are investigated and the binary cross entropy loss is finally chosen to train the proposed adversarial confidence learning system so that the unlimited modeling capacity of the discriminator is retained for confidence learning. With these settings, two machine learning algorithms are proposed to solve some specific medical image analysis problems. a) A difficulty-aware attention mechanism is proposed to properly handle hard samples or regions by taking structural information into consideration so that the irregular distribution of medical data could be appropriately dealt with. Experimental results on clinical and challenge datasets show that the proposed algorithm can achieve state-of-the-art segmentation (synthesis) accuracy. Further analysis also indicates that adversarial confidence learning can synchronously improve the visual perception and quantitative performance. b) A semi-supervised segmentation model is proposed to alleviate the everlasting challenge for medical image segmentation – lack of annotated data. The proposed method can automatically recognize the well-segmented regions (instead of the entire sample) and dynamically include them to increase the label set during training. Specifically, based on the confidence map, a region-attention based semi-supervised learning strategy is designed to further train the segmentation network. Experimental results on real clinical datasets show that the proposed approach can achieve better segmentation performance with extra unannotated data.

Nie, Xumin (1989)

“Automatic Theorem Proving In Problem Reduction Formats”
Under the direction of David A. Plaisted
Department of Computer Science Technical Report TR89-044
Electronic copy available

This thesis explores several topics concerning the sequent-style inference system – the modified problem reduction format. Chapter 1 is the introductory chapter. In Chapter 2, we will present how caching is performed with the depth first iterative deepening search to implement the modified problem reduction format, in order to avoid the repeated work involved in solving a subgoal more than once. In Chapter 3, we present the formalization of goal generalization and how it is implemented by augmenting the modified problem reduction format, where goal generalization is a special case of Explanation-Based Generalization in maching learning. In Chapter 4, we will present how subgoal reordering is performed in the modified problem reduction format and how it is implemented. In Chapter 5 and Chapter 6, we will present two refinements to the depth-first iterative deepening search strategy in the implementation. The first refinement, the priority system concerns how to incorporate the use of priority of subgoals into the depth-first iterative deepening search. We show that the time complexity of the priority systems is within a constant factor of the complexity of the depth-first iterative deepening search. The second refinement is based on a syntactic viewpoint of proof development, which views the process of finding proofs as an incremental process of constructing instances with a certain property. In Chapter 7, we present how semantics, or domain dependent knowledge, can be used in the inference system. In particular, we will present a semantic variant of the modified problem reduction format which selects its inference rules from any interpretation. This results in an inference system which is a true set-of-support strategy and allows back chaining. We will also discuss how contrapositives are used in the modified problem reduction format and its semantic variant. We will show that only some contrapositives are needed according to some interpretation.

Nomura, Kunihiko (1974)

“Stochastic Models for Systems of Multiple Integrated Processors”
Under the direction of Victor L. Wallace

Numerical solution of Markovian models provides an approach to the analysis of the stochastic behavior of computer systems. Such analysis is essential to evaluation of performance. The numerical approach frequently represents a middle ground in both cost and power between closed form solution of queueing-theoretic models and Monte Carlo simulation. A two-stage numerical method for the analysis of the equilibrium behavior of a class of closed queueing network models is developed. The class of queueing networks to which it applies frequently occurs in investigations of computer congestion problem and often have no known closed form solution. The method developed, called here a “shifted Gauss-Seidel” method, is an adaptation of the Gauss-Seidel method to homogeneous systems of equations. Convergence theory for this method and several and several alternative methods is presented, indicating the desirability of the method. The method is applied to an analysis of a model for systems of multiple integrated processors, and performance compared to a more conventional architecture. Several important conjectures about the conditions for effective use of such systems are produced.


Oguz, Ipek (2009)

“Groupwise Shape Correspondence with Local Features”
Under the direction of Martin Styner
Electronic copy available

Statistical shape analysis of anatomical structures plays an important role in many medical image analysis applications such as understanding the structural changes in anatomy in various stages of growth or disease. Establishing optimally accurate correspondence across object populations is essential for such statistical shape analysis studies. However, anatomical correspondence is rarely a direct result of spatial proximity of sample points but rather depends on many other features such as local curvature, position with respect to blood vessels, or connectivity to other parts of the anatomy. This dissertation presents a novel method for computing point-based correspondence among populations of surfaces by combining spatial location of the sample points with non-spatial local features. A framework for optimizing correspondence using arbitrary local features is developed. The performance of the correspondence algorithm is objectively assessed using a set of evaluation metrics. The main focus of this research is on correspondence across human cortical surfaces. Statistical analysis of cortical thickness, which is key to many neurological research problems, is the driving problem. I show that incorporating geometric (sulcal depth) and nongeometric (DTI connectivity) knowledge about the cortex significantly improves cortical correspondence compared to existing techniques. Furthermore, I present a framework that is the first to allow the white matter fiber connectivity to be used for improving cortical correspondence.

Ohbuchi, Ryutarou (1994)

“Incremental Acquisition and Visualization of 3D Ultrasound Images”
Under the direction of Henry Fuchs
Department of Computer Science Technical Report TR95-023
Electronic copy available

This dissertation describes work on 3D visualization of ultrasound echography data. The future goal of this research is the in-place volume visualization of medical 3D ultrasound images acquired and visualized real-time. For example, using such a system, a doctor wearing a special glasses would see a volume-visualized image of the fetus in the mother’s abdomen. This dissertation discusses two feasibility study systems that have been developed in order to push the state of the art toward this goal. The work on the first system, the static viewpoint 3D echography system, shows that it is possible with current graphics hardware to visualize, at an interactive rate, a stationary object from a series of 2D echography image slices hand-guided with 3 degrees-of-freedom. This work includes development of an incremental volume reconstruction algorithm for irregularly spaced samples and development of an efficient volume visualization algorithm based on a spatial bounding technique. The work on the second system, the dynamic viewpoint 3D echography system, shows the feasibility of a system that uses a video see-through head-mounted display to realize in-place visualization of ultrasound echography datasets.

Olano, T. Marc (1998)

“A Programmable Pipeline for Graphics Hardware”
Under the direction of Anselmo A. Lastra
Electronic copy available

This dissertation demonstrates user-written procedures on an interactive graphics machine. Procedural shading is a proven technique for off-line rendering, and has been effectively used for years in commercials and movies. During most of that time, polygon-per-second performance has been a major focus for graphics hardware development. In the last few years, we have seen increased attention on surface shading quality for graphics hardware, principally image- based texture mapping. Today, even low-end PC products include support for image textures. The PixelFlow graphics machine demonstrates that techniques like procedural shading are possible at interactive rates. PixelFlow is the first machine to run, at real-time rates of 30 frames per second, procedural shaders written in a high-level shading language. A graphics machine like PixelFlow is a large and complex device. An abstract pipeline is presented to model the operation of this and other interactive graphics machines. Each stage of the pipeline corresponds to one type of procedure that can be written by the graphics programmer. Through the abstract pipeline, the user can write shading procedures or procedures for other graphics tasks without needing to know the details of the machine architecture. We also provide a special-purpose language for writing some of these procedures. The special-purpose language hides more details of the machine implementation while enabling optimizations that make execution of the procedures possible.

Oliveira Neto, Manuel Menezes de (2000)

“Relief Texture Mapping”
Under the direction of Gary Bishop
Department of Computer Science Technical Report TR00-009
Electronic copy available

We present an extension to texture mapping that supports the representation of 3-D surface details and view motion parallax. The results are correct for viewpoints that are static or moving, far away or nearby. In this approach, a relief texture (texture extended with an orthogonal displacement per texel) is mapped onto a polygon using a two-step process. First, it is converted into an ordinary texture using a surprisingly simple 1-D forward transform. The resulting texture is then mapped onto the polygon using standard texture mapping. The 1-D warping functions work in texture coordinates to handle the parallax and visibility changes that result from the 3-D shape of the displacement surface. The subsequent texture-mapping operation handles the transformation from texture to screen coordinates. The pre-warping equations have a very simple 1-D structure that enables the pre-warp to be implemented using only 1-D image operations along rows and columns and requires interpolation between only two adjacent texels at a time. This allows efficient implementation in software and should allow a simple and efficient hardware implementation. The texture-mapping hardware already very common in graphics systems efficiently implements the final texture mapping stage of the warp. We demonstrate a software implementation of the method and show that it significantly increases the expressive power of conventional texture mapping. It also dramatically reduces the polygonal count required to model a scene, while preserving its realistic look. This new approach supports the representation and rendering of three-dimensional objects and immersive environments, and naturally integrates itself with popular graphics APIs. An important aspect of this research is to provide a framework for combining the photo-realistic promise of image-based modeling and rendering techniques with the advantages of polygonal rendering.

O’Meara, Matthew James (2013)

“A Features Analysis Tool For Assessing And Improving Computational Models In Structural Biology”
Under the direction of Jack Snoeyink and Brian Kuhlman
Electronic copy available

The protein-folding problem is to predict, from a protein’s amino acid sequence, its folded 3D conformation. State of the art computational models are complex collaboratively maintained prediction software. Like other complex software, they become brittle without support for testing and refactoring. Features analysis, a language of ‘scientific unit testing’, is the visual and quantitative comparison of distributions of features (local geometric measures) sampled from ensembles of native and predicted conformations. To support features analysis I develop a features analysis tool–a modular database framework for extracting and managing sampled feature instance and an exploratory data analysis framework for rapidly comparing feature distributions. In supporting features analysis, the tool supports the creation, tuning, and assessment of computational models, improving protein prediction and design. I demonstrate the features analysis tool through 6 case studies with the Rosetta molecular modeling suite. The first three demonstrate the tool usage mechanics through constructing and checking models. The first evaluates bond angle restraint models when used with the Backrub local sampling heuristic. The second identifies and resolves energy function derivative discontinuities that frustrate gradient-based minimization. The third constructs a model for disulfide bonds. The second three demonstrate using the tool to evaluate and improve how models represent molecular structure. I focus on modeling H-bonds because of their geometric specificity and environmental dependence lead to complex feature distributions. The fourth case study develops a novel functional form for Sp2 acceptor H-bonds. The fifth fits parameters for a refined H-bond model. The sixth combines the refined model with an electrostatics model and harmonizes them with the rest of the energy function. Next, to facilitate assessing model improvements, I develop recovery tests that measure predictive accuracy by asking models to recover native conformations that have been partially randomized. Finally, to demonstrate that the features analysis and recovery test tools support improving protein prediction and design, I evaluated the refined H-bond model and electrostatics model with additional corrections from the Rosetta community. Based on positive results, I recommend a new standard energy function, which has been accepted by the Rosetta community as the largest systematic improvement in nearly a decade.

Omojokun, Olufisayo (2006)

“Interacting with Networked Devices”
Under the direction of Prasun Dewan
Electronic copy available

Networking technology has become applicable in domains beyond the conventional computer. One such domain currently receiving a significant amount of research attention is networking arbitrary devices such as TVs, refrigerators, and sensors. In this dissertation, we focus on the following question: how does an infrastructure deploy a user-interface for a single device or a composition of several ones? We identify and evaluate several deployment approaches. The evaluation shows the approach of automatically generating device user-interfaces ‘on the fly’ as particularly promising since it offers low programming/maintenance costs and high reliability. The approach, however, has the important limitation of taking a long time to create a userinterface. It is our thesis that it is possible to overcome this limitation and build graphical and speech user-interface generators with deployment times that are as low as the inherently fastest approach of locally loading predefined code. Our approach is based on user-interface retargeting and history-based generation. User-interface retargeting involves dynamically mapping a previously generated user-interface of one device to another (target) device that can share the user-interface. History-based generation predicts and presents just the content a user needs in a device’s user-interface based on the user’s past behavior. By filtering out unneeded content from a screen, it our thesis that history-based generation can also be used to address the issue of limited screen space on mobile computers. The above ideas apply to both single and multiple device user-interfaces. The multidevice case also raises the additional issue of how devices are composed. Current infrastructures for composing devices are unsuccessful in simultaneously providing highiv level and flexible support of all existing composition semantics. It is our thesis that it is possible to build an infrastructure that: (1) includes the semantics of existing high-level infrastructures and (2) provides higher-level support than all other infrastructures that can support all of these semantics. Such an infrastructure requires a composition framework that is both data and operation oriented. Our approach is based on the idea of patternbased composition, which uses programming patterns to extract data and operation information from device objects. This idea is used to implement several abstract algorithms covering the specific semantics of existing systems.

Omondi, Amos R. (1990)

“Architecture and Implementation of a Parallel Machine for Imperative and Nondeterministic Functional Languages”
Under the direction of David A. Plaisted

This dissertation is a study in computer architecture and implementation. The driving problem is the need to implement both nondeterministic functional languages and procedural languages on a single conventional architecture. The choice of the problem was determined by the increasing importance of suitable languages for parallel symbolic computation, and the search for efficient parallel architectures for such languages. This search is currently manifested in the many proposals and implementations of “fifth generation” architectures, which are targeted primarily at functional and logic programming languages. However, in spite of much work that has been done in these areas, functional languages are still not very widely used and much work remains to be done in implementing them. The implementation of such languages on conventional machines has been rejected in many places, and many of the architectures mentioned above represent radical departures from the well-developed von Neumann style, which has served so well for so long. Such departures are frequently justified either on the grounds that the bases of the von Neumann architecture inherently impose limitations on the implementations (and hence on the attainable efficiency and performance), or on the grounds that essential aspects of these languages call for architecture and implementation features that are absent in the von Neumann model, and whose absence dooms to inefficiency of poor performance an implementation of such a language on a conventional machine. We believe that for functional languages to achieve widespread use two things will have to happen. First, these languages will have to incorporate more nondeterministic features in order to compete with logic programming languages in the areas of symbolic computation. Second, they will have to be implemented on conventional machines, either on their own or integrated with procedural languages. We believe that the requirements of functional language features can be satisfied through careful architectural design. We also believe that the techniques that have been used to enhance the performance and efficiency of conventional machines have as yet not been exploited to their full potential, and that doing so will correct many of the perceived performance difficulties. This work is the result of an investigation into suitable architectures, and corresponding high-performance implementations.

Ordonez-Roman, Vicente (2015)

Language and Perceptual Categorization in Computational Visual Recognition

Under the direction of Tamara Berg

Electronic copy available

Computational visual recognition or giving computers the ability to understand images as well as humans do is a core problem in Computer Vision. Traditional recognition systems often describe visual content by producing a set of isolated labels, object locations, or by even trying to annotate every pixel in an image with a category. People instead describe the visual world using language. The rich visually descriptive language produced by people incorporates information from human intuition, world knowledge, visual saliency, and common sense that go beyond detecting individual visual concepts like objects, attributes, or scenes. Moreover, due to the rising popularity of social media, there exist billions of images with associated text on the web, yet systems that can leverage this type of annotations or try to connect language and vision are scarce.

In this dissertation, we propose new approaches that explore the connections between language and vision at several levels of detail by combining techniques from Computer Vision and Natural Language Understanding. We first present a data-driven technique for understanding and generating image descriptions using natural language, including automatically collecting a big-scale dataset of images with visually descriptive captions. Then we introduce a system for retrieving short visually descriptive phrases for describing some part or aspect of an image, and a simple technique to generate full image descriptions by stitching short phrases. Next we introduce an approach for collecting and generating referring expressions for objects in natural scenes at a much larger scale than previous studies. Finally, we describe methods for learning how to name objects by using intuitions from perceptual categorization related to basic-level and entry-level categories.

The main contribution of this thesis is in advancing our knowledge on how to leverage language and intuitions from human perception to create visual recognition systems that can better learn from and communicate with people.

Otaduy Tristan, Miguel A. (2004)

“6-DoF Haptic Rendering Using Contact Levels of Detail and Haptic Textures”
Under the direction of Ming C. Lin
Electronic copy available

Humans use tactile and force cues to explore the environment around them and to identify and manipulate objects. An ideal human-machine interface for virtual environments should empower the user to feel and orient objects by providing force feedback. The synthesis of force and torque feedback arising from object-object interaction, commonly referred to as six-degree-of-freedom (6-DoF) haptic rendering, can greatly benefit many applications involving dexterous manipulation and complex maneuvering of virtual objects. Examples of such applications include assembly and disassembly operations in rapid prototyping, endoscopic surgical training, and virtual exploration with limited visual feedback. However, existing techniques for 6-DoF haptic rendering are applicable only to relatively simple contact configurations or low-complexity models. In this dissertation, I propose a novel approach for 6-DoF haptic rendering that combines multiresolution representations, hierarchical collision detection algorithms, and perception-based force models. This approach enables 6-DoF haptic rendering of interaction between two polygonal models of high combinatorial complexity. I introduce contact levels of detail, a collision detection algorithm based on multiresolution hierarchies for performing contact queries between complex models at force update rates, by adaptively selecting the appropriate contact resolutions. I also present a new algorithm for 6-DoF haptic rendering of intricate interaction between textured surfaces, based on a perceptually inspired force model and the representation of the objects as low-resolution models with haptic textures. Finally, I derive a novel implicit formulation for computing rigid body dynamics with haptic interaction, and I integrate all these techniques together, thereby achieving stable and responsive 6-DoF haptic rendering.

Ott, David (2005)

“An Open Architecture for Transport-Level Coordination in Distributed Multimedia Application”
Under the direction of Ketan Mayer-Patel
Electronic copy available

Complex multimedia applications of the future will employ clusters of computing hosts and devices where single endpoint hosts once sufficed. Communication between clusters will require an increasing number of data flows as new media types and sophisticated modes of interactivity continue to be developed. With an increase in the number of data flows sharing the same forwarding path comes a need for coordinated use of network resources. Unfortunately, modern transport protocols like TCP, UDP, SCTP, TFRC, DCCP, etc. are designed to operate independently and lack mechanisms for sharing information with peer flows and coordinating data transport within the same application. In this dissertation, we propose an open architecture for data transport that supports the exchange of network state information, peer flow information, and application-defined information among flows sharing the same forwarding path between clusters. Called simply the Coordination Protocol (CP) , the scheme facilitates coordination of network resource usage across flows belonging to the same application, as well as aiding other types of coordination. We demonstrate the effectiveness of our approach by applying CP to the problem of multi-streaming in 3D Tele-immersion (3DTI). Laboratory results show that CP can be used to significantly increase transport synchrony among video streams while, at the same time, minimizing buffering delay, maintaining good network utilization, and exhibiting fairness to TCP traffic competing on the same forwarding path. Experiments on the Abilene backbone network verify these results in a scaled, uncontrolled environment.

Ouh-young, Ming (1990)

“Force Display in Molecular Docking”
Under the direction of Frederick P. Brooks, Jr.
Department of Computer Science Technical Report TR90-004
Electronic copy available

We have developed a real-time computer-graphics system that uses a master station of a remote manipulator system as a 6-D force and torque display. This manipulator system, a graphics engine (E&S PS300 graphics engine, or Fuchs’ Pixel-planes), stereoscopic display, and high-speed calculation of the interaction forces between a drug and its receptor site, constitute a tool for molecular docking. When the drug molecule is maneuvered by the user’s hand, the manipulator serves both as an input device with six degrees of freedom and as an output device for displaying force and torque. My work studied the creation, analysis, and evaluation of the illusion of feel, supplemented with visual illusion. My thesis is that adding force display to an interactive computer graphics system can significantly help in molecular docking problems in terms of task completion time. To demonstrate my hypothesis, I conducted two experiments. The first experiment tied six randomly located springs to both ends of a 6″ stick. The subjects (seven graduate students) tried to find the null-force position. The experimental results corroborated my hypothesis (p < 0.01). Performance gains by about a factor of two were observed. The second experiment simulated the interaction forces between the dihydrofolate reductase enzyme (600 atoms) and six drugs (40 to 60 atoms each). Twelve biochemists tried to dock the drugs into the enzyme including deforming the drugs by bond-twisting. The experimental results corroborated (p < 0.05) my hypothesis in the 6-D rigid-body manipulation component of the task. There was, however, no significant difference in the case of the one-degree-of-freedom bond-rotations component. Overall task-completion time with force-feedback was improved, but the difference was not significant. Limited case-by-case studies show that the subjects using the current ARM system are not only docking faster but also getting more precise docking results than those using the Ellipsoid algorithm (one of the best algorithms to date), both in the number of well-formed hydrogen bonds and in displacements.


Palmer, Daniel W. (1996)

“Efficient Execution of Nested Data-Parallel Programs”
Under the direction of Jan F. Prins

Data parallelism is the independent application of an operation to all elements of a data-aggregate. Nested data parallelism, a fully general form of data parallelism, permits arbitrary functions to be applied to nested data-aggregates (whose elements can themselves be aggregates). Though nested data parallelism is recognized as an important technique for expressing complex computations, it has proven difficult to execute in parallel. Blelloch and Sabot made key progress on this problem with the development of a technique calledflattening that generates parallel vector operations for a large but restricted set of nested data parallel programs. In our work, we formalize flattening as a series of program transformations, and then use this framework to extend the flattening technique to include two important classes of previously excluded programs: those with parallel index operations and those with very large memory requirements. We accommodate the first class of programs with a new algorithm for parallel indexing, called node-extended indexing. This algorithm allows flattening of programs that contain arbitrary parallel indexing operations and helps establish uniform source-level performance metrics for such programs. We report on the performance of the technique on various parallel platforms. Flattening realizes all available parallelism in a nested data-parallel program by trading execution time for storage space independent of machine characteristics. Thus a program in the second class contains available parallelism that exceeds the memory resources of the targeted architecture and therefore cannot execute. The technique of piecewise execution partially serializes a transformed program and reduces its parallelism to match the memory capacity of a target machine without creating a load-imbalance. Taken together, these two approaches enhance the flattening technique, making it applicable to a larger class of nested data-parallel programs.

Pan, Feng (2009)

“Efficient Algorithms in Analyzing Genomic Data”
Under the direction of Wei Wang
Electronic copy available

With the development of high-throughput and low-cost genotyping technologies, immense data can be cheaply and efficiently produced for various genetic studies. A typical dataset may contain hundreds of samples with millions of genotypes/haplotypes. In order to prevent data analysis from becoming a bottleneck, there is an evident need for fast and efficient analysis methods. My thesis focuses on the following two important genetic analyzing problems: 1) Genome-wide Association mapping. The goal of genome wide association mapping is to identify genes or narrow regions in the genome which have significant statistical correlations to the given phenotypes. The discovery of these genes offers the potential for increased understanding of biological processes affecting phenotypes such as body weight and blood pressure. 2) Sample selection for maximal Genetic Diversity. Given a large set of samples, it is usually more efficient to first conduct experiments on a small subset. Then the following question arises: What subset to use? There are many experimental scenarios where the ultimate objective is to maintain, or at least maximize, the genetic diversity within relatively small breeding populations. In my thesis, I developed efficient and effective algorithms to address these problems: 1) Phylogeny-based Genom-wide association mapping, TreeQA and TreeQA+. TreeQA uses local perfect phylogeny tree in genome wide analysis for genotype/phenotype association mapping. Samples are partitioned according to the sub-trees they belong to. The association between a tree and the phenotype is measured by some statistic tests. TreeQA+ inherits all the advantages of TreeQA. Moreover, it improves TreeQA by incorporating sample correlations into the association study. 2) Sample selection for maximal genetic diversity. In biallelic SNP Data, samples are selected based on their genetic diversity among a set of SNPs. Given a set of samples, the algorithms search for the minimum subset that retains all diversity (or a high percentage of diversity). For more general data (non-biallelic), information-theoretic measurements such as entropy and mutual information are used to measure the diversity of a sample subset. Samples are selected to maximize the original information retained.

Pan, Jia (2013)

“Efficient Configuration Space Construction and Optimization”
Under the direction of Dinesh Manocha

Configuration space is an important concept widely used in algorithmic robotics. Many applications in robotics, computer-aided design and related areas can be reduced to computational problems in terms of configuration space. In this dissertation, we address three main computational challenges related to configuration spaces: 1) how to efficiently compute an approximate representation of high-dimensional configuration spaces; 2) how to efficiently perform geometric, proximity, and motion planning queries in high-dimensional configuration spaces; 3) how to model uncertainty information in configuration spaces in terms of dealing with noisy sensor data.   We present new configuration space construction algorithms based on machine learning and geometric approximation techniques. These algorithms perform collision queries on many configuration samples as input and generate approximate representations, which can quickly converge to the exact configuration space. We highlight the efficiency of our algorithms on penetration depth computation and faster instance-based motion planning. We also present parallel GPU-based algorithms to accelerate the performance of optimization or search computations in configuration spaces. In particular, we design efficient GPU-based parallel k-nearest neighbor and parallel collision detection algorithms and use them to accelerate motion planning. In order to extend configuration space algorithms to handle noisy sensor data arising from real-world robotics applications, we model the uncertainty in configuration space by formulating the collision probability between noisy data. We use these algorithms to perform reliable motion planning for the PR2 robot.

Paramasivam, Muthukrishnan (1997)

“Instance-Based First-Order Methods Using Propositional Provers”
Under the direction of David A. Plaisted

Early refutational theorem proving procedures were direct applications of Herbrand’s version of the completeness theorem for first-order logic. These instance-based theorem provers created propositional instances of the first-order clauses to be proved unsatisfiable, and tested the instances on a propositional calculus prover. This methodology was not pursued for several decades as it was thought to be too slow. Moreover, the invention of the resolution inference rule changed the direction of theorem proving forever. The success of resolution was largely due to unification. Recently, unification has been incorporated in creating instances of first-order clauses. Furthermore, high-performance propositional calculus provers have been developed in the past few years. As a result, it is possible to realize effective instance-based first-order methods for several applications. We describe the design of instance-based methods for three different purposes. First, RRTP is a theorem prover based on the idea of replacing predicates with their definitions. We compare its performance with some state-of-the-art theorem provers. Second, we describe a proof procedure for Horn theories. The proof procedure creates instances of the input clauses by backward chaining and reasons forward among the instances to find the proof. Third, we describe the construction of a finite-model finder. Finally, we describe the application of the theorem prover and the model finder on an application–description logics. We show by empirical means that, contrary to general belief, theorem provers compare well with specialized application-specific techniques for description logics.

Parente, Peter (2008)

“Clique: Perceptually Based, Task Oriented Auditory Display for GUI Applications”
Under the direction of Gary Bishop
Electronic copy available

Screen reading is the prevalent approach for presenting graphical desktop applications in audio. The primary function of a screen reader is to describe what the user encounters when interacting with a graphical user interface (GUI). This straightforward method allows people with visual impairments to hear exactly what is on the screen, but with significant usability problems in a multitasking environment. Screen reader users must infer the state of on-going tasks spanning multiple graphical windows from a single, serial stream of speech. In this dissertation, I explore a new approach to enabling auditory display of GUI programs. With this method, the display describes concurrent application tasks using a small set of simultaneous speech and sound streams. The user listens to and interacts solely with this display, never with the underlying graphical interfaces. Scripts support this level of adaption by mapping GUI components to task definitions. Evaluation of this approach shows improvements in user efficiency, satisfaction, and understanding with little development effort. To develop this method, I studied the literature on existing auditory displays, working user behavior, and theories of human auditory perception and processing. I then conducted a user study to observe problems encountered and techniques employed by users interacting with an ideal auditory display: another human being. Based on my findings, I designed and implemented a prototype auditory display, called Clique, along with scripts adapting seven GUI applications. I concluded my work by conducting a variety of evaluations on Clique. The results of these studies show the following benefits of Clique over the state of the art for users with visual impairments (1-5) and mobile sighted users (6):

  1. Faster, accurate access to speech utterances through concurrent speech streams.
  2. Better awareness of peripheral information via concurrent speech and sound streams.
  3. Increased information bandwidth through concurrent streams.
  4. More efficient information seeking enabled by ubiquitous tools for browsing and searching.
  5. Greater accuracy in describing unfamiliar applications learned using a consistent, task-based user interface.
  6. Faster completion of email tasks in a standard GUI after exposure to those tasks in audio.
Pargas, Roy P. (1982)

“Parallel Solution of Elliptic Partial Differential Equations On a Tree Machine”
Under the direction of Gyula A. Mago
Electronic copy available

The numerical solution of elliptic partial differential equations (pde’s) can often be reduced to the solution of other relatively simple problems, such as solving tridiagonal systems of equations and low-order recurrence relations. This thesis describes elegant and efficient tree machine algorithms for solving a large class of these simpler problems, and then uses these algorithms to obtain numerical solutions of elliptic partial pde’s using methods of finite differences. The tree machine model on which this work is based contains data only in the leaf cells of a complete binary tree of processors; one leaf cell typically holds all information pertinent to one point of the rectangular mesh of points used by the method of finite differences. An algorithm is described for communication among leaf cells using shortest paths; other algorithms are exhibited that find the first nterms of the solution to several classes of recurrence expressions in O(log n) time. The communication and recurrence expression tree algorithms are used to describe algorithms to solve (n x n) tridiagonal linear systems of equations. A number of direct methods are shown to require O(log n) time, whereas other direct methods require O((logn)²) time. Iterative methods are shown to require O(log n) time per iteration. The asymptotic complexity of both direct and iterative methods implemented on sequential, vector, array, and tree processors are compared. The tridiagonal linear system solvers and the communication algorithms are used to describe algorithms to solve (n² x n²) block-tridiagonal linear systems iteratively. Both point iterative and block iterative methods are shown to require O(n) time per iteration. Alternating direction implicit methods require O(n log n) time per iteration. The asymptotic complexity of methods implemented on sequential, vector, array, and tree processors are again compared.

Park, Chonhyon (2016)

High-DOF Motion Planning in Dynamic Environments using Trajectory Optimization

Under the direction of Dinesh Manocha

Electronic copy available

Motion planning is an important problem in robotics, computer-aided design, and simulated environments. Recently, robots with a high number of controllable joints are increasingly used for different applications, including in dynamic environments with humans and other moving objects. In this thesis, we address three main challenges related to motion planning algorithms for high-DOF robots in dynamic environments: 1) how to compute a feasible and constrained motion trajectory in dynamic environments; 2) how to improve the performance of realtime computations for high-DOF robots; 3) how to model the uncertainty in the environment representation and the motion of the obstacles. 

We present a novel optimization-based algorithm for motion planning in dynamic environments.

We model various constraints corresponding to smoothness, as well as kinematics and dynamics bounds as cost function, and perform stochastic trajectory optimization to compute feasible high-dimensional trajectories. In order to handle arbitrary dynamic obstacles, we use a replanning framework that interleaves planning with execution. We also parallelize our approach on multiple CPU or GPU cores to improve the performance and perform realtime computations. 

In order to deal with the uncertainty of dynamic environments, we present an efficient probabilistic collision detection algorithm that takes into account noisy sensor data.

We predict the future obstacle motion as Gaussian distributions, and compute the bounded collision probability between a high-DOF robot and obstacles. We highlight the performance of our algorithms in simulated environments as well as with a 7-DOF Fetch arm.

Park, Eunbyung (2019)

Learning to adapt from few examples

Under the direction of Alex Berg

Electronic copy available

Despite huge progress in artificial intelligence, the ability to quickly learn from few examples is still far short of that of a human. With the goal of building machines with this capability, learning-to-learn or meta-learning has begun to emerge with promising results. I present the effectiveness and techniques that improve existing meta-learning methods in the context of visual object tracking, few-shot classification, and few-shot reinforcement learning setup. 

The visual object trackers that use online adaptation are improved. The core contribution is an offline meta-learning-based method to adjust the initial deep networks used in online adaptation-based tracking. The meta learning is driven by the goal of deep networks that can quickly be adapted to robustly model a particular target in future frames. Ideally the resulting models focus on features that are useful for future frames, and avoid overfitting to background clutter, small parts of the target, or noise. Experimental results on standard benchmarks, OTB2015 and VOT2016, show that the meta-learned trackers improve speed, accuracy, and robustness.

I propose to learn curvature information that can achieve better generalization and fast model adaptation. Based on the model-agnostic meta-learner (MAML), learning to transform the gradients in the inner optimization such that the transformed gradients achieve better generalization performance to a new task. For training large scale neural networks, the decomposition of the curvature matrix into smaller matrices are proposed and this capture the dependencies of the model’s parameters with a series of tensor products. Experimental results show significant improvements on classification tasks and promising results on reinforcement learning tasks. Finally, I present an analysis that explains better generalization performance with the meta-trained curvature.

Parker, Erin (2004)

“Analyzing the Behavior of Loop Nests in the Memory Hierarchy: Methods, Tools and Applications”
Under the direction of Siddhartha Chatterjee
Electronic copy available

Processor speeds are improving at a much faster rate than the speeds of accessing main memory. As a result, data access time dominates the execution times of many programs. Understanding the behavior of programs executing in a memory hierarchy is therefore an important part of improving program performance. This dissertation describes an analytical framework for understanding the behavior of loop-oriented programs executing in a memory hierarchy. The framework has three components: 1) an alternative classification of cache misses that makes it possible to obtain the exact cache behavior of a sequence of program fragments by combining the cache behavior of the individual fragments; 2) the use of Presburger arithmetic to model data access patterns and describe events such as cache misses; and 3) algorithms exploiting the connection among Presburger arithmetic, automata theory, and graph theory to produce exact cache miss counts. The analytical framework presented in this dissertation goes beyond existing analytical frameworks for modeling cache behavior: it handles set-associative caches, data cache and translation lookaside buffer (TLB) misses, imperfect loop nests, and nonlinear array layouts in an exact manner. Experiments show both the framework’s value in the exploration of new memory system designs and its usefulness in guiding code and data transformations for improved program performance.

Parris, Mark A. (2001)

“Class-Based Thresholds: Lightweight Active Router-Queue Management for Multimedia Networking”
Under the direction of Kevin Jeffay
Electronic copy available

In the best-effort Internet, congestion can cause severe degradations in the performance of both reliable data transfer flows and multimedia flows. These reliable data transfers are typically based on TCP, a responsive protocol. Responsive protocols are those that respond to congestion by reducing the rate at which they transmit data. This behavior is desirable because when all traffic is responsive, the traffic sources cooperate to ameliorate congestion by arriving at an aggregate operating point where the generated load matches the available capacity. In contrast, multimedia applications are typically based on unresponsive protocols, such as UDP, where the transmission rate is determined by the application, independent of network conditions. There exists a fundamental tension between these responsive and unresponsive protocols. When both traffic types are present during periods of congestion, unresponsive traffic maintains its load, forcing responsive traffic to reduce its load. Consequently, unresponsive traffic may benefit from this behavior and consume more than its fair share of the available bandwidth while responsive flows receive less than their fair share and suffer poor throughput. In the extreme, congestion collapse is possible. This offers a disincentive for using responsive protocols. Recent proposals have attempted to address this problem by isolating responsive traffic from the effects of unresponsive traffic. They achieve this goal by identifying and severely constraining all unresponsive traffic. We take the position that some unresponsive traffic, specifically multimedia, is necessarily unresponsive. Maintaining the fidelity of the media stream places bounds on minimum levels of throughput and maximum tolerable latency. Since responsive protocols focus on reducing the transmission rate to match the available capacity independent of application-level concerns, these bounds may be difficult or impossible to meet with responsive protocols. As such, we argue that in addition to isolating responsive traffic, multimedia should not be unduly constrained and must also be isolated from the effects of other unresponsive traffic. In this dissertation we propose and evaluate a novel algorithm to allocate network bandwidth by allocating buffer space in a router’s queue. This algorithm is called Class-Based Thresholds (CBT). We define a set of traffic classes: responsive (i.e., TCP), multimedia, and other. For each class a threshold is specified that limits the average queue occupancy by that class. In CBT, when a packet arrives at the router it is classified into one of these classes. Next, the packet is enqueued or discarded depending on that class’s recent average queue occupancy relative to the class’s threshold value. The ratio between the thresholds determines the ratio between the bandwidth available to each class on the outbound link. CBT and other router queue management algorithms from the literature (FIFO, RED, and FRED) are implemented in a FreeBSD router and evaluated in a laboratory network using several combinations of TCP, multimedia, and generic unresponsive traffic. We show that CBT can be tuned to offer prescribed levels of performance for each traffic class. We present analysis that predicts network performance when using CBT based on initial configuration parameters, explain how this analysis can be reversed to derive optimal parameter settings given desired network performance, and empirically demonstrate the accuracy of this analysis. We present experimental data that contributes to our understanding of how existing queue management schemes perform, articulate relationships between parameters and performance metrics, and determine optimal parameter settings for each algorithm under various traffic mixes. We empirically demonstrate that CBT effectively isolates TCP while providing better-than-best-effort service for multimedia by comparing CBT’s performance to the optimal performance for other algorithms. Finally, we show CBT provides better protection for TCP than RED and FIFO and better multi-media performance than RED, FIFO, and FRED.

Partain, William D. (1989)

“Graph Reduction Without Pointers”
Under the direction of Gyula A. Mago
Department of Computer Science Technical Report TR89-045
Electronic copy available

Graph reduction is one way to overcome the exponential space blow-ups that simple normal-order evaluation of the lambda-calculus is likely to suffer. The lambda-calculus underlies lazy functional programming languages, which offer hope for improved programmer productivity based on stronger mathematical underpinnings. Because functional languages seem well-suited to highly-parallel machine implementations, graph reduction is often chosen as the basis for these machines’ designs. Inherent to graph reduction is a commonly-accessible store holding nodes referenced through “pointers,” unique global identifiers; graph operations cannot guarantee that nodes directly connected in the graph will be in nearby store locations. This absence of locality is inimical to parallel computers, which prefer isolated pieces of hardware working on self-contained parts of a program. In this dissertation, I develop an alternate reduction system using “suspensions” (delayed substitutions), with terms represented as trees and variables by their binding indices (de Bruijn numbers). Global pointers do not exist and all operations, except searching for redexes, are entirely local. The system is provably equivalent to graph reduction, step for step. I show that if this kind of interpreter is implemented on a highly-parallel machine with a locality-preserving, linear program representation and fast scan primitives (an FFP Machine is an appropriate architecture) then the interpreter’s worst-case space complexity is the same as that of a graph reducer (that is, equivalent sharing), and its time complexity falls short on only one unimportant case. On the other side of the ledger, graph operations that involve chaining through many pointers are often replaced with a single associative-matching operation. What is more, this system has no difficulty with free variables in redexes and is good for reduction to full beta-normal form. These results suggest that non-naive tree reduction is an approach to support functional programming that a parallel-computer architect should not overlook.

Patil, Sachin (2013)

“Closed-Loop Planning and Control of Steerable Medical Needles”
Under the direction of Ron Alterovitz

Steerable needles have the potential to increase the effectiveness of needle-based clinical procedures such as biopsy, drug delivery, and radioactive seed implantation for cancer treatment. These needles can trace curved paths when inserted into tissue, thereby increasing maneuverability and targeting accuracy while reaching previously inaccessible targets that are behind sensitive or impenetrable anatomical regions. Guiding these flexible needles along an intended path requires continuously inserting and twisting the needle at its base, which is not intuitive for a human operator. In addition, the needle often deviates from its intended trajectory due to factors such as tissue deformation, needle-tissue interaction, noisy actuation and sensing, modeling errors, and involuntary patient motions. These challenges can be addressed with the assistance of robotic systems that automatically compensate for these perturbations by performing motion planning and feedback control of the needle in a closed-loop fashion under sensory feedback. We present two approaches for efficient closed-loop guidance of steerable needles to targets within clinically acceptable accuracy while safely avoiding sensitive or impenetrable anatomical structures. The first approach uses a fast motion planning algorithm that unifies planning and control by continuously replanning, enabling correction for perturbations as they occur. We evaluate our method using a needle steering system in phantom and ex vivo animal tissues. The second approach integrates motion planning and feedback control of steerable needles in highly deformable environments. We demonstrate that this approach significantly improves the probability of success compared to prior approaches that either consider uncertainty or deformations but not both simultaneously. We also propose a data-driven method to estimate parameters of stochastic models of steerable needle motion. These models can be used to create realistic medical simulators for clinicians wanting to train for steerable needle procedures and to improve the effectiveness of existing planning and control methods. This dissertation advances the state of the art in planning and control of steerable needles and is an important step towards realizing needle steering in clinical practice. The methods developed in this dissertation also generalize to important applications beyond medical needle steering, such as manipulating deformable objects and control of mobile robots.

Peck, Tabitha (2010)

“Redirected Free Exploration with Distractors: A Large-Scale Real-Walking Locomotion Interface”
Under the direction of Mary Whitton & Henry Fuchs
Electronic copy available

Immersive VEs enable user controlled interactions within the environment such as head-controlled point-of-view and user-controlled locomotion. In the real world people usually locomote by walking; walking is simple and natural, and enables people not only to move between locations, but also to develop cognitive maps, or mental representations, of environments. People navigate every day in the real world without problem, however users navigating VEs often become disoriented and frustrated, and find it challenging to transfer spatial knowledge acquired in the VE to the real world. In this dissertation I develop and demonstrate the effectiveness of a new locomotion interface, Redirected Free Exploration with Distractors (RFED) that enables people to freely walk in large scale VEs. RFED is the combination of distractors, objects, sounds, or combinations of objects and sounds in the VE that encourage people to turn their heads, and redirection, making the user turn herself by interactively and imperceptibly rotating the virtual scene about her while she is turning her head. I demonstrate through user studies that compare RFED to a real-walking locomotion interface that RFED does not diminish user ability to navigate. I further demonstrate that users navigate better in RFED than with joystick and walking-in-place locomotion interfaces. Additionally, RFED does not significantly increase simulator sickness when compared to real walking, walking-in-place, and joystick interfaces

Pfannenstiel, Wolf (2000)

“Piecewise Execution of Nested Data Parallel Programs” (Technische Universitat Berlin)
Under the direction of S. Jaehnichen, Technische UniversitäBerlin, and Jan F. Prins
(No UNC-Chapel Hill library copy)

Nested data parallelism is an expressive high-level parallel programming model. Nested data-parallel languages are characterized by nested aggregate types, and the ability to specify nested data- parallel operations. The flattening transformation is used to compile nested data-parallel programs to use a library of simple vector operations. However, since all parallelism is preserved, the vector operations lead to high space requirements as well as high bandwidth requirements. In this thesis, we propose using piecewise execution as a way to reduce space and bandwidth requirements. Piecewise execution breaks down the monolithic view of vector operations. Instead, each operation processes pieces of only constant size at any one time. Operations must therefore be activated multiple times in order to consume their complete input and produce all their output. In the optimal case, the whole computation can work on pieces such that, for each vector, only a piece of constant size is stored in the memory at any one time. We propose two piecewise evaluation strategies featuring opposite ways of scheduling the activations of operations. We develop a cost- effective parallel implementation for piecewise execution based on multiple threads. The implementation targets distributed-memory architectures but can easily take advantage of shared-memory architectures. Piecewise execution can be seamlessly combined with other execution modes such as full vector operations. We conduct an analysis of the space bounds of piecewise execution and show that it can actually increase the space requirements of programs. It is, in general, impossible to predict the space behavior of piecewise execution statically. However, we identify critical program structures and conditions for the use of piecewise execution. The results can be exploited for optimizations in the compilation process. We show by case studies that piecewise execution can improve both space and time performance of typical programs. Improved time performance is achieved by choosing a piece size that matches the processors’ data-cache size. We show how it can be combined with loop-fusion techniques, which are probably the most successful optimizations for collection-based programs. Unlike local optimizations like loop fusion, piecewise execution can be applied across boundaries such as (recursive) function calls and communication operations. We suggest a number of optimizations for piecewise execution and demonstrate their usefulness in the case studies. Finally, we identify and discuss a number of potential application areas other than nested data parallelism for piecewise execution.

Pool, Jeff (2012)

“Energy-Precision Tradeoffs in the Graphics Pipeline”
Under the direction of Anselmo Lastra and Montek Singh
Electronic copy available

The energy consumption of a graphics processing unit (GPU) is an important factor in its design, whether for a server, desktop, or mobile device. Mobile products, such as smart phones, tablets, and laptop computers, rely on batteries to function; the less the demand for power is on these batteries, the longer they will last before needing to be recharged. GPUs used in servers and desktops, while not dependent on a battery for operation, are still limited by the eciency of power supplies and heat dissipation techniques. In this dissertation, I propose to lower the energy consumption of GPUs by reducing the precision of oating-point arithmetic in the graphics pipeline and the data sent and stored on- and o-chip.

The key idea behind this work is twofold: energy can be saved through a systematic and targeted reduction in the number of bits 1) computed and 2) communicated. Reducing the number of bits computed will necessarily reduce either the precision or range of a oating point number. I focus on saving energy by way of reducing precision, which can exploit the over-provisioning of bits in many stages of the graphics pipeline. Reducing the number of bits communicated takes several forms. First, I propose enhancements to existing compression schemes for o-chip buers to save bandwidth. I also suggest a simple extension that exploits unused bits in reduced-precision data undergoing compression. Finally, I present techniques for saving energy in on-chip communication of reduced-precision data.

By designing and simulating variable-precision arithmetic circuits with promising energy versus precision characteristics and tradeos, I have developed an energy model for GPUs. Using this model and my techniques, I have shown that signicant savings (up to 70% in computation in the vertex and pixel shader stages) are possible by reducing the precision of the arithmetic. Further, my compression approaches have enabled improvements of 1.26x over past work, and a general-purpose compressor design has achieved bandwidth savings of 34%, 87%, and 65% for color, depth, and geometry data, respectively, which is competitive with past work. Lastly, an initial exploration in signal gating unused lines in on-chip buses has suggested savings of 13{48% for the tested applications’ trac from a multiprocessor’s register le to its L1 cache.

Popelas, Judy M. (1983)

“A Case Grammar Approach to Error Correction in Computer Programs”
Under the direction of Peter Calingaert

This dissertation describes a programming language grammar and a corresponding analysis method based on the use of attribute and significant keyword information. The grammar is referred to as a case grammar, and the analysis method as case analysis. Their use results in a significant ability to correct syntactic errors in computer programs. A case grammar defines a programming language in terms of its objects, and its objects in terms of their attributes. It consists of two parts. The first part, the access form dictionary, defines ways of accessing objects in the language in terms of other objects in the language. For example, a Pascal set literal can be accessed by specifying the zero or more set elements that compose it. The second part of a case grammar, the verb dictionary, specifies the objects required by each verb of the language, and the functional or case relationship of each object to its verb. The Pascal assignment verb, for example, requires two objects, the first a recipient and the second a donor. The donor case relationship is explicitly indicated by the terminal symbol ‘:=’ that precedes the donor object. Entries in the verb and access form dictionaries are referred to as case and form frames, respectively. Case grammars are capable of a complete syntactic definition of any context-free programming language. The case analysis algorithm retrieves appropriate case or form frames, puts them on a stack, and matches the next object in the program text to an object requirement in the topmost frame. Frames are often established absolutely, on the basis of significant programming language tokens (keywords) in the program text. In matching an object to a particular object requirement in a frame, the analysis algorithm considers the object’s attributes, as well as its position, punctuation context, and preceding keyword, it any. The case analysis algorithm is shown to work correctly for correct program text. The error correction performance of the case analysis algorithm compares favorably with the performance of other recently proposed error correction schemes.

Popescu, Voicu S. (2001)

“Forward Rasterization: A Reconstruction Algorithm for Image-Based Rendering”
Under the direction of Anselmo A. Lastra
Department of Computer Science Technical Report TR96-019
Electronic copy available

In a recent alternative research path for interactive 3D graphics, the scene to be rendered is described with images. Image-based rendering (IBR) is appealing since natural scenes, which are very difficult to model conventionally, can be acquired semi-automatically using cameras and other devices. Also IBR has the potential to create high-quality output images if the rendering stage can preserve the quality of the reference images (photographs). One promising IBR approach enhances the images with per-pixel depth. This allows reprojecting or warping the color-and-depth samples from the reference image to the desired image. Image-based rendering by warping (IBRW) is the focus of this dissertation. In IBRW, simply warping the samples does not suffice for high-quality results because one must reconstruct the final image from the warped samples. This dissertation introduces a new reconstruction algorithm for IBRW. This new approach overcomes some of the major disadvantages of previous reconstruction methods. Unlike the splatting methods, it guarantees surface continuity and it also controls aliasing using a novel filtering method adapted to forward mapping. Unlike the triangle-mesh method, it requires considerably less computation per input sample, reducing the rasterization-setup cost by a factor of four. The algorithm can be efficiently implemented in hardware and this dissertation presents the WarpEngine, a hardware architecture for IBRW. The WarpEngine consists of several identical nodes that render the frame in a sort-first fashion. Sub-regions of the depth images are processed in parallel in SIMD fashion. The WarpEngine architecture promises to produce high-quality high-resolution images at interactive rates. This dissertation also analyzes the suitability of our approach for polygon rendering and proposes a possible algorithm for triangles. Finding the samples that must be warped to generate the current frame is another very challenging problem in IBRW. The dissertation introduces the vacuum-buffer sample-selection method, designed to ensure that the set of selected samples covers every visible surface.

Pozefsky, Diane P. (1979)

“Building Efficient Pass-Oriented Attribute Grammar Evaluators”
Under the direction of Mehdi Jazayeri

This dissertation explores various avenues toward constructing efficient attribute grammar evaluators automatically. It begins by exploring operations applicable to all attribute grammars: the identification and elimination of useless and local attributes, the removal of erasing productions from an attribute grammar, and methods of improving the set of semantic rules associated with a particular production. The work then turns explicitly to the family of pass-oriented attribute grammar evaluators, of which Eochmann’s left-to-right evaluator and Jazayeri’s Alternating Semantic Evaluator are the most familiar members. The family is studied in its most general form and a subfamily is identified whose evaluators can be produced easily by a single evaluator generator. Along the way, a procedure is developed for ordering the attributes of the same nonterminal. Methods are then explored to improve the time and space requirements of the member evaluators. Improvements are made in the speed of the evaluators by eliminating the parse tree and controlling evaluation by files of calls. This reduces execution time to its minimum, the amount of time required to evaluate all attribute occurrences. Storage improvements are made by identifying the useful life span of attributes, storing on a stack those that are only needed for a single pass, and dynamically allocating and freeing others. Further, multiple attributes with identical values can share copies of the data; this occurs when attributes are assigned values in transfer rules. Finally, profiles of attribute grammars for real languages are given to support the value of many of the features presented.

Pozefsky, Mark (1977)

“Programming in Reduction Languages”
Under the direction of Gyula A. Mago

Reduction languages are a complete, closed applicative language with several elegant mathematical properties. This dissertation demonstrates that Reduction languages can be viewed as practical user languages with respect to a particular machine model. The expression orientation and absence of variables force a structured, top-down, modularized approach to algorithm design. Functional, recursive, and combinatory programming techniques make proper program decomposition natural. Reduction languages can range from very low to very high level languages depending on the set of primitives used. No control structures are built-in so the programmer has the obligation and the opportunity to define the control flows most beneficial to his algorithm. A single syntactic entity, the sequence, simulates most existing information structures (e.g., arrays, trees), and obviates some other structures (e.g., linked lists). Unbounded parallelism, associative referencing, and implicit storage management decrease the bookkeeping burdens on the programmer, allowing him to concentrate on the high level structure of his program. Testing and debugging are simplified because all data are literal, language primitives are so powerful, and sequencing control is simplified. Independence of expressions makes isolation and correction of errors easier.

Prastawa, Marcelinus (2007)

“An MRI Segmentation Framework for Brains with Anatomical Deviations”
Under the direction of Guido Gerig
Electronic copy available

The segmentation of brain Magnetic Resonance (MR) images, where the brain is partitioned into anatomical regions of interest, is a notoriously difficult problem when the underlying brain structures are influenced by pathology or are undergoing rapid development. This dissertation proposes a new automatic segmentation method for brain MRI that makes use of a model of a homogeneous population to detect anatomical deviations. The chosen population model is a brain atlas created by averaging a set of MR images and the corresponding segmentations. The segmentation method is an integration of robust parameter estimation techniques and the Expectation-Maximization algorithm. In clinical applications, the segmentation of brains with anatomical deviations from those commonly observed within a homogeneous population is of particular interest. One example is provided by brain tumors, since delineation of the tumor and of any surrounding edema is often critical for treatment planning. A second example is provided by the dynamic brain changes that occur in newborns, since study of these changes may generate insights into regional growth trajectories and maturation patterns. Brain tumor and edema can be considered as anatomical deviations from a healthy adult population, whereas the rapid growth of newborn brains can be considered as an anatomical deviation from a population of fully developed infant brains. A fundamental task associated with image segmentation is the validation of segmentation accuracy. In cases in which the brain deviates from standard anatomy, validation is often an ill-defined task since there is no knowledge of the ground truth (information about the actual structures observed through MRI). This dissertation presents a new method of simulating ground truth with pathology that facilitates objective validation of brain tumor segmentations. The simulation method generates realistic-appearing tumors within the MRI of a healthy subject. Since the location, shape, and volume of the synthetic tumors are known with certainty, the simulated MRI can be used to objectively evaluate the accuracy of any brain tumor segmentation method.

Pratapa, Srihari (2019)

Efficient Random Access Compression Schemes for Image-Based Representations

Under the direction of Dinesh Manocha

Electronic copy available

In the last two decades, a new class of rendering methods called image-based rendering has emerged as an alternative to traditional geometry-based rendering methods. Image-based rep- representations require an intensive amount of image data, making compression a critical issue. In particular, we need efficient compression schemes that map well to the current graphics hardware and provide random access capability for interactive applications. 

In this dissertation, we present new, efficient random-access compression schemes for three different image-based representations. Our first compression scheme, MPTC, introduces the idea of transcoding compressed video data directly to compressed textures such that the decoded output can be rendered using commodity texture mapping hardware. Our method improves rendering speeds up to an order of magnitude on mobile devices. Next, we propose two novel compression schemes, RLFC and HMLFC, for encoding light field images. In the RLFC algorithm, we describe a new hierarchical approach that provides progressive streaming and decoding capabilities that enable interactive rendering of light fields. HMLFC combines two different approaches for encoding light fields, motion compensation and hierarchical approaches to capture redundancies in a hybrid fashion; the compression rates with the hybrid approach improve by a factor of 2 − 5× over the underlying motion and hierarchical schemes. Our third compression method, RANDM, deals with compressing depth maps using a range-partitioning and a global dictionary. RANDM is the first compression scheme that enables random access decoding for depth maps and achieves compression rates similar to the existing schemes. Each of these three methods allow compressed representation of image data in memory while rendering and selectively decoding required data quickly during run-time. 

We also present a novel, deep learning-based approach (neural networks), TexNN, for predicting the right set of encoding parameters for fast encoding of textures into compressed texture formats. Our method offers up to a magnitude of speed-up compared to prior compression algorithms while maintaining similar compression quality.

Price, True (2019)

Scene Reconstruction Beyond Structure-from-Motion and Multi-view Stereo

Under the direction of Jan Michael-Frahm

Electronic copy available

 Image-based 3D reconstruction has become a robust technology for recovering accurate and realistic models of real-world objects and scenes. A common pipeline for 3D reconstruction is to first apply Structure-from-Motion (SfM), which recovers relative poses for the input images and sparse geometry for the scene, and then apply Multi-view Stereo (MVS), which estimates a dense depth map for each image. While this two-stage process is quite effective in many 3D modeling scenarios, assumptions about the scene and the input imagery limit what can be reconstructed. This dissertation focuses on three particular scenarios where the SfM+MVS pipeline fails and introduces new approaches to accomplish each reconstruction task.

First, I introduce a novel method to recover dense surface reconstructions of endoscopic video. In this setting, SfM can generally provide sparse surface structure, but the lack of surface texture as well as complex, changing illumination often causes MVS to fail. To overcome these difficulties, I introduce a method that utilizes SfM both to guide surface reflectance estimation and to regularize shading-based depth reconstruction. I also introduce models of reflectance and illumination that improve the final result.

Second, I introduce an approach for augmenting 3D reconstructions from large-scale Internet photo-collections by recovering the 3D position of transient objects — specifically, people — in the input imagery. Since no two images can be assumed to capture the same person in the same location, the typical triangulation constraints enjoyed by SfM and MVS cannot be directly applied. I introduce an alternative method to approximately triangulate people who stood in similar locations, aided by a height distribution prior and visibility constraints provided by SfM. The scale of the scene, gravity direction, and per-person ground-surface normals are also recovered.

Finally, I introduce the concept of using crowd-sourced imagery to create living 3D reconstructions — visualizations of real places that include dynamic representations of transient objects. A key difficulty here is that SfM+MVS pipelines often poorly reconstruct ground surfaces given Internet images, which typically capture the ground at very oblique angles, if at all. To address this, I introduce a volumetric reconstruction approach that leverages scene scale and person placements. Crowd simulation is then employed to add virtual pedestrians to the space and bring the reconstruction “to life.”

Prokop, Jan S. (1969)

“An Investigation of the Effects of Computer Graphics on Executive Decision Making in an Inventory Control Environment”
Under the direction of Frederick P. Brooks, Jr.

The central question of this dissertation is whether an objective ranking of utility to the decision maker can be assigned to the form of the presentation to him. In this instance we are interested in whether an executive decision can be reached earlier, faster or more consistently with a computer-driven display device than with the more customary printed material. An experimental group of eighteen management-level subjects with extensive experience in inventory control was assembled for a two-week short course in advanced inventory management techniques. During the short course, the subjects were exposed to simulated results from computer application of certain inventory control policies to a hypothetical inventory system handling n items. This system is faced with a certain randomly derived set of orders, price changes, replenishments of stock and other transactions. The results of the simulation were presented on both printer paper and on a cathode ray tube display device. For each method of presentation, results were represented in both tabular and graphical form. When a substantial part of the course had transpired, the subjects were asked to evaluate the results of simulating two inventory systems, using printer output for one evaluation and the cathode ray tube display device for the other evaluation. By means of a Latin square design and rank order statistics these evaluations were inspected to determine if experienced decision-makers using a display device could reach a decision that was earlier, faster or more consistent than a decision reached by means of printer output. The results indicated, with very high statistical significance, that a decision could be made faster with a display device, and with high significance that a decision could be made on the basis of less information by means of the display device. Other results pointed to decisions that were more consistent for the individual and which tended more to agree with the rest of the group when display device techniques were used for the evaluation of the inventory systems.

Puff, Derek T. (1995)

“Human vs. Vision Model Performance for Two Medical Image Estimation Tasks” (Biomedical Engineering, UNC-Chapel Hill)
Under the direction of Stephen M. Pizer
Electronic copy available

A computed method for measuring medical image quality would allow fast and economical evaluation of image acquisition and display systems without the need for arduous, expensive human observer experiments. It would help such a method to be predictive of human assessment if it reflected the principles thought to govern the operation of the visual system. This dissertation research implemented and tested the accuracy of a measure of medical image quality that incorporates a model of human vision in a simulation of human image interpretation. It was hypothesized that the model, by performing in a way that reflected the inherent capabilities and limitations of a human, would be predictive of human performance as physical properties of the image varied. The core model of shape perception, a theory for the representation of objects that may serve as a fundamental perceptual basis for a number of medical image interpretation tasks, was applied in computing estimates of the depth of a vessel stenosis in an angiogram and the position of a radiation treatment field relative to the spinal cord in a portal image. Parameters of those imaging systems that have significant effects on the physical characteristics of the images, such as the amount of imaging system blur or the extent of contrast-enhancing processing, were systematically varied. Model and human task performance was studied as a function of the parameters in order to assess the extent to which the model predicted the human results. In most instances, the analysis suggested that the conformance of the model and human data was not sufficient to allow use of the visual model as proposed. The conclusion explores the potential factors in those discrepancies and reiterates the claim that image quality assessments based upon fundamental principles of visual perception might eventually be utilized successfully for medical image interpretation tasks.


Rademacher, Pablo M. (2003)

“Measuring the Perceived Visual Realism of Images”
Under the direction of Gary Bishop
Electronic copy available

One of the main goals of computer graphics research is to develop techniques for creating images that look real – i.e., indistinguishable from photographs. Most existing work on this problem has focused on image synthesis methods, such as the simulation of the physics of light transport and the reprojection of photographic samples. However, the existing research has been conducted without a clear understanding of how it is that people determine whether an image looks real or not real. There has never been an objectively tested, operational definition of realism for images, in terms of the visual factors that comprise them. If the perceptual cues behind the determination of realism were understood, then rendering algorithms could be developed to directly target these cues. This work introduces an experimental method for measuring the perceived visual realism of images, and presents the results of a series of controlled human participant experiments. These experiments investigate the following visual factors: shadow softness, surface smoothness, number of objects, mix of object shapes, and number of light sources. The experiments yield qualitative and quantitative results, confirm some common assertions about realism, and contradict others. They demonstrate that participants untrained in computer graphics converge upon a common interpretation of the term real, with regard to images. The experimental method can be performed using either photographs or computer-generated images, which enables the future investigation of a wide range of visual factors.

Raguram, Rahul (2013)

“Efficient Algorithms for Robust Estimation”
Under the direction of Jan-Michael Frahm and Marc Pollefeys

One of the most commonly encountered tasks in computer vision is the estimation of model parameters from image measurements. This scenario arises in a variety of applications — for instance, in the estimation of geometric entities, such as camera pose parameters, from feature matches between images. The main challenge in this task is to handle the problem of outliers — in other words, data points that do not conform to the model being estimated. It is well known that if these outliers are not properly accounted for, even a single outlier in the data can result in arbitrarily bad model estimates. Due to the widespread prevalence of problems of this nature, the field of robust estimation has been well studied over the years, both in the statistics community as well as in computer vision, leading to the development of popular algorithms like Random Sample Consensus (RANSAC). While recent years have seen exciting advances in this area, a number of important issues still remain open. In this dissertation, we aim to address some of these challenges. The main goal of this dissertation is to advance the state of the art in robust estimation techniques by developing algorithms capable ofefficiently and accurately delivering model parameter estimates in the face of noise and outliers. To this end, the first contribution of this work is in the development of a coherent framework for the analysis of RANSAC-based robust estimators, which consolidates various improvements made over the years. In turn, this analysis leads naturally to the development of new techniques that combine the strengths of existing methods, and yields high-performance robust estimation algorithms, including for real-time applications. A second contribution of this dissertation is the development of an algorithm that explicitly characterizes the e?ects of estimation uncertainty in RANSAC. This uncertainty arises from smallscale measurement noise that a?ects the data points, and consequently, impacts the accuracy of model parameters. We show that knowledge of this measurement noise can be leveraged to develop an inlier classification scheme that is dependent on the model uncertainty, as opposed to a fixed inlier threshold, as in RANSAC. This has the advantage that, given a model with associated uncertainty, we can immediately identify a set of points that support this solution, which in turn leads to an improvement in computational efficiency. Finally, we have also developed an approach to addresses the issue of the inlier threshold, which is a user-supplied parameter that can vary depending on the estimation problem and the data being processed. Our technique is based on the intuition that the residual errors for “good” models are in some way consistent with each other, while “bad” models do not exhibit this consistency. In other words, looking at the relationship between subsets of models can reveal useful information about the validity of the models themselves. We show that it is possible to efficiently identify this consistent behaviour by exploiting residual ordering information coupled with simple non-parametric statistical tests, which leads to an e?ective algorithm for threshold-free robust estimation.

Raghuvanshi, Nikunj (2010)

“Interactive Physically-based Sound Simulation”
Under the direction of Ming C. Lin
Electronic copy available

Hearing is one of our principal senses which complements, and in some cases, supplements sight. The realization of interactive, immersive virtual worlds thus requires the ability to present a realistic aural experience that convincingly reflects the environment presented visually. Physical simulation is a natural way to achieve such realism, enabling deeply immersive virtual worlds — the range of sounds generated is limited only by the number of physical scenarios realized at runtime, rather than the size of a library of pre-recorded sounds/filters. However, physically-based sound simulation is a very computationally expensive problem with its unique challenges, owing to the physics behind sound production and propagation. Past attempts at developing physically-based techniques for sound have suffered from the large gap between required and available computation. The increased computational power of desktop systems today has served to reduce this gap, and it has become possible to bridge this gap to some degree using a combination of algorithmic, implementation and perceptual optimizations. In this talk, I will present my work on developing interactive techniques for sound simulation. The talk will focus on both aspects of sound simulation: Synthesis and Propagation. The problem of Sound Synthesis is concerned with generating the sounds produced by objects due to interaction with the environment, such as collisions. Specifically, I will discuss techniques that I have developed that exploit human auditory perception to simulate scenes with hundreds of solid objects undergoing impact and rolling in real time. Sound Propagation is the complementary problem of modeling the reflection and diffraction of sound in an environment as it travels from a source to the listener. However, numerical acoustic simulation on a large scene can easily require computation in the petaFLOPS range. I will discuss my work on a novel numerical simulator that is 100 times faster and consumes 10 times less memory than the current state of the art in room acoustics. Lastly, I will discuss my work that utilizes simulation results from this fast numerical simulator to render realistic, wave-based acoustics of arbitrary static scenes for multiple moving sources and moving listener in real time, while correctly accounting for physical effects such as low-pass filtering around walls due to diffraction, sound diffusion and realistic reverberation.

Rajgopal, Suresh (1992)

“Spatial Entropy–A Unified Attribute to Model Dynamic Communication in VLSI Circuits”
Under the direction of Kye S. Hedlund
Department of Computer Science Technical Report TR92-041
Electronic copy available

This dissertation addresses the problem of capturing the dynamic communication in VLSI circuits. There are several CAD problems where attributes that combine behavior and structure are needed, or when function behavior is too complex and is best captured through some attribute in the implementation. Examples include, timing analysis, logic synthesis, dynamic power estimation, and variable ordering for binary decision diagrams (BDDs). In such a situation, using static attributes computed from the structure of the implementation is not always helpful. Firstly, they do not provide sufficient usage information, and secondly they tend to exhibit variances with implementations which is not desirable while capturing function behavior. The contribution of this research is a new circuit attribute called spatial entropy. It models the dynamic communication effort in the circuit by unifying the static structure and the dynamic data usage. Quantitatively, spatial entropy measures the switching energy in a physical (CMOS) implementation. A minimum spatial entropy implementation is a minimum energy implementation. For the purposes of this dissertation we restrict our scope to combinational circuits. We propose a simple procedure to estimate spatial entropy in a gate level circuit. It is characterized in extensive detail and we describe why it is difficult to compute spatial entropy accurately. We show how it can also be defined at other levels of abstraction. We illustrate applications of spatial entropy in BDD variable ordering, a problem that has traditionally relied on static attribute based solutions. We also show empirically that spatial entropy can track function behavior through implementations, by using it to measure gate-count complexity in Boolean functions.

Ramamurthy, Srikanth (1997)

“A Lock-Free Approach to Object Sharing in Real-Time Systems.”
Under the direction of James Anderson
Electronic copy available

This work aims to establish the viability of lock-free object sharing in uniprocessor real-time systems. Naive usage of conventional lock-based object-sharing schemes in real-time systems leads to unbounded priority inversion. A priority inversion occurs when a task is blocked by a lower-priority task that is inside a critical section. Mechanisms that bound priority inversion usually entail kernel overhead that is sometimes excessive. We propose that lock-free objects offer an attractive alternative to lock-based schemes because they eliminate priority inversion and its associated problems. On the surface, lock-free objects may seem to be unsuitable for hard real-time systems because accesses to such objects are not guaranteed to complete in bounded time. Nonetheless, we present scheduling conditions that demonstrate the applicability of lock-free objects in hard real-time systems. Our scheduling conditions are applicable to schemes such as rate-monotonic scheduling and earliest-deadline- first scheduling. Previously known lock-free constructions are targeted towards asynchronous systems; such constructions require hardware support for strong synchronization primitives such as compare-and-swap. We show that constructions for uniprocessor real-time systems can be significantly simplified–and the need for strong primitives eliminated–by exploiting certain characteristics of real-time scheduling schemes. Under lock-based schemes, a task can perform operations on many shared objects simultaneously via nested critical sections. For example, using nested critical sections, a task can atomically dequeue an element from one shared queue and enqueue that element in another shared queue. In order to achieve the level of functionality provided by nested critical sections, we provide a lock-free framework that is based on a multi-word compare-and-swap primitive and that supports multi-object accesses | the lock-free counterpart to nested critical sections. Because multi-word primitives are not provided in hardware, they have to be implemented in software. We provide a time-optimal implementation of the multi-word compare-and-swap primitive. Finally, we present a formal comparison of the various object-sharing schemes based on scheduling conditions, followed by results from a set of simulation experiments that we conducted. Also, as empirical proof of the viability of lock-free objects in practical systems, we present results from a set of experiments conducted on a desktop videoconferencing system.

Randhavane, Tanmay (2019)

Social Perception of Pedestrians and Virtual Agents Using Movement Features

Under the direction of Dinesh Manocha

Electronic copy available

In many tasks such as navigation in a shared space, humans explicitly or implicitly estimate social information related to the emotions, dominance, and friendliness of other humans around them. This social perception is critical in predicting others’ motions or actions and deciding how to interact with them. Therefore, modeling social perception is an important problem for robotics, autonomous vehicle navigation, and VR and AR applications. 

In this thesis, we present novel, data-driven models for the social perception of pedestrians and virtual agents based on their movement cues, including gaits, gestures, gazing, and trajectories. We use deep learning techniques (e.g., LSTMs) along with biomechanics to compute the gait features and combine them with local motion models to compute the trajectory features. Furthermore, we compute the gesture and gaze representations using psychological characteristics. We describe novel mappings between these computed gaits, gestures, gazing, and trajectory features and the various components (emotions, dominance, friendliness, approachability, and deception) of social perception. Our resulting data-driven models can identify the dominance, deception, and emotion of pedestrians from videos with an accuracy of more than 80%. We also release new datasets to evaluate these methods. 

We apply our data-driven models to socially-aware robot navigation and the navigation of autonomous vehicles among pedestrians. Our method generates robot movement based on pedestrians’ dominance levels, resulting in higher rapport and comfort. We also apply our data-driven models to simulate virtual agents with desired emotions, dominance, and friendliness. We perform user studies and show that our data-driven models significantly increase the user’s sense of social presence in VR and AR environments compared to the baseline methods.

Raskar, Ramesh (2002)

“Projector-Based Three Dimensional Graphics”
Under the direction of Henry Fuchs and Gregory Welch
Department of Computer Science Technical Report TR02-046
Electronic copy available

Light projectors can be arranged into electronic displays that offer large, bright, and high resolution images. However, despite their unique characteristics, projectors have been treated like any other two-dimensional display devices, e.g. CRT monitors or LCD panels, to create flat and usually rectangular images. Even the growth of three dimensional computer graphics has followed this limitation. To improve and widen the range of applications of projectors, in this dissertation I present a single unified geometric framework for projector-based graphics. It is based on the notion of the projector as the dual of a camera. The geometric framework is based on (i) the analytical projection model, (ii) the geometric representation of the display surface and (iii) the viewer location. The framework can be used for practically all the projector-based applications. For classic projector-based systems, such as tiled displays or immersive panoramic displays, the framework provides a fundamentally different approach that allows greater freedom and flexibility. In addition, it enables a new class of projector-based visualization methods.

Rathinavel, Kishore (2019)

“Volumetric and Varifocal Occlusion Augmented Reality Displays”

Under the direction of Henry Fuchs

Electronic copy available

Augmented Reality displays are a next-generation computing platform that offer unprecedented user experience by seamlessly combining physical and digital content, and could revolutionize the way we communicate, visualize, and interact with digital information.

However, providing a seamless and perceptually realistic experience requires displays capable of presenting photorealistic imagery, and especially, perceptually realistic depth cues, resulting in virtual imagery being presented at any depth and of any opacity. Today’s commercial augmented reality displays are far from perceptually realistic because they do not support important depth cues such as accommodation, defocus-blur, and mutual occlusion, resulting in a transparent image overlaid onto the real-world at a fixed depth. Previous research prototypes fall short by presenting occlusion only for a fixed depth, and by presenting accommodation and defocus-blur only for a narrow depth-range, or with poor depth or spatial resolution. 

To address these challenges, this thesis explores a computational display approach, where the display’s optics, electronics, and algorithms are co-designed to improve performance or enable new capabilities. A volumetric near-eye augmented reality display was developed to simultaneously present many virtual objects at different depths across a large depth range (15 – 400 cm) without sacrificing spatial resolution, framerate, or bitdepth. This was accomplished by (1) synchronizing a high-speed Digital Micro-Mirror (DMD) projector and a focus-tunable lens to periodically sweep out a volume composed of 280 single-color binary images in-front of the user’s eye, (2) a new voxel-oriented decomposition algorithm that enables 280 unique depth positions for each color voxel, and (3) per-depth-plane illumination control. In a separate design, we enable depth-dependent occlusion in optical see-through augmented reality displays for the first time. This was accomplished by (1) dynamically moving the virtual and occlusion image planes using focus-tunable lenses to the assumed eye’s focal distance, (2) rendering out-of-focus objects with synthetic blur, and (3) designing the optics of the display to ensure unit magnification for the see-through real world irrespective of the virtual and occlusion image plane distances. The prototype display supports depth-dependent occlusion across a large depth range (30 – 300 cm).

Contributions of this thesis include new optical designs, new rendering algorithms, and prototype displays that demonstrate accommodation, defocus-blur, and occlusion depth cues over an extended depth-range.

Razzaque, Sharif (2005)

“Redirected Walking”
Under the direction of Frederick P. Brooks, Jr.
Electronic copy available

Locomotion in human-scale, immersive virtual environments can be specified by flying with a hand-controller, using a treadmill, walking-in-place, etc. Real walking, where the user actually and physically walks in the lab, and moves the same distance in the virtual scene, is better than flying. It is more input-natural, does not require learning a new interface, results in a greater sense of presence, and theoretically results is less simulator sickness. With real walking, however, the size of the virtual scene is limited by the size of tracked area. For example, for an architect to really walk in a virtual prototype of a house, the tracked area must be as large as the house. This requirement makes real walking infeasible for many facilities and virtual scenes. To address this limitation, I have developed Redirected Walking, which makes the user turn herself by interactively and imperceptibly rotating the virtual scene about her. Under the right conditions, the user would unknowingly walk in circles in the lab, thinking she is walking in a straight and arbitrarily long path in the virtual scene. In this dissertation I develop Redirection, discuss its theoretical underpinnings, and argue that it can be used: 1) to make the user turn themselves 2) without causing the user to be aware of Redirection or modify her conscious behavior 3) without unacceptably increasing the level of simulator sickness, most importantly, 4) to useful effect: A) In head-mounted display systems, the user can experience a virtual scene larger then the lab while also having the benefits of real walking. B) In an open-backed, three-walled CAVE, users can have the increased presence and input-naturalness normally of a fully enclosed CAVE.

Rewaskar, Sushant (2007)

“Real-world Evaluation of Techniques to Mitigate Impact of Losses on TCP Performance”
Under the direction of Jasleen Kaur
Electronic copy available

Transmission Control Protocol (TCP) is the dominant transport protocol used for Internet data transfers. While it is generally accepted that network losses can significantly impact TCP performance, the extent to which they do so in the real world is not well understood. In this dissertation, we study the interaction between TCP and losses and evaluate techniques for reducing the impact of losses on TCP performance. Specifically, we make three main contributions: (i) A methodology for in-depth and accurate passive analysis of TCP traces: We develop a passive analysis tool, TCPdebug, for accurate and in-depth analysis of traces of TCP connections. TCPdebug is designed to accurately track TCP sender state for several prominent OSes (Windows, Linux, Solaris, and FreeBSD/MacOS) and accurately identify and classify segments that appear out-of-sequence in a TCP trace. This tool has been extensively validated using controlled lab experiments and as well as against real Internet connections—it is accurate in more than 99% of the cases. TCPdebug improves upon the classification accuracy of existing tools by almost 100%. (ii) Systematic evaluation of the configuration of loss detection/recovery mechanisms in TCP: Using TCPdebug, we analyze traces of more than 2.8 million connections collected from 5 different vantage points across the globe to study the efficiency of current TCP loss detection/recovery mechanisms. We developed models to capture the impact of configuration of these mechanisms on the durations of TCP connections. We find that the recommended as well as widely implemented configurations for these mechanisms are fairly sub-optimal for a significant fraction of Internet connections. Our analysis suggests that the duration of up to 40% of Internet connections can be reduced by more that 10% by reconfiguring loss detection in prominent TCP stacks. (iii) Systematic evaluation of Delay-based Congestion Estimators (DBCEs): Finally, we investigate the ability of several popular Delay Based Connection Estimators (DBCEs) to predict (and help avoid) losses using estimates of network queuing delay. We incorporate several prominent DBCEs in TCPdebug and evaluate their efficacy using our connection traces. We find that the most popular Vegas estimator is fairly conservative and has minimal overall impact on connection durations. More aggressive predictors like CIM improve the performance for a large fraction of connections. We also cluster our connections according to their similarity of characteristics and study the per-cluster performance of DBCEs. We find that connections with high throughput benefit the most from any DBCE. These findings suggest that DBCEs hold significant promise for future high speed networks.

Rhee, Injong (1994)

“Efficiency of Partial Synchrony and Resource Allocation in Distributed Systems”
Under the direction of Jennifer Welch
Department of Computer Science Technical Report TR94-071
Electronic copy available

This dissertation is in two parts, covering two distinct areas of distributed computing. The first part concerns timing models in distributed systems that lie between the synchronous and asynchronous models in terms of their assumption on synchrony. We study their time complexity for solving distributed computing problems in shared memory (SM) and message-passing (MP) systems. We consider four timing parameters: the upper and lower bounds on process step time and message delay. Timing models are obtained by considering independently whether each parameter is known or unknown, giving rise to four SM models and 16MP models. We also study other timing models that are not covered by this framework. We show a general time complexity hierarchy indicating inherent time complexity gaps among the models. The time complexity gaps are introduced by the time complexity of the session problem, an abstraction of fundamental synchronization problems in distributed computing. The hierarchy can be valuable information for system designers in evaluating various timing models to build efficient, yet cost-effective, distributed systems. The second part concerns resource allocation in distributed systems, i.e., the scheduling of accesses to resources shared by concurrent process. Three different resource allocation problems with varying degrees of generality are considered including the dining philosophers problem. A tight bound on the response time for the dining philosophers problem is obtained. We also prove that any (dining philosophers) algorithm with the optimal response is transformable in polynomial time into a sequential algorithm for an NP-complete problem. It suggests that an execution of any dining philosophers algorithm with the optimal response time may require a large (perhaps exponential) amount of local computation in acquiring resources. We also present a modular resource allocation algorithm that uses any resource allocation algorithm as a subroutine. For appropriate choices of the subroutine we obtain the fastest known algorithms in the worst case. Although they do not give the optimal performance, the algorithms require only a polynomial amount of computation in acquiring resources. We did simulation studies indicating that our algorithm performs faster than other known algorithms on average.

Rheingans, Penny L. (1993)

“Dynamic Explorations of Multiple Variables in a 2D Space”
Under the direction of Frederick P. Brooks, Jr.

Color is used widely and reliably to display the value of a single scalar variable. It is more rarely, and far less reliably, used to display multivariate data. This research adds the element of dynamic control over the color mapping to that of color itself for the more effective display and exploration of multivariate spatial data. My thesis is that dynamic manipulation of representation parameters is qualitatively different and quantitatively more powerful than viewing static images. In order to explore the power of dynamic representation, I constructed a dynamic tool for the creation and manipulation of color mappings. Using Calico, a one- or two-variable color mapping can be created using parametric equations in a variety of color models. This mapping can be manipulated by moving input devices referenced in the parametric expressions, by applying affine transforms, or by performing free-form deformations. As the user changes the mapping, an image showing the data displayed using the current mapping is updated in real time, as are geometric objects which describe the mapping. To support my thesis, I conducted two empirical studies comparing static and dynamic color mapping for the display of bivariate spatial data. The first experiment investigated the effects of user control and smooth change in the display of quantitative data on user accuracy, confidence, and preference. Subjects gave answers which were an average of thirty-nine percent more accurate when they had control over the representation. This difference was almost statistically significant (0.05 < p < 0. 10). User control produced significant increases in user preference and confidence. The second experiment compared static and dynamic representations for qualitative judgments about spatial data. Subjects made significantly more correct judgments (p < 0.001) about feature shape and relative positions, on average forty-five percent more, using the dynamic representations. Subjects also expressed a greater confidence in and preference for dynamic representations. The differences between static and dynamic representations were greater in the presence of noise.

Rhoades, John S. (1993)

“Shaping Curved Surfaces”
Under the direction of Stephen M. Pizer
Department of Computer Science Technical Report TR93-066
Electronic copy available

This dissertation presents a new tool for shaping curved surfaces, the bending operator. The bending operator is an interactive tool intended for use in 3-D sketching. It is based on the idea that bending a surface is equivalent to changing its normal vector field while perturbing its metric as little as possible. The user of this tool specifies a bending operator, which is a surface that indicates how the normals of a target surface should change. The bending algorithm adds the derivatives of the normal vector fields of the bending and target surfaces and integrates this sum to produce a desired normal vector field. The target surface is then reshaped using a variational technique that moves the underlying surface control points to minimize a penalty function of the target surface. After bending, the resulting surface acquires the features of the bending surface while maintaining the general shape of the original target surface. The bending algorithm can perform a wide variety of surface shaping tasks, including bending about a cylinder axis, indenting, twisting, and embossing. The algorithm includes a positioning control used to specify the correspondence between points of the bending operator surface and target surface and a range of action selector used to restrict the bending action to a part of the target surface. The bending operator is intuitive in that a user can easily learn to predict the approximate result of a bending operation without needing a detailed understanding of the algorithm. The algorithm can be applied to any patch type that is based on control points and that is piecewise twice differentiable, including Bezier patches, B-spline patches, and NURBS. The algorithm can also be applied to a non-branching mesh of patches with smoothness constraints. The bending algorithm was implemented in an interactive prototype program using X-windows. This program performs a bending operation in seconds to minutes on a HP-730 workstation depending on the complexity of the target and bending surfaces. The dissertation also includes an outline for a joining algorithm based on variational techniques similar to those used in bending.

Riely, James (1999)

“Applications of Abstraction for Concurrent Programs”
Under the direction of Jan F. Prins

We study the use of abstraction to reason operationally about concurrent programs. Our thesis is that abstraction can profitably be combined with operational semantics to produce new proof techniques. We study two very different applications:

  • the implementation of nested data-parallelism, and
  • the verification of value-passing processes.

In the first case, we develop a typing system for a nested data-parallel programming language and use it to prove the correctness of flattening, an important compilation technique. In the second, we demonstrate that abstract interpretations of values domains can be applied to process description languages, extending the applicability of finite-state methods to infinite-state processes.

Raghuvanshi, Nikunj (2010)

“Interactive Physically-based Sound Simulation”
Under the direction of Ming C. Lin
Electronic copy available

Hearing is one of our principal senses which complements, and in some cases, supplements sight. The realization of interactive, immersive virtual worlds thus requires the ability to present a realistic aural experience that convincingly reflects the environment presented visually. Physical simulation is a natural way to achieve such realism, enabling deeply immersive virtual worlds — the range of sounds generated is limited only by the number of physical scenarios realized at runtime, rather than the size of a library of pre-recorded sounds/filters. However, physically-based sound simulation is a very computationally expensive problem with its unique challenges, owing to the physics behind sound production and propagation.

Ren, Zhimin (2014)

“Real-Time Physically BAsed Sound Synthesis and Application in Multimodal Interaction”
Under the direction of Ming Lin

An immersive experience in virtual environments requires realistic auditory feedback that is closely coupled with other modalities, such as vision and touch. This is particularly challenging for real-time applications due to its stringent computational requirement. In this dissertation, I present and evaluate effective real-time physically based sound synthesis models that integrate visual and touch data and apply them to create richly varying multimodal interaction. I first propose an efficient contact sound synthesis technique that accounts for texture information used for visual rendering and greatly reinforces cross-modal perception. Secondly, I present both empirical and psychoacoustic approaches that formally study the geometry-invariant property of the commonly used material model in real-time sound synthesis. Based on this property, I design a novel example-based material parameter estimation framework that automatically creates synthetic sound effects naturally controlled by complex geometry and dynamics in visual simulation. Lastly, I translate user touch input captured on commodity multi-touch devices to physical performance models that drive both visual and auditory rendering. This novel multimodal interaction is demonstrated in a virtual musical instrument application on both a large-size tabletop and mobile tablet devices, and evaluated through pilot studies. Such an application offers capabilities for intuitive and expressive music playing, rapid prototyping of virtual instruments, and active exploration of sound effects determined by various physical parameters.

Rosenthal, Michael Hayden (2005)

“Automatically Reducing and Bounding Geometric Complexity by Using Images”
Under the direction of Henry Fuchs
Electronic copy available

Medicine is rapidly adopting new minimally invasive techniques, especially in the realm of vascular procedures. These new techniques require spatially complex operations while providing little or no hand-eye coordination. Image-based tracking, registration, and visualization may make it easier to understand and perform these procedures, possibly shortening procedures, improving outcomes, and allowing new, more complex techniques to be pursued. This dissertation will explore the calibration, tracking, and registration issues that need to be understood to develop a viable intraoperative guidance system. To develop an understanding of the likely benefits of a complete system for intraoperative tracking and visualization, this dissertation will consider the methods, accuracy and effectiveness of several major components in such a system. The first part of this work presents a novel phantom for simultaneous calibration of two fluoroscopes, a set of methods for accurate calibration of such systems based on bundle adjustment, and the integration of high-order distortion correction and calibration into a single optimization method. Existing methods generally address single-view calibration and distortion correction as separate steps, with the best fluoroscopic methods producing mean errors of at least 1-2mm. The methods presented herein yield a mean reconstruction error of 0.44 mm from a single calibration process. The second part describes a real-time anatomical feature tracker based on maximum-likelihood feature selection. This method is then used to determine the motion of the liver in fluoroscopic image sets from real patients. The final innovation of this dissertation is a method for registering a 3D vascular model to two calibrated angiographic views simultaneously. Overall, this work advances our understanding of the ways in which real-time fluoroscopy can be used to precisely track and model interventional procedures interactively. These results enable sub-millimeter reconstruction of anatomical features, instruments, or other features of interest in a practical and robust manner.

Roussev, Vassil (2003)

“Flexible Sharing of Distrubuted Objects Based On Programming Patterns”
Under the direction of Prasun Dewan
Electronic copy available

Distributed collaborative applications allow a group of physically dispersed users to work on a common task. To simplify and lower the cost of developing such applications, a large number of collaborative sharing infrastructures have been built. A common approach employed by sharing infrastructures is to present the programmer with shared programming abstractions as a basis for implementing multi- user applications. A shared abstraction extends a traditional single-user abstraction, such as an object, with automatic collaboration support. In this thesis, we summarize the achievements and limitations of current approaches to implementing shared abstractions and argue that they have not been successful in simultaneously addressing the issues of automation, abstraction flexibility, sharing flexibility, extensibility, and legacy code reuse. We present a new infrastructure model designed to fulfill these requirements better than existing systems. At the core of the model is the notion of a programming pattern based on which we define a new shared abstraction model. A programming pattern is a consistent naming convention that allows the logical structure of an object to be implicitly derived from its observable state. We define a pattern specification language that allows programmers to formally express the patterns they use and show that the set of supported pattern-based abstractions subsumes the abstraction models of existing systems. We complement our abstraction model with an architectural model that allows the practical use of patterns in the application development. Furthermore, we introduce a sharing model that subsumes and provides a descriptive mechanism for specifying application layering. We describe a prototype implementation of our conceptual model, as well as our experience in developing collaborative applications with it. We also discuss several problems in which we have successfully emp loyed a pattern-based approach outside the domain of collaboration. Next, we present a requirement-by-requirement evaluation of our work relative to existing systems showing that, overall, our system performs better in satisfying the requirements. Finally, we present our conclusions and outline the directions in which we plan to extend this work in the future.

Rudolph, David J. (1995)

“Automatic Landmark Identification in Orthodontic Cephalometric Radiographs (Machine Vision)” (Biomedical Engineering and Mathematics, UNC-Chapel Hill)
Under the direction of James Coggins

The goal of orthodontic and orthognathic therapy is to improve the interrelationships among craniofacial tissues, which determine form, function, aesthetics, and relative stability. A two-dimensional x-ray image of the sagital skull projection, called a cephalometric radiograph, can be used to evaluate these relationships. In orthodontics, distances and angles among cephalometric image landmarks are compared with normative values to diagnose a patient’s deviation from ideal form and prescribe treatment. This process is often extremely time consuming for the orthodontist. A computer-based system which automatically performs these visual tasks has enormous potential in diagnostic medicine, particularly in orthodontics. One task which may benefit from such a system is landmark identification in cephalometric radiographs. Automatic landmark identification and analysis could simultaneously save time, mathematically define landmarks, improve the repeatability of landmark identification, and support alternative methods of form analysis. Computer vision methods can be heuristic (based on a set of rules usually applicable to only one image type) or can be broad-based (applicable to many tasks). Previous attempts to automatically locate landmarks have been heuristic. These attempts have been only partially successful. The use of a unified broad based approach to image analysis may solve the automatic landmarking problem and may also apply to a wider range of computer vision tasks. This dissertation substantiates and tests Spatial Spectroscopy, a unified broad based approach to vision that makes decisions about image structure based on a convolution of the image with a set of filters followed by a nonlinear decision method using statistical pattern recognition techniques. This study tested two filter sets for comparison: a multiscale Gaussian derivative filter set and an offset Gaussian set. Furthermore, a statistical decision process is proposed and tested that includes probability measures and outlier removal. This study tested both high and low resolution images. These results show: (1) There is no difference in landmark identification errors between human identification on the computer display at low resolution (4 mm$sp2$ per pixel) and automatic identification at low resolution. Mean errors in this study are defined as mean magnitude in distance between correct and selected landmark. (2) There is no difference in landmark identification errors between human identification on the computer display at a resolution of 1 mm$sp2$ per pixel vs. direct identification on the radiograph as reported by other investigators. (3) There is a correlation between distance in feature space and distance in image space for the features tested. Locations which are close in feature space are close in image space. (4) There was no difference between the offset Gaussian and the Multiscale Gaussian feature set at low resolution. (5) The outlier removal did improve the results. However, the mean improvement in accuracy was not large. (6) High resolution significantly improved the standard deviations of errors, but did not improve the mean errors.

Rungta, Atul (2018)

Perceptually Driven Interactive Sound Propagation for Virtual Environments

Under the direction of Jan-Michael Frahm

Electronic copy available

Sound simulation and rendering can significantly augment a user’s sense of presence in virtual environments. Many techniques for sound propagation have been proposed that predict the behavior of sound as it interacts with the environment and is received by the user. At a broad level, the propagation algorithms can be classified into reverberation filters, geometric methods, and wave-based methods. In practice, heuristic methods based on reverberation filters are simple to implement and have a low computational overhead, while wave-based algorithms are limited to static scenes and involve extensive precomputation. However, relatively little work has been done on the psychoacoustic characterization of different propagation algorithms, and evaluating the relationship between scientific accuracy and perceptual benefits.

In this dissertation, we present perceptual evaluations of sound propagation methods and their ability to model complex acoustic effects for virtual environments. Our results indicate that scientifically accurate methods for reverberation and diffraction do result in increased perceptual differentiation.

Based on these evaluations, we present two novel hybrid sound propagation methods that combine the accuracy of wave-based methods with the speed of geometric methods for interactive sound propagation in dynamic scenes. Our first algorithm couples modal sound synthesis with geometric sound propagation using wave-based sound radiation to perform mode-aware sound propagation. We introduce diffraction kernels of rigid objects, which encapsulate the sound diffraction behaviors of individual objects in the free space and are then used to simulate plausible diffraction effects using an interactive path tracing algorithm. Finally, we present a novel perceptual driven metric that can be used to accelerate the computation of late reverberation to enable plausible simulation of reverberation with a low runtime overhead. We highlight the benefits of our novel propagation algorithms in different scenarios.


Saboo, Rohit R. (2011)

“Atlas Diffeomorphisms via Object Models”
Under the direction of Stephen M. Pizer

I propose tackling the problem of segmenting several closely-spaced objects from 3D medical images using a hybrid of two segmentation components: one model-based and one image-based. A major contribution implements the first component by diffeomorphically mapping a fully segmented atlas image to a partially segmented image of a patient(target) while preserving the correspondence that is inferred from the partial segmentation of the target. The mapping is produced by solving the steady-state heat flow equation where the temperature is a coordinate vector and corresponding points have the same temperature. Objects carried over from the atlas into the target serve as reasonable initial segmentations and can be further refined by a model-based segmentation method. Good quality segmentations are added to the list of the initial partial segmentations, and the process is repeated. The image-based component provides shape models of quasi-tubular objects and statistics on those models. Whereas, medial models were previously only developed for slab-shaped objects, this contribution provides an approximately medial means to stably represent nearly tubular objects. I test a variety of components of my method on segmenting objects from 3D CT scans of the head and neck as required by radiotherapy treatment planning.

Sanders, Sean (2017)

Techniques for the Analysis of Modern Web Page Traffic Using Anonymized TCP/IP Headers

Under the direction of Jasleen Kaur

Electronic copy available

Analysis of traces of network traffic is a methodology that has been widely adopted for studying the Web for several decades. However, due to recent privacy legislation and increasing adoption of traffic encryption, often only anonymized TCP/IP headers are accessible in traffic traces. For traffic traces to remain useful for analysis, techniques must be developed to glean insight using this limited header information. This dissertation evaluates approaches for classifying individual web page downloads — referred to as web page classification — when only anonymized TCP/IP headers are available.

The context in which web page classification is defined and evaluated in this dissertation is different from prior traffic classification methods in three ways.  First, the impact of diversity in client platforms (browsers, operating systems, device type, and vantage point) on network traffic is explicitly considered. Second, the challenge of overlapping traffic from multiple web pages is explicitly considered and demultiplexing approaches are evaluated (web page segmentation). And lastly, unlike prior work on traffic classification, four orthogonal labeling schemes are considered (genre-based, device-based, navigation-based, and video streaming-based) — these are of value in several web-related applications, including privacy analysis, user behavior modeling, traffic forecasting, and potentially behavioral ad-targeting.

We conduct evaluations using large scale collections of both synthetically generated data, as well as browsing data from real users. Our analysis shows that the client platform choice has a statistically significant, and at times surprising, impact on web traffic. It also shows that change point detection methods, a new class of segmentation approach, outperform existing idle time-based methods. Overall, this work establishes that web page classification performance can be improved by: (i) incorporating client platform differences in the feature selection and training methodology, and (ii) utilizing better performing web page segmentation approaches.

This dissertation research increases the awareness in the community on the challenges associated with the analysis of modern web traffic. It shows and advocates for considering real-world factors, such as client platform diversity and overlapping traffic from multiple streams, when developing and evaluating traffic analysis techniques.

Sawyer, Jeanne C. (1990)

“A Reference and Planning Model for Library Online Public Access Catalogs”
Under the direction of Stephen F. Weiss
Department of Computer Science Technical Report TR90-043
Electronic copy available

This dissertation explores the problem of how library catalogs can be connected or combined, providing library users with access to materials beyond the local library collection while maintaining local autonomy and ease of implementation. The first step in solving the problem was to identify the fundamental ways library systems can be connected, and to determine the characteristics of each. The Reference Model for Online Public Access Catalogs (OPAC Model) does this by categorizing the basic architectures into three models: centralized, distributed, and stand-alone. The reference model prides a way to classify a system according to its architecture and identifies the basic characteristics that a system must have. The distributed model (DLN Model) is explored in depth because it is the least well understood of the models and because the characteristics of distributed systems must be standardized if such systems are to be connected effectively. Whereas the OPAC Model defines the system architectures, the Library Systems Architecture Planning Model (LSAP Model) provides a tool for choosing among them. The system characteristics defined in the reference model are included to meet real-world needs, such as providing access to another library’s holdings or preserving local autonomy. The LSAP Model follows from the Reference Model by making explicit the connections between a set of system characteristics and a set of environmental characteristics. The concepts included in the Reference Model are new and untested, especially for the distributed architecture. Therefore a case study of the Triangle Research Libraries Network’s system was included in the dissertation specifically to demonstrate that:

  • the reference model can be implemented, and
  • the implementation is reasonable and is an appropriate choice for that environment

Verifying the LSAP Model was then necessary to demonstrate that the planning model works, i.e., that the Model accurately reflects expert judgements of appropriate choice of system architecture. In addition, verification of the LSAP Model further validates the Reference Model since the LSAP Model is built from the architectures delineated in the Reference Model. If those architectures were inappropriately defined, the LSAP Model could not work properly.

Schissler, Carl (2017)

Efficient Interactive Sound Propagation in Dynamic Environments

Under the direction of Dinesh Manocha

Electronic copy available

The physical phenomenon of sound is ubiquitous in our everyday life and is an important component of immersion in interactive virtual reality applications. Sound propagation involves modeling how sound is emitted from a source, interacts with the environment, and is received by a listener. Previous techniques for computing interactive sound propagation in dynamic scenes are based on geometric algorithms such as ray tracing. However, the performance and quality of these algorithms is strongly dependent on the number of rays traced. In addition, it is difficult to acquire acoustic material properties. It is also challenging to efficiently compute spatial sound effects from the output of ray tracing-based sound propagation. These problems lead to increased latency and less plausible sound in dynamic interactive environments.

              In this dissertation, we propose three approaches with the goal of addressing these challenges. First, we present an approach that utilizes temporal coherence in the sound field to reuse computation from previous simulation time steps. Secondly, we present a framework for the automatic acquisition of acoustic material properties using visual and audio measurements of real-world environments. Finally, we propose efficient techniques for computing directional spatial sound for sound propagation with low latency using head-related transfer functions (HRTF). 

              We have evaluated both the performance and subjective impact of these techniques on a variety of complex dynamic indoor and outdoor environments and observe an order-of-magnitude speedup over previous approaches. The accuracy of our approaches has been validated against real-world measurements and previous methods. The proposed techniques enable interactive simulation of sound propagation in complex multi-source dynamic environments.

Schmitt, Charles P. (1999)

“Recognizing Moving Objects: A Neural Model of Temporal Binding in Human Vision”
Under the direction of Jonathan A. Marshall

A visual object is recognizable only if its parts (elements) are correctly identified and integrated into a perceptual whole. Element integration requires some means of identifying, or labeling, which elements, visible at possibly different moments in time, are associated with which perceptual objects–a problem known as the temporal binding problem. this problem is non-trivial for biological systems because the elements of an object are represented by distributed populations of neurons, because information transfer rates are limited by neural spiking rates, and because the visual world is often filled with multiple objects of behavioral significance. This dissertation presents a neural model, derived from an examination of psychophysical and neurobiological findings, for solving the temporal binding problem. The neural mechanisms used to solve the binding problem are based upon a serial, attention-based mechanism for constructing and updating internal object representations, an association network for linking together the shape and motion properties of objects, and modulatory pathways that enforce Gestalt grouping principles. The proposed neural mechanisms are used to construct a neural network model. The model is computationally simulated on image sequences from psychophysical experiments on multielement tracking, object review, and object integration. The simulation results are shown to be quantitatively consistent with experimental results on human subjects. In particular, the simulations show that the effects of experimental manipulations (e.g., varying interstimulus interval, varying number of objects) can be explained as resulting from the proposed neural mechanisms. Finally, the model is extended to address the problem of grouping in relative motion perception. A novel neural architecture and new approaches to solving the binding problem are presented that allow multiple sets of moving elements to be grouped into separate objects. This dissertation presents several contributions to vision research. First, a model of binding in human vision is presented that shows how elements can be grouped together into multiple perceptual objects, even if the elements are visible at different moments. Second, the model provides explanations for human performance in several psychophysical experiments. Finally, the model shows how preattentive grouping can occur in parallel in motion perception.

Seeger, Adam (2004)

“Surface Reconstruction From AFM and SEM Images”
Under the direction of Russell M. Taylor II
Electronic copy available

Current methods for surface reconstruction from AFM images do not enable one to incorporate constraints from other types of data. Current methods for surface reconstruction from SEM images are either unsuitable for nanometer scale shapes or limited to shapes described by a small number of parameters. I have developed a new approach to surface reconstruction from combination AFM/SEM images that overcomes these limitations. A dilation model is used to model AFM image formation and a filter bank model is used to model SEM image formation. I construct noise models for both AFM and SEM images from real data. The image formation models including the noise descriptions are used to construct an objective function expressing the probability of observed images given a hypothetical surface reconstruction. The surface is modeled as a sum of Gaussian basis functions and I derive a formula to estimate the gradient of the objective function in the surface parameter space. The conjugate gradient method is used to optimize the surface parameters. My thesis is that this algorithm is more general and accurate than existing methods and that the gradient-based optimization based on my formula enables one to compute a reconstruction is a few hours. This thesis is demonstrated by applying the algorithm to real and synthetic examples.

Sewall, Jason (2010)

“Efficient, Scalable Traffic and Compressible Fluid Simulations Using Hyperbolic Models”
Under the direction of Ming Lin
Electronic copy available

Virtual worlds that are filled with the phenomena we experience in the real world can be more visually compelling, but it is challenging to create animations of complex, nonlinear phenomena efficiently. This thesis presents novel techniques for efficiently generating animations of compressible fluids and traffic flow for making virtual worlds more lifelike. I introduce simulation methods designed to recreate the motion of coupled gas and elastic bodies, shockwaves in compressible gases, and traffic flows on large, varied road networks. These phenomena can all be described with mathematical models classified as hyperbolic — essentially, models where the speed of information propagation is bounded. This property leads to computational schemes with very local data access patterns that favor parallel computation. I demonstrate how the application of hyperbolic models to the aforementioned phenomena can lead to techniques for physically plausible animations that are efficient and scalable on modern multi-processor architectures. Animations of gas dynamics, from curling smoke to sonic booms, are visually exciting – particularly when such fluids interact with suspended objects. However, existing computational models of fluids in computer graphics are unsuitable for properly describing compressible gas flows. I present a method based on a truly compressible model of gases to simulate two-way coupling between gases and elastic bodies on simplicial meshes that can handle large-scale simulation domains in a fast and scalable manner. Computational models of fluids used so far in graphics are similarly inappropriate for describing supersonic gas dynamics because they assume the presence of smooth solutions. The shockwaves associated with these phenomena have great visual appeal but are notoriously difficult to handle numerically. I present a technique for the simulation of explosive gas phenomena that addresses the challenges found in animation — namely stability, efficiency, and generality. I also demonstrate how this method is able to achieve near-linear scaling on modern many-core architectures. Automobile traffic is an ubiquitous feature of modern life, and no virtual description of a populated urban area is complete without it. I present a technique for traffic animation that leverages the efficiency of a hyperbolic continuum model for traffic flow with a discrete representation that allows for visual depiction and fine control, and I demonstrate how this approach is able to outperform agent-based models for traffic simulation. I further show the flexibility of hyperbolic models by coupling discrete, agent-based vehicle simulation with a continuum model of traffic. This hybrid technique can capture the interaction between arbitrarily arranged continuum and discrete lanes as well as dynamically transition between the two disparate models of representation. By exploiting the inherent locality and descriptiveness of hyperbolic models for natural and man-made phenomena, the methods presented in this dissertation open new possibilities for the efficient creation of physically-based animations for virtual worlds. I show their effectiveness by computing shockwave propagation in gases and simulating traffic on large road networks; these schemes are at least an order of magnitude faster than the closest existing techniques.

Shan, Liang (2014)

“Automatic Localized Analysis of Longitudinal Cartilage Changes”
Under the direction of Marc Niethammer

Osteoarthritis (OA) is the most common form of arthritis; it is characterized by the loss of cartilage. Automatic quantitative methods are needed to screen large image databases to assess changes in cartilage morphology. This dissertation presents an automatic analysis method to quantitatively analyze longitudinal cartilage changes from knee magnetic resonance (MR) images.

A novel robust automatic cartilage segmentation method is proposed to overcome the limitations of existing cartilage segmentation methods. The dissertation presents a new and general convex three-label segmentation approach to ensure the separation of touching objects, i.e., femoral and tibial cartilage. Anisotropic spatial regularization is introduced to avoid over-regularization by isotropic regularization on thin objects. Temporal regularization is further incorporated to encourage temporally-consistent segmentations across time points for longitudinal data.

The state-of-the-art analysis of cartilage changes relies on the subdivision of cartilage, which is coarse and purely geometric whereas cartilage loss is a local thinning process and exhibits spatial nonuniformity. A statistical analysis method is proposed to study localized longitudinal cartilage thickness changes by establishing spatial correspondences across time and between subjects. The method is general and can be applied to nonuniform morphological changes in other diseases.

Shan, Yen-Ping (1990)

“MoDE: An Object-Oriented User Interface Development Environment Based on the Concept of Mode”
Under the direction of John B. Smith
Department of Computer Science Technical Report TR90-028
Electronic copy available

This thesis explores a particular concept of mode that can provide a unified conceptual framework for user interfaces and can lead to an effective implementation environment for developing a rich variety of user interfaces. This research has addressed several importation limitations faced by most user interface management systems (UIMSs). These include:

  • Lack of generality.
  • Little support for creating and managing the connections between user interfaces and their underlying applications.
  • Lack of support beyond the coding phase.

The major results of the research are the following: A new user interface development environment, called the Mode Development Environment (MoDE), was developed. MoDE accommodates an orthogonal design that decouples the user interface components from each other, thereby increasing their reusability and overall system generality. A new connection model was developed that allows strong separation between the user interface and the application without limiting the communication between them. MoDE supports the creation and management of both components and connections through direct manipulation. New concepts and UIMS capabilities were developed to provide support beyond the coding stage. To support design, a particular concept of mode was developed to help decompose the interface into components. To support testing and maintenance, MoDE enables the user to run and interface, suspend it at any point, and inspect and change it.

Shannon, Karen P. (1992)

“Tool Integration Via Fine-Grained Data Management”
Under the direction of Richard Snodgrass

Software development tools have long been used to increase the productivity of software engineers. However, the use of individual, isolated tools offers only a partial solution to the complexity of software development; a more complete solution requires the integration of tools into a software development environment (SDE). This research describes a mechanism to integrate tools in a large scale SDE. SDE tools are integrated by sharing data, hence data structure communication is the key to our approach to tool integration. We focus on large scale SDEs because the increase in the number of tools and the increase in size and complexity of individual tools and shared data complicate the interface among the tools. We focus on fine grained data in part because the performance requirements concerning fine grained data are most difficult to meet. We identify four additional characteristics we wish to preserve in large scale SDEs:extensibility or ease of modification, flexibility or ease of configuration, integrity or ease of maintaining consistency, and efficiency at both development and execution time. In general, it is impossible to preserve all characteristics simultaneously. However, our approach exploits different requirements at various points during the development of the SDE. We examine a spectrum of approaches to tool integration incorporating and generalizing those previously proposed by others. Implementation techniques are discussed that allow tools to be moved individually along this spectrum with relative ease as the SDE evolves. Finally, we discuss specific facilities of a meta-environment, a specific environment tailored to the development of SDEs, that allows the SDE developer to precisely control the development tradeoffs of the SDE, and we examine how the runtime system is used to effect integration. We show that it is possible to initially sacrifice SDE execution time efficiency to achieve SDE development time efficiency, extensibility, and flexibility without sacrificing integrity. We also explain how to shift the SDE, with no changes to the source code, to many intermediate points along the coupling spectrum eventually reaching high execution time efficiency at the production stage.

Shriram, Alok (2007)

“Design of a Scalable Available Bandwidth Information Infrastructure”
Under the direction of Jasleen Kaur
Electronic copy available

Several applications such as peer-to-peer file sharing and grid computing are being instantiated using overlay networks over the Internet. These applications typically involve transferring large amounts of data between overlay nodes for the purpose of data processing, large-scale scientific computation or content distribution. In order to improve the data transfer rates of these applications, several overlay services such as overlay routing and source selection are being defined. A key piece of information needed by the above services is an estimate of the Available Bandwidth (AB) of all the paths of the overlay network. In this dissertation, we focus on the design of an infrastructure—referred to as an AB Information Infrastructure (ABII)—that can monitor an overlay and provide this information. The key requirements from an ABII are that it provide accurate and up-to-date AB information while imposing a low monitoring overhead on the network. We use a two-step approach to design an ABII: Select a tool to measure the AB on a single end-to-end path: Several AB estimation tools (ABET) have been designed in the recent past to measure the end-to-end AB on a given path; unfortunately, it is not clear which tool has high accuracy, low run-time, and low overhead. Specifically, existing ABET evaluations are not comprehensive in the set of tools as well as network conditions evaluated. Furthermore, no past evaluation considers the impact of systemic biases as well as impact of temporal aspects of AB estimation. In this thesis we address these issues by systematically evaluating the impact of systemic, temporal, and algorithmic aspects of ABET design. Design a scheme to scalably infer AB on all N2 paths of an overlay: In order to estimate the AB on all paths of an overlay network, we first recognize that measuring the AB on all N2 paths has a fairly high overhead. In this thesis we propose and evaluate three alternate approaches that measure AB only on a subset of paths and use these to infer the AB on the remaining paths—the number of measurements needed ranges from O(N) to O(NvN). We validate our approaches on the PlanetLab experimental test-bed and show that these consistently out perform the state-of-the-art in terms of accuracy, while imposing a similar overhead on the network

Silbermann, Frank S. K. (1989)

“A Denotational Semantics Approach to Functional and Logic Programming”
Under the direction of Bharadwaj Jayaraman
Department of Computer Science Technical Report TR89-030
Electronic copy available

This dissertation addresses the problem of incorporating into lazy higher-order functional programming the relational programming capability of Horn logic. The language design is based on set abstraction, a feature whose denotational semantics has until now not been rigorously defined. A novel approach is taken in constructing an operational semantics directly from the denotational description. The main results of the dissertation are:

  1. Relative set abstraction can combine lazy higher-order functional programming with not only first-order Horn logic, but also with a useful subset of higher order Horn logic. Sets, as well as functions, can be treated as first-class objects.
  2. Angelic powerdomains provide the semantic foundation for relative set abstraction
  3. The computation rule appropriate for this language is a modified parallel outermost, rather than the more familiar left-most rule.
  4. Optimizations incorporating ideas from narrowing and resolution greatly improve the efficiency of the interpreter, while maintaining correctness.
Singh, Abhishek (2007)

“Co-scheduling Real-time Tasks and Non Real-time Tasks Using Empirical Probability Distribution of Real-time Execution Requirements”
Under the direction of Kevin Jeffay
Electronic copy available

We present a novel co-scheduling algorithm for real-time (RT) and non real-time, response time sensitive (TS) tasks.Previous co-scheduling algorithms focused on providing isolation to the tasks without considering the impact of RT tasks on the response times of the TS tasks. To best utilize the available processing capacity, the number of jobs qualifying for acceptable performance should be maximized. A good scheduling algorithm would reduce the deadline overrun times for soft real-time tasks and the response times for the TS tasks, while meeting deadline guarantees for the RT tasks. We propose a Stochastic Processor Sharing (SPS) algorithm that uses the empirical probability distribution of execution times of the RT tasks to schedule the RT tasks such that their maximum expected processor share at any instant is minimized. We show theoretically and empirically that SPS provideds significant performance benefits in terms of reducing response times of TS jobs over current co-scheduling algorithms.

Singh, Darshan (2015)

Annotation-Free Detection of Splicing Differences Between RNA-Seq Samples

Under the direction of Jan Prins

Electronic copy available

Alternative splicing expands the diversity of RNA transcripts in eukaryotic cells. Through the mechanism of alternative splicing, 95\% of human genes yield multiple transcript isoforms, with 10 or more transcripts isoforms typical.

Alternative splicing plays an important role in tissue-specific differentiation, and its misregulation can result in disease. To understand these processes, there is a need for methods to identify splicing differences between cell culture or tissue samples using high-throughput RNA sequencing (RNA-seq).

We define differential transcription between two samples as the difference in the relative abundance of the transcript isoforms present in the samples, and develop a method to detect differential transcription between two RNA-seq datasets without knowledge of the underlying transcript isoforms. This avoids the problem of relying on a generally incomplete catalog of transcript isoforms for abundance estimation (e.g RSEM), or the problem of transcript inference (e.g. Cufflinks), both of which can be unreliable. 

We introduce the Flow Difference Metric (FDM) to identify differential transcription between RNA-seq datasets. The method is based on a splice graph representation of the RNA-seq data that summarizes the alignment of RNA-seq reads to a reference genome. The method detects differential transcription that significantly exceeds technical variation in the RNA-seq data and provides the locus of such differences. 

Various ways to use FDM are illustrated on biological questions posed by collaborators with different experimental designs. We have detected differential transcription 1) between two cell lines, 2) between different stages of cell differentiation, 3) between two cell types across multiple individuals 4) between two classes of large groups of individuals; and 5) between different stages of cell cycle. We also detected association of differential transcription with cancer subtypes. All these contributions are significant in the effort to understand the mechanisms of cancers and to develop therapies.

Sinha, Sudipta N. (2008)

“Silhouettes for Calibration and Reconstruction from Multiple Views”
Under the direction of Marc Pollefeys
Electronic copy available

In this dissertation, I study how silhouettes extracted from images and video can help with two fundamental problems of computer vision – namely multi-view camera calibration and 3D surface reconstruction from multiple images. First I present an automatic method for calibrating a network of cameras that works by analyzing only the motion of silhouettes in the multiple video streams. This is particularly useful for automatic reconstruction of a dynamic event using a camera network in a situation where pre-calibration of the cameras is impractical or even impossible. Our key contribution is a novel RANSAC-based algorithm that simultaneously computes the epipolar geometry and synchronization of a pair of cameras, from only the silhouettes of moving objects. The approach starts by independently computing the epipolar geometry and synchronization for various pairs of cameras in the network. In the next stage, the calibration and synchronization of the complete network is recovered. The effectiveness of our approach is demonstrated by remotely calibrating many multi-view datasets acquired by researchers in the community. In the second part of the dissertation, I address some shortcomings of existing volumetric multi-view stereo approaches. First I propose an improved multi-view stereo formulation that allows for robust and accurate fusion of the silhouette and stereo cues. I show that it is possible to enforce exact silhouette constraints within the graph-cut optimization step of the volumetric graph-cut stereo algorithm. Hence the reconstructed surface can be guaranteed to be consistent with the original silhouettes. I also describe an alternate multi-view stereo formulation involving an adaptive graph construction which addresses the high memory and computational overhead of the underlying approach. The proposed method does not need any initialization and is not restricted to a specific surface topology. Using the method, accurate and detailed 3D models have been reconstructed from high-resolution images.

Skarbez, Richard (2016)

“Plausibility Illusion in Virtual Environments”
Under the direction of Mary Whitton & Fred Brooks

Historically, research into subjective user experience in virtual environments has focused on presence, the feeling of “being there” in the virtual environment. Recently, Professor Mel Slater proposed that in addition to this feeling of being in the virtual space, researchers also need to consider the subjective feeling that the events depicted in the virtual environment appear real. He coined the terms Place Illusion (PI) and Plausibility Illusion(Psi), respectively, to refer to these subjective feelings.

There exists a substantial amount of previous research applicable to PI, but very little regarding Psi. This dissertation fleshes out the concept of Plausibility Illusion by introducing new terminology, and reports on several experiments investigating the factors and effects of Psi. I demonstrate that Psi can be detected using existing presence measures, including questionnaires and physiological metrics. Of particular interest in these results is that factors contributing to Plausibility Illusion affected heart rate, with inconsistent behavior of the virtual environment leading to increased heart rate. I also demonstrate that study participants’ individual differences affected how they interacted with a virtual environment, leading to different levels of Plausibility Illusion and, therefore, presence. I further demonstrate that, among the factors tested, the virtual body is the most important factor contributing to users’ feelings of Plausibility Illusion, and that the coherence of the virtual scenario is the second most important factor. This shows it is feasible to determine a rank ordering of factors that affect users’ sense of Plausibility Illusion in virtual environments, offering guidance to creators and developers.

Smith, Bruce T. (1972)

“Logic Programming on an FFP Machine”
Under the direction of David A. Plaisted

In this dissertation, I describe a method of implementing logic programming systems, such as the programming language Prolog, on an FFP Machine, a parallel computer architecture designed by Professor Gyula A. Mago. The production system execution model used in this method allows for the parallel execution of logic programs while avoiding problems with the strictness property of the FP and FFP programming languages, which FFP Machines normally execute. As a part of this work, I have developed several new parallel algorithms for solving symbolic problems on an FFP Machine. I present these algorithms and their analysis, along the way to showing how they fit into the overall goal of implementing logic programming systems.

Smith, F. Donelson (1978)

“Models of Multiprocessing for Transaction-Oriented Computer Systems”
Under the direction of Frederick P. Brooks, Jr.

Multiprocessing can be used effectively in computer systems which process transactions. Blaauw’s distinction among computer architecture, implementation, and realization is used in creating models which relate performance to the equipment used in a processor. With these models the computer architecture is held constant (System/360) and three implementations are studied (360/40, 360/50, and 360/65). Implementation-dependent but realization-independent measures are used to eliminate technology differences. Performance is measured in instructions per datapath beat (I/B); equipment is measured in circuits and bits. The results show that one eight-byte-wide datapath yields more performance for the amount of equipment invested than multiple two- or four-byte-wide datapaths. These results are based on three instruction mixes, including one representing a control program used for processing transactions (OS/MVT with TCAM). Multiprocessor configurations consisting of N identical processors and M identical storage modules are then considered. An index of “effectiveness”, E, is defined as the ratio of I/B for an N-datapath multiprocessor to N times I/B for one datapath. This ratio measures the loss in processing power due to contention for main storage. Modeling results show that I/B is a linear function of N in the range (2,16) when M is equal to N or N/2. E, the slope of the linear performance function, ranges from 0.55 to 0.95 for various cases (0.76 to 0.83 for M=N and the control program instruction mix). One solution to the main-storage contention problem is to use a private cache storage for each datapath. Cache hit ratios from address traces of an OS/MVT/TCAM system are used to model a multiprocessor with a private cache added to each datapath. The results show that I/B remains a linear function of N. E is, however, much nearer to 1.0 (greater than 0.95 for a cache size of 8K bytes). Inter-cache address broadcasting has little effect on performance even when implemented with a single bus. Finally, a case study of contention for critical regions in multiprocessor software is presented. A simulation model representing multiple concurrent processes (based on the internal subtask structure of TCAM) and monitors (based on Hoare’s definitions) is used. Flow of control among processes and monitors and their execution path lengths, are based on the address traces of OS/MVT/TCAM. Results from this model show that transaction throughput and response time for an N-datapath multiprocessor compare quite closely with the results for a single processor with equal power. This performance is obtained even though the simulated processes spend 30% of their path length in some monitor’s critical region. The model also demonstrates some of the pitfalls in programming for multiprocessing. Care must be taken to create enough concurrent processes, particularly those which constitute a large portion of a transaction’s path length. Critical regions in heavily used monitors (e.g., I/O interrupt handlers) should be split into smaller regions if bottlenecks occur.

Smith, Jason McColm (2005)

“SPQR: Formal Foundations and Practical Support for the Automated Detection of Design Patterns From Source Code”
Under the direction of P. David Stotts
Electronic copy available

Maintenance costs currently comprise the majority of total costs in producing software. While object-oriented techniques and languages appear to have assisted in the production of new code, there is little evidence to support the theory that they have helped lower the high cost of maintenance. In this dissertation, I describe the current problem and provide a system ultimately aimed at reducing this cost. The System for Pattern Query and Recognition, or SPQR, consists of: the rho-calculus, a formal foundation for conceptual relationships in object-oriented systems; a suite of Elemental Design Patterns that capture the fundamentals of object-oriented programming and their expressions in the rho-calculus; an XML Schema, the Pattern/Object Markup Language, or POML, to provide a concrete method for expressing the formalisms in a practical manner; an example mapping from the C++ programming language to POML; an implementation which ties the above components together into a practical tool that detects instances of design patterns directly from source code using the Otter automated theorem prover. I will discuss each of the components of the system in turn, and relate them to previous research in the area, as well as provide a number of future research directions. Using the results of SPQR, a system can be more easily documented and understood. The major contribution of SPQR is the flexible detection of design patterns using the formalisms of rho-calculus instead of static structural cues. Building on SPQR, I propose: a suite of metrics utilizing the Minimum Description Length principle that capture the salient conceptual features of source code as expressed in design patterns nomenclature as a method for measuring comprehensibility of code; an approach for mapping these metrics to cost-based metrics from current management principles. This combination should prove to be effective in facilitating communication between technical and managerial concerns in a manner that allows for the most efficient allocation of resources during maintenance of software systems.

Snape, Jamie R. (2012)

“Smooth and Collision-Free Navigation for Multiple Mobile Robots and Video Game Characters”
Under the direction of Dinesh Manocha
Electronic copy available

The navigation of multiple mobile robots or virtual agents through environments containing static and dynamic obstacles to specified goal locations is an important problem in mobile robotics, many video games, and simulated environments. Moreover, technological advances in mobile robot hardware and video games consoles have allowed increasing numbers of mobile robots or virtual agents to navigate shared environments simultaneously. However, coordinating the navigation of large groups of mobile robots or virtual agents remains a difficult task. Kinematic and dynamic constraints and the effects of sensor and actuator uncertainty exaggerate the challenge of navigating multiple physical mobile robots, and video games players demand plausible motion and an ever increasing visual fidelity of virtual agents without sacrificing frame rate. We present new methods for navigating multiple mobile robots or virtual agents through shared environments, each using formulations based on velocity obstacles. These include algorithms that allow navigation through environments in two-dimensional or three-dimensional workspaces containing both static and dynamic obstacles without collisions or oscillations. Each mobile robot or virtual agent senses its surroundings and acts independently, without central coordination or intercommunication with its neighbors, implicitly assuming the neighbors use the same navigation strategy based on the notion of reciprocity. We use the position, velocity, and physical extent of neighboring mobile robots or virtual agents to compute their future trajectories to avoid collisions locally and show that, in principle, it is possible to theoretically guarantee that the motion of each mobile robot or virtual agent is smooth. Moreover, we demonstrate direct, collision-free, and oscillation-free navigation in experiments using physical iRobot Create mobile robots, simulations of multiple differential-drive robots or simple-airplanes, and video games levels containing hundreds of virtual agents.

Snow, Kevin (2014)

Identifying Code Injection and Reuse Payloads in Memory Error Exploits

Under the direction of Fabian Monrose

Electronic copy available

Today’s most widely exploited applications are the web browsers and document readers we use every day. The immediate goal of these attacks is to compromise target systems by executing a snippet of malicious code in the context of the exploited application. Technical tactics used to achieve this can be classified as either code injection — wherein malicious instructions are directly injected into the vulnerable program — or code reuse, where bits of existing program code are pieced together to form malicious logic. In this thesis, I present a new code reuse strategy that bypasses existing and up-and-coming mitigations, and two methods for detecting attacks by identifying the presence of code injection or reuse payloads.

Fine-grained address space layout randomization efficiently scrambles program code, limiting one’s ability to predict the location of useful instructions to construct a code reuse payload. To expose the inadequacy of this exploit mitigation, a technique for ‘just-in-time’ exploitation is developed. This new technique maps memory on-the-fly and compiles a code reuse payload at runtime to ensure it works in a randomized application. The attack also works in face of all other widely deployed mitigations, as demonstrated with a proof-of-concept attack against Internet Explorer 10 in Windows 8. This motivates the need for detection of such exploits rather than solely relying on prevention.

Two new techniques are presented for detecting attacks by identifying the presence of a payload. Code reuse payloads are identified by first taking a memory snapshot of the target application, then statically profiling the memory for chains of code pointers that reuse code to implement malicious logic. Code injection payloads are identified with runtime heuristics by leveraging hardware virtualization for efficient sandboxed execution of all buffers in memory.

Employing both detection methods together to scan program memory takes about a second and produces no false positives or negatives provided that the given exploit is functional and triggered in the target application version. Compared to other strategies, such as the use of signatures, this approach requires relatively little effort spent on maintenance over time and is capable of detecting never before seen attacks. Moving forward, one could use these contributions to form the basis of a unique and effective network intrusion detection system (NIDS) to augment existing systems.

Srinivasan, Anand (2003)

“Efficient and Flexible Fair Scheduling of Real-time Tasks on Multiprocessors”
Under the direction of James H. Anderson
Electronic copy available

Proportionate fair (Pfair) scheduling is the only known way to optimally schedule periodic real-time task systems on multiprocessors in an on-line manner. Under Pfair scheduling, the execution of each task is broken into a sequence of quantum-length subtasks that must execute within intervals of approximately-equal lengths. This scheduling policy results in allocations that mimic those of an ideal “fluid” scheduler, and in periodic task systems, ensures that all deadlines are met. Though Pfair scheduling algorithms hold much promise, prior to our work, research on this topic was limited in that only static systems consisting of synchronous periodic tasks were considered. My dissertation thesis is that the Pfair scheduling framework for the on-line scheduling of real-time tasks on multiprocessors can be made more flexible by allowing the underlying task model to be more general than the periodic model and by allowing dynamic task behaviors. Further, this flexibility can be efficiently achieved. Towards the goal of improving the efficiency of Pfair scheduling algorithms, we develop the PD2 Pfair algorithm, which is the most efficient optimal Pfair scheduling algorithm devised to date. Through a series of counterexaples, we show that it is unlikely that a more efficient optimal Pfair algorithm exists. We also introduce the cocept of ERfair scheduling, which is a work-conserving extension of Pfair scheduling. In addition, we study the non-optimal earliest-pseudo-deadline-first (EPDF) Pfair algorithm, which is more efficient than PD2, and presen several scenarios under which it is preferable to PD2. We address the flexibility issue by developing the intra-sporadic (IS) task model and by considering the scheduling of dynamic task systems. The well-known sporadic model generalizes the periodic model by allowing jobs to be released late. The IS model generalizes this notion further by allowing late as well as early subtask releases. Such a generalization is useful for modeling applications in which the instantaneous rate of releases differs greatly from the average rate of releases (e.g., an application that receives packets over a network). We prove that PD2 is optimal for scheduling static IS task systems on multiprocessors. In dynamic task systems, tasks are allowed to join and leave, i.e., the set of tasks is allowed to change. This flexibility also allows us to model systems in which the weights of tasks may change. We present sufficient conditions under which joins and leaves can occur under PD2 without causing missed deadlines. Further, we show that these conditions are tight. Finally, we also provide schemes for multiplexing the scheduling of aperiodic tasks and real-time IS tasks. These approaches aim at improving the response times of aperiodic tasks while ensuring that the real-time IS tasks meet their deadlines. We also provide bounds on aperiodic response times under these schemes; these bounds can be used to obtain admission-control tests for aperiodic tasks with deadlines.

Steinhurst, Joshua Eli (2007)

“Practical Photon Mapping in Hardware”
Under the direction of Anselmo Lastra
Electronic copy available

Photon mapping is a popular global illumination algorithm that can reproduce a wide range of visual effects including indirect illumination, color bleeding and caustics on complex diffuse, glossy, and specular surfaces modeled using arbitrary geometric primitives. However, the large amount of computation and tremendous amount of memory bandwidth, terabytes per second, required makes photon mapping prohibitively expensive for interactive applications. In this dissertation I present three techniques that work together to reduce the bandwidth requirements of photon mapping by over an order of magnitude. These are combined in a hardware architecture that can provide interactive performance on moderatelysized indirectly-illuminated scenes using a pre-computed photon map.

  1. The computations of the naive photon map algorithm are efficiently reordered, generating exactly the same image, but with an order of magnitude less bandwidth due to an easily cacheable sequence of memory accesses.
  2. The irradiance caching algorithm is modified to allow fine-grain parallel execution by removing the sequential dependency between pixels. The bandwidth requirements of scenes with diffuse surfaces and low geometric complexity is reduced by an additional 40% or more.
  3. Generating final gather rays in proportion to both the incident radiance and the reflectance functions requires fewer final gather rays for images of the same quality. Combined Importance Sampling is simple to implement, cheap to compute, compatible with query reordering, and can reduce bandwidth requirements by an order of magnitude.

Functional simulation of a practical and scalable hardware architecture based on these three techniques shows that an implementation that would fit within a host workstation will achieve interactive rates. This architecture is therefore a candidate for the next generation of graphics hardware.

Sterling, Auston (2019)

Audio-Material Modeling and Reconstruction for Multimodal Interaction

Under the direction of Ming C. Lin

Electronic copy available

Interactive virtual environments are common for creating training simulations, games, and social applications. These virtual environments are designed to create a sense of presence in the environment: a sensation that its user is truly in another location. To maintain presence, interactions with virtual objects should engage multiple senses. Furthermore, the sensory input should be consistent, e.g., a virtual bowl that visually appears plastic should also sound like plastic when dropped on the floor. In this dissertation, I propose methods to improve the perceptual realism of simulated sounds of impacted objects and ensure cohesion between those sounds and the input obtained from other senses. Recreating the impact sound of a real-world object requires an accurate estimate of that object’s material parameters. The material parameters that affect generated sound—collectively forming the audio-material—include the material damping parameters. I propose and evaluate models for material damping and use those models to estimate material damping parameters of real-world objects. I also consider how interaction with virtual objects can be made more cohesive between the senses of sight, hearing, and touch. First, I present a method for modeling the damping properties of impact sounds using generalized proportional damping. Generalized proportional damping is applied to both perform impact sound synthesis and estimate more expressive material damping parameters from recorded impact sounds. Next, I present a method for estimating material damping parameters in the presence of confounding factors and with no knowledge of the object’s shape. A probabilistic damping model captures various external effects to produce robust damping parameter estimates. I also present a method for cohesive multimodal interaction with textured surfaces. Texture maps serve as a single unified representation of mesoscopic detail for the purposes of visual rendering, sound synthesis, and rigid-body simulation. Finally, I present a method for geometry and material classification using multimodal audio-visual input. Using this method, a real-world scene can be scanned and virtually reconstructed while accurately modeling both the visual appearances and audio-material properties of each object. 

Stetten, George D. (2000)

“Automated Identification and Measurement of Cardiac Anatomy via Statistical Analysis of Medial Primitives” (Biomedical Engineering, UNC-Chapel Hill)
Under the direction of Stephen M. Pizer
Electronic copy available

Identification and measurement of objects in 3D images can be automatic, rapid and stable, based on local shape properties derived statistically from populations of medial primitives sought throughout the image space. These shape properties are measured at medial locations within the object and include scale, orientation, endness, and medial dimensionality. Medial dimensionality is a local shape property differentiating sphere, cylinder, and slab, with intermediate dimensionality also possible. Endness is a property found at the cap of a cylinder or the edge of a slab. A model of the cardiac left ventricle during systole is constructed as a large dark cylinder with an apical cap at one end, terminated at the other end by a thin bright slab-like mitral valve. Such a model, containing medial shape properties at just a few locations, along with the relative distances and orientations between them, is intuitive and robust and permits automated detection of the left ventricular axis in vivo using Real-Time Three Dimensional (RT3D) echocardiography. The statistical nature of these shape properties allows their extraction even in the presence of noise and permits statistical geometric measurements without exact delineation of boundaries, as demonstrated in determining the volume of balloons and of in vivo left ventricles in RT3D scans. The inherent high speed of the method is appropriate for real-time clinical use.

Stone, Donald L. (1995)

“Managing the Effect of Delay Jitter on the Display of Live Continuous Media”
Under the direction of Kevin Jeffay
Electronic copy available

This dissertation addresses the problem of displaying live continuous media (e.g., digital audio and video) with low latency in the presence of delay jitter, where delay jitter is defined as variation in processing and transmission delay. Display in the presence of delay jitter requires a tradeoff between two goals: displaying frames with low latency and displaying every frame. Applications must choose a display latency that balances these goals. The driving problem for my work is workstation-based videoconferencing using conventional data networks. I propose a two-part approach. First, delay jitter at the source and destination should be controlled, leaving network transmission as the only uncontrolled source. Second, the remaining delay jitter should be accommodated by dynamically adjusting display latency in response to observed delay jitter. My thesis is that this approach is sufficient to support the low-latency display of continuous media transmitted over campus-sized networks. Delay jitter at the source and destination is controlled by implementing the application as a real-time system. The key problem addressed is that of showing that frames are processed with bounded delay. The analysis framework required to demonstrate this property includes a new formal model of real-time systems and a set of techniques for representing continuous media applications in the model. The remaining delay jitter is accommodated using a new policy called queue monitoring that manages the queue of frames waiting to be displayed. This policy adapts to delay jitter by increasing display latency in response to long delays and by decreasing display latency when the length of the display queue remains stable over a long interval. The policy is evaluated with an empirical study in which the application was executed in a variety of network environments. The study shows that queue monitoring performs better than a policy that statically chooses a display latency or an adaptive policy that simply increases display latency to accommodate the longest observed delay. Overall, the study shows that my approach results in good quality display of continuous media transmitted over campus-sized networks that do not support communication with bounded delay jitter.

Stough, Joshua V. (2008)

“Clustering and Shifting of Regional Appearance for Deformable Model Segmentation”
Under the direction of Stephen M. Pizer
Electronic copy available

Automated medical image segmentation is a challenging task that benefits from the use of effective image appearance models. An appearance model describes the grey-level intensity information relative to the object being segmented. Previous models that compare the target against a single template image or that assume a very small-scale correspondence fail to capture the variability seen in the target cases. In this dissertation I present novel appearance models to address these deficiencies, and I show their efficacy in segmentation via deformable models. The models developed here use clustering and shifting of the object-relative appearance to capture the true variability in appearance. They all learn their parameters from training sets of previously-segmented images. The first model uses clustering on cross-boundary intensity profiles in the training set to determine profile types, and then builds a template of optimal types that reflects the various edge characteristics seen around the boundary. The second model uses clustering on local regional image descriptors to determine large-scale regions relative to the boundary. The method then partitions the object boundary according to region type and captures the intensity variability per region type. The third and fourth models allow shifting of the image model on the boundary to reflect knowledge of the variable regional conformations seen in training. I evaluate the appearance models by considering their efficacy in segmentation of the kidney, bladder, and prostate in abdominal and male pelvis CT. I compare the automatically generated segmentations using these models against expert manual segmentations of the target cases and against automatically generated segmentations using previous models.

Styner, Martin A. (2001)

“Combined Boundary-Medial Shape Description of Variable Biological Shapes”
Under the direction of Guido Gerig
Electronic copy available

This dissertation describes a novel shape description scheme that incorporates variability of an object population into the generation of a characteristic 3D shape model. Knowledge about the biological variability of anatomical objects is essential for statistical shape analysis and discrimination between healthy and pathological structures. The proposed shape representation is based on a fine-scale spherical harmonics (SPHARM) description and a coarse-scale m-rep description. The SPHARM description describes the boundary as a weighted series of spherical harmonics. The correspondence on the boundary is defined by a first-order ellipsoid normalized parameterization. The medial m-rep description is composed of a net of medial primitives with fixed graph properties. A m-rep model is computed automatically from the shape space of a training population of SPHARM objects. Pruned 3D Voronoi skeletons are used to determine a common medial branching topology in a stable way. An intrinsic coordinate system and an implicit correspondence between objects are defined on the medial manifold. My novel representation scheme describes shape and shape changes in a meaningful and intuitive manner. Several experimental studies of shape asymmetry and shape similarity in biological structures demonstrate the power of the new representation to describe global and local form. The clinical importance of shape measurements is shown in the presented applications. The contributions made in this dissertation include the development of a novel automatic pruning scheme for 3D Voronoi skeletons. My experiments showed that only a small number of skeletal sheets are necessary to describe families of even quite complex objects. This work is also the first to compute a common medial branching topology of an object population, which deals with the sensitivity of the branching topology to small shape variations. The sensitivity of the medial descriptions to small boundary perturbations, a fundamental problem of any skeletonization technique, is approached with a new sampling technique.

Sud, Avneesh (2006)

“Efficient Computation of Discrete Voronoi Diagram and Homotopy-Preserving Simplified Medial Axis of a 3D Polyhedron”
Under the direction of Dinesh Manocha
Electronic copy available

The Voronoi diagram is a fundamental geometric data structure and has been well studied in computational geometry and related areas. A Voronoi diagram defined using the Euclidean distance metric is also closely related to the Blum medial axis, a well known skeletal representation. Voronoi diagrams and medial axes have been shown useful for many 3D computations and operations, including proximity queries, motion planning, mesh generation, finite element analysis, and shape analysis. However, their application to complex 3D polyhedral and deformable models has been limited. This is due to the difficulty of computing exact Voronoi diagrams in an efficient and reliable manner. In this dissertation, we bridge this gap by presenting efficient algorithms to compute discrete Voronoi diagrams and simplified medial axes of 3D polyhedral models with geometric and topological guarantees. We apply these algorithms to complex 3D models and use them to perform interactive proximity queries, motion planning and skeletal computations. We present three new results. First, we describe an algorithm to compute 3D distance fields of geometric models by using a linear factorization of Euclidean distance vectors. This formulation maps directly to the linearly interpolating graphics rasterization hardware and enables us to compute distance fields of complex 3D models at interactive rates. We also use clamping and culling algorithms based on properties of Voronoi diagrams to accelerate this computation. We introduce surface distance maps, which are a compact distance vector field representation based on a mesh parameterization of triangulated two-manifolds, and use them to perform proximity computations. Our second main result is an adaptive sampling algorithm to compute an approximate Voronoi diagram that is homotopy equivalent to the exact Voronoi diagram and preserves topological features. We use this algorithm to compute a homotopy-preserving simplified medial axis of complex 3D models. Our third result is a unified approach to perform different proximity queries among multiple deformable models using second order discrete Voronoi diagrams. We introduce a new query called N-body distance query and show that different proximity queries, including collision detection, separation distance and penetration depth can be performed based on Nbody distance query. We compute the second order discrete Voronoi diagram using graphics hardware and use distance bounds to overcome the sampling errors and perform conservative computations. We have applied these queries to various deformable simulations and observed up to an order of magnitude improvement over prior algorithms.

Surles, Mark C. (1992)

“Techniques for Interactive Manipulation of Graphical Protein Models”
Under the direction of Frederick P. Brooks, Jr.
Department of Computer Science Technical Report TR92-016
Electronic copy available

This thesis describes a graphics modeling system, called Sculpt, that maintains physically-valid protein properties while a user interactively moves atoms in a protein model. Sculpt models strong properties such as bond lengths and angles with rigid constraints and models weak properties such as near-neighbor interactions with potential energies. Sculpt continually satisfies the constraints and maintains a local energy minimum throughout user interaction. On a Silicon Graphics 240-GTX, Sculpt maintains 1.5 updates per second on a molecular model with 355 atoms (1065 variables, 1027 constraints, and 3450 potential energies). Performance decreases linearlywith increased molecule size. Three techniques yield interactive performance: a constrained minimization algorithm with linear complexity in problem size, coarse-grain parallelism, and variable reduction that replaces model segments with rigid bodies. The thesis presents a Lagrange multiplier method that finds a constrained minimum and achieves linear computational complexity for articulated figures whose spine contains many more joints than any attached limb (e.g. reptiles, mammals, and proteins). The method computes the Jacobian matrix of the constraint functions, multiplies it by its transpose, and solves the resulting system of equations. A sort of the Jacobian at program initialization yields a constant, band-diagonal pattern of nonzeros. Multiplication and solution of band-diagonal matrices require computation that increases linearly with problem size. One or two iterations of this algorithm typically find a constrained minimum in this application. The number of functions and variables can be reduced by the use of rigid bodies. A user can specify that a rigid object with few variables replace large segments of a model that should not change. For example, a user can twist a backbone into a helix and then freeze the helix by replacing its atoms and bonds with a cylinder of rigid shape but movable position and orientation. Two improvements over existing interactive protein modeling systems have been observed in modeling sessions with Sculpt. First, time-consuming model correction is avoided by maintaining a physically-valid model throughout a modeling session. Second, additional cues about model properties can arise when a chemist interactively guides a folding simulation rather than viewing a cine loop from a pre-computed simulation.


Takahashi, Tetsuya (2019)

Efficient Particle-based Viscous Fluid Simulation with Video-Guided Real-to-Virtual Parameter Transfer

Under the direction of Ming Lin

Electronic copy available

Viscous fluids, such as honey, molten chocolate, and paint are common materials frequently seen in our daily life. These viscous fluids exhibit characteristic behaviors different from those generated by inviscid fluids. Capturing and understanding such dynamics have been required over decades for various applications including virtual reality and mechanical engineering. Although recent research made significant advances in simulating complicated dynamics of viscous fluids, still many important challenges are left to be addressed. In this dissertation, I present novel techniques to more efficiently and accurately simulate viscous fluid dynamics with Lagrangian particles and propose a parameter identification framework to facilitate the tedious parameter tuning steps for viscous materials. I demonstrate that how my approach can improve the performance in a wide range of scenarios, and how my parameter identification framework helps the parameter tuning for viscous fluids. In fluid simulation, enforcing appropriate boundary conditions for free surfaces and solid boundaries is essential, and failure for enforcing the appropriate boundary conditions negatively affects the robustness and efficiency of the fluid solver. To address this issue, I propose a new boundary handling approach for incompressible Smoothed Particle Hydrodynamics (SPH) solver by enforcing appropriate boundary conditions based on connectivity analysis for simulation particles. This approach significantly improves the robustness and capability of the fluid solver and allows for taking larger time steps, leading to the performance gain.

To simulate realistic fluids, enforcing the incompressibility of fluids is important, and various techniques have been proposed for this. However, previously proposed techniques based on Jacobi method or Conjugate Gradient does not scale well, and the computational cost increases superlinearly with respect to the simulation resolution. I propose a new multilevel solver for particle-based fluids. This approach constructs the hierarchy of simulation particles and grids while minimizing the discrepancies over the levels. This multilevel solver significantly improves the efficiency of the fluid simulation and achieves the nearly linear scaling, unlike previous approaches. Viscous fluids exhibit characteristic behaviors, such as coiling and buckling phenomena, which are not observed for inviscid fluids. Thus, it is necessary to develop a specialized solver that can capture such behaviors. I propose a stable and efficient particle-based solver for simulating highly viscous fluids. This solver improves the robustness compared to previous approaches using implicit integration with the full form of viscosity. Consequently, this approach enables the simulation to take larger time steps, leading to improved performance.

While various approaches have been proposed to generate rotational behaviors of viscous fluids, few attentions have been paid for how viscous fluids interact with solid objects. I propose a new two-way fluid-solid coupling method for viscous fluids via the unified minimization for implicit viscosity integration and interplay between viscous fluids and solid objects based on the variational principle. This method enables more accurate and robust simulations for the interplay between viscous fluids and solid objects, as compared to previous approaches. Although physically-based approaches have made it possible to simulate a wide range of materials over the decades, one challenge for the physically-based simulation is that it is not obvious how to choose appropriate material parameters to generate desirable behaviors of simulated materials. I propose a parameter identification framework that helps to tune material parameters for viscous fluids with example video data captured from real world fluid phenomena. This framework identifies viscosity parameters for the real viscous fluids while estimating the hidden variables for the fluids, and enables the parameter transfer from the real world to virtual environment.

Talley, Terry M. (1997)

“A Transmission Control Framework for Continuous Media”
Under the direction of Kevin Jeffay
Electronic copy available

Desktop video conferencing allows people to simulate face-to-face conversations by integrating real-time two-way audio and video with the computer system. Unfortunately, the quality of video conferences carried over current networks such as the Internet is often inadequate for effective communication. Network congestion can cause video conferences to experience high latencies and poor fidelity. We claim that in many cases we can sustain low-latency, high-fidelity conferences over current networks even when the networks are highly congested if we carefully manage the transmission of the audio and video streams at the endpoints of the conference. Network congestion is caused by two distinct types of network constraints: capacity constraints and access constraints. Capacity constraints limit the bit rate that can be supported by the network. Access constraints limit the message rate that can be supported by the network. We claim conferences can heuristically identify the type of network constraint causing congestion and reduce the effects of the congestion by carefully selecting the bit and message rates associated with each of the conference media streams. We explain and empirically demonstrate why addressing capacity and access constraints requires two complementary transmission adaptations: scalingand packaging. Scaling increases or decreases the bit rate associated with a media stream by controlling the generation and compression of the media stream. Packaging increases or decreases the conference message rate by controlling the type and number of media data units, or frames, placed into each message. We describe a transmission control framework that shows how scaling and packaging can be used to preserve conference quality when the network has capacity constraints, access constraints, or a combination of capacity and access constraints. We develop a transmission control algorithm based on this framework and demonstrate that the algorithm can deliver low-latency, high-fidelity conferences even on heavily congested networks. We also show that the algorithm delivers conferences with lower latency and higher fidelity than those delivered by non-adaptive transmission algorithms or by algorithms that only scale the video bit rate.

Taylor, Micah (2014)

“Interactive Sound Propagation for Massive Multi-User and Dynamic Virtual Enrironments”
Under the direction of Dinesh Manocha

Hearing sound waves is an important sense and it is known that rendering sound effects can enhance the level of immersion in virtual environments. Modeling sound waves is a complex problem, requiring vast computing resources to solve accurately. Prior methods are restricted to static scenes or limited acoustic effects. In this thesis, we present methods to improve the quality and performance of interactive geometric sound propagation in dynamic scenes and precomputation algorithms for acoustic propagation in enormous multi-user virtual environments.

We present a method for finding edge diffraction propagation paths on arbitrary 3D scenes for dynamic sources and receivers. Using this algorithm, we present a unified framework for interactive simulation of specular reflections, diffuse reflections, diffraction scattering, and reverberation effects. We also define a guidance algorithm for ray tracing that responds to dynamic environments and reorders queries to minimize simulation time. Our approach works well on modern GPUs and can achieve more than an order of magnitude performance improvement over prior methods.

Modern multi-user virtual environments support many types of client devices, and current phones and mobile devices may lack the resources to run acoustic simulations. In order to allow such devices the benefits of sound simulation, we have developed a precomputation algorithm that efficiently computes and stores acoustic data on a server in the cloud. Using novel algorithms, the server can render enhanced spatial audio in scenes spanning several square kilometers for hundreds of clients in realtime. Our method provides the benefits of immersive audio to collaborative telephony, video games, and multi-user virtual environments.

Taylor II, Russell M. (1994)

“The Nanomanipulator: A Virtual-Reality Interface to a Scanning Tunneling Microscope”
Under the direction of Frederick P. Brooks, Jr.
Electronic copy available

We have developed a virtual-reality interface to a scanning tunneling microscope (STM); the resulting system is called theNanomanipulator. The user interface comprises a stereoscopic color head-mounted display, a force-feedback remote manipulator master station, and a high-performance graphics computer. It provides the illusion of a surface floating in space in front of the user. The user’s hand gestures are translated into commands that are sent to the STM in real time; the returned video and haptic signals allow the user to see and to feel the surface topography and to control the timing and location of voltage pulses applied between the tip of the STM probe and the sample under study. My thesis is that a virtual-reality interface is a powerful and effective user interface to an STM–allowing qualitatively different types of experiments to be performed. The success of our investigations using this system demonstrates the validity of the thesis. We have used the Nanomanipulator to examine various surfaces and to perform surface modification experiments. This investigation has led to new insight into the meaning of certain surface features and into the mechanisms by which voltage pulses change the tip and sample. These insights were the direct results of the real-time visualization and the more interactive nature of our system compared to standard methods. The key to the success of the Nanomanipulator system is that it provides an intuitive two-way interface to the instrument. Raw data from an STM is not in a format easily understood by a scientist, and the Etch-a-Sketch type of controls required for positioning an STM tip are neither natural nor familiar to a user. The Nanomanipulator system acts as a translator between the instrument and the scientist, allowing the scientist to concentrate on interacting with the surface under study rather than on the computer interface or the STM itself. This system seeks to put the scientists on the surface, in control, while the experiment is happening–thus turning the STM from a remote, batch surface modifier into a real-time, user-guided surface modifier.

Taylor, Teryl (2016)

“Using Context to Improve Network-Based Malware Detection”
Under the direction of Fabian Monrose

Today, our computers are routinely compromised while performing seemingly innocuous activities like reading articles on trusted websites (e.g., the NY Times).  These compromises are perpetrated via complex interactions involving the advertising networks that monetize these sites. Web-based compromises such as exploit kits are similar to any other scam — the attacker wants to lure an unsuspecting client into a trap to steal private information, or resources — generating 10s of millions of dollars annually. Exploit kits are web-based services specifically designed to capitalize on vulnerabilities in unsuspecting client computers in order to install malware without a user’s knowledge.  Sadly, it only takes a single successful infection to ruin a user’s financial life, or lead to corporate breaches that result in millions of dollars of expense and loss of customer trust.

Exploit kits use a myriad of techniques to obfuscate each attack instance, making current network-based defenses such as signature-based network intrusion detection systems far less effective than in years past. Dynamic analysis or honeyclient analysis on these exploits plays a key role in identifying new attacks for signature generation, but provides no means of inspecting end-user traffic on the network to identify attacks in real time.  As a result, defenses designed to stop such malfeasance often arrive too late or not at all resulting in high false positive and false negative (error) rates. In order to deal with these drawbacks, three new detection approaches are presented.

To deal with the issue of a high number of errors, a new technique for detecting exploit kit interactions on a network is proposed.  The technique capitalizes on the fact that an exploit kit leads its potential victim through a process of exploitation by forcing the browser to download multiple web resources from malicious servers. This process has an inherent structure that can be captured in HTTP traffic and used to significantly reduce error rates.  The approach organizes HTTP traffic into tree-like data structures, and, using a scalable index of exploit kit traces as samples, models the detection process as a subtree similarity search problem.  The technique is evaluated on 3,800 hours of web traffic on a large enterprise network, and results show that it reduces false positive rates by four orders of magnitude over current state-of-the-art approaches.

While utilizing structure can vastly improve detection rates over current approaches, it does not go far enough in helping defenders detect new, previously unseen attacks.  As a result,  a new framework that applies dynamic honeyclient analysis directly on network traffic at scale is proposed.  The framework captures and stores a configurable window of reassembled HTTP objects network wide, uses lightweight content rendering to establish the chain of requests leading up to a suspicious event, then serves the initial response content back to the honeyclient in an isolated network.  The framework is evaluated on a diverse collection of exploit kits as they evolve over a 1 year period.  The empirical evaluation suggests that the approach offers significant operational value, and a single honeyclient can support a campus deployment of thousands of users.

While the above approaches attempt to detect exploit kits before they have a chance to infect the client, they cannot protect a client that has already been infected. The final technique detects signs of post infection behavior by intrusions that abuse the domain name system (DNS) to make contact with an attacker.   Contemporary detection approaches utilize the structure of a domain name and require hundreds of DNS messages to detect such malware.   As a result, these detection mechanisms cannot detect malware in a timely manner and are susceptible to high error rates.   The final technique, based on sequential hypothesis testing, uses the DNS message patterns of a subset of DNS traffic to detect malware in as little as three DNS messages, and with orders of magnitude reduction in error rates.

The results of this work can make a significant operational impact on network security analysis, and open several exciting future directions for network security research.

Terrell, Jeffrey S. (2009)

“Passive, automatic detection of network server performance anomalies in large networks”
Under the direction of Kevin Jeffay
Electronic copy available

Network management in a large organization often involves, whether explicitly or implicitly, the responsibility for ensuring the availability and responsiveness of network resources attached to the network, such as servers and printers. For better or worse, users often think of the services they rely on, such as web sites and email, as part of the network. Although tools exist for ensuring the availability of the servers running these services, ensuring their performance is a more difficult problem. In this dissertation, I introduce a novel approach to managing the performance of servers within a large network broadly and cheaply. I continuously monitor the border link of an enterprise network, building for each inbound connection an abstract model of the application-level dialog contained therein without affecting the operation of the server or the quality of its service in any way. The measurement occurs in real-time, on commodity hardware. The method is capable of keeping up with traffic rates bursting to 600 megabits per second. The model, originally developed for realistic traffic generation, includes a measurement of the server response time for each request/response exchange, which is the fundamental unit of performance I use. I then aggregate days of the response times for a particular server into distributions. Over the course of many days, I use these distributions to define a profile of what the typical response time distribution is for that server, using methods originally developed for medical image analysis. New distributions of response times are then compared to the profile, and their dissimilarity determines the degree to which the new distribution is considered anomalous. I have applied this method to monitoring the performance of servers on the UNC campus. I have tested three months of continuous measurements for anomalies, for over two hundred UNC servers. I found that most of the servers saw at least one anomaly, although for many servers the anomalies were minor. I found seven servers that had severe anomalies corresponding to real performance issues. These performance issues were found without any involvement of UNC network managers, although I have verified two of the issues by speaking with network managers. I have investigated each of these issues using the contextual and structural information measured at the border link in order to diagnose the cause of the issue (to the extent possible given the purely passive measurement approach). I found a variety of causes: overloaded servers from increased demand, heavy-hitting clients, authentication problems, changes in the nature of the request traffic, server misconfigurations, etc. Furthermore, information about the structure of connections proved valuable to diagnosing the issues.

Terriberry, Timothy B. (2006)

“Continuous Medial Models in Two-Sample Statistics of Shape”
Under the direction of Guido Gerig
Electronic copy available

In questions of statistical shape analysis, the foremost is how such shapes should be represented. The number of parameters required for a given accuracy and the types of deformation they can express directly influence the quality and type of statistical inferences one can make. One example is a medial model, which represents a solid object using a skeleton of a lower dimension and naturally expresses intuitive changes such as “bending”, “twisting”, and “thickening”. In this dissertation I develop a new three-dimensional medial model that allows continuous interpolation of the medial surface and provides a map back and forth between the boundary and its medial axis. It is the first such model to support branching, allowing the representation of a much wider class of objects than previously possible using continuous medial methods. A measure defined on the medial surface then allows one to write integrals over the boundary and the object interior in medial coordinates, enabling the expression of important object properties in an object-relative coordinate system. I show how these properties can be used to optimize correspondence during model construction. This improved correspondence reduces variability due to how the model is parameterized which could potentially mask a true shape change effect. Finally, I develop a method for performing global and local hypothesis testing between two groups of shapes. This method is capable of handling the nonlinear spaces the shapes live in and is well defined even in the high-dimension, low-sample size case. It naturally reduces to several well-known statistical tests in the linear and univariate cases.

Thall, Andrew L. (2004)

“Deformable Solid Modeling via Medial Sampling and Displacement Subdivision”
Under the direction of Stephen M. Pizer
Electronic copy available

Discrete m-reps use tolerance-based, sampled medial skeleta as an underlying framework for boundaries defined by displaced subdivision surfaces. They provide local and global shape information, combining the strengths of multiscale skeletal modeling with the multi-resolution, deformation and shading properties afforded by subdivision surfaces. Hierarchically linked medial figures allow figural, object-based deformation, and their stability of object representation with respect to boundary perturbation shows advantages of tolerance-based medial representations over Blum axis and Voronoi-based skeletal models. M-rep models provide new approaches to traditional computer graphics modeling, to physically based modeling and simulation, and to image-analysis, segmentation and display, by combining local and object-level deformability and by explicitly including object-scale, tolerance and hierarchical level-of-detail. Sampled medial representations combine the solid modeling capabilities of constructive solid geometry with the flexibility of traditional b-reps, to which they add multiscale medial and boundary deformations parameterized by an object-based coordinate system. This thesis research encompassed conceptual development on discrete m-reps and their implementation for MIDAG (the Medical Image Display and Analysis Group) at UNC-Chapel Hill. Prototype and application code was created to support the following: medial atoms in 3D that included a quaternion frame to establish a local coordinate system; data structures for medial mesh topologies; a new algorithm for interpolating Catmull-Clark subdivision surfaces for m-rep boundaries; a medially based coordinate system parameterizing the m-rep boundary, interior, and local exterior; displacement texturing and displacement meshing of m-rep boundaries; methods of medially based deformation; figure/subfigure blending by implicit surface methods or (with Qiong Han) using remeshing of subdivision boundaries; and C++ code libraries for m-rep modeling in geometric design and image-segmentation applications. Along with discussion of these achievements, this document also includes discussions of current m-rep applications and of design-methodology issues for m-rep-based modeling systems.

Thomas, Teresa A. (1988)

“The Semantics of an FP Language with Infinite Objects”
Under the direction of Donald F. Stanat
Department of Computer Science Technical Report TR88-022
Electronic copy available

We describe an extension of Backus’ FP (Functional Programming) languages to include infinite objects and discuss the semantic and mathematical issues surrounding the change. The extended languages, called SFP (Stream Functional Programming) Languages, satisfy the following desiderata:

  • The domain for FP is “embedded” in the new domain.
  • The new domain contains infinite objects, both those infinite in length and those infinitely nested.
  • The new language is not substantially different in syntax from Backus’ language.
  • The primitive functions of an FP language extend in an intuitively satisfying way to continuous functions on the new domain.
  • The functional style of FP programming is preserved.
  • The algebra of FP programs survives with few changes.

SFP differs from FP in that the domain for SFP contains infinite objects, approximations to complete objects, and top, used to denote an error. Approximations to complete objects include bottom and prefixes, where a prefix is a sequence that can be extended on the right. SFP uses a parallel outermost evaluation rule. Any monotonic function on the FP domain can be extended to a continuous function on the SFP domain. We describe a domain of objects, a collection of functions, and two semantics, one denotational and one operational, for SFP. We show that the two semantics define nearly the same language. The definitions of the primitive functions and functional forms, according to both the operational and denotational semantics, are given, as well as example programs.

Tolle, Donald M. (1981)

“Coordination of Computation in a Binary Tree of Processors: An Architectural Proposal”
Under the direction of Gyula A. Mago

A design is proposed for a cellular computer consisting of many processors, of two kinds, connected in the form of a binary tree. The design is inspired by and closely related to one recently presented by Mago. Both machines are intended for the highly parallel execution of the Formal Functional Programming (FFP) languages of Backus. These are high-level, general-purpose languages especially well suited for expressing parallel algorithms. Both machines, storage permitting, take advantage of all the parallelism expressed in a single program and can execute many programs simultaneously. Both machines, furthermore, decompose each primitive FFP operator into more elementary operations, many of which may be executable concurrently. In comparison to Mago’s machine, the computer described here offers the possibility of a higher degree of concurrency below the FFP language level, has more powerful basic operations, and often requires less storage for the same computation. The individual processors of the computer are more complex than Mago’s, but are still well suited for implementation in VLSI. The machine proposed here uses the syntactic structure of an FFP expression to guide the embedding of a syntactic network of nodes in the binary tree of machine cells. Execution of the FFP program is accomplished through operations performed by the embedded network of nodes. As new FFP expressions are produced during execution of the FFP program, corresponding new syntactic networks are automatically and dynamically embedded and used for further execution. A “Syntax Tree Language” (STL) for programming the embedded syntactic networks is specified in some detail. Several examples of the computational potential of the machine are presented, including primitive operators for sorting, for matrix multiplication, and for solving simultaneous linear equations.

Tuck, Russell (1990)

“Porta-SIMD: An Optimally Portable SIMD Programming Language” (Duke University)
Under the direction of Frederick P. Brooks, Jr.
(No UNC-Chapel Hill library copy)

Existing programming languages contain architectural assumptions which limit their portability. I submit optimal portability, a new concept which solves this language design problem. Optimal portability makes it possible to design languages which are portable across various sets of diverse architectures. SIMD (Single-Instruction stream, Multiple-Data stream) computers represent an important and very diverse set of architectures for which to demonstrate optimal portability. Porta-SIMD (pronounced “porta-simm’d”) is the first optimally portable language for SIMD computers. It was designed and implemented to demonstrate that optimal portability is a useful and achievable standard for language design. An optimally portable language allows each program to specify the architectural features it requires. The language then enables the compiled program to exploit exactly those features, and to run on all architectures that provide them. An architecture’s features are those it can implement with a constant-bounded number of operations. This definition optimal portability ensures reasonable execution efficiency, and identifies architectural differences relevant to algorithm selection. An optimally portable language for a set of architectures must accommodate all the features found in the members of that set. There was no suitable taxonomy to identify the features of SIMD architectures. Therefore, the taxonomy created and used in the design of Porta-SIMD is presented. Porta-SIMD is an optimally portable, full-featured, SIMD language. It provides dynamic allocation of parallel data with dynamically determined sizes. Generic subroutines which operate on any size of data may also be written in Porta-SIMD. Some important commercial SIMD languages do not provide these features. A prototype implementation of Porta-SIMD has been developed as a set of #include files and libraries used with an ordinary C++ compiler. This approach has allowed more rapid prototyping and language experimentation than a custom compiler would have, but modestly constrained the language’s syntax. The result is a very portable but only moderately efficient implementation. Porta-SIMD has been implemented for the Connection Machine 2, for Pixel-Planes 4 and 5, and for ordinary sequential machines. Optimal portability is an important new concept for developing portable languages which can handle architectural diversity. Porta-SIMD demonstrates its usefulness with SIMD computers.

Turk, Gregory (1992)

“Texturing Surfaces Using Reaction-Diffusion”
Under the direction of Henry Fuchs
Department of Computer Science Technical Report TR94-035
Electronic copy available

This dissertation introduces a new method of creating computer graphics textures that is based on simulating a biological model of pattern formation known as reaction-diffusion. Applied mathematicians and biologists have shown how simple reaction-diffusion systems can create patterns of spots or stripes. Here we demonstrate that the range of patterns created by reaction-diffusion can be greatly expanded by cascading two or more reaction-diffusion systems. Cascaded systems can produce such complex patterns as the clusters of spots found on leopards and the mixture of spots and stripes found on certain squirrels. This dissertation also presents a method for simulating reaction-diffusion systems directly on the surface of any given polygonal model. This is done by creating a mesh for simulation that is specifically fit to a particular model. Such a mesh is created by first randomly distributing points over the surface of the model and causing these points to repel one another so that they are evenly spaced over the surface. Then a mesh cell is created around each point by finding the Voronoi region for each point within a local planar approximation to the surface. Reaction-diffusion systems can then be simulated on this mesh of cells. The chemical concentrations resulting from such a simulation can be converted to color values to create a texture. Textures created by simulation on a mesh do not have the problems of texture distortion or seams between patches that are artifacts of some other methods of texturing. Two methods of rendering these synthetic textures are presented. The first method uses a new surface of triangles that closely matches the original model, but whose vertices are taken from the cells of the simulation mesh. These vertices are assigned colors based on the simulation values, and this re-tiled model can be rapidly displayed on a graphics workstation. A higher-quality image can be created by computing each pixel’s color value using a weighted average of the chemical concentration at nearby mesh points. Using a smooth cubic weighting function gives a satisfactory reconstruction of the underlying function specified by the values at the mesh points. Several low-pass filtered versions of the texture are used to avoid aliasing when the textured object covers a small portion of the screen. The proper color value at a pixel is found by selecting from the appropriately filtered level.


Vallidis, Nicholas Michael (2002)

“Design of a Scalable Available Bandwidth Information Infrastructure”
Under the direction of Gary Bishop
Electronic copy available

Tracking systems determine the position and/or orientation of a target object, and are used for many different purposes in various fields of work. My focus is tracking systems in virtual environments. While the primary use of tracking for virtual environments is to track the head position and orientation to set viewing parameters, another use is body tracking–the determination of the positions of the hands and feet of a user. The latter use is the goal for WHISPER. The largest problem faced by body-tracking systems is emitter/sensor occlusion. The great range of motion that human beings are capable of makes it nearly impossible to place emitter/sensor pairs such that there is always a clear line of sight between the two. Existing systems either ignore this issue, use an algorithmic approach to compensate (e.g., using motion prediction and kinematic constraints to “ride out” occlusions), or use a technology that does not suffer from occlusion problems (e.g., magnetic or mechanical tracking devices). WHISPER uses the final approach. In this dissertation I present WHISPER as a solution to the body-tracking problem. WHISPER is an acoustic tracking system that uses a wide bandwidth signal to take advantage of low frequency sound’s ability to diffract around objects. Previous acoustic systems suffered from low update rates and were not very robust of environmental noise. I apply spread spectrum concepts to acoustic tracking in order to overcome these problems and allow simultaneous tracking of multiple targets using Code Division Multiple Access. The fundamental approach is to recursively track the correlation between a transmitted and received version of a pseudo-random wide-band acoustic signal. The offset of the maximum correlation value corresponds to the delay, which corresponds to the distance between the microphone and speaker. Correlation is computationally expensive, but WHISPER reduces the computation necessary by restricting the delay search space using a Kalman filter to predict the current delay of the incoming pseudo-noise sequence. Further reductions in computation expense are accomplished by reusing results from previous iterations of the algorithm.

Varadhan, Gokul (2005)

“Accurate Sampling-Based Algorithms for Surface Extraction and Motion Planning”
Under the direction of Dinesh Manocha
Electronic copy available

Boolean operations, Minkowski sum evaluation, configuration space computation, and motion planning are fundamental problems in solid modeling and robotics. Their applications include computer-aided design, numerically-controlled machining, tolerance verification, packing, assembly planning, and dynamic simulation. Prior algorithms for solving these problems can be classified into exact and approximate approaches. The exact approaches are difficult to implement and are prone to robustness problems. Current approximate approaches may not solve these problems accurately. Our work aims to bridge this gap between exact and approximate approaches. We present a sampling-based approach to solve these geometric problems. Our approach relies on computing a volumetric grid in space using a sampling condition. If the grid satisfies the sampling condition, our algorithm can provide geometric and topological guarantees on the output. We classify the geometric problems into two classes. The first class includes surface extraction problems such as Boolean operations, Minkowski sum evaluation, and configuration space computation. We compute an approximate boundary of the final solid defined using these geometric operations. Our algorithm computes an approximation that is guaranteed to be topologically equivalent to the exact surface and bounds the approximation error using two-sided Hausdorff error. We demonstrate the performance of our approach for the following applications: Boolean operations on complex polyhedral models and low degree algebraic primitives, model simplification and remeshing of polygonal models, Minkowski sums and offsets of complex polyhedral models, and configuration space computation for low degrees of freedom objects. The second class of problems is motion planning of rigid or articulated robots translating or rotating among stationary obstacles. We present an algorithm for complete motion planning, i.e., finding a path if one exists and reporting a failure otherwise. Our algorithm performs deterministic sampling to compute a roadmap that captures the connectivity of free space. We demonstrate the performance of our algorithm on challenging environments with narrow passages and no collision-free paths.

Varshney, Amitabh (1994)

“Hierarchical Geometric Approximations”
Under the direction of Frederick P. Brooks, Jr.
Department of Computer Science Technical Report TR94-050
Electronic copy available

This dissertation explores some techniques for automatic approximation of geometric objects. My thesis is that using and extending concepts from computational geometry can help us in devising efficient and parallelizable algorithms for automatically constructing useful detail hierarchies for geometric objects. We have demonstrated this by developing new algorithms for two kinds of geometric approximation problems that have been motivated by a single driving problem–the efficient computation and display of smooth solvent-accessible molecular surfaces. The applications of these detail hierarchies are in biochemistry and computer graphics. The smooth solvent-accessible surface of a molecule is useful in studying the structure and interactions of proteins, in particular for attacking the protein-substrate docking problem. We have developed a parallel linear-time algorithm for computing molecular surfaces. Molecular surfaces are equivalent to the weighted alpha-hulls. Thus our work is potentially useful in the application areas of alpha-hulls which include astronomy and surface modeling, besides biochemistry. We have defined the concept of interface surfaces and developed efficient algorithms for computation of surfaces at the interface of two or more molecular units. Interface surfaces are useful for visualizing the inter and intra-molecular interfaces and for characterizing the fit, or complementarity, of molecular interfaces. We have developed an algorithm for simplification of polygonal meshes. The simplified polygonal mesh has the following properties: (a) every point on it is within a user-specifiable distance (epsilon) from the input mesh, (b) it is topologically consistent with the input mesh (i.e. both have the same genus), (c) its vertices are a subset of the vertices of the input mesh, and (d) it is within a computable factor in complexity (in terms of number of faces) of the optimal mesh that satisfies (a), (b), and (c) (computing the optimal mesh is known to be NP-hard). We have accomplished this by transforming our problem to the set-partitioning problem.

Vicory, Jared (2016)

“Learned Shape Deformation and Regional Texture-Based Appearance for TRUS Prostate Segmentation”
Under the direction of Stephen Pizer

Transferring identified regions of interest (ROIs) from planning-time MRI images to the trans-rectal ultrasound (TRUS) images used to guide prostate biopsy is difficult because of the large difference in appearance between the two modalities as well as the deformation of the prostate’s shape caused by the TRUS transducer.  This dissertation describes methods for addressing these difficulties by both estimating a patient’s prostate shape after the transducer is applied and then locating it in the TRUS image using skeletal models (s-reps) of prostate shapes.

First, I introduce a geometrically-based method for interpolating discretely sampled s-reps into continuous objects.  This interpolation is important for many tasks involving s-reps, including fitting them to new objects as well as the later applications described in this dissertation.  This method is shown to be accurate for ellipsoids where an analytical solution is known.

Next, I create a method for estimating a probability distribution on the difference between two shapes.  Because s-reps live in a high-dimensional curved space,  I use Principal Nested Spheres (PNS) to transform these representations to instead live in a flat space where standard techniques can be applied.  This method is shown effective both on synthetic data as well as for modeling the deformation caused by the TRUS transducer to the prostate.

In cases where appearance is described via a large number of parameters, such as intensity combined with multiple texture features, it is computationally beneficial to be able to turn these large tuples of descriptors into a scalar value.  Using the inherent localization properties of s-reps, I develop a method for using regionally-trained classifiers to turn appearance tuples into the probability that the appearance tuple in question came from inside the prostate boundary.  This method is shown to be able to accurately discern inside appearances from outside appearances over a large majority of the prostate boundary.

Finally, I combine these techniques into a deformable model-based segmentation framework to segment the prostate in TRUS.  By applying the learned mean deformation to a patient’s prostate and then deforming it so that voxels with high probability of coming from the prostate’s interior are also in the model’s interior, I am able to generate prostate segmentations which are comparable to state of the art methods.

Vittayakorn, Sirion (2016)

Visual Attribute Discovery and Analysis from Web-Data

Under the direction of Tamara L. Berg

Electronic copy available

Visual attributes or visual appearances are important for understanding object appearance or describing objects. For the object classification or recognition, the algorithm needs to infer the visual attributes of object to compare, categorize or recognize the objects. Importantly, visual attributes are crucial when facing with an unknown object. Although an unknown object cannot be identified, it can be described by the visual attributes. Since different object categories usually share some common attributes (e.g. many animals have four legs, a tail and fur), explicitly modeling attributes not only allows previously learnt attributes to be transferred to a novel category but also reduces the number of training samples for the new category which can be important when the number of training samples are limited.

The objective of this dissertation is to exploit the weakly-supervised data from the web to construct computational methodologies to discover and analyses visual attributes both across time and domain. The dissertation first presents an automatic approach to learn visual attributes from the open-world vocabulary on the Web using convolutional neural network. This dissertation tries to understand visual attributes in terms of perception inside deep neural networks by focusing on the analysis of neural activations to identify the degree of being visually perceptible, namely visualness of the attributes. Moreover, the approach exploits the layered structure of the deep model to determine the semantic depth of the attributes. 

Then the dissertation introduces a system to recognize and analyze the visual concepts across time or visual trends. The dissertation explores several deep neural networks for estimating when objects were made together with the analyses of the neural activations and their entropy to gain insights into the deep network. To utilize the dating historical object frameworks in the real-world applications, the dating frameworks are applied to analyze the influence of vintage fashion on runway collections as well as the influence of fashion on runway collections and on street fashion.

Finally, the dissertation introduces an approach to recognize and transfer the visual concepts across domain in the realistic manner. Given two input images from two different domains: 1) shopping image, and 2) scene image, the dissertation proposes the generative adversarial network for transferring the product pixels from the shopping image to the scene image such that: 1) the output image looks realistic and 2) the visual attributes of the product are preserved.


Walker II, John Q. (1991)

“Automated Analysis of Computer-Generated Software Usage Protocols: An Exploratory Study”
Under the direction of John B. Smith
Electronic copy available

Highly-interactive computer software can potentially help users think and work more effectively. To realize this potential, software developers should understand the cognitive processes involved in the tasks being performed and the ways users interact with the software to accomplish the tasks. Gathering data about software usage–called protocols–is costly in several ways, including preparing representative test scenarios, finding suitable subjects, training personnel to administer the tests and record the protocols, and collecting and coding the protocols. Similarly, analyzing protocols can be tedious, often done manually by skilled researchers. Because of their high costs, protocol studies frequently consist of a few subjects tested while performing synthetic tasks in an artificial setting. The value of these studies is limited both for software developers and researchers in human-computer interaction. This paper describes a method used to collect and analyze the protocols of a large number of subjects performing tasks in a naturalistic setting. An interactive computer program was developed as a testbed for this study. It contained an automatic tracker that unobtrusively collected protocol records of users’ interactions with the program. Users’ strategies in working with this program were modeled as a formal grammar, and a parser was devised, based on the grammar, to analyze protocol records produced by the program. Users’ behaviors and strategies of working with the program were examined and characterized, based upon the parsed protocol data. A graphical structure editor was created as the testbed for this study; it assists in expository writing, such as technical journal articles, with special emphasis on the exploratory and organizational phases of the writing process. This paper discusses lessons learned in devising the grammar to model users’ writing sessions with the editor, in building and refining the parser, and in analyzing the protocol records for 112 sessions collected from 29 subjects.

Wang, Jeremy (2013)

“Analysis and Visualization of Local Phylogenetic Structure within Species”
Under the direction of Leonard McMillan

While it is interesting to examine the evolutionary history and phylogenetic relationship between species, for example, in a sort of “tree of life”, there is also a great deal to be learned from examining population structure and relationships within species. A careful description of phylogenetic relationships within species provides insights into causes of phenotypic variation, including disease susceptibility. The better we are able to understand the patterns of genotypic variation within species, the better these populations may be used as models to identify causative variants and possible therapies, for example through targeted genome-wide association studies (GWAS). My thesis describes a model of local phylogenetic structure, how it can be effectively derived under various circumstances, and useful applications and visualizations of this model to aid genetic studies. I introduce a method for discovering phylogenetic structure among individuals of a population by partitioning the genome into a minimal set of intervals within which there is no evidence of recombination. I describe two extensions of this basic method. The first allows it to be applied to heterozygous, in addition to homozygous, genotypes and the second makes it more robust to errors in the source genotypes. I demonstrate the predictive power of my local phylogeny model using a novel method for genome-wide genotype imputation. This imputation method achieves very high accuracy–on the order of the accuracy rate in the sequencing technology–by imputing genotypes in regions of shared inheritance based on my local phylogenies. Comparative genomic analysis within species can be greatly aided by appropriate visualization and analysis tools. I developed a framework for web-based visualization and analysis of multiple individuals within a species, with my model of local phylogeny providing the underlying structure. I will describe the utility of these tools and the applications for which they have found widespread use.

Wang, Jih-Fang (1990)

“A Real-time Optical 6D Tracker for Head-mounted Display Systems”
Under the direction of Henry Fuchs
Electronic copy available

Significant advance has been made towards realistic synthesis and display of three-dimensional objects using computers during the past two decades. However, the interaction between human and computer-generated scenes remains largely remote through devices such as keyboards, mice, joysticks, etc. Head-mounted display provides a mechanism for much more realistic visualizing and interacting with computer-generated 3D scenes through the hand-eye-body coordination exercised daily. Head-mounted display systems require that the position and orientation of the user’s head be tracked in real time with high accuracy in a large working environment. Current 6D positional tracking devices (3 translational and 3 rotational parameters) fail to satisfy these requirements. In this dissertation, a new system for real-time, six-dimensional position tracking is introduced, studied, and documented. This system adopts an inside-out tracking paradigm. The working environment is a room in which the ceiling is lined with a regular pattern of infrared LED beacons which are flashing (invisible to the human eyes) under the system’s control. Three cameras are mounted on a helmet which the user wears. Each camera uses a lateral effect photodiode as the recording surface. The 2D image positions of the flashing beacons inside the field of view of the cameras are recorded and reported in real time. The measured 2D image positions and the known 3D positions of beacons are used to compute the position of the camera assembly in space. We have designed an iterative algorithm to estimate the 6D position of the camera assembly in space. This algorithm is a generalized version of the Church’s method, and allows for multiple cameras with nonconvergent nodal points. Several equations are formulated to predict the system’s error analytically, and the system’s error is quantitatively studied and compared with the theoretical prediction. The requirements of accuracy, speed, adequate working volume, light weight and small size of the tracker are addressed. A novel approach to calibrate the positions of beacons automatically, is also proposed. A prototype was designed and constructed to demonstrate the integration and coordination of all essential components in the new tracker. This prototype uses off-the-shelf components and can be easily duplicated. Our results indicate that the new system significantly outperforms other existing systems. The new tracker prototype provides about 25 updates per second, and registers 0.1-degree rotational movements and 2- millimeter translational movements. The future full system will have a working volume about 1,000 cubic feet (10 ft on each side), and will provide more than 200 updates per second with a lag of less than 5 milliseconds by running on a faster processor. We aim at developing a new tracking system which offers a large working volume, high accuracy in the estimated head position, fast update rate, and immunity to the electro-magnetic interference of the environment. Such a system should find applications in many 6D position-reporting and input tasks.

Wang, Ke (2018)

Towards Efficient 3D Reconstructions from High-Resolution Satellite Imagery

Under the direction of Jan-Michael Frahm

Electronic copy available

Recent years have witnessed the rapid growth of commercial imaging satellites. Compared with other imaging products, modern satellite images are captured at high resolution and with multiple spectral bands, thus provide unique viewing angles, global coverage, and frequent updates of the Earth surfaces. With automated processing and intelligent analysis algorithms, satellite images enable global-scale 3D and semantic modeling applications.

                This thesis explores uncovering 3D reconstructions from satellite images at different levels: geometric, semantic, and parametric models. However, reconstructing satellite imagery is particularly challenging for the following reasons: 1) Satellite images typically contain an enormous amount of raw pixels. Efficient algorithms are needed to avoid the substantial computational overhead. 2) the ground sampling distances of satellite images are rather low. Objects, such as buildings, appear visually small and cluttered, thus posing challenges for 3D modeling. 3) satellite images usually have complicated camera models and inaccurate camera calibrations. Rational polynomial coefficients (RPC) camera models, although widely used, need to be appropriately handled to enable high-quality reconstructions.

                To obtain geometric reconstructions efficiently, we propose an edge-aware interpolation based algorithm to compute point clouds from satellite image pairs. Initial 2D pixel matches are filtered and triangulated to compensate the RPC calibration errors. Noisy dense correspondences can then be estimated by interpolating the inlier matches in an edge-aware manner. After refining the correspondence map with a fast bilateral solver, we can obtain dense 3D point clouds via triangulation.

             Pixel-wise semantic maps of satellite images are usually noisy. Thus, we propose to aggregate multiple corresponding observations of the same pixel to obtain high-quality semantic models. Instead of just leveraging geometric reconstructions to provide such correspondences, we perform geometric modeling and semantic reasoning in a joint Markov Random Field. Our experiments show that both tasks can benefit from the joint inference.

             Last but not least, we propose a novel deep learning based approach to perform single-view parametric reconstructions from satellite imagery. By parametrizing buildings as 3D cuboids, our method simultaneously localizes building instances visible in the image and estimates their corresponding cuboid models. Utilizing vectorized GIS maps and Lidar as supervision, we upsample CNN features to detect small but cluttered building instances. Besides, we estimate building contours through a separate network to avoid overlapping building cuboids.

Wang, Qian (2013)

“Registration in Large-Scale Population of Brain MR Images”
Under the direction of Dinggang Shen

Non-rigid image registration is fundamentally important in analyzing large-scale population of medical images, in particular, T1-weighted brain MR data. Conventional pairwise registration methods involve only two images, as the moving subject image is deformed towards the space of the template for the maximization of their in-between similarity. The population information, however, is mostly ignored, while individual images in the population are independently registered with the arbitrarily designated template. On the contrary, this dissertation investigates the contributions from the entire population to image registration.

First, the population provides guidance to the pairwise registration of a certain subject and the template. That is, the subject shares similar deformation field with another intermediate image in the population, if the subject and the intermediate image are similar in appearances. The guidance of the intermediate image  properly selected from the population is thus beneficial to the subject, in that the deformation of the intermediate image initiates the optimization of the subject deformation field.

Second, all images in the population can be registered towards the common space of the population via the groupwise manner. Groupwise registration differs from the traditional pairwise design since no template is pre-determined. Instead, all images agglomerate to the common space of the population simultaneously. Moreover, the common space is revealed spontaneously during image, without introducing any unwelcomed bias towards following analyses and applications.

This dissertation shows that population information contributes to both pairwise registration and groupwise registration. In particular, to utilize the guidance from the intermediate images in the population, the pairwise registration of a specific subject is more robust and accurate compared to the direct registration of the subject under consideration with the template. Meanwhile, in terms of groupwise registration, all images in the population are aligned more accurately in the common space, even though the complexity of the problem increases substantially.

Wang, Weibo (2015)

“Detect Copy Number Variations from Read-Depth of High-Throughput Sequencing Data”
Under the direction of Wei Wang

Copy-number variation (CNV) is a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize CNVs. High-throughput sequencing (HTS) technologies promise to revolutionize CNV detection but present substantial analytic challenges. This dissertation investigates improving the CNV detection using HTS data mainly from the following aspects.

  • It is observed that various sources of experimental biases in HTS confound read-depth estimation, and bias correction has not been adequately addressed by existing methods. This dissertation presents a novel read-depth-based method, GENSENG, which uses a Hidden Markov Model (HMM) and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS datasets, it is concluded that GENSENG outperforms peer read-depth-based CNV-detection algorithms.
  • It is conceivable that allele-specific reads from HTS data could be leveraged to both enhance CNV detection as well as produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. This dissertation presents an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, extensive experiments have been conducted including simulations, generated empirical data using existing WGS and WES datasets and validated predicted CNVs using an independent methodology. It is concluded that AS-GENSENG not only predicts accurate ASCN calls, but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data.
  • GENSENG employs a negative binomial regression, which is a special case of generalized linear model (GLM) to account for confounding factors such as mappability or GC content. In such a GLM, the sample size is the number of windows spanning the whole genome. The huge number of genomic windows leads to high computational cost. It is proposed in this dissertation an efficient randomized algorithm to infer regression coefficients of GLM in GENSENG, which is called R-GENSENG. Instead of using all the genomic windows, R-GENSENG uses weighed sampling of a subset of genomic windows, and thus it is computationally much more efficient than GENSENG. In addition, R-GENSENG provides accurate estimate of CNVs, and more efficient estimates than randomly sampling a subset of genomic windows.


Wang, Xueyi (2008)

“Exploring RNA and Protein 3D structures By Geometric Algorithms”
Under the direction of Jack Snoeyink
Electronic copy available

Many problems in RNA and protein structures are related with their specific geometric properties. I investigate these geometric properties and explore three different ways that geometric algorithms can help to the study of the structures. *Determine accurate structures.* Accurate details in RNA structures are important for understanding RNA function, but most existing RNA backbone conformations show serious steric clashes. I developed a program RNABC using forward kinematics and conjugate gradient methods that searches for alternative clash-free conformations with acceptable geometry. Two tests show that RNABC improves backbone conformations for most problem suites in S-motifs and for many of the worst problem suites identified by the Richardson lab. *Display structure commonalities.* Structure alignment commonly uses root mean squared distance (RMSD) to measure structure similarity. I extend RMSD to weighted RMSD for multiple structures and show that using wRMSD with multiplicative weights implies the average is a consensus structure. I develop a near-linear iterative algorithm to converge to a local minimum of wRMSD and a heuristic algorithm to reduce the effect of outliers and find structurally conserved regions. *Distinguish local structural features.* Identifying common motifs is one way to further our understanding of the structure and function of molecules. I apply a graph database mining technique to identify RNA tertiary motifs. Tests show that this method can identify most known RNA tertiary motifs in these families and suggest candidates for novel tertiary motifs.

Ward, Bryan (2016)

Sharing Non-Processor Resources in Multiprocessor Real-Time Systems

Under the direction of Jim Anderson

Electronic copy available

Computing devices are increasingly being leveraged in cyber-physical systems, in which computing devices sense, control, and interact with the physical world. Associated with many such real-world interactions are strict timing constraints, which if unsatisfied, can lead to catastrophic consequences. Modern examples of such timing constraints are prevalent in automotive systems, such as airbag controllers, anti-lock brakes, and new autonomous features. In all of these examples, a failure to correctly respond to an event in a timely fashion could lead to a crash, damage, injury and even loss of life. Systems with imperative timing constraints are called real-time systems, and are broadly the subject of this dissertation.

Much previous work on real-time systems and scheduling theory assumes that computing tasks are \emph{independent}, \ie, the only resource they share is the platform upon which they are executed. In practice, however, tasks share many resources, ranging from more overt resources such as shared memory objects, to less overt ones, including data buses and other hardware and I/O devices. Accesses to some such resources must be synchronized to ensure safety, i.e., logical correctness, while other resources may exhibit better run-time performance if accesses are explicitly synchronized. The goal of this dissertation is to develop new synchronization algorithms and associated analysis techniques that can be used to synchronize access to many classes of resources, while improving the overall resource utilization, specifically as measured by real-time schedulability.

Towards that goal, the Real-Time Nested Locking Protocol (RNLP), the first multiprocessor real-time locking protocol that supports lock nesting or fine-grained locking is proposed and analyzed. Furthermore, the RNLP is extended to support reader/writer locking, as well as k-exclusion locking. All presented RNLP variants are proven optimal. Furthermore, experimental results demonstrate the schedulability-related benefits of the RNLP.

Additionally, three new synchronization algorithms are presented, which are specifically motivated by the need to manage shared hardware resources to improve real-time predictability. Furthermore, two new classes of shared resources are defined, and the first synchronization algorithms for them are proposed. To analyze these new algorithms, a novel new analysis technique called idleness analysis is presented, which can be used to incorporate the effects of blocking into schedulability analysis.

Ward, Kelly Anne (2005)

“Modeling Hair Using Levels-of-Detail”
Under the direction of Ming C. Lin

Modeling hair is important for creating realistic virtual humans in various applications. A human head typically contains over 100,000 strands of hair, each strand being extremely thin and thereby generating an intricate hair volume. Due to its complex nature, hair simulation, including the reproduction of interactions both among the hair strands and between the hair and the avatar, is computationally overwhelming. The rendering of hair is similarly challenging, particularly as a result of the shadows caused by hair self-occlusions. Consequently, many interactive applications in practice today are forced to overlook several complex features of hair in order to attain a desired performance. By simplifying the hair volume, these applications often compromise the visual quality of the hair. Moreover, they typically contain a considerable amount of unnecessary computation allocated towards strands of hair that have minimal significance to such applications. In this thesis, I introduce the use of levels of detail for modeling hair. Levels of detail enable a simulation to allocate the majority of computational resources towards modeling those strands of hair with the most significance to the application at hand. The methods I discuss are based on the use of three discrete hair representations: strips, clusters and strands. Each representation provides a different level of visual fidelity and performance speed for simulating and rendering hair. The visibility, motion and viewing distance of the hair, as well as the user’s interaction, are considered in identifying those groups of hair with the greatest significance to the application. The techniques I present then accelerate the simulation and rendering of the hair strands with the lowest importance to the application, thereby accelerating the overall modeling of the hair. Moreover, in this dissertation I offer several techniques for dynamically changing the physical structure, behavior and appearance of hair as water or styling products are applied to it. These styling methods are then coupled with the level of detail framework to allow users to interactively style virtual hair. Wei, Lei (2013) “Privacy-Preserving Regular-Expression Evaluation on Encrypted Data”
Under the direction of Mike Reiter Motivated by the need to outsource file storage to untrusted clouds while still permitting controlled use of that data by authorized third parties, in this dissertation we present a family of protocols by which a client can evaluate a regular expression on an encrypted file stored at a server (the cloud), once authorized to do so by the file owner. We present a protocol that provably protects the privacy of the regular expression and the file contents from a malicious server and the privacy of the file contents (except for the evaluation result) from an honest-but-curious client. We then extend this protocol in two primary directions. In one direction, we develop a strengthened protocol that enables the client to detect any misbehavior of the server; in particular, the client can verify that the result of its regular-expression evaluation is based on the authentic file stored there by the data owner, and in this sense the file and evaluation result are authenticated to the client. The second direction in which we extend our initial protocol is motivated by the vast adoption of resource-constrained mobile devices, and the fact that our protocols involve relatively intensive client-server interaction and computation on the searching client. We therefore investigate an alternative in which the client (e.g., via her mobile device) can submit her encrypted regular expression to a partially trusted proxy, which then interacts with the server hosting the encrypted data and reports the encrypted evaluation result to the client. Neither the search query nor the result is revealed to an honest-but-curious proxy or malicious server during the process. We demonstrate the practicality of the protocol by prototyping a system to perform regular-expression searches on encrypted emails and evaluate its performance using a real-world email dataset.

Weigle, Christopher Charles (2006)

“Displays for Exploration and Comparison of Nested or Intersecting Surfaces”
Under the direction of Russell M. Taylor II
Department of Computer Science Technical Report TR06-026
Electronic copy available

The surfaces of real-world objects almost never intersect, so the human visual system is ill prepared to deal with this rare case. However, the comparison of two similar models or approximations of the same surface can require simultaneous estimation of individual global shape, estimation of point or feature correspondences, and local comparisons of shape and distance between the two surfaces. A key supposition of this work is that these relationships between intersecting surfaces, especially the local relationships, are best understood when the surfaces are displayed such that they do intersect. For instance, the relationships between radiation iso-dose levels and healthy and tumorous tissue is best studied in context with all intersections clearly shown. This dissertation presents new visualization techniques for general layered surfaces, and intersecting surfaces in particular, designed for scientists with problems that require such display. The techniques are enabled by a union/intersection refactoring of intersecting surfaces that converts them into nested surfaces, which are more easily treated for visualization. The techniques are aimed at exploratory visualization, where accurate performance of a variety of tasks is desirable, not just the best technique for one particular task. User studies, utilizing tasks selected based on interviews with scientists, are used to evaluate the effectiveness of the new techniques, and to compare them to some existing, common techniques. The studies show that participants performed the user study tasks more accurately with the new techniques than with the existing techniques.

Weigle, Michele Aylene Clark (2003)

“Investigating the Use of Synchronized Clocks in TCP Congestion Control”
Under the direction of Kevin Jeffay
Electronic copy available

In TCP Reno, the most common implementation of TCP, segment loss is the sole indicator of network congestion. TCP Reno only adjusts to congestion when segment loss has been detected in the network, thus, TCP Reno’s congestion control is tied to its error recovery mechanism. My dissertation thesis is that precise knowledge of one-way transit times (OTTs) can be used to improve the performance of TCP congestion control. Performance is measured in terms of network-level metrics, including packet loss and average queue sizes at congested links, and in terms of application-level metrics, including HTTP response times and throughput per HTTP response. A connection’s forward path OTT is the amount of time it takes a packet to traverse all links from the sender to the receiver, including both propagation and queuing delays. Queues in routers build up before they overflow, resulting in increased OTTs. If all senders directly measure changes in OTTs and back off when the OTT indicates that congestion is occurring, congestion could be alleviated. I introduce a variant of TCP, called Sync-TCP, which uses synchronized clocks to gather a connection’s OTT data. I use Sync-TCP as a platform for investigating techniques for detecting and responding to changes in OTTs. This dissertation makes the following contributions:

  • a method for measuring a flow’s OTT and returning this exact timing information to the sender
  • comparison of several methods for using OTTs to detect congestion
  • Sync-TCP — a family of end-to-end congestion control mechanisms based on using OTTs for congestion detection
  • study of standards-track TCP congestion control and error recovery mechanisms in the context of HTTP traffic

I will show that Sync-TCP provides lower packet loss, lower queue sizes, lower HTTP response times, and higher throughput per HTTP response than TCP Reno. Additionally, I will show that Sync-TCP offers performance comparable to that achieved by using router-based congestion control mechanisms. If all flows use Sync-TCP to react to increases in queuing delay, congestion could be alleviated quickly. This would result in overall shorter queues, faster response for interactive applications, and a more efficient use of network resources.

Wei, Lei (2013)

“Privacy-Preserving Regular-Expression Evaluation on Encrypted Data”
Under the direction of Mike Reiter

Motivated by the need to outsource file storage to untrusted clouds while still permitting controlled use of that data by authorized third parties, in this dissertation we present a family of protocols by which a client can evaluate a regular expression on an encrypted file stored at a server (the cloud), once authorized to do so by the file owner. We present a protocol that provably protects the privacy of the regular expression and the file contents from a malicious server and the privacy of the file contents (except for the evaluation result) from an honest-but-curious       client.  We then extend this protocol in two primary directions.  In one direction, we develop a strengthened protocol that enables the client to detect any misbehavior of the server; in particular, the client can verify that the result of its regular-expression evaluation is based on the authentic file stored there by the data owner, and in this sense the file and evaluation result are authenticated to the client.

The second direction in which we extend our initial protocol is motivated by the vast adoption of resource-constrained mobile devices, and the fact that our protocols involve relatively intensive client-server interaction and computation on the searching client.  We therefore investigate an alternative in which the client (e.g., via her mobile device) can submit her encrypted regular expression to a partially trusted proxy, which then interacts with the server hosting the encrypted data and reports the encrypted evaluation result to the client. Neither the search query nor the result is revealed to an honest-but-curious proxy or malicious server during the process. We demonstrate the practicality of the protocol by prototyping a system to perform regular- expression searches on encrypted emails and evaluate its performance using a real-world email dataset.

Welch, Gregory F. (1997)

“SCAAT: Incremental Tracking with Incomplete Information”
Under the direction of Gary Bishop
Department of Computer Science Technical Report TR96-051
Electronic copy available

The Kalman filter provides a powerful mathematical framework within which ax minimum mean-square-error estimate of a user’s position and orientation can be tracked using a sequence of single sensor observations, as opposed to groups of observations. We refer to this new approach as single-constraint-at-a-time or SCAAT tracking. The method improves accuracy by properly assimilating sequential observations, filtering sensor measurements, and by concurrently autocalibrating mechanical or electrical devices. The method facilitates user motion prediction, multisensor data fusion, and in systems where the observations are only available sequentially it provides estimates at a higher rate and with lower latency than a multiple-constraint approach. Improved accuracy is realized primarily for three reasons. First, the method avoids mathematically treating truly sequential observations as if they were simultaneous. Second, because each estimate is based on the observation of an individual device, perceived error (statistically unusual estimates) can be more directly attributed to the corresponding device. This can be used for concurrent autocalibration which can be elegantly incorporated into the existing Kalman filter. Third, the Kalman filter inherently addresses the effects of noisy device measurements. Beyond accuracy, the method nicely facilitates motion prediction because the Kalman filter already incorporates a model of the user’s dynamics, and because it provides smoothed estimates of the user state, including potentially unmeasured elements. Finally, in systems where the observations are only available sequentially, the method can be used to weave together information from individual devices in a very flexible manner, producing a new estimate as soon as each individual observation becomes available, thus facilitating multisensor data fusion and improving the estimate rates and latencies. The most significant aspect of this work is the introduction and exploration of the SCAAT approach to 3D tracking for virtual environments. However I also believe that this work may prove to be of interest to the larger scientific and engineering community in addressing a more general class of tracking and estimation problems.

Welch, Joshua (2017)

Computational Methods for Inferring Transcriptome Dynamics

Under the direction of Jan Prins

Electronic copy available

The sequencing of the human genome paved the way for a new type of medicine, in which a molecular-level, cell-by-cell understanding of the genomic control system informs diagnosis and treatment. A key experimental approach for achieving such understanding is measuring gene expression dynamics across a range of cell types and biological conditions. The raw outputs of these experiments are millions of short DNA sequences, and computational methods are required to draw scientific conclusions from such experimental data.

In this dissertation, I present computational methods to address some of the challenges involved in inferring dynamic transcriptome changes. My work focuses two types of challenges: (1) discovering important biological variation within a population of single cells and (2) robustly extracting information from sequencing reads.

Three of the methods are designed to identify biologically relevant differences among a heterogenous mixture of cells. SingleSplice uses a statistical model to detect true biological variation in alternative splicing within a population of single cells. SLICER elucidates transcriptome changes during a sequential biological process by positing the process as a nonlinear manifold embedded in high-dimensional gene expression space. MATCHER uses manifold alignment to infer what multiple types of single cell measurements obtained from different individual cells would look like if they were performed simultaneously on the same cell. These methods gave insight into several important biological systems, including embryonic stem cells and cardiac fibroblasts undergoing reprogramming.

To enable study of the pseudogene ceRNA effect, I developed a computational method for robustly computing pseudogene expression levels in the presence of high sequence similarity that confounds sequencing read alignment. AppEnD, an algorithm for detecting untemplated additions, allowed the study of transcript modifications during RNA degradation.

Welsh, Catherine Elizabeth (2014)

“Computational Tools to Aid the Design and Development of a Genetic Reference Population”
Under the direction of Leonard McMillan

Model organisms are important tools used in biological and medical research. A key component of a genetics model organism is a known and reproducible genome. In the early 1900s, geneticists developed methods for fixing genomes by inbreeding. First generation genetic models used inbreeding to create disease models from animals with spontaneous or stimulated mutations.

Recently, geneticists have begun to develop a second generation of models which better represent the human population in terms of diversity. One such model is the Collaborative Cross (CC), which is a mouse model derived from 8 founders. I have been involved in developing the CC since its early stages. In particular, I am interested in speeding up the inbreeding process, since it currently takes an average of thirty-six generations to achieve complete fixation.

To speed up the inbreeding process, I developed a simulator that replicates the breeding process and tested various breeding strategies before applying them to a CC. To apply the simulation techniques to live mice, a fast, low-cost way to monitor their genomes at each generation was needed. As a result, two genotyping arrays were designed, a first generation array with 7,851 markers called MUGA and a second generation array called MegaMUGA with 77,800 markers. Both arrays were designed specifically to be maximally informative for the CC population. Using these genotyping arrays, one can determine from which of the eight CC founders each part of a developing mouse lines genome is inherited. I refer to these as haplotype reconstructions, and they are used as the input into my simulations as well as various other monitoring tools. To determine the accuracy of these haplotype reconstructions, I used DNA sequencing data for three samples which were also genotyped, and compared the haplotype reconstructions from the DNA sequencing data to solutions from the genotyping array data.


Wendt, Jeremy (2010)

“Real-Walking Models Improve Walking-In-Place Systems”
Under the direction of Dr. Fred P. Brooks
Electronic copy available

Many Virtual Environment (VE) systems require a walking interface to travel through the virtual world. Real Walking the preferred interface is only feasible if the virtual world is smaller than the real-world tracked space. When the to-be-explored virtual world is larger than the real-world tracked space, Walking-In-Place (WIP) systems are frequently used: The user’s in-place steps tell the WIP system how to move the virtual viewpoint through the virtual world. When the system-generated forward motions do not match the user’s intended motions, the user becomes frustrated and the VE experience degrades. This dissertation presents two Real-Walking-based models that enable WIP systems to generate walking-like speeds. The ?rst (GUD WIP) calculates in-place step frequency even when only a portion of a step has completed. The second (The Forward Walking Model) measures a users step-frequency-to-walk-speed function in real-time from head-track data alone. The two models combined enable per-user-calibrated Real-Walking-like speeds from in-place gestures. This dissertation also presents two user studies that demonstrate that WIP systems which employ these models are better than a previous speed-focused WIP system.

Westover, Lee A. (1991)

“SPLATTING: A Parallel, Feed-Forward Volume Rendering Algorithm”
Under the direction of Turner Whitted
Department of Computer Science Technical Report TR91-029
Electronic copy available

Volume rendering is the generation of images from discrete samples of volume data. The volume data is sampled in at least three dimensions and comes in three basic classes: the rectilinear mesh–for example, a stack of computed tomography scans; the curvilinear mesh–for example, computational fluid dynamic data sets of the flow of air over an airplane wing; and the unstructured mesh–for example, a collection of ozone density readings at multiple elevations from a set of collection stations in the United States. Previous methods coerced the volumetric data into line and surface primitives that were viewed on conventional computer graphics displays. This coercion press has two fundamental flaws: viewers are never sure whether they are viewing a feature of the data or an artifact of the coercion process; and the insertion of a geometric modeling procedure into the middle of the display pipeline hampers interactive viewing. New direct rendering approaches that operate on the original data are replacing coercion approaches. These new methods, which avoid the artifacts introduced by conventional graphics primitives, fall into two basic categories: feed-backward methods that attempt to map the image plane onto the data, and feed-forward methods that attempt to map each volume element onto the image plane. This thesis presents a feed-forward algorithm, called splatting, that directly renders rectilinear volume meshes. The method achieves interactive speed through parallel execution, successive refinement, table-driven shading, and table-driven filtering. The method achieves high image quality by paying careful attention to signal processing principles during the process of reconstructing a continuous volume from the sampled input. This thesis’ major contribution to computer graphics is the splatting algorithm. It is a naturally parallel algorithm that adheres well to the requirements imposed by signal processing theory. The algorithm has uncommon features. First, it can render volumes as either clouds or surfaces by changing the shading functions. Second, it can smoothly trade rendering time for image quality at several stages of the rendering pipeline. In addition this thesis presents a theoretical framework for volume rendering.

Whitaker, Ross T. (1993)

“Geometry-Limited Diffusion”
Under the direction of Stephen M. Pizer
Department of Computer Science Technical Report TR94-037
Electronic copy available

This paper addresses the problem of image segmentation and suggests that a number of interesting visual features can be specified as the boundaries of segments. It proposes a segmentation strategy that utilizes the local homogeneity of local image structure. Stable differential measurements are made via a variable-conductance diffusion process, sometimes called anisotropic diffusion. Anisotropic diffusion has been shown to preserve important image structure while reducing unwanted noise. A multi-scale approach to variable-conductance diffusion is described. This technique combines information at a range of scales and provides a means of obtaining precise boundaries of “large-scale” objects in the presence of noise. These ideas generalize to incorporate multi-valued images and higher-order geometric structure. The result is a framework for constructing image segmentations based on the homogeneity of a set of descriptors. When applied to local image geometry, this method offers a means of breaking images into geometric patches that have interesting visual features as boundaries. Several examples of this framework are presented. While the diffusion of intensity provides a means of detecting edges, the diffusion of first-order information produces ridges and valleys. A combination of first-and second-order information provides a means of detecting the “middles” as well as the edges of objects. A proof of the geometric invariance under orthogonal group transformations is given. These same methods generalize to include systems of frequency-sensitive diffusion processes. These systems are useful for characterizing patches based on texture. An empirical investigation of these nonlinear diffusion processes shows they are stable in the presence of noise and offer distinct advantages over conventional linear smoothing. This analysis derives from a novel method for evaluating low-level computer-vision algorithms. The method uses ensembles of stochastically generated images to compute a set of statistical metrics which quantify the accuracy and reliability of a process. This evaluation offers a means of comparing different algorithms and exploring the space of free parameters associated with a single algorithm.

White, Andrew (2015)

“Practical Analaysis of Encrypted Network Traffic”
Under the direction of Fabian Monrose

The growing use of encryption in network communications is an undoubted boon for user privacy. However, the limitations of real-world encryption schemes are still not well understood, and new side- channel attacks against encrypted communications are disclosed every year. Furthermore, encrypted network communications, by preventing inspection of packet contents, represent a significant challenge from a network security perspective: our existing infrastructure relies on such inspection for threat detection. Both problems are exacerbated by the increasing prevalence of encrypted traffic: recent estimates suggest that 65\% or more of downstream Internet traffic will be encrypted by the end of 2016. This work addresses these problems by expanding our understanding of the properties and characteristics of encrypted network traffic and exploring new, specialized techniques for the handling of encrypted traffic by network monitoring systems.

We first demonstrate that opaque traffic, of which encrypted traffic is a subset, can be identified in real- time and how this ability can be leveraged to improve the capabilities of existing IDS systems. To do so, we evaluate and compare multiple methods for rapid identification of opaque packets, ultimately pinpointing a simple hypothesis test (which can be implemented on an FPGA) as an efficient and effective detector of such traffic. In our experiments, using this technique to “winnow”, or filter, opaque packets from the traffic load presented to an IDS system significantly increased the throughput of the system, allowing the identification of many more potential threats than the same system without winnowing.

Second, we show that side channels in encrypted VoIP traffic enable the reconstruction of approximate transcripts of conversations. Our approach leverages techniques from linguistics, machine learning, natural language processing, and machine translation to accomplish this task despite the limited information leaked by such side channels. Our ability to do so underscores both the potential threat to user privacy which such side channels represent and the degree to which this threat has been underestimated.

Finally, we propose and demonstrate the effectiveness of a new paradigm for identifying HTTP resources retrieved over encrypted connections. Our experiments demonstrate how the predominant paradigm from prior work fails to accurately represent real-world situations and how our proposed approach offers significant advantages, including the ability to infer partial information, in comparison. We believe these results represent both an enhanced threat to user privacy and an opportunity for network monitors and analysts to improve their own capabilities with respect to encrypted traffic.

David Wilkie (2015)

“Simulating, Reconstructing, & Routing Metropolis-Scale Traffic”
Under the direction of Ming Lin

Few phenomena are more ubiquitous than traffic, and few are more significant economically, socially, or environmentally. The vast, world-spanning road network enables the daily commutes of billions of people and makes us mobile in a way our ancestors would have envied. And yet, few systems perform so poorly so often. Gridlock and traffic jams cost 2.9 billion gallons of wasted fuel and costs over 121 billion dollars every year in the U.S. alone.

One promising approach to improving the reliability and efficiency of traffic systems is to incorporate computational techniques fully into the system, transforming the traffic systems of today into cyber-physical systems. However, creating a truly cyber-physical traffic system will require overcoming many substantial challenges. The state of traffic at any given time is unknown for the majority of the road network. The dynamics of traffic are complex, noisy, and dependent on drivers’ decisions. The domain of the system, the real-world road network, has no suitable representation for high-detail simulation. And there is no known solution for improving the efficiency and reliability of the system. In this dissertation, I propose techniques that combine simulation and data-driven techniques to solve these challenges and enable detailed state estimation, forward simulation, and optimization.

First, I propose an efficient method for creating a road network representation from a GIS database to construct a geometrically and topologically consistent 3D model to be used in high-detail, real-time traffic simulation, interactive visualization, and autonomous vehicle navigation.  The resulting model representation provides important road features for traffic simulations, including ramps, highways, overpasses, legal merge zones, and intersections with arbitrary states, and it is independent of the simulation methodologies.

Second, to estimate and communicate traffic conditions, I propose a fast technique to reconstruct traffic flows from in-road sensor measurements or user-specified data for interactive 3D visualization and communication applications. My algorithm estimates the full state of the traffic flow from sparse sensor measurements using a statistical inference method and a continuum traffic model. This estimated state then drives an agent-based traffic simulator to produce a 3D animation of vehicle traffic that statistically matches the original traffic conditions.

Third, to improve the efficiency of the traffic system, I propose a novel approach that takes advantage of mobile devices, such as cellular phones or embedded systems in cars, to form an interactive, participatory network of vehicles that plan their travel routes based on the current traffic conditions and existing routes planned by the network of participants, thereby making a more informed travel decision for each participating user. The premise of this approach is that a route, or plan, for a vehicle is also a prediction of where the car will travel. If routes are created for a sizable percentage of the total vehicle population, an estimate for the overall traffic pattern is attainable. Taking planned routes into account as predictions allows the entire traffic route planning system to better distribute vehicles and minimize traffic congestion.

For each of these problems, my work is motivated by the idea of fully integrating traffic simulation, as a model for the complex dynamics of real world traffic, with emerging data sources, including real-time sensor and public survey data.

Williams Jr., E. Hollins (1981)

“Analysis of FFP Programs for Parallel Associative Searching”
Under the direction of Donald F. Stanat

The formal functional programming (FFP) languages proposed by Backus are capable of expressing unbounded parallelism. Mago has designed a cellular computer that can efficiently execute FFP programs and, within limits of machine size, accommodate the unbounded parallelism. In this work, several analysis techniques are presented and used to predict the execution time and storage requirements of FFP associative searching algorithms on the Mago machine. Both decomposable searching and closest point problems are investigated. Brute force, semi-parallel, and cell methods are presented for solving the decomposable searching problems. If the initial program expression is suitably placed in memory, analysis of the brute force algorithms yields complexity results of C(k) time and O(kn) space. These bounds are shown to be asymptotically optimal with respect to the problem and the machine. Brute force, semi-parallel, and divide-and-conquer solutions are presented for solving the closest point problems. Analyses of the semi- parallel algorithms yield complexity results of O(kn) time and space, which are shown to be asymptotically optimal. Estimates of execution times of fast associative searching algorithms on a hypothetical sequential machine are compared to estimates of the execution times for the same problem on the Mago machine. The results indicate that the Mago machine will perform faster on files of moderate to large size. Suggestions are given for new operators that would reduce the execution time and storage requirements of FFP programs.

Willams, Thomas V. (1982)

“A Man-Machine Interface for Interpreting Electron Density Maps”
Under the direction of Frederick P. Brooks, Jr.

I have designed and implemented a tool for biochemists to use for interpreting electron density maps of crystallized proteins. My work has been concentrated on the representation for electron density maps and on the man-machine interface. Interpreting an electron density map is a difficult pattern recognition problem requiring specialized knowledge of protein structure and global views of the map. The tool is an interactive graphics system in which the human makes strategy decisions and does global pattern recognition, naming and pointing to recognized features in the map. These features belong to a hierarchy of objects natural to the problem. The computer does local, anchored pattern recognition for indicated features, displays the map in a ridge line representation using color to encode the map’s interpretation, and automatically builds a molecular model as the user identifies residues. A ridge line representation for maps was chosen because of the close correspondence of ridge lines to stick-figure models of molecules, because of the relatively few line segments required to display them, and because of the ease with which the density threshold can be changed in real time. Three different sets of people have interpreted electron density maps using this tool. A computer scientist (myself) interpreted a 2.5 A map of Staphyloccal nuclease in 26 hours. This was the first map I had ever interpreted. A highly-skilled professional biochemist interpreted a 3.0 A map of cytochrome b5 in 9 hours. A group of three biochemistry graduate students and post-doctoral fellows interpreted a 2.8 A map of cytochrome c550 in 22 hours. These three successful interpretations done in such relatively short times have shown that the system design is a good and useful one for this application. The contributions of this work to computer science are 1) a documented example of a good man-machine interface, 2) a detailed discussion of the major design decisions that I made, and 3) a demonstration of the usefulness of a ridge line representation for a scalar function of three variables.

Wilson, Andrew Thomas (2002)

“Spatially Encoded Image-Space Simplifications for Interactive Walkthrough”
Under the direction of Dinesh Manocha
Electronic copy available

Many interesting geometric environments contain more primitives than standard rendering techniques can handle at interactive rates. Sample-based rendering acceleration methods such as the use of impostors for distant geometry can be considered simplification techniques in that they replace primitives with a representation that contains less information but is less expensive to render. In this dissertation we address two problems related to the construction, representation, and rendering of image-based simplifications. First, we present an incremental algorithm for generating such samples based on estimates of the visibility error within a region. We use the Voronoi diagram of existing sample locations to find possible new viewpoints and an approximate hardware-accelerated visibility measure to evaluate each candidate. Second, we present spatial representations for databases of samples that exploit image-space and object-space coherence to reduce both storage overhead and runtime rendering cost. The image portion of a database of samples is represented using spatial video encoding, a generalization of standard MPEG2 video compression that works in a 3D space of images instead of a 1D temporal sequence. Spatial video encoding results in an average compression ratio of 48:1 within a database of over 22,000 images. We represent the geometric portion of our samples as a set of incremental textured depth meshes organized using a spanning tree over the set of sample viewpoints. The view-dependent nature of textured depth meshes is exploited during geometric simplification to further reduce storage and rendering costs. By removing redundant points from the database of samples, we realize a 2:1 savings in storage space and nearly 6:1 savings in preprocessing time over the expense of processing all points in all samples. The spatial encodings constructed from groups of samples are used to replace geometry far away from the user?s viewpoint at runtime. Nearby geometry is not altered. We achieve a 10-15x improvement in frame rate over static geometric levels of detail with little loss in image fidelity using a model of a coal-fired power plant containing 12.7 million triangles. Moreover, our approach lessens the severity of reconstruction artifacts present in previous methods such as textured depth meshes.

Wright, William V. (1972)

“An Interactive Computer Graphics System for Molecular Studies”
Under the direction of Frederick P. Brooks, Jr.

The purpose was to investigate the feasibility and usefulness of an interactive computer system for a single field of scientific research. Such a system is a priori attractive only for fields of research which include a nontrivial mathematical model and in which the questions of interest cannot be answered purely by mechanical evaluations of this model. The approach was to select a client (a scientist with a suitable problem), to design an interactive system for investigating his problem, to build a useful subset of this system, to have the client use the system in a typical research environment, and to analyze the events of this entire procedure for principles to guide the design of other similar systems and of more general ones. The investigation of the conformations and interactions of protein molecules was chosen as the research field. Desiderata of the system design were to minimize the number and difficulty of actions that the user must execute to specify an operation on his model, and to develop ways to minimize the cost of implementation. A principle of design, brought to light by the study, was that the language used to specify operations on the model must be allowed to evolve as the system is used. That is, it must be an extensible language. The system developed during this study was found to be a useful tool for the investigation of protein structures. It is superior to the conventional technique of building physical models in that a model can be built and modified more quickly, it is more precise, and the internal potential energy of the structure can be computed easily and quickly. A computer model, however, is less visualizable than a physical model. The concept of an extensible system proved to be workable and to have many advantages, even though the current mechanism for extension imposed a substantial delay between the conception of a new function and its implementation. The design of a mechanism for on-line extension of the system is proposed. The extensible nature of the system also minimizes the implementation costs. Initially only a basic system is needed, and thereafter, it is extended only as required by the progress of the research for which it is being used. Moreover, the current system is structured so that parts of it can serve as the foundation of a similar system for another field of research, so the implementation costs of later systems can be lower.

Wu, Changchang (2011)

“Geometry-driven Feature Detection for Improving Geometric Reconstruction”
Under the direction of Jan-Michael Frahm

Matching images taken from different viewpoints is a fundamental step for many computer vision applications including 3D reconstruction, scene recognition, virtual reality, robot localization, etc. The typical approaches detect feature keypoints based on local properties to achieve robustness to viewpoint changes, and establish correspondences between keypoints to recover the 3D geometry or determine the similarity between images. The invariance to viewpoint changes is challenged by the complexity of the distortions introduced through the projective transformations; the matching of typical local features is also inefficient due to the lack of local geometric information. This thesis explores the feature detection based on geometric information for improved projective invariance and semantic meaning. The main novel research contributions of this thesis are as follows. A projective invariant feature detection method by exploiting 3D structures that can be recovered from simpler stereo matching; an efficient 3D matching scheme to handle very large viewpoint changes, the proposed feature and its rich geometric dimension. A high-level feature detector that robustly extracts repetitive structures in urban scene, providing strong semantics for features and allowing efficient wide-baseline matching; a novel single-view reconstruction to densely recover the 3D geometry of the repetition-based features.


Xiao, Qiuyu (2019)

Efficient Data Protection by Noising, Masking, and Metering

Under the direction of Michael Reiter

Electronic copy available

Protecting data secrecy is an important design goal of computing systems. Conventional techniques like access control mechanisms and cryptography are widely deployed, and yet security breaches and data leakages still occur. There are several challenges. First, sensitivity of the system data is not always easy to decide. Second, trustworthiness is not a constant property of the system components and users. Third, a system’s functional requirements can be at odds with its data protection requirements. In this dissertation, we show that efficient data protection can be achieved by noising, masking, or metering sensitive data. Specifically, three practical problems are addressed in the dissertation—storage side-channel attacks in Linux, server anonymity violations in web sessions, and data theft by malicious insiders. To mitigate storage side-channel attacks, we introduce a differentially private system, dpprocfs, which injects noise into side-channel vectors and also reestablishes invariants on the noised outputs. Our evaluations show that dpprocfs mitigates known storage side channels while preserving the utility of the proc filesystem for monitoring and diagnosis. To enforce server anonymity, we introduce a cloud service, PoPSiCl, which masks server identifiers, including DNS names and IP addresses, with personalized pseudonyms. PoPSiCl can defend against both passive and active network attackers with minimal impact to web-browsing performance. To prevent data theft from insiders, we introduce a system, Snowman, which restricts the user to access data only remotely and accurately meters the sensitive data output to the user by conducting taint analysis in a replica of the application execution without slowing the interactive user session.

Xu, Hao (2012)

“How To Efficiently Implement an OSHL-Based Automatic Theorem Prover”

Ordered Semantic Hyper-linking (OSHL) is a general-purpose instance-based first-order automated theorem proving algorithm. Although OSHL has many useful properties, previous implementations of OSHL were not very efficient. The implementation of such a theorem prover differs from other more traditional programs in that a lot of its subroutines are more mathematical than procedural. The low performance of previous implementations prevents us from evaluating how the proof strategy used in OSHL matches up against other theorem proving strategies. This dissertation addresses this problem on three levels. First, an abstract, generalized version genOSHL is defined which captures the essential features of OSHL and for which the soundness and completeness are proved. This gives genOSHL the flexibility to be tweaked while still preserving soundness and completeness. A type inference algorithm is introduced which allows genOSHL to possibly reduce its search space while still preserving the soundness and completeness. Second, incOSHL, a specialized version of genOSHL, which differs from the original OSHL algorithm, is defined by specializing genOSHL. Its soundness of completeness follows from that of genOSHL. Third, an embedded programming language called STACK EL, which allows managing program states and their dependencies on global mutable data, is designed and implemented. STACK EL allows our prover to generate instances incrementally. We also study the performance of our incremental theorem prover that implements incOSHL.
Under the direction of Wei Wang.

Xu, Yi (2016)

“Toward Robust Video Event Detection and Retrieval Under Adversarial Constraints”
Under the direction of Jan-Michael Frahm & Fabian Monrose

The continuous stream of videos uploaded and shared on the Internet has been leveraged by computer vision researchers for a myriad of detection and retrieval tasks. However, the existing state-of-the-art event detection and retrieval techniques fail to deal with several real-world challenges (e.g., low resolution and noise) under adversary constraints. This dissertation focuses on these challenges in five realistic scenarios (CAPTCHA decoding, face liveness detection, reconstructing typed input on mobile devices, video confirmation attack, and content-based copy detection) and demonstrates practical methods to address the problem of robustness and efficiency within video event detection and retrieval systems.

Specifically, I propose an automated approach which can decode moving-image object recognition (MIOR) CAPTCHAs. I showed that not only are there inherent weaknesses in current MIOR CAPTCHA designs, but that several obvious countermeasures (e.g. extending the length of the codeword) are not viable.  More importantly, my work highlights the fact that the choice of underlying hard problem selected by current MIOR CAPTCHA designs falls into a solvable subclass of computer vision problems.

For face liveness detection, I introduce a novel approach to bypass modern face authentication systems. More specifically, by leveraging a handful of pictures of the target user taken from social media, I show how to create realistic 3D facial models using virtual reality (VR) systems. My framework incorporates the ability to perform animations, tricking liveness detectors to believe that the 3D model is a real human face. I demonstrate that such VR-based attack points to weaknesses in camera-based authentication systems.

For reconstructing typed input on mobile devices, I proposed a method that successfully transcribes the text typed on a keyboard by exploiting video of the user typing, even from significant distances and from repeated reflections. This feat allows us to reconstruct typed input from the image of a mobile phone’s screen on a user’s eyeball as reflected through a nearby mirror, extending the privacy threat to include situations where the adversary is located around a corner from the user.

Lastly, to assess the viability of a video confirmation attack, I exploit the emanations of changes in light to reveal the programs being watched. I leverage the key insight that the observable emanations of a display (e.g., TV monitor) during presentation of the viewing content induces a distinctive flicker pattern that can be exploited by an adversary. My proposed approach works successfully with observations of light effusions through the windows from faraway. Moreover, I extended my proposed temporal feature to content-based copy detection, my large-scale evaluation on real-world data shows that I can successfully detect infringing content from movies and sports clips with high accuracy at an average time expense of merely 5.3 seconds, outperforming the state of the art by an order of magnitude.


Yakowenko, William J. (1999)

“Propositional Theorem Proving by Semantic Tree Trimming for Hardware Verification”
Under the direction of David A. Plaisted
Electronic copy available

The present work describes a new algorithm for testing the satisfiability of statements in propositional logic. It was designed to efficiently handle the most obvious kinds of pathological cases for the Davis-Putnam algorithm. Its performance is compared with a very efficient implementation of Davis-Putnam on a large number of problems and it is shown to be superior. A recently-developed version of Davis-Putnam with a related algorithmic enhancement is better still, but it is conjectured that the same enhancement can apply to the present work, with a similar boost in performance.

Yang, Hua (2008)

“Differential Tracking through Sampling and Linearizing the Local Appearance Manifold”
Under the direction of Greg Welch
Electronic copy available

Recovering motion information from input camera image sequences is a classic problem of computer vision. Conventional approaches estimate motion from either dense optical flow or sparse feature correspondences identified across successive image frames. Among other things, performance depends on the accuracy of the feature detection, which can be problematic in scenes that exhibit view-dependent geometric or photometric behaviors such as occlusion, semitransparancy, specularity and curved reflections. Beyond feature measurements, researchers have also developed approaches that directly utilize appearance (intensity) measurements. Such appearance-based approaches eliminate the need for feature extraction and avoid the difficulty of identifying correspondences. However the simplicity of on-line processing of image features is usually traded for complexity in off-line modeling of the appearance function. Because the appearance function is typically very nonlinear, learning it usually requires an impractically large number of training samples. I will present a novel appearance-based framework that can be used to estimate rigid motion in a manner that is computationally simple and does not require global modeling of the appearance function. The basic idea is as follows. An n-pixel image can be considered as a point in an n-dimensional appearance space. When an object in the scene or the camera moves, the image point moves along a low-dimensional appearance manifold. While globally nonlinear, the appearance manifold can be locally linearized using a small number of nearby image samples. This linear approximation of the local appearance manifold defines a mapping between the images and the underlying motion parameters, allowing the motion estimation to be formulated as solving a linear system. I will address three key issues related to motion estimation: how to acquire local appearance samples, how to derive a local linear approximation given appearance samples, and whether the linear approximation is sufficiently close to the real local appearance manifold. In addition I will present a novel approach to motion segmentation that utilizes the same appearance-based framework to classify individual image pixels into groups associated with different underlying rigid motions.

Yan, Jingyu (2009)

“Articulated Non-Rigid Shape, Motion and Kinematic Chain Recovery from Video”
Under the direction of Marc Pollefeys
Electronic copy available

Shape and motion recovery is an essential task for computer vision. One particular interest is to recover articulated shape and motion, e.g. human motion, from images or video. A system that can capture and recover articulated shape and motion has a wide range of applications in medical study, sports analysis and animation, etc. In this dissertation, we propose a factorization-based approach to recover the shape, motion and kinematic chain of an articulated object with non-rigid parts. Articulated motion with non-rigid parts is a good approximation to human motion, e.g. human body motion with non-rigid facial motion. In our approach, we model the articulated non-rigid motion using a set of intersecting motion subspaces. A motion subspace can model the motion of either a rigid or non-rigid articulated part. The intersection of the motion subspaces of two linked parts models the motion subspace of an articulated joint or axis. Our approach consists of a motion segmentation algorithm that segments articulated motions and an algorithm that automatically builds the kinematic chain of an articulated object based on motion segmentation. The shape and motion recovery is done via the factorization method for rigid or non-rigid parts of the object. We test our approach through synthetic and real experiments and demonstrate how to recover articulated structure and motion with non-rigid parts via a single-view camera without prior knowledge of its kinematic structure.

Yang, Kecheng (2018)

Motion Planning for Multi-Agent Systems with Physics-Based and Behavior Constraints

Under the direction of Dinesh Manocha

Electronic copy available

Man-made entities and humans rely on movement as an essential form of interaction with the world.

Whether it is an autonomous vehicle navigating crowded roadways or a simulated pedestrian traversing a virtual world, each entity must compute safe, effective paths to achieve their goals. In addition, these entities, termed agents, are subject to unique physical and behavioral limitations within their environment. For example, vehicles have a finite physical turning radius and must obey behavioral constraints such as traffic signals and rules of the road. Effective motion-planning algorithms for diverse agents must account for these physical and behavioral constraints.

        In this dissertation, we present novel motion-planning algorithms that capture aspects of the behavioral and physics-based constraints for virtual agents. We describe representational approaches to capture specific physical constraints on the various agents and propose abstractions to model behavior constraints affecting them. We then describe algorithms to plan motions for agents who are subject to the modeled constraints. 

        First, we describe a biomechanically accurate elliptical representation for virtual pedestrians; we also describe human-like movement constraints corresponding to shoulder-turning and side-stepping in dense scenes. We detail a novel motion-planning algorithm extending velocity obstacles to generate paths for hundreds of elliptical agents at interactive rates. Next, we describe a model to encode dynamics and traffic-like behavior constraints for autonomous vehicles in urban and highway environments. We describe a motion-planning algorithm to generate safe, high-speed avoidance maneuvers using a novel optimization function and modified velocity-obstacle formulation, and we present a simulation framework to evaluate driving strategies. Next, we present an approach to incorporating high-level reasoning to model the motions and behaviors of virtual agents in terms of verbal interactions with other agents or avatars. Our approach leverages natural language interaction to reduce uncertainty and generate effective plans. Finally, we describe an application of our techniques to simulate pedestrian behaviors for gathering simulated data about boarding, deplaning, and evacuating an aircraft.

Yang, Kecheng (2018)

“Real-Time Scheduling on Asymmetric Multiprocessor Platforms”

Under the direction of Jim Anderson

Electronic copy available

Real-time scheduling analysis is crucial for time-critical systems, in which provable timing guarantees are more important than observed raw performance. Techniques for real-time scheduling analysis initially targeted uniprocessor platforms but have since evolved to encompass multiprocessor platforms. However, work directed at multiprocessors has largely focused on symmetric platforms, in which every processor is identical. Today, it is common for a multiprocessor to include heterogeneous processing elements, as this offers advantages with respect to size, weight, and power (SWaP) limitations. As a result, realizing modern real-time systems on asymmetric multiprocessor platforms is an inevitable trend. Unfortunately, principles and mechanisms regarding real-time scheduling on such platforms are relatively lacking. 

The goal of this dissertation is to enrich such principles and mechanisms, by bridging existing analysis for symmetric multiprocessor platforms to asymmetric ones and by developing new techniques that are unique for asymmetric multiprocessor platforms. The specific contributions are threefold.

First, for a platform consisting of processors that differ with respect to processing speeds only, this dissertation shows that the preemptive global earliest-deadline-first (G-EDF) scheduler is optimal for scheduling soft-real-time (SRT) task systems. Furthermore, it shows that semi-partitioned scheduling, which is a hybrid of conventional global and partitioned scheduling approaches, can be applied to optimally schedule both hard-real-time (HRT) and SRT task systems.

Second, on platforms that consist of processors with different functionalities, tasks that belong to different functionalities may process the same source data consecutively and therefore have producer/consumer relationships among them, which are represented by directed acyclic graphs (DAGs). End-to-end response-time bounds for such DAGs are derived in this dissertation under a G-EDF-based scheduling approach, and it is shown that such bounds can be improved by a linear-programming-based deadline-setting technique. 

Third, processor virtualization can lead a symmetric physical platform to be asymmetric. In fact, for a designated virtual-platform capacity, there exist an infinite number of allocation schemes for virtual processors and a choice must be made. In this dissertation, a particular asymmetric virtual-processor allocation scheme, called minimum-parallelism (MP) form, is shown to dominate all other schemes including symmetric ones.

Yang, Ruigang (2003)

“View-Dependent Pixel Coloring – A Physically-Based Approach for 2D View Synthesis”
Under the direction of Gregory Welch
Electronic copy available

The basic goal of traditional computer graphics is to generate 2D images of a synthetic scene represented by a 3D analytical model. When it comes to real scenes however, one usually does not have a 3D model. If however one has access to 2D images of the scene gathered from a few cameras, one can use view synthesis techniques to generate 2D images from various viewing angles between and around the cameras. In this dissertation I introduce a fully automatic, physically-based framework for view synthesis that I call View-dependent Pixel Coloring (VDPC). VDPC uses a hybrid approach that estimates the most likely color for every picture element of an image from the desired view, while simultaneously estimating a view-dependent 3D model of the scene. By taking into account a variety of factors including object occlusions, surface geometry and materials, and lighting, VDPC has produced superior results under some very challenging conditions, in particular, in the presence of textureless regions and specular highlights, conditions that cause conventional approaches to fail. In addition, VDPC can be implemented on commodity graphics hardware under certain simplifying assumptions. The basic idea is to use texture-mapping functions to warp the input images to the desired view point, and use programmable pixel rendering functions to decide the most consistent color for each pixel in the output image. By exploiting the fast speed and tremendous amount of parallelism inherent in today’s graphics board, one can achieve real-time, on-line view synthesis of a dynamic scene.

Yang, Shan (2017)

Image-Based Non-Rigid Body Mechanical Property Recover

Under the direction of Ming Ling

Electronic copy available

Material property has great importance in surgical simulation and virtual reality. The mechanical properties of the human soft tissue, are important to characterize the tissue deformation of each patient. Studies have shown that the tissue stiffness described by the tissue properties may indicate abnormal pathological process. For virtual reality applications such as virtual try-on, garment material capturing is of great importance.

In this thesis, I present novel approaches for automatically estimating the material properties of soft bodies from two sets of images, taken before and after deformation or from a video which records the motion of the deformable body. I use a coupled simulation-optimization-identification framework to deform one soft body at its original, non-deformed state to match the deformed geometry of the same object in its deformed state. The optimal set of material parameters is thereby determined by minimizing the error metric function. This method can simultaneously recover the elasticity parameters of multiple regions of soft bodies using Finite Element Method-based simulation (of either linear or nonlinear materials undergoing large deformation) and particle-swarm optimization methods. I demonstrate this approach on real-time interaction with virtual organs in patient-specific surgical simulation, using parameters acquired from low-resolution medical images.

Further, I extend the material recovery framework on cloth material identification. Most recent garment capturing techniques rely on acquiring multiple views of clothing, which may not always be readily available, especially in the case of preexisting photographs from the web. As an alternative, I propose a wrinkle density-based method that is able to compute the garment material type from a single photograph and a learning-based method which estimates the material property of a piece of cloth from a video. I demonstrate the effectiveness of my algorithm by re-purposing the reconstructed garments for virtual try-on and garment transfer applications and for cloth animation for digital characters.

Yang, Xiao (2017)

Uncertainty Quantification, Image Synthesis and Deformation Prediction for Image Registration

Under the direction of Marc Niethammer

Electronic copy available

Image registration is essential for medical image analysis to provide spatial correspondences. It is a difficult problem due to the modeling complexity of image appearance and the computational complexity of the deformable registration models. Thus, several techniques are needed: Uncertainty measurements of the high-dimensional parameter space of the registration methods for the evaluation of the registration result; Registration methods for registering healthy medical images to pathological images with large appearance changes; Fast registration prediction techniques for uni-modal and multi-modal images.

This dissertation addresses these problems and makes the following contributions: 1) A framework for uncertainty quantification of image registration results is proposed. The proposed method for uncertainty quantification utilizes a low-rank Hessian approximation to evaluate the variance/covariance of the variational Gaussian distribution of the registration parameters. The method requires significantly less storage and computation time than computing the Hessian via finite difference while achieving excellent approximation accuracy, facilitating the computation of the variational approximation; 2) An image synthesis deep network for pathological image registration is developed. The network transforms a pathological image into a ‘quasi-normal’ image, making registrations more accurate; 3) A patch-based deep learning framework for registration parameter prediction using image appearances only is created. The network is capable of accurately predicting the initial momentum for the Large Deformation Diffeomorphic Metric Mapping (LDDMM) model for both uni-modal and multi-modal registration problems, while increasing the registration speed by at least an order of magnitude compared with optimization-based approaches and maintaining the theoretical properties of LDDMM.

Applications of the methods include 1) Uncertainty quantification of LDDMM for 2D and 3D medical image registrations, which could be used for uncertainty-based image smoothing and subsequent analysis; 2) Quasi-normal image synthesis for the registration of brain images with tumors with potential extensions to other image registration problems with pathologies and 3) deformation prediction for various brain datasets and T1w/T2w magnetic resonance images (MRI), which could be incorporated into other medical image analysis tasks such as fast multi-atlas image segmentation, fast geodesic image regression, fast atlas construction and fast user-interactive registration refinement.

Yeh, Hengchin (2014)

“Adaptive Modeling of Details for Physically-Based Sound Synthesis and Propagation”
Under the direction of Ming Lin

In order to create an immersive virtual world, it is crucial to incorporate a realistic aural experience that complements the visual sense. Physically-based sound simulation is a method to achieve this goal and automatically provides audio-visual correspondence. It simulates the physical process of sound: the pressure variations of a medium originated from some vibrating surface (sound synthesis), propagating as waves in space and reaching human ears (sound propagation). The perceived realism of simulated sounds depends on the accuracy of the computation methods and the computational resource available, and oftentimes it is not feasible to use  the most accurate technique for all simulation targets. I propose techniques that model the general sense of sounds and their details separately and adaptively to balance the realism and computational costs of sound simulations.

For synthesizing liquid sounds, I present a novel approach that generate sounds due to the vibration of resonating bubbles. My approach uses three levels of bubble modeling to control the trade-offs between quality and efficiency: statistical generation from liquid surface configuration, explicitly tracking of spherical bubbles, and decomposition of non-spherical bubbles to spherical harmonics. For synthesizing rigid-body contact sounds, I propose to improve the realism in two levels using example recordings: first, material parameters that preserve the inherent quality of  the recorded material are estimated; then extra details from the example recording that are not fully captured  by the material parameters are computed and added.
For simulating sound propagation in large, complex scenes, I present a novel hybrid approach that couples numerical and geometric  acoustic techniques. By decomposing the spatial domain of a scene and applying
the more accurate and expensive numerical acoustic techniques only in limited regions, a user is able to allocate computation resources on where it matters most.

Yin, Qianwen (2017)

TCP Rapid:  From Theory to Practice

Under the direction of Jasleen Kaur

Electronic copy available

Delay and bandwidth-based alternatives to TCP congestion-control have been around for nearly three decades and have seen a recent surge in interest.   However, such designs have faced significant resistance in being deployed on a wide-scale across the Internet—this has been mostly due to serious concerns about noise in delay measurements, pacing inter-packet gaps, and required changes to the standard TCP stack. With the advent of high-speed networking, some of these concerns become even more significant.

This thesis considers Rapid, a recent proposal for ultra-high speed congestion control, which perhaps stretches each of these challenges to the greatest extent. Rapid adopts a framework of continuous fine-scale bandwidth probing and rate adapting. It requires finely-controlled inter-packet gaps, high-precision timestamping of received packets, and reliance on fine-scale changes in inter-packet gaps. While simulation-based evaluations of Rapid show that it has outstanding performance gains along several important dimensions, these will not translate to the real-world unless the above challenges are addressed.

This thesis identifies the key challenges TCP Rapid faces on real high-speed networks, including deployability in standard protocol stacks, precise inter-packet gap creation, achieving robust bandwidth estimation in the presence of noise, and a stability/adaptability trade-off. A Linux implementation of Rapid is designed and developed after carefully considering each of these challenges. The evaluations on a 10Gbps testbed confirm that the implementation can indeed achieve the claimed performance gains, and that it would not have been possible unless each of the above challenges was addressed.

Yoo, Terry S. (1996)

“Image Geometry Through Multiscale Statistics”
Under the direction of Stephen M. Pizer
Electronic copy available

This study in the statistics of scale space begins with an analysis of noise propagation of multiscale differential operators for image analysis. It also presents methods for computing multiscale central moments that characterize the probability distribution of local intensities. Directional operators for sampling oriented local central moments are also computed and principal statistical directions extracted, reflecting local image geometry. These multiscale statistical models are generalized for use with multivalued data. The absolute error in normalized multiscale differential invariants due to spatially uncorrelated noise is shown to vary non-monotonically across order of differentiation. Instead the absolute error decreases between zeroth and first order measurements and increases thereafter with increasing order of differentiation, remaining less than the initial error until the third or fourth order derivatives are taken. Statistical invariants given by isotropic and directional sampling operators of varying scale are used to generate local central moments of intensity that capture information about the local probability distribution of intensities at a pixel location under an assumption of piecewise ergodicity. Through canonical analysis of a matrix of second moments, directional sampling provides principal statistical directions that reflect local image geometry, and this allows the removal of biases introduced by image structure. Multiscale image statistics can thus be made invariant to spatial rotation and translation as well as linear functions of intensity. These new methods provide a principled means for processing multivalued images based on normalization by local covariances. They also provide a basis for choosing control parameters in variable conductance diffusion.

Yoon, Sung-Eui (2005

“Interactive Visualization and Collision Detection using Dynamic Simplification and Cache-Coherent Layouts”
Under the direction of Dinesh Manocha
Electronic copy available

Recent advances in model acquisition, computer-aided design, and simulation technologies have resulted in massive databases of complex geometric models consisting of more than tens or hundreds of millions of triangles. In spite of the rapid progress in the performance of CPUs and graphics processing units (GPUs), it may not be possible to visualize or perform collision detection between massive models at interactive rates on commodity hardware. In this thesis, we present dynamic simplification and cache-coherent layout algorithms for interactive visualization and collision detection between large models, in order to bridge the gap between the performance of commodity hardware and high model complexity. Firstly, we present a novel dynamic simplification algorithm that efficiently handles massive models for view-dependent rendering while alleviating discontinuity problems such as visual poppings that arise when switching between different levels of detail (LODs). We describe an out-of-core construction algorithm for hierarchical simplification of massive models that cannot fit into main memory. We also apply dynamic simplification to collision detection and introduce a new conservative distance metric to perform fast and conservative collision detection between massive models. Our approach is conservative in that it may overestimate the set of colliding primitives, but never misses any collisions. Secondly, we present novel cache-oblivious layout algorithms for polygonal meshes and hierarchies to minimize the expected number of cache misses for a wide variety of applications. Our layout algorithms only assume that runtime applications have random, but cache-coherent access patterns. However, we do not require any knowledge of cache parameters such as block and cache sizes. We demonstrate benefits of our layout algorithm on three different applications including view-dependent rendering, collision detection, and isosurface extraction. We have implemented our algorithms on a desktop PC and are able to achieve significant improvements over previous approaches and obtain interactive performance (more than 10 frames per second) on view-dependent rendering and collision detection between massive and complex models.

Yu, Licheng (2019)

“Locate, Tell, Answer and Act: Connecting Vision, Language and Beyond”

Under the direction of Tamara Berg

Electronic copy available

Work in computer vision and natural language processing involving images and text has been experiencing explosive growth in recent years, with a particular boost from neural network revolution. Our intelligent agents are now able to do language grounding (locate), image captioning (tell), visual question answering (answer) and embodied AI task (act). These skills together make human-robot interaction in a very natural way, e.g., robot could accurately comprehend human’s spoken command and make desired actions. This talk will cover the most representative work during my PhD study, including referring expression generation and comprehension, and embodied question answering, as well as some thought on the future work connecting vision, language and beyond.

Yushkevich, Paul A. (2003)

“Statistical Shape Characterization Using the Medial Representation”
Under the direction of Stephen M. Pizer
Electronic copy available

The goal of the research presented in this dissertation is to improve the clinical understanding of processes that act the shape of anatomical structures. Schizophrenia is an example of such a process: it is known to act the shape of the hippocampus, but the precise nature of the morphological changes that it causes is not fully understood. This dissertation introduces novel statistical shape characterization methodology that can improve the understanding of shape-altering biological processes by (i) identifying the regions of the acted objects where these processes are most significantly manifested and (ii) expressing the ects of these processes in intuitive geometric terms. The following three new techniques are described and evaluated in this dissertation. 1. In an approach motivated by human form perception, the shape characterization problem is divided into a coarse-to-fine hierarchy of sub-problems that analyze shape at diㄦent locations and levels of detail, making it pssible to compare the ects of shapealtering processes on diㄦent object regions. Statistical features are base on the medial (skeletal) object representation, which can be used to decompose objects into simple components called figures and to measure the bending and widening of the figures. Such features make it possible to express shape variability in terms of bending and widening. 2. A new algorithm that identifies regions of biological objects that are most relevant for shape-based classification is developed. In the schizophrenia application, the algorithm is used to find the hippocampus locations most relevant for classification between schizophrenia patients and matched healthy controls. The algorithm fuses shape heuristics with existing feature selection methodology, ectively reducing the inherently combinatorial search space of the latter. 3. Biological objects in 3D and 2D are described using a novel medial representation that models medial loci and boundaries using continuous manifolds. The continuous medial representation is used in the deformable templates framework to segment objects in medical images. The representation allows arbitrary sampling that is needed by the hierarchical shape characterization method.


Zarling, Raymond L. (1976)

“Numerical Solution of Nearly Decomposable Queuing Networks”
Under the direction of Victor L. Wallace

The applicability of an aggregation technique of Simon and Ando to the problem of finding the equilibrium probability vector of a finite, nearly decomposable Markov chain is investigated. The method involves using the solution of a nearly equal, decomposable system together with a limited knowledge of the small couplings between classes of the original chain to approximate the solution. Of central concern is the utility of the technique for an analysis of computer system models for specific levels of coupling, rather than for asymptotic analysis as the coupling tends to zero. The technique of aggregation is found to produce a result which is approximately correct with an error proportional to the size of the coupling in the original system. The constant of proportionality is exhibited and found to be large enough to prohibit use of the technique in many cases. The error is also found to depend in a critical way on the particular choice of decomposable matrix used to approximate the given system. A way is presented of choosing this decomposable matrix so that it yields an exact solution to the original problem. A condition number for the problem of aggregation is defined in connection with this error analysis. This number is shown to be a principal component of the constant of proportionality. It is large for a class of problems which were previously known to offer poor numerical properties, and some new classes of poorly conditioned problems are demonstrated. This refutes earlier conjectures that the aggregation algorithm could perform poorly only when some of the aggregates were themselves nearly decomposable. The condition number is also used to bound the rate of convergence of Wallace and Rosenberg’s “Recursive Queue Analyzer”, an iterative algorithm for finding equilibrium probability vectors of large chains. An optimal choice for a parameter of this algorithm is thereby demonstrated

Zhang, Hansong (1998)

“Effective Occlusion Culling for the Interactive Display of Arbitrary Models”
Under the direction of Dinesh Manocha
Electronic copy available

As an advanced form of visibility culling, occlusion culling detects hidden objects and prevents them from being rendered. An occlusion-culling algorithm that can effectively accelerate interactive graphics must simultaneously satisfy the following criteria:

  • Generality. It should be applicable to arbitrary models, not limited to architectural models or models with many large, polygonal occluders.
  • Significant Speed-up. It should not only be able to cull away large portions of a model, but do so fast enough to accelerate rendering.
  • Portability and Ease of Implementation. It should contain as few assumptions as possible on special hardware support. It must also be robust (i.e. insensitive to floating-point errors).

Based on proper problem decomposition and efficient representations of cumulative occlusion, this dissertation presents algorithms that satisfy all three of the criteria listed above. Occlusion culling is decomposed into two sub-problems-in order for an object to be occluded by the occluders, its screen-space projection must be inside the cumulative projection of the occluders, and it must not occlude any visible parts of the occluders. These two necessary conditions are verified by the overlap tests and the depth tests, respectively. The cumulative projection and the depth of the occluders are represented separately to support these tests. Hierarchical occlusion maps represent the cumulative projection to multiple resolutions. The overlap tests are performed hierarchically through the pyramid. The multi-resolution representation supports such unique features as aggressive approximate culling (i.e. culling away barely-visible objects), and leads to the concept of levels of visibility. Two depth representations, the depth estimation buffer and the no-background Z-buffer, have been developed to store the depth information for the occluders. The former conservatively estimates the far boundary of the occluders; the latter is derived from a conventional Z-buffer and captures the near boundary. A framework for a two-pass variation of our algorithms is presented. Based on the framework, a system has been implemented on current graphics workstations. Testing of the system on a variety of models (from 300,000 to 15 Million polygons) has demonstrated the effectiveness of our algorithms for the interactive display of arbitrary models.

Zhang, Jingdan (2009)

“Object Detection and Segmentation using Discriminative Learning”
Under the direction of Leonard McMillan
Electronic copy available

Prior knowledge of shape and appearance plays an importance role in constraining solutions of object detection and segmentation. A promising approach for incorporating prior knowledge is to learn it directly from expert annotations using machine-learning techniques. Previously generative learning approaches or energy minimization methods have been used to achieve this goal. I propose a series of discriminative learning algorithms based on boosting principles to learn prior knowledge. For object detection, I present a learning procedure called Probabilistic Boosting Network (PBN) for real-time object detection and pose estimation. PBN integrates evidence from two building blocks, namely a multiclass classifier for pose estimation and a detection cascade for object detection. I implement PBN using a graph-structured network that alternates between the two tasks of object detection and pose estimation in order to reject negative cases as quickly as possible. Compared with previous approaches, PBN achieves higher accuracy in object localization and pose estimation with a noticeable reduction in computation. For object segmentation, I present a comparative study on how to apply three discriminative learning approaches – classification, regression, and ranking – to deformable shape segmentation that is I directly learn a fitting function to characterize the relationship between shape and image appearance. By casting the segmentation into a discriminative learning framework, the target fitting function can be steered to possess a desired shape for ease of optimization yet better characterize the relationship between shape and appearance. To address the high-dimensional learning challenge present in discriminative learning, I propose the multilevel and efficient sampling approaches. The experimental results demonstrate that the discriminative models outperform generative ones and energy minimization methods by a large margin.

Zhang, Jinghe (2013)

“Uncertainty-Driven Adaptive Estimation with Applications in Electrical Power Systems”
Under the direction of Gregory F. Welch
Electronic copy available

From electrical power systems to meteorology, large-scale state-space monitoring and forecasting methods are fundamental and critical. Such problem domains pose challenges from both computational and signal processing perspectives, as they typically comprise a large number of elements, and processes that are highly dynamic and complex (e.g., severe nonlinearity, discontinuities, and uncertainties). This makes it especially challenging to achieve real-time operations and control. For decades, researchers have developed methods and technology to improve the accuracy and efficiency of such large-scale state-space estimation. Some have devoted their efforts to hardware advances–developing advanced devices with higher data precision and update frequency. I have focused on methods for enhancing and optimizing the state estimation performance. As uncertainties are inevitable in any state estimation process, uncertainty analysis can provide valuable and informative guidance for on-line, predictive, or retroactive analysis. My research focuses primarily on three areas:

  1. Grid Sensor Placement. I present a method that combines off-line steady-state uncertainty and topology analysis for optimal sensor placement throughout the grid network.
  2. Filter Computation Adaptation. I present a method that utilizes on-line state uncertainty analysis to choose the best measurement subsets from the available (large-scale) measurement data. This allows systems to adapt to dynamically available computational resources.
  3. Adaptive and Robust Estimation. I present a method with a novel on-line measurement uncertainty analysis that can distinguish between suboptimal/incorrect system modeling and/or erroneous measurements, weighting the system model and measurements appropriately in realtime as part of the normal estimation process.

We seek to bridge the disciplinary boundaries between Computer Science and Power Systems Engineering, by introducing methods that leverage both existing and new techniques. While these methods are developed in the context of electrical power systems, they should generalize to other large-scale scientific and engineering applications.

Zhang, Liangjun (2009)

“Efficient Motion Planning using Generalized Penetration Depth Computation”
Under the direction of Dinesh Manocha
Electronic copy available

Motion planning is a fundamental problem in robotics and also arises in other applications including virtual prototyping, navigation, animation and computational structural biology. It has been extensively studied for more than three decades, though most practical algorithms are based on randomized sampling. In this dissertation, we address two main issues that arise with respect to these algorithms: (1) there are no good practical approaches to check for path non-existence even for low degree-of-freedom (DOF) robots; (2) the performance of sampling-based planners can degrade if the free space of a robot has narrow passages. In order to develop effective algorithms to deal with these problems, we use the concept of penetration depth (PD) computation. By quantifying the extent of the intersection between overlapping models (e.g. a robot and an obstacle), PD can provide a distance measure for the configuration space obstacle (C-obstacle). We extend the prior notion of translational PD to generalized PD, which takes into account translational as well as rotational motion to separate two overlapping models. Moreover, we formulate generalized PD computation based on appropriate model-dependent metrics and present two algorithms based on convex decomposition and local optimization. We highlight the efficiency and robustness of our PD algorithms on many complex 3D models. Based on generalized PD computation, we present the first set of practical algorithms for low DOF complete motion planning. Moreover, we use generalized PD computation to develop a retraction-based planner to effectively generate samples in narrow passages for rigid robots. The effectiveness of the resulting planner is shown by alpha puzzle benchmark and part disassembly benchmarks in virtual prototyping.

Zhang, Qi (2009)

“Mining Massive Scientific Sequence Data using Block-wise Decomposition Methods”
Under the direction of Wei Wang
Electronic copy available

The recent proliferation of high throughput technologies catalyzes the transformation of traditional science to data-driven science. The unprecedented growth of scientific datasets leads to the phenomenon of data rich but information poor, demanding for revolutionary knowledge discovery techniques to assist and accelerate scientific studies. In this dissertation, I present efficient data mining algorithms for knowledge discovery on two types of emerging large-scale sequence-based scientific datasets: 1) static sequence data generated from SNP diversity arrays for genomic studies, and 2) dynamic sequence data collected in streaming and sensor network systems for environmental studies. The massive, noisy nature of the SNP arrays and the distributive, online nature of sensor network data pose great challenges for knowledge discovery in terms of scalability, robustness, and efficiency. Despite the different characteristics and scientific application areas, the SNP arrays and the streaming and sensor data, as sequences of ordered observations, can be efficiently mined using algorithms based on block-wise decomposition methods. High-resolution Single Nucleotide Polymorphism (SNP) maps offer the most comprehensive resources of genetic variation and enable biologists to greatly accelerate the discovery of the genetic determinants of human diseases. In this dissertation, I proposed models and algorithms to efficiently infer genetic variation structure from the massive SNP panels of recombinant sequences resulting from meiotic recombination, one of the principal evolutionary forces shaping the genetic variations present in current populations. I introduced the Minimum Segmentation Problem (MSP) and the Minimum Mosaic Problem (MMP) to obtain the fine-scale analysis of the genetic variation in terms of the segmentation structure on a single recombinant strain as well as the mosaic structure on a panel of recombinant strains. Efficient algorithms based on block-wise decomposition are presented to solve MSP and MMP on genome-wide large-scale panels with biological noise such as point mutations, gene conversions, and genotyping errors. Large-scale deployment of wireless sensor networks to observe the natural environments, monitoring habitat and wild populations offers environmental and ecology scientists the access to enormous volumes of data collected from physically-dispersed locations in a continuous fashion. In this dissertation, I proposed efficient algorithms using block-wise synopsis construction to capture the data distribution online for the dynamic sequence data collected in the sensor network and streaming systems including clustering analysis and order-statistics computation, which is critical for real-time monitoring, anomaly detection, and other domain specific analysis.

Zhang, Yang (2019)

Efficient, Locality-Maintaining Namespace Operations in a Write-Optimized File System

Under the direction of Don Porter

Electronic copy available

There is a long-standing trade-off between good locality (fast directory traversals) and efficient namespace operations (efficient file or directory renames) in file systems. Traditional inode-based file systems have good rename performance but can fail to maintain locality, especially in the face of file system aging. On the other hand, full-path-indexed file systems ensure locality, however, renaming a directory needs to update all related full-paths, which is usually implemented as an expensive operation. No existing file system has both good locality and efficient namespace operations.

This dissertation describes a new file system design that has both good locality and efficient namespace operations. In particular, we describe a novel synthesis of write-optimization, full-path indexing, and operations on data structures. By directly manipulating the data structure, a full-path-indexed file system can efficiently update related full-paths in a rename. Moreover, with the technique, a full-path-indexed file system can clone a directory without traversing the directory.

We implement this technique in BetrFS, a full-path-indexed, write-optimized, local file system for Linux. Compared to ext4, the widely used inode-based file system in Linux, the new version of BetrFS traverses the Linux source directory 9.47x faster and renames the same directory 1.09x faster. Meanwhile, the new version of BetrFS clones a directory faster than state-of-the-art file systems that support clones, such as Btrfs and XFS.

Zhang, Yinqian (2014)

“Cache-based Side-Channel Attacks in Multi-Tenant Clouds and Their Countermeasures”
Under the direction of Mike Reiter

Cloud computing is gaining traction due to the business agility, resource scalability and operational efficiency that it enables. However, the murkiness of the security assurances offered by public clouds to its tenants is one of the major impediments to enterprise adoption of cloud computing. This dissertation explores one of the major design flaws in modern public clouds, namely insufficient isolation among cloud tenants as evidenced by the cloud’s inability to prevent side-channel attacks between co-located tenants, in both Infrastructure-as-a-Service (IaaS) clouds and Platform-as-a-Service (PaaS) clouds. Specifically, we demonstrate that one virtual machine can successfully extract cryptographic private keys from another VM co-located on the same physical machine using a cache-based side-channel attack, which calls into question the established belief that the security isolation provided by modern virtualization technologies remains adequate under the new threat model in multi-tenant public IaaS clouds. We have also demonstrated in commercial PaaS clouds that cache-based side channels can penetrate container-based isolation by extracting sensitive information from the execution paths of the victim applications, thereby subverting their security. Finally, we devise two defensive techniques for the IaaS setting, which can be adopted by cloud tenants immediately on modern cloud platforms without extra help from cloud providers, to address side-channel threats: (1) for tenants requiring a high degree of security and physical isolation, a tool to facilitate cloud auditing of such isolation; and (2) for tenants who use multi-tenant cloud services, an operating-system-level defense to defend against cache-based side-channel threats on their own.

Zhang, Zhaojun (2014)

“Efficient Computational Genetics Methods for Multiparent Lines”
Under the direction of Wei Wang & William Valdar

Multiparent lines are genetic populations bred in a controlled manner from a finite number of known founders. They represent experimental resources that are of potentially great value for understanding the genetic basis of complex diseases. An important new experimental technology that can be applied to multiparent lines, namely high-throughput sequencing, generates an immense amount of data and provides unprecedented opportunities to study genetics at an ultra-high resolution. However, to take advantage of such massive data, several computational challenges have to be addressed. These include RNA-Seq assembly and quantification, QTL mapping, and haplotype effect estimation. In order to tackle these problems, which are highly connected to each other, I have devised a series of methods: GeneScissors is a novel method to detect errors caused by multiple alignments in the RNA-Seq; RNA-Skim can rapidly quantify RNA-Seq data while still provide reliable results; HTreeQA is designed as a phylogeny based QTL mapping method for genotypes with heterozygous sites; and Diploffect estimates founder effects with statistically valid interval estimates in multiparent lines. These methods are extensively evaluated on both simulated and real data. These studies demonstrate that they can make data analysis of multiparent lines more effective and efficient and produce results are more accurate and trustworthy than a number of existing alternative methods.

Zhang, Feng (2015)

“Spatio-Temporal Registration in Augmented Reality”
Under the direction of Greg Welch

The overarching goal of Augmented Reality (AR) is to provide users with the illusion that virtual and real objects coexist indistinguishably in the same space. An effective persistent illusion requires accurate registration between the real and the virtual objects, registration that is spatially and temporally coherent. However, visible misregistration can be caused by many inherent error sources, such as errors in calibration, tracking, modeling, and system delay.

This dissertation focuses on new methods that could be considered part of “the last mile” of spatio-temporal registration in AR: closed-loop spatial registration and low-latency temporal registration:

  1. For spatial registration, the primary insight is that calibration, tracking and modeling are means to an end—the ultimate goal is registration. In this spirit I present a novel pixel-wise closed-loop registration approach that can automatically minimize registration errors using a reference model comprised of the real scene model and the desired virtual augmentations. Registration errors are minimized in both global world space via camera pose refinement, and local screen space via pixel-wise adjustments. This approach is presented in the context of Video See-Through AR (VST-AR) and projector-based Spatial AR (SAR), where registration results are measurable using a commodity color camera.
  1. For temporal registration, the primary insight is that the real-virtual relationships are evolving throughout the tracking, rendering, scanout and display steps, and registration can be improved by leveraging fine-grained processing and display mechanisms. In this spirit I introduce a general end-to-end system pipeline with low latency, and propose an algorithm for minimizing latency in displays (DLP™ DMD projectors in particular). This approach is presented in the context of Optical See-Through AR (OST-AR), where system delay is the most detrimental source of error.

I also discuss future steps that may further improve spatio-temporal registration. Particularly, I discuss possibilities for using custom virtual or physical-virtual fiducials for closed-loop registration in SAR. The custom fiducials can be designed to elicit desirable optical signals that directly indicate any error in the relative pose between the physical and projected virtual objects.

Zheng, Enliang (2016)

“Toward 3D Reconstruction of Static and Dynamic Objects”
Under the direction of Jan-Michael Frahm
Electronic copy available

The goal of image-based 3D reconstruction is to build up a spatial understanding of the world from a collection of images. For applications that seek to model generic real-world scenes, it is important that the reconstruction methods used are able to characterize both static scene elements (e.g. trees and buildings) as well as dynamic objects (e.g. cars and pedestrians). However, due to many inherent ambiguities in the reconstruction problem, recovering this 3D information with accuracy, robustness, and efficiency is a considerable challenge. To advance the research frontier for image-based 3D modeling, this dissertation focuses on three challenging problems in static scene and dynamic object reconstruction.

We first target the problem of static scene depthmap estimation from crowd-sourced datasets (i.e. photos collected from the Internet). While achieving high-quality depthmaps using images taken under a   controlled environment is already a difficult task, heterogeneous crowd-sourced data presents a unique set of challenges for multi-view depth estimation, including varying illumination and occasional occlusions. We propose a depthmap estimation method that demonstrates high accuracy, robustness, and scalability on a large number of photos collected from the Internet.

Compared to static scene reconstruction, the problem of dynamic object reconstruction from monocular images is fundamentally ambiguous when not imposing any additional assumptions. This is because having only a single observation of an object is insufficient for valid 3D triangulation, which typically requires concurrent observations of the object from multiple viewpoints. Assuming that dynamic objects of the same class (e.g. all the pedestrians walking on a sidewalk) move in a common path in the real world, we develop a method that estimates the 3D positions of the dynamic objects from unstructured monocular images. Experiments on both synthetic and real datasets illustrate the solvability of the problem and the effectiveness of our approach.

Finally, we address the problem of dynamic object reconstruction from a set of unsynchronized videos capturing the same dynamic event, instead of using monocular images as input. This problem is of great interest because, due to the increased availability of portable capture devices, captures using multiple unsynchronized videos are common in the real world. To resolve the difficulty that arises from non- concurrent captures and unknown temporal overlap among video streams, we propose a self-expressive dictionary learning framework, where the dictionary entries are defined

as the collection of temporally varying structures. The experimental results show that this novel approach effectively solves the previously unsolved problem.

Zheng, Yu (2014)
“Locomotion Generation and Balance Control of Humanoid Robots Considering External Dynamics and Constraints”
Under the direction of Ming Lin

Building a capable robot to service and assist people is one of the ultimate goals in humanoid robotics. To realize this goal, a humanoid robot first needs to be able to perform some fundamental locomotion tasks, such as balancing and walking which is insufficient for a robot to be useful. A humanoid robot should also possess the ability to make use of the object in the environment to generate dynamic motions and improve its mobility. Also, since humanoid robots are expected to work and live closely with humans, having human-like motions is important for them to be human-friendly.

This dissertation addresses my work on endowing humanoid robots with the ability to manipulate objects by locomotion or to imitate human motions. First, as a representative case of generating dynamic motions, a biped humanoid robot is required to balance and walk on a cylinder that can roll freely on the ground. This task is difficult even for humans. I introduce a control method for a humanoid robot to execute this challenging task. Its effectiveness has been verified by full-body dynamics simulation and hardware experiments on the Sarcos humanoid robot. Second, I present a method for real-time control of humanoid robots to track human motions while maintaining balance. It consists of a standard proportional-derivative controller that computes the desired acceleration to track the given reference motion and an optimization problem that computes the optimal joint torques and contact forces to realize the desired acceleration, considering the full-body dynamics of the robot and strict constraints on contact forces.

Through this work, I demonstrate that considering object dynamics or constraints in the environment in the design of a controller enables humanoid robots to achieve additional locomotion tasks, such as manipulating a dynamic object or tracking given reference motions, while maintaining balance.

Zherong, Pan (2019)

Efficient Motion Planning for Deformable Objects with High Degrees of Freedom

Under the direction of Dinesh Manocha

Electronic copy available

Many robotics and graphics applications need the ability to plan a motion by interacting with complex environmental objects, including solids, sands, plants, and fluids. A key aspect of these deformable objects is that they are high-DOF, which implies that they can move or change shapes in a large number of independent ways subject to physics-based constraints. In many applications, users impose high-level goals on the movements of high-DOF objects, and planning algorithms need to model their motions and determine the optimal control actions to satisfy their goals.

In this thesis, we propose several planning algorithms for high-DOF objects. Our algorithms can improve the scalability considerably and can plan motions for different types of objects, including elastically deformable objects, free-surface flows, and Eulerian fluids. We show that the salient deformations of elastically deformable objects lie in a low-dimensional nonlinear space. By embedding the configuration space in the subspace, our motion planning algorithm can achieve over 100 times speedup over prior sampling-based planning algorithms. For free surface flows such as liquids, we utilize properties of the planning tasks and machine learning techniques to identify low-dimensional latent spaces to accelerate the motion planning computation. For Eulerian fluids without free surfaces, we present a scalable planning algorithm based on novel numerical techniques. We show that the numerical discretization scheme exhibits strong regularity, which allows us to accelerate optimization-based motion planning algorithms using a hierarchical data structure and we can achieve 3-10 times speedup. Finally, for high-DOF objects with many frictional contacts with the environment, we propose a contact dynamic model that can handle contacts without expensive combinatorial optimization.

We illustrate the benefits of our high-DOF planning algorithms for three applications. First, we can plan contact-rich motion trajectories of general elastically deformable robots. Second, we can achieve realtime performance when planning for a robot arm to transfer liquids between containers. Finally, our method enables a more intuitive user interface. We allow animation editors to modify animations using an offline motion planner to generate controlled fluid animations.

Zhou, Yipin (2019)

Learning Beyond-pixel Mappings from Internet Videos

Under the direction of Tamara L. Berg

Electronic copy available

There have been significant advancements in algorithms to recognize or localize visual contents for both images and videos. These tasks infer the information that is directly visible within the images or video frames (predicting what’s in the frame). While human-level visual understanding could be much more than that, because human also have the insights about the information ‘beyond the frame’. In other word, human is able to reasonably infer the information that is not visible from the current scenes, such as predicting future events, and we expect the computational models could own the same capabilities one day.

Learning beyond-pixel mappings can be a broad concept, we carefully define and formulate the problems as specific and subdivided tasks from different aspects. Under this context, what beyond-pixel mapping do is to infer the information of broader special or temporal context, or even the information from other modalities like text or sound. We first present a computational framework to learn the mappings between the short event video clips and their intrinsic temporal sequence (which one usually happens first). Then we keep exploring the follow-up direction by directly predicting future. Specifically we utilize the generative models to predict the depictions of the object in its future state. Next, we explore the similar generation task to generate the video frames of the target person with unseen poses guided by a random person. Finally, we propose a framework to learn the mappings between input video frames and it’s counterpart in sound domain. The main contribution of this dissertation lies in exploring beyond-pixel mappings from variant aspects to add a piece of knowledge to the next-generation AI platforms.

Zhu, Yunshan (1998)

“Efficient First-Order Semantic Deduction Techniques”
Under the direction of David A. Plaisted

Mathematical logic formalizes the process of mathematical reasoning. For centuries, it has been the dream of mathematicians to do mathematical reasoning mechanically. In the TPTP library, one finds thousands of problems from various domains of mathematics such as group theory, number theory, set theory, etc. Many of these problems can now be solved with state of the art automatic theorem provers. Theorem proving also has applications to artificial intelligence and formal verification. As a formal method, theorem proving has been used to verify the correctness of various hardware and software designs. In this thesis, we propose a novel first-order theorem proving strategy–ordered semantic hyper linking (OSHL). OSHL is an instance-based theorem proving strategy. It proves first-order unsatisfiability by generating instances of first-order clauses and proving the set of instances to be propositionally unsatisfiable. OSHL can use semantics, i.e. domain information, to guide its search, just as a mathematician will study a diagram before diving deep into the proof of a geometry problem. OSHL allows a general form of semantics which is represented as a ground decision procedure on Herbrand terms. Term rewriting and narrowing are used in OSHL to handle equations. Theorem prover OSHL represents a novel combination of efficient propositional decision procedures, semantic guidance and term rewriting. We apply OSHL to planning problems. We analyze the complexity of ordered semantic hyper linking using a novel concept of theorem proving complexity measure. We compare the complexity with those of common theorem proving strategies. We show that OSHL is both practically and asymptotically efficient.

Zimmerman, John B. (1985)

“The Effectiveness of Adaptive Contrast Enhancement”
Under the direction of Stephen M. Pizer

A significant problem in the display of digital images has been the proper choice of a display mapping which will allow the observer to utilize well the information in the image. Recently, contrast enhancement mappings which adapt to local information in the image have been developed. In this work, the effectiveness of an adaptive contrast enhancement method, Pizer’s Adaptive Histogram Equalization (AHE), was compared to global contrast enhancement for a specific task by the use of formal observer studies with medical images. Observers were allowed to choose the parameters for linear intensity windowing of the data to compare to images automatically prepared using Adaptive Histogram Equalization. The conclusions reached in this work were:

  • There was no significant difference in diagnostic performance using AHE and interactive windowing.
  • Windowing performed relatively better in the mediastinum than in the lungs.
  • There was no significant improvement over time in the observers’ performance using AHE.
  • The performances of the observers were roughly equivalent.

An image quality measure was also developed, based upon models of processing within the visual system that are edge sensitive, to allow the evaluation of contrast enhancement mappings without the use of observer studies. Preliminary tests with this image quality measure showed that it was able to detect appropriate features for contrast chances in obvious targets, but a complete evaluation using subtle contrast changes found that the measure is insufficiently sensitive to allow the comparison of different contrast enhancement modalities. It was concluded that limiting factors affecting the measure’s sensitivity included the intrinsic variation in normal image structure and spatial and intensity quantization artifacts. However, humans were able to reliably detect the targets used in the experiment. This work suggests that 1) the use of adaptive contrast enhancement can be effective for the display of medical images and 2) modeling the eye’s detection of contrast as an edge-sensitive process will require further evaluation to determine if such models are useful.

Zimmons, Paul Michael (2004)

“The Influence of Lighting Quality on Presence and Task Performance in Virtual Environments”
Under the direction of Frederick P. Brooks, Jr.
Electronic copy available

This dissertation describes three experiments that were conducted to explore the influence of lighting in virtual environments. The first experiment (Pit Experiment), involving 55 participants, took place in a stressful, virtual pit environment. The purpose of the experiment was to determine if the level of lighting quality and degree of texture resolution increased the participants’ sense of presence as measured by physiological responses. Findings indicated that as participants moved from a low-stress environment to an adjacent high-stress environment, there were significant increases in all physiological measurements; and the experiment did not discriminate between conditions. In the second experiment (Gallery Experiment), 63 participants experienced a non-stressful virtual gallery. This experiment studied the influence of lighting quality, position, and intensity on movement and attention. Participants occupied spaces lit with higher intensities for longer periods of time and gazed longer at objects that were displayed under higher lighting contrast. This experiment successfully utilized a new technique, attention mapping, for measuring behavior in a three-dimensional virtual environment. Attention mapping provides an objective record of viewing times which can be used to examine and compare the relative importance of different components in the environment. Experiment 3 (Knot Experiment) utilized 101 participants to investigate the influence of three lighting models (ambient, local, and global) on object recognition accuracy and speed. Participants looked at an object rendered with one lighting model and then searched for that object among distractor objects rendered with the same or different lighting model. Accuracy scores were significantly lower when there were larger differences in the lighting model between the search object and searched set of objects. Search objects rendered in global or ambient illumination took significantly longer to identify than those rendered in a local illumination model.