Customize3DForm

Introducing human subjectivity to objective objects using text-based generative pipe-line

In collaboration with Wei-Chun Cheng

 

1.Introduction

Recent advances in technology have transformed the interaction between humans and machines into complex, highly collaborative relationships. To improve machine contributions in this partnership, we suggest a dynamic approach that leverages machines to improve human design decisions and develop communication interfaces that enhance the design process during the early stages of defining form, space, and size.

This paper uses an undergraduate core design studio project, Door, Window, and Stair, as the testing 3D design space. It is a conceptual exercise in which students design a cultural artifact by extracting 2D and 3D parti diagrams and transforming them into 3D space with physical materials such as wood, plexiglass, and stick assemblages. The exercise teaches students how to translate descriptions into forms and explore the potential qualities of different spatial configurations.

Figure 1. Door Window Stair Perspective Vignette. Image credits: Christina Graydon

Figure 2. Door Window Stair Model. Image credits: Erin Lee


2.State of the Art: 3D AI and Generative Algorithm

The current generative 3D AI mostly iterates design options by generating new designs based on the randomization of rules, parameters, or datasets in a pre-defined design space. However, we view the generative process as evolutionary. Present methods often fail to preserve essential aspects of a particular model, not meeting designer expectations. Hence, this paper proposes a method that introduces the ability to customize and regenerate specific parts of a 3D model rather than the entire structure. This approach not only provides a new perspective on AI's role in design but also challenges the conventional use of AI merely for optimization. Instead, it promotes AI as a tool to enhance creativity and support varied human-centric solutions.

 

3.Methodology

A three-step methodology was conducted: First, generate a dataset of 3D geometries from a control-based generative parametric model that is defined with specific design criteria; second, collect human feedback as text labels using eye-tracking and a web-based interface that integrates clickable grid cells (Figure 3).

Figure 3. Left Image: Eye Tracking Heatmap, Right Image: Clickable Grid Cell.

The second step of the methodology has two data-gathering techniques that work together to encompass two levels of human reaction to the text prompt and 3D form. This combined technique provides an in-depth examination of the generated 3D form. Third, resize the original dataset by using a cluster-then-label approach (CL). The CL used an unsupervised clustering algorithm called Self-Organizing Maps (SOM) to autonomously label other unlabeled neighboring forms based on formal features used to train the unsupervised model, creating a pool of design options to explore (Figure 4).

Figure 4. The pipeline of Cusomize3DForm. Users' operations are colored in blue to differentiate them from the computational process. This workflow involved two main parts: data collection and data matching.

3.1.Generation of Design Form and Data Curation

In order to increase the diversity of the generated output elements from a parametric-based generator, a five-layer randomness system was implemented using the Populate 3D component in Grasshopper. Each layer introduces a specific level of control, with set number seeds applied to ensure consistent element spawning locations across layers. Each layer accounts for different variables such as the placement, rotation, and transparency of planks. Ten unique designs were generated with this process to validate the proposal (Figure 5).

Figure 5. Parametric-Based Generated Design Options.

Each design is dissected into parts with cellular bounding boxes, which slice the model and create a template for the web interface for human feedback collection (Figure 6).

Figure 6. Cellar Box: Unfold the 3D form to the Top, Left, and Front viewports to locate the labeled form. The highlighting part is T0L4F6.

The division resolution is user-defined; in this paper, 4×4×10 cellular boxes are demonstrated. Consequently, each design has 160 parts, and the total 3D elements dataset has 1,369 parts (Figure 7).

Figure 7. 1369 forms sliced from 10 generated designs.

3.2.Web-Based Interface Design and Text Prompt Definition

The web interface design contains four viewports to examine the design options: Top, Left, Front, and Perspective. Each viewport is divided into (20 × 12) grid cells to record the designer’s feedback to the text prompt questions located on the bottom of the web interface. It has a timer to record when each cell is clicked to observe the confidence of the designer’s feedback. The eye-tracking was devised with GazePoint GP3, one of the research-standard screen-based eye-tracking systems.

Figure 8. Web Interface, Eye-tracking monitor, and Data collection station.

To evaluate the models, contrasting design words were chosen as the text labels to describe both functional utility, such as “Strong - Weak,” and aesthetic appeal, such as “Usual - Unusual,” from the 3D assemblies (Figure 9). With this diverse vocabulary, there is innovative potential to create designs that transcend conventional norms.

Figure 9. Example of description text data collection.

3.3.Rapid Autonomous Labeling with Self-Organizing Map Algorithm

To reduce the workload for students to label all the 3D parts, we employed a clustering-then-label approach, mapping labels to all similar parts within a defined cluster. We used an unsupervised machine learning algorithm called Self-Organizing Map (SOM), leveraging clustering and dimensionality reduction. SOM transforms the data from high-dimensional to low-dimensional space (usually two or three dimensions) while preserving the organization of the original high-dimensional space. This process ensures that similar data points in the high-dimensional space remain close in the low-dimensional space, thus placing them within the same cluster. This low-dimensional space is represented by a 2D grid with a fixed number of points, known as a map, which allows for the visualization of high-dimensional databases (Figure 10). SOM enhances transparency by clustering similar data points, reducing dimensionality, visualizing design spaces, and identifying outliers, which helps designers better understand complex relationships and make more informed design decisions.

To cluster the 3D parts based on their formal characteristics, we calculated the Higher Order Statistics (HOS) on the nodes of each 3D part. HOS is a statistical method that captures the invariances of the design features at a higher level. This approach helps identify and cluster parts with similar formal qualities, facilitating the efficient assignment of text labels to the 3D parts within each cluster, as seen in Figure 10. Each cluster contains multiple forms that are organized by the similarity of the observed features (Figure 11).

Figure 10. SOM clusters into 10x10 Grid with text labels

Figure 11. Number of forms in each cluster.

To reconcile contradictory labels after the autonomous labeling with SOM, we employed a straightforward label occurrence analysis to address potential contradictions in qualitative user inputs.

3.4.Iterating Design with Search Engine-Based Generator

For the data matching, four layers of data linkage and conversion need to be executed between: the parameters of the associated form, the eye-tracking data with labels, the clickable cells with labels, and the cellular grid index. By cross-matching these four layers of data, a library of forms is created that can be searched by users using subjective terms. Customize3DForm can generate design proposals from descriptions the user provides. In this paper, we present a UI control panel to perform this search and form-generation function that only needs natural language text input (Figure 12).

Figure 12. Design Interface

 

4.Results and Discussion

This project showcased an approach to integrating a text-guided 3D generator to customize parts of an initial 3D model design, a method based on subjective human input, and a series of AI methods to guide the design output. Figure 12 shows the final result of the proposed 3D generator. However, there were some limitations:

a) Insufficient text labels, with occasional contradictions.

b) The generative algorithm uses a brute-force search method, which could be more efficient and easier to expand.

Despite these limitations, we identify some solutions for future research. First, by diversifying the training data, we can enhance the accuracy of the algorithm and train deep-learning generative models that are more efficient and sophisticated. Additionally, creating an interface to collect more comprehensive data, including verbal descriptions of the 3D models, would allow researchers to better understand how humans perceive space and further improve the accuracy and usability of our process.

 

5.Conclusion

Customize3DForm is a powerful tool for architects and designers to get quick and personalized solutions tailored to their needs and preferences. A challenge is that inexperienced designers may not be able to fully evaluate every aspect of a generated design. To address this, we propose an iterative feedback loop where designers can focus on specific design elements and refine their design through multiple interactions. By breaking down the evaluation into manageable, focused tasks, students can provide meaningful feedback without needing to assess the entire design in depth.

Additionally, we propose incorporating peer review to broaden feedback collection and discussion to foster a learning environment beneficial for design education.

For future research, conducting user studies or case examples demonstrating how designers interact with the system would validate its effectiveness. Including quantitative or qualitative measures regarding the system’s impact on the design process could further strengthen the evaluation.

Looking ahead, we aim to fully streamline the workflow into Rhino Grasshopper. For example, users could type "create a curved surface" to generate a 3D form with a curved surface. This development would make the tool easier to use, enabling users to generate complex 3D forms through simple, conversational interactions.


Gallery