Hey 👋

Tell us a little about yourself....

Dictate3D - An AI-Powered 3D Editor

Scroll to top

Dictate3D is a pioneering 3D application that leverages the power of AI to transform verbal instructions into real-time 3D scene changes. Using advanced technologies and machine learning models, it offers a dynamic and interactive platform for manipulating 3D environments. From creating, deleting, moving objects, to interpreting intents and generating color codes, Dictate3D represents a remarkable blend of web development, 3D graphics, AI, and machine learning, creating an unprecedented user experience in the realm of 3D editing tools.

Dictate3D - An AI-Powered 3D Editor

Overview

Dictate3D is an innovative 3D editor that uses artificial intelligence to interpret and execute user commands for manipulating 3D environments. Using a combination of advanced technologies such as ThreeJS, React Three Fiber, Next.js, and Flask, coupled with machine learning models like BERT and feed-forward neural networks, Dictate3D transforms verbal instructions into real-time 3D scene changes. Dictate3D

Technologies Used

The frontend of Dictate3D was developed using React Three Fiber and Next.js, which offer a dynamic user interface and improved performance and SEO through server-side rendering, respectively. The visualization is powered by ThreeJS, a popular JavaScript library for creating and displaying animated 3D graphics in a web browser.

On the backend, we utilized Flask to host and manage the AI models that interpret user commands. Docker was used to containerize the application, ensuring a consistent environment between development and deployment, which greatly simplifies the workflow and enhances the application's reliability.

Features

Dictate3D offers the following key features: Dictate3D

  • 3D Object Manipulation: Users can interact with the 3D environment using verbal commands like "move this cube", "delete this", "add 10 cubes", etc., which are converted into actual 3D manipulations in real time. Dictate3D

  • Intent Recognition: A BERT model, trained on a large corpus of 3D commands, is used to understand and classify the user's intent from their verbal instructions. Dictate3D

  • Color Code Generation: A feed-forward neural network is used to generate color codes, enabling more variety and customization in the 3D scene.

  • Number Interpretation: The Python word2num library is used to convert verbal numerical inputs into actual numbers that can be used in the 3D manipulation commands.

Challenges and Solutions

The main challenge was integrating the different AI models and making them work together smoothly to interpret and execute user commands. We overcame this by carefully designing the data pipelines and testing extensively to ensure the accuracy and responsiveness of the command interpretation. Docker was instrumental in streamlining the deployment process and ensuring environment consistency.

Impact

The end product is a powerful 3D editor that simplifies the user's interaction with 3D environments by accepting natural language commands. Dictate3D showcases our skills in web development, 3D graphics, AI, and machine learning, offering a unique blend of these technologies to enhance user experience and push the boundaries of what's possible in 3D editing tools.