Our Exclusive AI Dataset
Comprehensive Text-to-Video Dataset for AI Training and Multimedia Applications
This dataset, designed to fuel text-to-video AI systems, contains 63 entries selected from a larger collection of 1000 data points. It is organized into six video categories: explanatory, everyday scenes, storytelling, documentary, user-generated content (UGC), and animation. The dataset is divided into two subsets (70% for training and 30% for testing) and includes files detailing attributes such as scene description, actions, language, duration, and more. It is primarily intended for use in multimedia search, automatic video creation, and video analysis, with applications across fields like education, advertising, and sociological analysis.
1. Objective (Text-to-Video Dataset)
This dataset is designed to train text-to-video AI models. It originally contains 1,000 data entries, but this sample includes 63.
2. Data Structure
2.1. Video Categories
- Explanatory/Educational Videos: Detailed descriptions of concepts or processes.
- Everyday Scene Videos: Captures daily life moments with rich visual and interactive details.
- Storytelling/Narrative Videos: Stories with dialogues or narrations tied to visible events.
- Thematic/Documentary Videos: In-depth narration explaining visual elements.
- User-Generated Content (UGC): Videos from platforms with simple captions or subtitles.
- Animated/Synthetic Videos: Precisely described actions and scenes, often with scripts or subtitles.
2.2. Technical Composition
- data.csv: Complete dataset.
- readme.md: Dataset usage guide.
- Markdown Documentation: An extended version of the readme file.
- Train and Test Folders:
- train.csv: 70% of the data for training.
- test.csv: 30% of the data for testing.
3. Field Descriptions
Field | Description |
---|---|
Title | Title of the video |
Text Description | Narrative describing the scene or event. |
Described Actions | Key actions described. |
Emotions or Tone | Ambiance or emotions in the description. |
Language | Language of the description. |
Duration | Total video duration. |
Content Category | Video classification. |
Location | Scene setting or location. |
Audio Presence | Indicates if the video has audio. |
Entities | Visible objects, animals, or people. |
Source | Video origin (e.g., recorded, generated). |
Creation Date | Recording or creation date. |
Tags/Keywords | Keywords for easier search. |
URL | Link to access the video. |
Weather Conditions | Relevant weather information. |
Time of Day | Morning, afternoon, evening, or night. |
Channel | The channel owner of the video. |
4. Use Cases
- Multimedia Search: Enhance text-based search and video indexing.
- Automated Content Creation: Generate educational or UGC videos from text.
- Video Analysis: Detect emotions, objects, and visual elements in scenes.
- Personalized Content: Tailor videos for ads or virtual assistants.
- Specialized Applications: Create educational content, VR/AR media, or conduct social/psychological studies.
Technical Documentation for the Text-to-Video Dataset “VidData”
Chatbot

Info
Databoost, registered in the United States, is an international company with offices and subsidiaries in Madagascar. Through this global structure, we provide superior quality solutions by combining American expertise and local Malagasy talent. We emphasize flexibility, creativity, and efficiency, with a commitment to serving our clients on a global scale while remaining deeply rooted in local realities.
Links
Subscribe to our newsletter
Copyright © Databoost, 2024. All rights reserved