Our Exclusive AI Dataset
Technical Documentation for the Text-to-Video Dataset “VidData”
This dataset contains 1006 annotated videos of everyday scenes, used for training and evaluating AI models in video generation and recognition. It is structured to meet the needs of Text-to-Video models and motion analysis.
2. Dataset Specifications
2.1. Generation Criteria
- Maximum video duration: 10 seconds maximum
- Video themes:
- Walking
- Exercising
- Writing
- Shopping
- Sleeping
- Meditating
- Working
- Studying
- Driving
- Washing
- Gardening
- Calling
- Listening
- Organizing
- Planning
- Relaxing
- Teaching
- Video size: 512×512 pixels
2.2. Dataset Organization
The dataset follows this structure:
└─ VidData └─ data └─ train └─ VidData.csv └─ video └─ ---_iRTHryQ_13_0to241.mp4 └─ ---agFLYkbY_7_0to303.mp4 └─ --0ETtekpw0_2_18to486.mp4 └─ readme.md
- data/train/: Contains CSV files with metadata associated with the videos.
- video/: Contains the video files.
3. Data Structure
The dataset is stored as a CSV file and includes the following columns:
Column | Type | Description |
---|---|---|
video | string | Video file name |
caption | string | Textual description of the video |
temporal consistency score | float64 | Temporal consistency score |
fps | float64 | Frame per second |
frame | int64 | Number of frames in the video |
seconds | float64 | Video duration in seconds |
motion score | float64 | Motion score |
camera motion | string | Type of camera motion (e.g., pan_left) |
4. Libraries Used
4.1. Library Examples
Here are some example libraries that can be used when analyzing this data:
- OpenCV: Video manipulation and processing (reading, writing, frame extraction, contour detection, filtering, etc.).
- Scikit-Image: Calculating the Structural Similarity Index (SSIM) for image quality evaluation and various image transformations (segmentation, filtering, etc.).
- NumPy: Efficient manipulation of matrices and arrays, essential for calculations on images and videos.
- Pandas: Managing and structuring metadata associated with videos (e.g., file names, timestamps, annotations).
- Matplotlib/Seaborn: Visualizing analysis results as graphs.
4.2. Installing Dependencies
Follow the instructions below to install the required libraries:
- Create a
requirements.txt
file and add the following:opencv-python==4.8.1.78 # Video manipulation and processing scikit-image==0.22.0 # SSIM calculation and image transformations numpy==1.26.2 # Efficient manipulation of matrices and arrays pandas==2.1.4 # Managing and structuring metadata matplotlib==3.8.2 # Visualizing analysis results seaborn==0.12.2 # Advanced visualization with enhanced graphics
- Run the command:
pip install -r requirements.txt
Note: Only include the libraries you need in requirements.txt
.
5. Using the Dataset
5.1. Primary Applications
5.1.1. Text-to-Video Generation
- Train models to generate video based on textual input.
- Benchmark performance by comparing generated video against dataset entities.
5.1.2. Video Description Models
- Evaluate models designed to generate textual descriptions from videos.
5.1.3. Temporal Consistency Analysis
- Test model for maintaining smoothness and coherence in video generation.
5.2. Example Workflow
Load the dataset using Python:
import pandas as pd dataset = pd.read_csv('VidData.csv') print(dataset.head()) # Access video metadata: video = dataset.iloc[0] # First entry print(f"Video Name: {video['video_name']}") print(f"Caption: {video['Caption']}") print(f"Duration: {video['duration_seconds']} seconds") # Filter video based on motion: high_motion_videos = dataset[dataset['motion_score'] > 1.0] print(high_motion_videos)
6. File Format
The dataset is delivered in CSV format, with each column representing a video and its metadata.
7. Sample Entry
video_name | caption | temporal_consistency_score | fps | frames | duration_seconds | motion_score | camera_motion |
---|---|---|---|---|---|---|---|
E_1.mp4 | The video shows a soccer player kicking a soccer ball. | 0.948826 | 30 | 195 | 6.5 | 0.826522 | 1.105807 |
8. Contact
For inquiries, please contact:
- Email: info@databoost.us
- Website: databoost.us
9. Hugging Face
If you want to access the complete dataset, including the videos, click the button below. It will redirect you to our dataset publication on Hugging Face.
Comprehensive Text-to-Video Dataset for AI Training and Multimedia Applications
Chatbot

Info
Databoost, registered in the United States, is an international company with offices and subsidiaries in Madagascar. Through this global structure, we provide superior quality solutions by combining American expertise and local Malagasy talent. We emphasize flexibility, creativity, and efficiency, with a commitment to serving our clients on a global scale while remaining deeply rooted in local realities.
Links
Subscribe to our newsletter
Copyright © Databoost, 2024. All rights reserved