User:Magnus Manske/File handling
This is a proposal. Feel free to edit!
Several issues with current Wikimedia file storage:
- NFS (misc problems)
- Different engines for different extensions
- Temporary files are forever
keep adding
Proposed solution:
- Central MySQL database for handeling all file storage (temporary and permanent) for all projects
- Can handle multiple storage types (could even have the same file in multiple storage engines during move to new storage engine)
- Scales (add DB slaves if necessary)
- timestamp could show time of last request (only actually update for 1 in a 100 requests to keep writes down) to nuke old, unused (temporary) files
- file_key would be the file name for uploaded files, or the text inside the math tag, or the md5sum of the timeline text, or...
- file_location would be a file_storage-specific key (NFS path, cloud ID,...)
- width and height can be stored (thumbnails), but are optional
Proposed DB tables:
CREATE TABLE file (
file_id INTEGER PRIMARY KEY AUTO_INCREMENT,
file_project INTEGER
file_type SET ('file','thumbnail','math','timeline'),
file_key MEDIUMBLOB,
file_storage SET ('nfs','cloud','vapor'),
file_location MEDIUMBLOB,
file_width INTEGER DEFAULT NULL,
file_height INTEGER DEFAULT NULL,
file_timestamp VARCHAR(14)
) engine=InnoDB;
CREATE TABLE project (
project_id INTEGER PRIMARY KEY AUTO_INCREMENT,
project_name VARCHAR(255)
) engine=InnoDB;
Add indices!