DuranRafid/MachineTranslation
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Data\data.zip :- Original unprocessed raw data. This file is the only file that you will put manually. All the following files should be generated by data.py file. Do not upload this file during submission as you already gave them to me. Data\Train\Under_10_min_training\data.zip :- A subset of training data where a 5 epoch training takes less than 10 min Data\Train\Under_90_min_tuning\data.zip :- A subset of training data where a 10 epoch training takes less than 90 min. This subset should be used for each hyperparameter combination during tuning. Data\Train\Best_hyperparameter_80_percent\data.zip :- 80 percent training data. This should be used for training with optimal hyperparameter settings. This learned model must be saved to use separately with test data. Data\Validation\3_samples\data.zip :- A 3 sample set for validation Data\Validation\Validation_10_percent\data.zip :- 10 percent validation data. This should be used to evaluate each hyperparameter combination during tuning. Data\Test\Test_10_percent\data.zip :- 10 percent test data. This should be used to evaluate performance of your saved model. tuning_results.txt :- Performance for each hyperparameter combination during tuning hyperparameter.txt :- Optimal hyperparameters after tuning model.h5 :- Your saved model in HDF5 format Results.docx :- Tuning and test results in table format. script.bat :- Install any dependencies for your program. data.py :- All data preprocessing code train.py :- All training and model saving code tune.py :- All tuning, hyperparameter search and validation code. This will call training module from train.py test.py :- All model loading and testing code Lib\ :- Any other code, library you need Tmp\ :- Created runtime. All temporary data should reside in this folder. Deleted at the end of execution. Execution order is given below. Assume that current directory is your project folder. Following script is for windows. For linux the python command will not contain .exe extention. Command line arguments in the same line with python command are input files. Following files are output files. ############################################################################################################################################################### md Tmp script.bat python.exe data.py .\Data\data.zip .\Data\Train\Best_hyperparameter_80_percent\ .\Data\Validation\Validation_10_percent\ .\Data\Test\Test_10_percent\ .\Data\Train\Under_10_min_training\ .\Data\Train\Under_90_min_tuning\ .\Data\Validation\3_samples\ python.exe tune.py .\Data\Train\Under_90_min_tuning\data.zip .\Data\Validation\Validation_10_percent\data.zip .\tuning_results.txt .\hyperparameter.txt python.exe train.py .\Data\Train\Best_hyperparameter_80_percent\data.zip .\hyperparameter.txt .\model.h5 python.exe test.py .\Data\Test\Test_10_percent\data.zip .\model.h5 rd Tmp /s /q ###############################################################################################################################################################