Skip to content

DuranRafid/MachineTranslation

Repository files navigation

Data\data.zip :- Original unprocessed raw data. This file is the only file that you will put manually. All the following files should be generated by data.py file. Do not upload this file during submission as you already gave them to me. 
Data\Train\Under_10_min_training\data.zip :- A subset of training data where a 5 epoch training takes less than 10 min
Data\Train\Under_90_min_tuning\data.zip :- A subset of training data where a 10 epoch training takes less than 90 min. This subset should be used for each hyperparameter combination during tuning.
Data\Train\Best_hyperparameter_80_percent\data.zip :- 80 percent training data. This should be used for training with optimal hyperparameter settings. This learned model must be saved to use separately with test data.
Data\Validation\3_samples\data.zip :- A 3 sample set for validation
Data\Validation\Validation_10_percent\data.zip :- 10 percent validation data. This should be used to evaluate each hyperparameter combination during tuning.
Data\Test\Test_10_percent\data.zip :- 10 percent test data. This should be used to evaluate performance of your saved model.
tuning_results.txt :- Performance for each hyperparameter combination during tuning
hyperparameter.txt :- Optimal hyperparameters after tuning
model.h5 :- Your saved model in HDF5 format
Results.docx :- Tuning and test results in table format. 
script.bat :- Install any dependencies for your program.
data.py :- All data preprocessing code
train.py :- All training and model saving code
tune.py :- All tuning, hyperparameter search and validation code. This will call training module from train.py
test.py :- All model loading and testing code
Lib\ :- Any other code, library you need
Tmp\ :- Created runtime. All temporary data should reside in this folder. Deleted at the end of execution. 

Execution order is given below. Assume that current directory is your project folder.
Following script is for windows. For linux the python command will not contain .exe extention. Command line arguments in the same line with python command are input files. Following files are output files.
###############################################################################################################################################################
md Tmp
script.bat
python.exe data.py .\Data\data.zip 
.\Data\Train\Best_hyperparameter_80_percent\ 
.\Data\Validation\Validation_10_percent\ 
.\Data\Test\Test_10_percent\ 
.\Data\Train\Under_10_min_training\ 
.\Data\Train\Under_90_min_tuning\ 
.\Data\Validation\3_samples\
python.exe tune.py .\Data\Train\Under_90_min_tuning\data.zip .\Data\Validation\Validation_10_percent\data.zip 
.\tuning_results.txt 
.\hyperparameter.txt
python.exe train.py .\Data\Train\Best_hyperparameter_80_percent\data.zip .\hyperparameter.txt
.\model.h5
python.exe test.py .\Data\Test\Test_10_percent\data.zip .\model.h5
rd Tmp /s /q
###############################################################################################################################################################





  

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages