# Introduction

This is a pytorch implementation of a sequence to sequence learning toolkit for the i-machine-think project. This repository is a fork from the pytorch-seq2seq library developed by IBM, but has substantially diverged from it after heavy development. For the original implementation, visit [https://github.com/IBM/pytorch-seq2seq](https://github.com/IBM/pytorch-seq2seq).

# Requirements

This library runs with PyTorch 0.3.0. We refer to the [PyTorch website](http://pytorch.org/) to install the right version for your environment.
To install additional requirements (including numpy and torchtext), run:

`pip install requirements`

# Quickstart

There are 3 commandline tools available

* `train_model.py`
* `evaluate.py`
* `infer.py`

## Training

The script `train_model.py` can be used to train a new model, resume the training of an existing model from a checkpoint, or retrain an existing model from a checkpoint. E.g. to train a model from scratch:

     # Train a simple model with hidden layer size 128 and embedding size 128
    `python train_model.py --train $train_path --dev $dev_path --output_dir $expt_dir  --embedding_size 128 --hidden_size 256 --rnn_cell gru --epoch 20 

Several options are available from the command line, including changing the optimizer, batch size, using attention/bidirectionality and using teacher forcing. 
For a complete overview, use the *help* function of the script.

## Evaluation and inference

The scripts `infer.py` and `evaluate.py` can be used to run an existing model (loaded from a checkpoint) in inference mode, and evaluate a model on a test set, respectively. E.g: 

      # Use the model stored in $checkpoint_path in inference mode
    ` python infer.py --checkpoint_path $checkpoint_path
    
      # Evaluate a trained model stored in $checkpoint_path
    ` python evaluate.py --checkpoint_path $checkpoint_path --test_data $test_data

## Example script

The script `example.sh` illustrates the usage of all three tools: it uses the toy data from the test directory (containing a 'reverse' dataset in which the translation of any sequence of numbers is its inverse), trains a model on this data using `train_model.py`, evaluates this model using `evaluate.py` and then runs `infer.py` to generate outputs.

Once training is complete, you will be prompted to enter a new sequence to translate and the model will print out its prediction (use ctrl-C to terminate).  Try the example below!

    Input:  1 3 5 7 9
	Expected output: 9 7 5 3 1 EOS


## Checkpoints

During training, the top *k* models are stored in a folder which is named using the accuracy and loss of the model on the development set.
Currently, *k* is set to 5.
The folder contains the model, the source and target vocabulary and the trainer states.

# Contributing

We welcome pull requests for the library.
Please run both the unittests and integration test before committing:

`python -m unittest discover`
`sh integration_test.sh`