Dataset¶

We use torchtext to manage data loading and processing. For more information about torchtext, please go to: https://github.com/pytorch/text

Fields¶

class machine.dataset.fields.SourceField(**kwargs)[source]¶

Wrapper class of torchtext.data.Field that forces batch_first and include_lengths to be True.

Variables:	eos_id – index of the end of sentence symbol.

build_vocab(*args, **kwargs)[source]¶

Construct the Vocab object for this field from one or more datasets.

Parameters:	arguments (Positional) – Dataset objects or other iterable data sources from which to construct the Vocab object that represents the set of possible values for this field. If a Dataset object is provided, all columns corresponding to this field are used; individual columns can also be provided directly. keyword arguments (Remaining) – Passed to the constructor of Vocab.

class machine.dataset.fields.TargetField(include_eos=True, **kwargs)[source]¶

Wrapper class of torchtext.data.Field that forces batch_first to be True and prepend <sos> and append <eos> to sequences in preprocessing step.

Variables:	sos_id – index of the start of sentence symbol eos_id – index of the end of sentence symbol

build_vocab(*args, **kwargs)[source]¶

Construct the Vocab object for this field from one or more datasets.

Parameters:	arguments (Positional) – Dataset objects or other iterable data sources from which to construct the Vocab object that represents the set of possible values for this field. If a Dataset object is provided, all columns corresponding to this field are used; individual columns can also be provided directly. keyword arguments (Remaining) – Passed to the constructor of Vocab.