Skip to content

word_to_id

WordtoId

Bases: NumpyOp

Converts words to their corresponding id using mapper function or dictionary.

Parameters:

Name Type Description Default
mapping Union[Dict[str, int], Callable[[List[str]], List[int]]]

Mapper function or dictionary

required
inputs Union[str, Iterable[str]]

Key(s) of sequences to be converted to ids.

required
outputs Union[str, Iterable[str]]

Key(s) of sequences are converted to ids.

required
mode Union[None, str, Iterable[str]]

What mode(s) to execute this Op in. For example, "train", "eval", "test", or "infer". To execute regardless of mode, pass None. To execute in all modes except for a particular one, you can pass an argument like "!infer" or "!train".

None
ds_id Union[None, str, Iterable[str]]

What dataset id(s) to execute this Op in. To execute regardless of ds_id, pass None. To execute in all ds_ids except for a particular one, you can pass an argument like "!ds1".

None
Source code in fastestimator/fastestimator/op/numpyop/univariate/word_to_id.py
@traceable()
class WordtoId(NumpyOp):
    """Converts words to their corresponding id using mapper function or dictionary.

    Args:
        mapping: Mapper function or dictionary
        inputs: Key(s) of sequences to be converted to ids.
        outputs: Key(s) of sequences are converted to ids.
        mode: What mode(s) to execute this Op in. For example, "train", "eval", "test", or "infer". To execute
            regardless of mode, pass None. To execute in all modes except for a particular one, you can pass an argument
            like "!infer" or "!train".
        ds_id: What dataset id(s) to execute this Op in. To execute regardless of ds_id, pass None. To execute in all
            ds_ids except for a particular one, you can pass an argument like "!ds1".
    """
    def __init__(self,
                 mapping: Union[Dict[str, int], Callable[[List[str]], List[int]]],
                 inputs: Union[str, Iterable[str]],
                 outputs: Union[str, Iterable[str]],
                 mode: Union[None, str, Iterable[str]] = None,
                 ds_id: Union[None, str, Iterable[str]] = None) -> None:
        super().__init__(inputs=inputs, outputs=outputs, mode=mode, ds_id=ds_id)
        self.in_list, self.out_list = True, True
        assert callable(mapping) or isinstance(mapping, dict), \
            "Incorrect data type provided for `mapping`. Please provide a function or a dictionary."
        self.mapping = mapping

    def forward(self, data: List[List[str]], state: Dict[str, Any]) -> List[np.ndarray]:
        return [self._convert_to_id(elem) for elem in data]

    def _convert_to_id(self, data: List[str]) -> np.ndarray:
        """Flatten the input list and map the token to ids using mapper function or lookup table.

        Args:
            data: Input array of tokens

        Raises:
            Exception: If neither of the mapper function or dictionary object is passed

        Returns:
            Array of token ids
        """
        if callable(self.mapping):
            data = self.mapping(data)
        else:
            data = [self.mapping.get(token) for token in data]
        return np.array(data)