Skip to content

convert

convert(model, operator, inplace=True, weight_layers=[], activation_layers=[], input=False, log=True, excluded_weight_layer_indexes=[], excluded_activation_layer_indexes=[], include=None, exclude=None, order='post')

Automatically convert a model to a new model with its weights and activations transformed by the operator, e.g. prune or quantize.

Parameters:

Name Type Description Default
model nn.Module

input network module

required
operator Union[PruneLayer, QuantizeLayer]

operator used to transform the weights and activations.

required
inplace bool

whether mutates the original module. Defaults to False.

True
weight_layers Sequence[Type[nn.Module]]

which layers to apply operator to transform weights. Defaults to [].

[]
activation_layers Sequence[Type[nn.Module]]

which layers to apply operator to transform output activations. Defaults to [].

[]
input bool

whether apply operator to input. Defaults to False.

False
log bool

whether print the conversion log. Defaults to True.

True
excluded_weight_layer_indexes Sequence[Tuple[Type[nn.Module], Sequential[int]]]

indexes of layers excluded in weight transformations from conversion. Defaults to [].

[]
excluded_activation_layer_indexes Sequence[Tuple[Type[nn.Module], Sequential[int]]]

indexes of layers excluded in activation transformations from conversion. Defaults to [].

[]
include Union[str, List[str]]

Used to filter out a subnetwork to convert. For example, when include="transition", then convert will only visit layers whose module paths include "transition". When more than one items are provided, the layer module path must include all of them in order to be converted. Defaults to None, means to traverse the entire network.

None
exclude Union[str, List[str]]

Used to filter out a subnetwork to convert. For example, when exclude="transition", then convert will only visit layers whose module paths DO NOT include "transition". When more than one items are provided, the layer module path includes any of them will not be converted. Defaults to None, means no module will be excluded.

None
order str

whether insert the operator after ("post") or before ("pre") the activation layers, available choices: "post", "pre". Defaults to "post".

'post'

Returns:

Type Description
nn.Module

converted module

Source code in qsparse/convert.py
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
def convert(  # noqa: C901
    model: nn.Module,
    operator: Union[PruneLayer, QuantizeLayer],
    inplace: bool = True,
    weight_layers: Sequence[Type[nn.Module]] = [],
    activation_layers: Sequence[Type[nn.Module]] = [],
    input: bool = False,
    log: bool = True,
    excluded_weight_layer_indexes: Sequence[Tuple[Type[nn.Module], Sequence[int]]] = [],
    excluded_activation_layer_indexes: Sequence[
        Tuple[Type[nn.Module], Sequence[int]]
    ] = [],
    include: Optional[Union[str, List[str]]] = None,
    exclude: Optional[Union[str, List[str]]] = None,
    order: str = "post",
) -> nn.Module:
    """Automatically convert a model to a new model with its weights and
    activations transformed by the operator, e.g. [prune][qsparse.sparse.prune] or [quantize][qsparse.quantize.quantize].

    Args:
        model (nn.Module): input network module
        operator (Union[PruneLayer, QuantizeLayer]): operator used to transform the weights and activations.
        inplace (bool, optional): whether mutates the original module. Defaults to False.
        weight_layers (Sequence[Type[nn.Module]], optional): which layers to apply operator to transform weights. Defaults to [].
        activation_layers (Sequence[Type[nn.Module]], optional): which layers to apply operator to transform output activations. Defaults to [].
        input (bool, optional): whether apply operator to input. Defaults to False.
        log (bool, optional): whether print the conversion log. Defaults to True.
        excluded_weight_layer_indexes (Sequence[Tuple[Type[nn.Module], Sequential[int]]], optional): indexes of layers excluded in weight transformations from conversion. Defaults to [].
        excluded_activation_layer_indexes (Sequence[Tuple[Type[nn.Module], Sequential[int]]], optional): indexes of layers excluded in activation transformations from conversion. Defaults to [].
        include (Union[str, List[str]], optional): Used to filter out a subnetwork to convert. For example, when include="transition", then `convert` will only visit layers whose module paths include "transition". When more than one items are provided, the layer module path must include all of them in order to be converted. Defaults to None, means to traverse the entire network.
        exclude (Union[str, List[str]], optional): Used to filter out a subnetwork to convert. For example, when exclude="transition", then `convert` will only visit layers whose module paths DO NOT include "transition". When more than one items are provided, the layer module path includes any of them will not be converted. Defaults to None, means no module will be excluded.
        order (str, optional): whether insert the operator after ("post") or before ("pre") the activation layers, available choices: "post", "pre". Defaults to "post".

    Returns:
        nn.Module: converted module
    """
    assert isinstance(
        operator, (PruneLayer, QuantizeLayer)
    ), "`operator` does not belong to (PruneLayer, QuantizeLayer)"
    assert order in ["pre", "post"], "`order` must be either 'pre' or 'post'"

    filter = include or []
    if isinstance(filter, str):
        filter = [filter]

    exclude = exclude or []
    if isinstance(exclude, str):
        exclude = [exclude]

    def met_exclude_condition(module_path: str):
        return any([s in module_path for s in exclude])

    def met_filter_condition(module_path: str):
        return all([s in module_path for s in filter])

    if (len(weight_layers) + len(activation_layers)) == 0:
        warnings.warn(
            "No weight or activation layers specified, nothing will be converted."
        )

    def _print(msg):
        if log:
            logging.info(msg)

    def mstr(m) -> str:
        if isinstance(m, nn.Sequential):
            for c in m.children():
                if not isinstance(c, (QuantizeLayer, PruneLayer)):
                    return mstr(c)
        elif isinstance(m, nn.Module):
            return m.__class__.__name__
        else:
            return m.__name__


    def is_container(m: nn.Module) -> bool:
        if len(m._modules) == 0:
            return False
        else:
            for k in [
                "quantize",
                "prune",
                "quantize_bias",
            ]:  # a leaf module injected with qsparse layers
                if hasattr(m, k):
                    return False
            return True

    def apply_operator(layer: Optional[nn.Module] = None) -> nn.Module:
        if layer is not None:
            if isinstance(operator, QuantizeLayer):
                return quantize(layer, **copy_nn_module_on_demand(operator._kwargs))
            else:
                return prune(layer, **copy_nn_module_on_demand(operator._kwargs))
        else:
            return copy.deepcopy(operator)

    if not inplace:
        model = copy.deepcopy(model)

    def count_occurrence(
        mod: nn.Module, layer_types: Sequence[Type[nn.Module]]
    ) -> Mapping[str, int]:
        def _count_occurrence(m: nn.Module, layer_type, scope):
            total = 0
            for name, layer in m.named_children():
                cur_scope = f"{scope}.{name}"
                if met_exclude_condition(cur_scope):
                    continue
                if not is_container(layer):
                    if mstr(layer) == mstr(layer_type) and met_filter_condition(
                        cur_scope
                    ):
                        total += 1
                else:
                    total += _count_occurrence(layer, layer_type, cur_scope)
            return total

        return {mstr(l): _count_occurrence(mod, l, "") for l in layer_types}

    def excluded_layer_indexes_to_dict(
        layer_indexes, layers_count
    ) -> Mapping[str, Sequence[int]]:
        return defaultdict(
            list,
            {
                mstr(cls): [
                    l if l >= 0 else (l + layers_count[mstr(cls)]) for l in indexes
                ]
                for cls, indexes in layer_indexes
            },
        )

    weight_counter = {mstr(c): 0 for c in weight_layers}
    weight_total_layers_counts = count_occurrence(model, weight_layers)
    excluded_weight_layer_indexes = excluded_layer_indexes_to_dict(
        excluded_weight_layer_indexes, weight_total_layers_counts
    )

    activation_counter = {mstr(c): 0 for c in activation_layers}
    activation_total_layers_counts = count_occurrence(model, activation_layers)
    excluded_activation_layer_indexes = excluded_layer_indexes_to_dict(
        excluded_activation_layer_indexes, activation_total_layers_counts
    )

    operator_name = str(operator).lower()
    for c in ["(", ")", "layer"]:
        operator_name = operator_name.replace(c, "")
    operator_name = f"`{operator_name}`"

    def _convert_weight(mod: nn.Module, scope: str = "") -> nn.Module:
        reassign = {}
        for name, m in mod.named_children():
            modified = False
            cur_scope = f"{scope}.{name}"
            if met_exclude_condition(cur_scope):
                continue
            if not is_container(m):
                if mstr(m) in weight_counter:
                    if (
                        weight_counter[mstr(m)]
                        not in excluded_weight_layer_indexes[mstr(m)]
                    ) and met_filter_condition(cur_scope):
                        _print(f"Apply {operator_name} on the {cur_scope} weight")
                        m = apply_operator(m)
                        modified = True
                    else:
                        _print(f"Exclude {cur_scope} weight")
                    weight_counter[mstr(m)] += 1
                if modified:
                    reassign[name] = m
            else:
                _convert_weight(m, cur_scope)

        for key, value in reassign.items():
            mod._modules[key] = value
        return mod

    def _convert_activation(mod: nn.Module, scope: str = "") -> nn.Module:
        reassign = {}
        for name, m in mod.named_children():
            origin_m = m
            cur_scope = f"{scope}.{name}"
            if met_exclude_condition(cur_scope):
                continue
            if (not is_container(m)) or hasattr(m, "_qsparse_conversion"):
                layer_type = mstr(m)
                if layer_type in activation_counter:
                    if (
                        activation_counter[layer_type]
                        not in excluded_activation_layer_indexes[layer_type]
                    ) and met_filter_condition(cur_scope):
                        _print(f"Apply {operator_name} on the {cur_scope} activation")
                        if order == "post":
                            m = nn.Sequential(m, apply_operator())
                        else:
                            m = nn.Sequential(apply_operator(), m)
                        setattr(m, "_qsparse_conversion", True)
                    else:
                        _print(f"Exclude {cur_scope} activation")
                    activation_counter[layer_type] += 1
                if origin_m is not m:
                    reassign[name] = m
            else:
                _convert_activation(m, cur_scope)

        for key, value in reassign.items():
            mod._modules[key] = value
        return mod

    def apply_to_input(mod):
        return nn.Sequential(apply_operator(), mod)

    if model == nn_module(model):
        model = _convert_weight(model)
        model = _convert_activation(model)
        if input:
            model = apply_to_input(model)
    else:
        model.module = _convert_weight(model.module)
        model.module = _convert_activation(model.module)
        if input:
            model.module = apply_to_input(model.module)
    auto_name_prune_quantize_layers(nn_module(model))
    return model