сохранять и загружать модель keras с настраиваемым слоем с дополнительными атрибутами

Я создал собственный слой DenseWithMask, который является подклассом Dense. У него есть еще несколько атрибутов, включая тот, который я назвал edge_mask. Итак, этот код отлично работает:

new_layer = DenseWithMask(10)
print(new_layer.edge_mask)

Однако, если я создам модель с использованием DenseWithMask, затем сохраню и загрузю снова, слои не будут иметь атрибута edge_mask. (И все остальные атрибуты, которые я добавил, тоже отсутствуют.)

Вот пример (код для DenseWithMask в конце этого сообщения):

import tensorflow as tf
from DenseWithMask import DenseWithMask

# make a model
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    DenseWithMask(128, activation='relu'),
    DenseWithMask(10)])
model.compile(optimizer='adam', 
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy'])

# try accessing edge_mask
print('edge_mask:',model.layers[1].edge_mask)

# save model
model.save('model_with_custom_layers')

# load model
model2 = tf.keras.models.load_model('model_with_custom_layers', 
     custom_objects={"DenseWithMask": DenseWithMask})

# try accessing attributes of custom layer
model2.layers[1].edge_mask

Это возвращает:

edge_mask: tf.Tensor(
[[ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 ...
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]], shape=(784, 128), dtype=bool)
INFO:tensorflow:Assets written to: model_with_custom_layers\assets

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-32e4901279b3> in <module>
     23 
     24     # try accessing attributes of custom layer
---> 25     model2.layers[1].edge_mask

AttributeError: 'DenseWithMask' object has no attribute 'edge_mask'

Как я могу заставить это работать? Строки, в которых исправлена ошибка в этом потоке, уже есть в моем коде.

Ниже приведен код для DenseWithMask. Я включил весь класс, но ожидаю, что только __init__ и get_config имеют значение для моей проблемы.

(В get_config я конвертирую атрибуты, которые представляют собой массивы констант тензорного потока, в массивы numpy, потому что у save есть проблемы с записью массивов констант тензорного потока в файл JSON. Я ожидаю, что это вызовет проблемы в дальнейшем, но моя текущая проблема, похоже, не зависит от этой потому что я пытался не преобразовывать массивы констант тензорного потока в get_config и строить модель через model2=from_config(model.get_config()), и это также не привело к созданию модели с edge_mask в качестве атрибута.)

################################################################################
# Define a keras layer class that allows for permanent pruning
################################################################################

# imports  copied from keras.layers.core
from tensorflow.python.eager import context
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import tensor_shape
from tensorflow.python.keras import activations
from tensorflow.python.keras import backend as K
from tensorflow.python.keras import constraints
from tensorflow.python.keras import initializers
from tensorflow.python.keras import regularizers
from tensorflow.python.keras.engine.base_layer import Layer
from tensorflow.python.keras.engine.input_spec import InputSpec
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import gen_math_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn
from tensorflow.python.ops import sparse_ops
from tensorflow.python.ops import standard_ops

# other imports
import numpy as np
import tensorflow as tf
#from tensorflow.keras.layers import *

################################################################################

# The following class is a copy of Dense in keras. I marked all the lines that 
# I added or changed

class DenseWithMask(Layer):
    """Dense layer but with optional permanent masking of units or edges."""

    def __init__(self, 
                 units,
                 unit_mask=None, # NEW
                 edge_mask=None, # NEW
                 activation=None,
                 use_bias=True,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='zeros',
                 kernel_regularizer=None,
                 bias_regularizer=None,
                 activity_regularizer=None,
                 kernel_constraint=None,
                 bias_constraint=None,
                 **kwargs):
        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] = (kwargs.pop('input_dim'),)

        super(DenseWithMask, self).__init__( # changed 'Dense' to 'DenseWithMask'
              activity_regularizer=regularizers.get(activity_regularizer)) #, **kwargs)

        self.units = int(units) if not isinstance(units, int) else units 
        # NEW: add unit_mask to class attributes
        self.unit_mask = unit_mask 
        # NEW: add edge_mask to class attributes
        self.edge_mask = edge_mask 
        # NEW: add unit_mask_indices to class attributes
        self.unit_mask_indices = None 
        # NEW: add edge_mask_indices to class attributes
        self.edge_mask_indices = None 
        self.activation = activations.get(activation)
        self.use_bias = use_bias
        self.kernel_initializer = initializers.get(kernel_initializer)
        self.bias_initializer = initializers.get(bias_initializer)
        self.kernel_regularizer = regularizers.get(kernel_regularizer)
        self.bias_regularizer = regularizers.get(bias_regularizer)
        self.kernel_constraint = constraints.get(kernel_constraint)
        self.bias_constraint = constraints.get(bias_constraint)

        self.supports_masking = True
        self.input_spec = InputSpec(min_ndim=2)

        super(DenseWithMask, self).__init__(**kwargs)


    def get_config(self):
        config = {
            'class_name': 'DenseWithMask',
            'units': self.units,
            'unit_mask': self.unit_mask.numpy(), #NEW: added unit_mask to config
            'unit_mask_indices': self.unit_mask_indices.numpy(), # NEW
            'edge_mask': self.edge_mask.numpy(), #NEW: added edge_mask to config
            'edge_mask_indices': self.edge_mask_indices.numpy(), # NEW
            'activation': activations.serialize(self.activation), 
            'use_bias': self.use_bias,
            'kernel_initializer': initializers.serialize(self.kernel_initializer), 
            'bias_initializer': initializers.serialize(self.bias_initializer),
            'kernel_regularizer': regularizers.serialize(self.kernel_regularizer), 
            'bias_regularizer': regularizers.serialize(self.bias_regularizer),
            'activity_regularizer':
                 regularizers.serialize(self.activity_regularizer),
            'kernel_constraint': constraints.serialize(self.kernel_constraint), 
            'bias_constraint': constraints.serialize(self.bias_constraint)
            }
        base_config = super(DenseWithMask, self).get_config() 

        return dict(list(base_config.items()) + list(config.items()))


    def build(self, input_shape): 
        dtype = dtypes.as_dtype(self.dtype or K.floatx())
        if not (dtype.is_floating or dtype.is_complex):
            raise TypeError('Unable to build `Dense` layer with non-floating point '
                            'dtype %s' % (dtype,))
        input_shape = tensor_shape.TensorShape(input_shape)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError('The last dimension of the inputs to `Dense` '
                             'should be defined. Found `None`.')
        last_dim = tensor_shape.dimension_value(input_shape[-1])
        self.input_spec = InputSpec(min_ndim=2, axes={-1: last_dim})

        # NEW: update masks in build
        if self.unit_mask is None:
            self.unit_mask = np.ones(self.units, dtype=bool)
        else:
            # check if previously set mask matches number of units
            if not len(self.unit_mask) == self.units:
                raise ValueError('Length of unit_mask must be equal to number of units.')

        if self.edge_mask is None:
            self.edge_mask = np.ones((last_dim, self.units), dtype=bool)
        else:
            # check if previously set mask matches input dimensions
            if not self.edge_mask.shape == (last_dim, self.units):
                raise ValueError('Dimensions of edge_mask must be equal to (last_input_dim, units).')

        # NEW: incorporate unit_mask info into edge_mask
        self.edge_mask = self.edge_mask * self.unit_mask

        # NEW: incorporate edge_mask info into unit_mask
        self.unit_mask = self.unit_mask * np.any(self.edge_mask, axis=0)

        # NEW: update mask indices
        self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
        self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
        self.unit_mask_indices = self.unit_mask.nonzero()[0]
        # need to convert indices for bias to 2d array for sparse tensor
        self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
                                       self.unit_mask_indices]).T
        self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')

        # NEW: turn all new attributes into tensorflow constants
        self.unit_mask = tf.constant(self.unit_mask)
        self.edge_mask = tf.constant(self.edge_mask)
        self.unit_mask_indices = tf.constant(self.unit_mask_indices)
        self.edge_mask_indices = tf.constant(self.edge_mask_indices)

        self.kernel = self.add_weight(
            'kernel',
            # NEW: trainable kernel weights may be fewer than before;
            # they are now stored as 1D array,
            shape=[int(np.sum(self.edge_mask))], 
            initializer=self.kernel_initializer,
            regularizer=self.kernel_regularizer,
            constraint=self.kernel_constraint,
            dtype=self.dtype,
            trainable=True)
        if self.use_bias:
            self.bias = self.add_weight(
                'bias',
                #NEW: trainable biases may be fewer than number of units
                shape=[int(np.sum(self.unit_mask))], 
                initializer=self.bias_initializer,
                regularizer=self.bias_regularizer,
                constraint=self.bias_constraint,
                dtype=self.dtype,
                trainable=True)
        else:
            self.bias = None
        self.built = True


    def rebuild(self, edge_mask=None, unit_mask=None): 
        # NEW: This is a new function for rebuilding a DenseWithMask layer with a 
        # new edge mask and/or unit mask

        # if none are given, get default values for masks from layer
        if edge_mask is None:
            edge_mask = self.edge_mask.numpy()
        if unit_mask is None:
            unit_mask = self.unit_mask.numpy()

        # incorporate unit_mask info into edge_mask
        edge_mask = edge_mask * unit_mask

        # incorporate edge_mask info into unit_mask
        unit_mask = unit_mask * np.any(edge_mask, axis=0)

        # NOW: get new arrays of trainable weights for layer
        # The stuff below contains slow, redundant lines but 
        # those might become handy when adding default values for
        # new edges and nodes(?)

        # create old kernel
        kernel_old = np.zeros_like(self.edge_mask, dtype=float)
        kernel_old[self.edge_mask.numpy()]=self.kernel.numpy()

        # create new kernel
        kernel_new = np.zeros_like(edge_mask, dtype=float)
        for i in range(len(kernel_new)):
            for j in range(len(kernel_new[0])):
                if edge_mask[i,j]:
                    kernel_new[i,j] = kernel_old[i,j]

        # create initializer for new kernel
        vals = list(kernel_new[edge_mask])
        initK = tf.compat.v1.keras.initializers.Constant(value=vals, 
                                                         verify_shape=False)

        # save new edge mask
        self.edge_mask = edge_mask

        # build new kernel
        self.kernel = self.add_weight(
            'kernel',
            shape=[int(np.sum(self.edge_mask))], 
            initializer=initK,
            regularizer=self.kernel_regularizer,
            constraint=self.kernel_constraint,
            dtype=self.dtype,
            trainable=True)

        # create old bias list
        bias_old = np.zeros_like(self.unit_mask, dtype=float)
        bias_old[self.unit_mask.numpy()] = self.bias.numpy()

        # create new bias list
        bias_new = np.zeros_like(unit_mask, dtype=float)
        for i in range(len(bias_new)):
            if unit_mask[i]:
                bias_new[i] = bias_old[i]

        # create initializer for new biases
        vals = list(bias_new[unit_mask])
        initB = tf.compat.v1.keras.initializers.Constant(value=vals, 
                                                         verify_shape=False)

        # save new unit mask
        self.unit_mask = unit_mask

        # build new biases
        self.bias = self.add_weight(
            'bias',
            shape=[int(np.sum(self.unit_mask))], 
            initializer=initB,
            regularizer=self.bias_regularizer,
            constraint=self.bias_constraint,
            dtype=self.dtype,
            trainable=True)

        # update mask indices
        self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
        self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
        self.unit_mask_indices = self.unit_mask.nonzero()[0]
        # need to convert indices for bias to 2d array for sparse tensor
        self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
                                           self.unit_mask_indices]).T
        self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')

        # turn all new attributes into tensorflow constants
        self.unit_mask = tf.constant(self.unit_mask)
        self.edge_mask = tf.constant(self.edge_mask)
        self.unit_mask_indices = tf.constant(self.unit_mask_indices)
        self.edge_mask_indices = tf.constant(self.edge_mask_indices)


    def call(self, inputs): 

        # NEW: create a kernel (2D numpy array) from 1D list of trainable kernel weights
        kernel = tf.SparseTensor(indices=self.edge_mask_indices, 
                                 values=self.kernel, 
                                 dense_shape=self.edge_mask.shape)
        kernel = tf.sparse.to_dense(kernel)

        # NEW: create a bias vector (1D numpy array) from 1D list of trainable biases
        bias = tf.SparseTensor(indices=self.unit_mask_indices, values=self.bias, 
                               dense_shape=[1,self.unit_mask.shape[0]])
        bias = tf.squeeze(tf.sparse.to_dense(bias))

        rank = inputs.shape.rank
        if rank is not None and rank > 2:
            # Broadcasting is required for the inputs.
            outputs = standard_ops.tensordot(inputs, kernel, [[rank - 1], [0]]) # NEW: self.kernel -> kernel
            # Reshape the output back to the original ndim of the input.
            if not context.executing_eagerly():
                shape = inputs.shape.as_list()
                output_shape = shape[:-1] + [self.units]
                outputs.set_shape(output_shape)
        else:
            inputs = math_ops.cast(inputs, self._compute_dtype)
            if K.is_sparse(inputs):
                outputs = sparse_ops.sparse_tensor_dense_matmul(inputs, kernel) # NEW: self.kernel -> kernel
            else:
                outputs = gen_math_ops.mat_mul(inputs, kernel) # NEW: self.kernel -> kernel

        if self.use_bias:
            outputs = nn.bias_add(outputs, bias) # NEW: self.bias -> bias

        if self.activation is not None:
            return self.activation(outputs)  

        return outputs


    def compute_output_shape(self, input_shape):
        input_shape = tensor_shape.TensorShape(input_shape)
        input_shape = input_shape.with_rank_at_least(2)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError(
                'The innermost dimension of input_shape must be defined, but saw: %s'
                % input_shape)

        return input_shape[:-1].concatenate(self.units)

Alice Schwarze 25.07.2020 источник

comment

Привет, на самом деле это не решение вашей проблемы, поэтому, оставляя комментарии, недавно люди обнаружили, что новая привязка tenorflow-keras после tf v2 не обновляет веса для каких-либо настраиваемых слоев, таких как ваш, вы можете переключиться на pytorch. Просто предупреждаю. - 25.07.2020

comment

Спасибо @aak! У вас есть ссылка, где люди обсуждали эту проблему? Кажется, мои веса обновляются нормально. - Alice Schwarze 25.07.2020

Ответы (1)

arrow_upward
3
arrow_downward

Я не могу воспроизвести вашу ошибку с помощью предоставленного вами кода. У меня есть несколько других ошибок, которые можно исправить, выполнив следующие действия:

class_name / unit_mask_indices / edge_mask_indices не должно быть в словаре, возвращаемом get_config, поскольку нет __init__ параметров
unit_mask / edge_mask должны быть преобразованы в массивы numpy в __init__, если они не None

Вот рабочий код (на основе предоставленного вами кода):

import tensorflow as tf

################################################################################
# Define a keras layer class that allows for permanent pruning
################################################################################

# imports  copied from keras.layers.core
from tensorflow.python.eager import context
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import tensor_shape
from tensorflow.python.keras import activations
from tensorflow.python.keras import backend as K
from tensorflow.python.keras import constraints
from tensorflow.python.keras import initializers
from tensorflow.python.keras import regularizers
from tensorflow.python.keras.engine.base_layer import Layer
from tensorflow.python.keras.engine.input_spec import InputSpec
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import gen_math_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn
from tensorflow.python.ops import sparse_ops
from tensorflow.python.ops import standard_ops

# other imports
import numpy as np
import tensorflow as tf
#from tensorflow.keras.layers import *

################################################################################

# The following class is a copy of Dense in keras. I marked all the lines that 
# I added or changed

class DenseWithMask(Layer):
    """Dense layer but with optional permanent masking of units or edges."""
    def __init__(
        self, 
        units,
        unit_mask=None, # NEW
        edge_mask=None, # NEW
        activation=None,
        use_bias=True,
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        kernel_regularizer=None,
        bias_regularizer=None,
        activity_regularizer=None,
        kernel_constraint=None,
        bias_constraint=None,
        **kwargs
    ):
        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] = (kwargs.pop('input_dim'),)

        super(DenseWithMask, self).__init__( # changed 'Dense' to 'DenseWithMask'
              activity_regularizer=regularizers.get(activity_regularizer)) #, **kwargs)

        self.units = int(units) if not isinstance(units, int) else units 
        # NEW: add unit_mask to class attributes
        self.unit_mask = np.array(unit_mask) if unit_mask is not None else None
        # NEW: add edge_mask to class attributes
        self.edge_mask = np.array(edge_mask) if edge_mask is not None else None
        # NEW: add unit_mask_indices to class attributes
        self.unit_mask_indices = None 
        # NEW: add edge_mask_indices to class attributes
        self.edge_mask_indices = None 
        self.activation = activations.get(activation)
        self.use_bias = use_bias
        self.kernel_initializer = initializers.get(kernel_initializer)
        self.bias_initializer = initializers.get(bias_initializer)
        self.kernel_regularizer = regularizers.get(kernel_regularizer)
        self.bias_regularizer = regularizers.get(bias_regularizer)
        self.kernel_constraint = constraints.get(kernel_constraint)
        self.bias_constraint = constraints.get(bias_constraint)

        self.supports_masking = True
        self.input_spec = InputSpec(min_ndim=2)

        super(DenseWithMask, self).__init__(**kwargs)


    def get_config(self):
        config = {
            'units': self.units,
            'unit_mask': self.unit_mask.numpy(), #NEW: added unit_mask to config
            'edge_mask': self.edge_mask.numpy(), #NEW: added edge_mask to config
            'activation': activations.serialize(self.activation), 
            'use_bias': self.use_bias,
            'kernel_initializer': initializers.serialize(self.kernel_initializer), 
            'bias_initializer': initializers.serialize(self.bias_initializer),
            'kernel_regularizer': regularizers.serialize(self.kernel_regularizer), 
            'bias_regularizer': regularizers.serialize(self.bias_regularizer),
            'activity_regularizer':
                 regularizers.serialize(self.activity_regularizer),
            'kernel_constraint': constraints.serialize(self.kernel_constraint), 
            'bias_constraint': constraints.serialize(self.bias_constraint)
            }
        base_config = super(DenseWithMask, self).get_config() 

        return dict(list(base_config.items()) + list(config.items()))


    def build(self, input_shape): 
        dtype = dtypes.as_dtype(self.dtype or K.floatx())
        if not (dtype.is_floating or dtype.is_complex):
            raise TypeError('Unable to build `Dense` layer with non-floating point '
                            'dtype %s' % (dtype,))
        input_shape = tensor_shape.TensorShape(input_shape)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError('The last dimension of the inputs to `Dense` '
                             'should be defined. Found `None`.')
        last_dim = tensor_shape.dimension_value(input_shape[-1])
        self.input_spec = InputSpec(min_ndim=2, axes={-1: last_dim})

        # NEW: update masks in build
        if self.unit_mask is None:
            self.unit_mask = np.ones(self.units, dtype=bool)
        else:
            # check if previously set mask matches number of units
            if not len(self.unit_mask) == self.units:
                raise ValueError('Length of unit_mask must be equal to number of units.')

        if self.edge_mask is None:
            self.edge_mask = np.ones((last_dim, self.units), dtype=bool)
        else:
            # check if previously set mask matches input dimensions
            if not self.edge_mask.shape == (last_dim, self.units):
                raise ValueError('Dimensions of edge_mask must be equal to (last_input_dim, units).')

        # NEW: incorporate unit_mask info into edge_mask
        self.edge_mask = self.edge_mask * self.unit_mask

        # NEW: incorporate edge_mask info into unit_mask
        self.unit_mask = self.unit_mask * np.any(self.edge_mask, axis=0)

        # NEW: update mask indices
        self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
        self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
        self.unit_mask_indices = self.unit_mask.nonzero()[0]
        # need to convert indices for bias to 2d array for sparse tensor
        self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
                                       self.unit_mask_indices]).T
        self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')

        # NEW: turn all new attributes into tensorflow constants
        self.unit_mask = tf.constant(self.unit_mask)
        self.edge_mask = tf.constant(self.edge_mask)
        self.unit_mask_indices = tf.constant(self.unit_mask_indices)
        self.edge_mask_indices = tf.constant(self.edge_mask_indices)

        self.kernel = self.add_weight(
            'kernel',
            # NEW: trainable kernel weights may be fewer than before;
            # they are now stored as 1D array,
            shape=[int(np.sum(self.edge_mask))], 
            initializer=self.kernel_initializer,
            regularizer=self.kernel_regularizer,
            constraint=self.kernel_constraint,
            dtype=self.dtype,
            trainable=True)
        if self.use_bias:
            self.bias = self.add_weight(
                'bias',
                #NEW: trainable biases may be fewer than number of units
                shape=[int(np.sum(self.unit_mask))], 
                initializer=self.bias_initializer,
                regularizer=self.bias_regularizer,
                constraint=self.bias_constraint,
                dtype=self.dtype,
                trainable=True)
        else:
            self.bias = None
        self.built = True


    def rebuild(self, edge_mask=None, unit_mask=None): 
        # NEW: This is a new function for rebuilding a DenseWithMask layer with a 
        # new edge mask and/or unit mask

        # if none are given, get default values for masks from layer
        if edge_mask is None:
            edge_mask = self.edge_mask.numpy()
        if unit_mask is None:
            unit_mask = self.unit_mask.numpy()

        # incorporate unit_mask info into edge_mask
        edge_mask = edge_mask * unit_mask

        # incorporate edge_mask info into unit_mask
        unit_mask = unit_mask * np.any(edge_mask, axis=0)

        # NOW: get new arrays of trainable weights for layer
        # The stuff below contains slow, redundant lines but 
        # those might become handy when adding default values for
        # new edges and nodes(?)

        # create old kernel
        kernel_old = np.zeros_like(self.edge_mask, dtype=float)
        kernel_old[self.edge_mask.numpy()]=self.kernel.numpy()

        # create new kernel
        kernel_new = np.zeros_like(edge_mask, dtype=float)
        for i in range(len(kernel_new)):
            for j in range(len(kernel_new[0])):
                if edge_mask[i,j]:
                    kernel_new[i,j] = kernel_old[i,j]

        # create initializer for new kernel
        vals = list(kernel_new[edge_mask])
        initK = tf.compat.v1.keras.initializers.Constant(value=vals, 
                                                         verify_shape=False)

        # save new edge mask
        self.edge_mask = edge_mask

        # build new kernel
        self.kernel = self.add_weight(
            'kernel',
            shape=[int(np.sum(self.edge_mask))], 
            initializer=initK,
            regularizer=self.kernel_regularizer,
            constraint=self.kernel_constraint,
            dtype=self.dtype,
            trainable=True)

        # create old bias list
        bias_old = np.zeros_like(self.unit_mask, dtype=float)
        bias_old[self.unit_mask.numpy()] = self.bias.numpy()

        # create new bias list
        bias_new = np.zeros_like(unit_mask, dtype=float)
        for i in range(len(bias_new)):
            if unit_mask[i]:
                bias_new[i] = bias_old[i]

        # create initializer for new biases
        vals = list(bias_new[unit_mask])
        initB = tf.compat.v1.keras.initializers.Constant(value=vals, 
                                                         verify_shape=False)

        # save new unit mask
        self.unit_mask = unit_mask

        # build new biases
        self.bias = self.add_weight(
            'bias',
            shape=[int(np.sum(self.unit_mask))], 
            initializer=initB,
            regularizer=self.bias_regularizer,
            constraint=self.bias_constraint,
            dtype=self.dtype,
            trainable=True)

        # update mask indices
        self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
        self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
        self.unit_mask_indices = self.unit_mask.nonzero()[0]
        # need to convert indices for bias to 2d array for sparse tensor
        self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
                                           self.unit_mask_indices]).T
        self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')

        # turn all new attributes into tensorflow constants
        self.unit_mask = tf.constant(self.unit_mask)
        self.edge_mask = tf.constant(self.edge_mask)
        self.unit_mask_indices = tf.constant(self.unit_mask_indices)
        self.edge_mask_indices = tf.constant(self.edge_mask_indices)


    def call(self, inputs): 

        # NEW: create a kernel (2D numpy array) from 1D list of trainable kernel weights
        kernel = tf.SparseTensor(indices=self.edge_mask_indices, 
                                 values=self.kernel, 
                                 dense_shape=self.edge_mask.shape)
        kernel = tf.sparse.to_dense(kernel)

        # NEW: create a bias vector (1D numpy array) from 1D list of trainable biases
        bias = tf.SparseTensor(indices=self.unit_mask_indices, values=self.bias, 
                               dense_shape=[1,self.unit_mask.shape[0]])
        bias = tf.squeeze(tf.sparse.to_dense(bias))

        rank = inputs.shape.rank
        if rank is not None and rank > 2:
            # Broadcasting is required for the inputs.
            outputs = standard_ops.tensordot(inputs, kernel, [[rank - 1], [0]]) # NEW: self.kernel -> kernel
            # Reshape the output back to the original ndim of the input.
            if not context.executing_eagerly():
                shape = inputs.shape.as_list()
                output_shape = shape[:-1] + [self.units]
                outputs.set_shape(output_shape)
        else:
            inputs = math_ops.cast(inputs, self._compute_dtype)
            if K.is_sparse(inputs):
                outputs = sparse_ops.sparse_tensor_dense_matmul(inputs, kernel) # NEW: self.kernel -> kernel
            else:
                outputs = gen_math_ops.mat_mul(inputs, kernel) # NEW: self.kernel -> kernel

        if self.use_bias:
            outputs = nn.bias_add(outputs, bias) # NEW: self.bias -> bias

        if self.activation is not None:
            return self.activation(outputs)  

        return outputs


    def compute_output_shape(self, input_shape):
        input_shape = tensor_shape.TensorShape(input_shape)
        input_shape = input_shape.with_rank_at_least(2)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError(
                'The innermost dimension of input_shape must be defined, but saw: %s'
                % input_shape)

        return input_shape[:-1].concatenate(self.units)


def main():

    # make a model
    model = tf.keras.models.Sequential(
        [
            tf.keras.layers.Flatten(input_shape=(28, 28)),
            DenseWithMask(128, activation='relu'),
            DenseWithMask(10)
        ]
    )
    model.compile(
        optimizer='adam', 
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        metrics=['accuracy']
    )

    # try accessing edge_mask
    print('edge_mask:', model.layers[1].edge_mask)

    # save model
    model.save('model_with_custom_layers')

    # load model
    model2 = tf.keras.models.load_model('model_with_custom_layers',  custom_objects={"DenseWithMask": DenseWithMask})

    # try accessing attributes of custom layer
    print("edge_mask of loaded model:", model2.layers[1].edge_mask)

if __name__ == "__main__":
    main()

И соответствующий вывод:

edge_mask: tf.Tensor(
[[ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 ...
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]], shape=(784, 128), dtype=bool)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
edge_mask of loaded model: tf.Tensor(
[[ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 ...
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]], shape=(784, 128), dtype=bool)

Что касается вашего кода, вы можете немного его упростить:

Как вы упомянули, ваш пользовательский класс DenseWithMask является расширенной версией класса Dense из tensorflow, поэтому вы можете использовать наследование (по крайней мере, в __init__ и get_config, я не проверял все ваши методы)

import tensorflow as tf

class DenseWithMask(tf.keras.layers.Dense):
    """
    Dense layer but with optional permanent masking of units or edges.
    """
    def __init__(
        self, 
        units,
        unit_mask=None, # NEW
        edge_mask=None, # NEW
        **kwargs
    ):
        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] = (kwargs.pop('input_dim'),)
        
        self.unit_mask = np.array(unit_mask) if unit_mask is not None else None
        self.edge_mask = np.array(edge_mask) if edge_mask is not None else None
        self.unit_mask_indices = None
        self.edge_mask_indices = None

        super().__init__(units=units, **kwargs)

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "edge_mask": self.edge_mask.numpy(),
                "unit_mask": self.unit_mask.numpy()
            }
        )

        return config

Когда вы определяете настраиваемые вызываемые объекты (например, слои, метрики, оптимизаторы и т. Д.) Вместо определения сопоставления custom_objects в методе load_model, вы можете использовать функцию utils, предоставляемую тензорным потоком, чтобы сделать это автоматически: tf.keras.utils.register_keras_serializable

import tensorflow as tf

@tf.keras.utils.register_keras_serializable()
class DenseWithMask(tf.keras.layers.Dense):
   ....

Затем, когда вы сохраняете / загружаете модель, вы можете пропустить отображение custom_objects:

# make a model
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        DenseWithMask(128, activation='relu'),
        DenseWithMask(10)
    ]
)
model.compile(
    optimizer='adam', 
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)

# try accessing edge_mask
print('edge_mask:', model.layers[1].edge_mask)

# save model
model.save('model_with_custom_layers.h5')

# load model
model2 = tf.keras.models.load_model('model_with_custom_layers.h5')

# try accessing attributes of custom layer
print("edge_mask of loaded model:", model2.layers[1].edge_mask)

Примечание: для этого второго пункта мне удалось заставить его работать только для формата файла h5, но это также должно быть выполнено с форматом pb.

M. Perier--Dulhoste 26.11.2020

сохранять и загружать модель keras с настраиваемым слоем с дополнительными атрибутами

Ответы (1)

Вопросы по теме