I'm currently trying to improve the performance of a CycleGAN model which has a couple of downsampling layers and upsampling layers combined with 6 ResNet blocks for the bottleneck.
I've added the Depthwise separable convolution implementation in the What have I tried
Currently, I've swapped every possible conv1d and conv2d layer with these layers and it was able to reduce parameters from 22059265 to 7906710 and I gained a boost of performance.
So my question is, is it a good practice to replace all conv layers with Depthwise separable convolutions and reduce this much of parameters, or is it better to only change the bottleneck layers with this? will it hurt the accuracy?
What I have tried:
I used the following implementation for it,
def __init__(self, in, out, kernel, stride, padding, bias=False):
self.d = nn.Conv1d(in, in, kernel_size=kernel, stride=stride, padding=padding, groups=in, bias=bias)
self.p = nn.Conv1d(in, out, kernel_size=1, stride=stride, bias=bias)
def forward(self, x):
out = self.d(x)
out = self.p(out)