当前位置：首页 > news >正文

网站搭建报价表巴州建设局网站

news 2025/9/30 23:47:23

网站搭建报价表,巴州建设局网站,wordpress+纯静态插件,wordpress打开太慢责备#x1f368; 本文为#x1f517;365天深度学习训练营中的学习记录博客#x1f356; 原作者#xff1a;K同学啊 | 接辅导、项目定制 0. 总结数据导入及处理部分#xff1a;本次数据导入没有使用torchvision自带的数据集#xff0c;需要将原始数据进行处理包括数据导入… 本文为365天深度学习训练营中的学习记录博客原作者K同学啊 | 接辅导、项目定制 0. 总结数据导入及处理部分本次数据导入没有使用torchvision自带的数据集需要将原始数据进行处理包括数据导入查看数据分类情况定义transforms进行数据类型转换等操作。划分数据集划定训练集测试集后再使用torch.utils.data中的DataLoader()分别加载上一步处理好的训练及测试数据查看批处理维度. 模型构建部分DenseNet-121 SE模块设置超参数在这之前需要定义损失函数学习率动态学习率以及根据学习率定义优化器例如SGD随机梯度下降用来在训练中更新参数最小化损失函数。定义训练函数函数的传入的参数有四个分别是设置好的DataLoader(),定义好的模型损失函数优化器。函数内部初始化损失准确率为0接着开始循环使用DataLoader()获取一个批次的数据对这个批次的数据带入模型得到预测值然后使用损失函数计算得到损失值。接下来就是进行反向传播以及使用优化器优化参数梯度清零放在反向传播之前或者是使用优化器优化之后都是可以的一般是默认放在反向传播之前。定义测试函数函数传入的参数相比训练函数少了优化器只需传入设置好的DataLoader(),定义好的模型损失函数。此外除了处理批次数据时无需再设置梯度清零、返向传播以及优化器优化参数其余部分均和训练函数保持一致。训练过程定义训练次数有几次就使用整个数据集进行几次训练初始化四个空list分别存储每次训练及测试的准确率及损失。使用model.train()开启训练模式调用训练函数得到准确率及损失。使用model.eval()将模型设置为评估模式调用测试函数得到准确率及损失。接着就是将得到的训练及测试的准确率及损失存储到相应list中并合并打印出来得到每一次整体训练后的准确率及损失。结果可视化模型的保存调取及使用。在PyTorch中通常使用 torch.save(model.state_dict(), ‘model.pth’) 保存模型的参数使用 model.load_state_dict(torch.load(‘model.pth’)) 加载参数。需要改进优化的地方确保模型和数据的一致性都存到GPU或者CPU;注意numclasses不要直接用默认的1000需要根据实际数据集改进实例化模型也要注意numclasses这个参数此外注意测试模型需要用3,224,2243表示通道数这和tensorflow定义的顺序是不用的224,224,3做代码转换时需要注意。 import torch import torch.nn as nn import torchvision from torchvision import datasets,transforms from torch.utils.data import DataLoader import torchvision.models as models import torch.nn.functional as F from collections import OrderedDict import os,PIL,pathlib import matplotlib.pyplot as plt import warningswarnings.filterwarnings(ignore) # 忽略警告信息plt.rcParams[font.sans-serif] [SimHei] # 用来正常显示中文标签 plt.rcParams[axes.unicode_minus] False # 用来正常显示负号 plt.rcParams[figure.dpi] 100 # 分辨率1. 设置GPU device torch.device(cuda if torch.cuda.is_available() else cpu) devicedevice(typecuda)2. 导入数据及处理部分 # 获取数据分布情况 path_dir ./data/mpox_recognize/ path_dir pathlib.Path(path_dir)paths list(path_dir.glob(*)) # classNames [str(path).split(\\)[-1] for path in paths] # [Bananaquit, Black Skimmer, Black Throated Bushtiti, Cockatoo] classNames [path.parts[-1] for path in paths] classNames[Monkeypox, Others]# 定义transforms 并处理数据 train_transforms transforms.Compose([transforms.Resize([224,224]), # 将输入图片resize成统一尺寸transforms.RandomHorizontalFlip(), # 随机水平翻转transforms.ToTensor(), # 将PIL Image 或 numpy.ndarray 装换为tensor,并归一化到[0,1]之间transforms.Normalize( # 标准化处理 -- 转换为标准正太分布高斯分布使模型更容易收敛mean [0.485,0.456,0.406], # 其中 mean[0.485,0.456,0.406]与std[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。std [0.229,0.224,0.225]) ]) test_transforms transforms.Compose([transforms.Resize([224,224]),transforms.ToTensor(),transforms.Normalize(mean [0.485,0.456,0.406],std [0.229,0.224,0.225]) ]) total_data datasets.ImageFolder(./data/mpox_recognize/,transform train_transforms) total_dataDataset ImageFolderNumber of datapoints: 2142Root location: ./data/mpox_recognize/StandardTransform Transform: Compose(Resize(size[224, 224], interpolationbilinear, max_sizeNone, antialiasTrue)RandomHorizontalFlip(p0.5)ToTensor()Normalize(mean[0.485, 0.456, 0.406], std[0.229, 0.224, 0.225]))total_data.class_to_idx{Monkeypox: 0, Others: 1}3. 划分数据集 # 划分数据集 train_size int(len(total_data) * 0.8) test_size len(total_data) - train_sizetrain_dataset,test_dataset torch.utils.data.random_split(total_data,[train_size,test_size]) train_dataset,test_dataset(torch.utils.data.dataset.Subset at 0x18230109120,torch.utils.data.dataset.Subset at 0x182300d2cb0)# 定义DataLoader用于数据集的加载batch_size 32train_dl torch.utils.data.DataLoader(train_dataset,batch_size batch_size,shuffle True,num_workers 1 ) test_dl torch.utils.data.DataLoader(test_dataset,batch_size batch_size,shuffle True,num_workers 1 )# 观察数据维度 for X,y in test_dl:print(Shape of X [N,C,H,W]: ,X.shape)print(Shape of y: , y.shape,y.dtype)breakShape of X [N,C,H,W]: torch.Size([32, 3, 224, 224]) Shape of y: torch.Size([32]) torch.int644. 模型构建部分 SE 模块代码解释 Squeeze操作使用nn.AdaptiveAvgPool2d(1)来实现全局平均池化它将输入张量的空间维度H x W池化成1x1大小保留每个通道的平均值。Excitation操作将池化后的输出通过两个全连接层fc1 和 fc2。第一个全连接层的输出维度是filter_sq然后通过ReLU激活再经过第二个全连接层输出1个值最后通过Sigmoid激活函数将值压缩到[0, 1]之间表示每个通道的权重。Scale操作对输入的特征图进行按通道加权操作输出加权后的特征图。运行示例该代码中创建了一个SqueezeExcitationLayer实例并用一个形状为(1, 32, 32, 32)的输入张量进行测试输出的形状将与输入形状相同因为SE模块是一个逐通道加权操作不改变空间维度。 # import torch # import torch.nn as nn # import torch.nn.functional as F# class SqueezeExcitationLayer(nn.Module): # def __init__(self, filter_sq): # # filter_sq 是 Excitation 中第一个全连接层的输出通道数 # super(SqueezeExcitationLayer, self).__init__() # self.filter_sq filter_sq # self.global_avg_pool nn.AdaptiveAvgPool2d(1) # 等效于全局平均池化 # self.fc1 nn.Linear(1, filter_sq) # 输入通道数是1全局池化后的输出输出通道数是filter_sq # self.relu nn.ReLU() # self.fc2 nn.Linear(filter_sq, 1) # 最后的输出通道数为1每个通道的权重 # self.sigmoid nn.Sigmoid()# def forward(self, x): # # Squeeze阶段 # squeeze self.global_avg_pool(x) # Shape: (batch_size, channels, 1, 1) # squeeze squeeze.view(squeeze.size(0), -1) # 拉平成(batch_size, channels)# # Excitation阶段 # excitation self.fc1(squeeze) # Shape: (batch_size, filter_sq) # excitation self.relu(excitation) # excitation self.fc2(excitation) # Shape: (batch_size, 1) # excitation self.sigmoid(excitation) # Shape: (batch_size, 1)# # Reshape back to match input dimensions for element-wise multiplication # excitation excitation.view(excitation.size(0), excitation.size(1), 1, 1) # Shape: (batch_size, channels, 1, 1)# # Scale input with excitation weights # scale x * excitation # Element-wise multiplication# return scale# # 示例创建一个SqueezeExcitation层并通过它传入一个dummy输入 # SE SqueezeExcitationLayer(16) # inputs torch.zeros((1, 32, 32, 32)) # 输入张量形状为 (batch_size, channels, height, width) # output SE(inputs) # 执行前向传播 # print(output.shape) # 输出形状class SqueezeExcitationLayer(nn.Module):def __init__(self, num_input_features, filter_sq):super(SqueezeExcitationLayer, self).__init__()self.filter_sq filter_sqself.global_avg_pool nn.AdaptiveAvgPool2d(1) # 等效于全局平均池化self.fc1 nn.Linear(num_input_features, filter_sq) # 输入特征为num_input_features输出特征为filter_sqself.relu nn.ReLU()self.fc2 nn.Linear(filter_sq, num_input_features) # 最后的输出通道数与输入的通道数相同self.sigmoid nn.Sigmoid()def forward(self, x):# Squeeze阶段squeeze self.global_avg_pool(x) # Shape: (batch_size, channels, 1, 1)squeeze squeeze.view(squeeze.size(0), -1) # 拉平成(batch_size, channels)# Excitation阶段excitation self.fc1(squeeze) # Shape: (batch_size, filter_sq)excitation self.relu(excitation)excitation self.fc2(excitation) # Shape: (batch_size, num_input_features)excitation self.sigmoid(excitation) # Shape: (batch_size, num_input_features)# Reshape back to match input dimensions for element-wise multiplicationexcitation excitation.view(excitation.size(0), excitation.size(1), 1, 1) # Shape: (batch_size, channels, 1, 1)# Scale input with excitation weightsscale x * excitation # Element-wise multiplicationreturn scale# 调用SE模块时确保传入的参数正确 inputs torch.zeros((1, 32, 32, 32)) # 示例输入张量注意channels的位置 inputs inputs.permute(0, 3, 1, 2) # 将输入的维度从 (batch_size, height, width, channels) 转换为 (batch_size, channels, height, width)se SqueezeExcitationLayer(32, 16) # 32是输入通道数16是filter_sq output se(inputs) print(output.shape)torch.Size([1, 32, 32, 32])出现 RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x32 and 1x16) 错误是因为在 SE 模块的 fc1 层中输入的形状和权重矩阵的形状不匹配。这个问题发生在全连接层时通常是由于输入尺寸不符合全连接层的预期。问题的根源在 SqueezeExcitationLayer 中我们对输入进行全局平均池化后得到的输出是 (batch_size, channels, 1, 1)。然后试图将这个输出展平成 (batch_size, channels)并传递给全连接层fc1。然而fc1 层的输入特征数应与输入的通道数匹配。错误的根本原因是 fc1 层的输入尺寸不匹配。解决方法要确保输入的形状与 fc1 层的输入特征数匹配我们应该在 fc1 层的输入时确保其维度正确。具体地我们需要使用正确的输入特征大小即 num_input_features它应为输入张量的通道数。关键修改输入维度在调用 SqueezeExcitationLayer 时我们确保了输入张量的维度是 (batch_size, channels, height, width)。因为PyTorch通常处理的图像数据格式是 (batch_size, channels, height, width)而不是 (batch_size, height, width, channels)。 fc1 层的输入fc1 层的输入特征数应与输入张量的通道数num_input_features匹配。在调用 SqueezeExcitationLayer 时确保了这一点。这样您应该能够顺利执行前向传播并得到正确的输出形状。改进的DenseNET 要在你现有的DenseNet代码中加入SE模块Squeeze-and-Excitation我们需要对DenseNet中的每个_DenseLayer做一些修改确保在每个DenseLayer后加入SE模块。SE模块的作用是通过学习通道重要性来调整每个通道的权重。我们可以将SE模块添加到每个_DenseLayer的输出中。具体修改步骤在_DenseLayer中加入SE模块使得每个DenseLayer的输出都经过SE模块的加权调整。在DenseNet构造函数中对每个_DenseLayer实例化时加入SE模块。 # class _DenseLayer(nn.Sequential): # Basic unit of DenseBlock (using bottleneck layer) # def __init__(self, num_input_features, growth_rate, bn_size, drop_rate): # super(_DenseLayer, self).__init__() # self.add_module(norm1, nn.BatchNorm2d(num_input_features)) # self.add_module(relu1, nn.ReLU(inplaceTrue)) # self.add_module(conv1, nn.Conv2d(num_input_features, bn_size*growth_rate, # kernel_size1, stride1, biasFalse)) # self.add_module(norm2, nn.BatchNorm2d(bn_size*growth_rate)) # self.add_module(relu2, nn.ReLU(inplaceTrue)) # self.add_module(conv2, nn.Conv2d(bn_size*growth_rate, growth_rate, # kernel_size3, stride1, padding1, biasFalse)) # self.drop_rate drop_rate# def forward(self, x): # new_features super(_DenseLayer, self).forward(x) # if self.drop_rate 0: # new_features F.dropout(new_features, pself.drop_rate, trainingself.training) # return torch.cat([x, new_features], 1)# class _DenseBlock(nn.Sequential): # DenseBlock # def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate): # super(_DenseBlock, self).__init__() # for i in range(num_layers): # layer _DenseLayer(num_input_featuresi*growth_rate, growth_rate, bn_size, # drop_rate) # self.add_module(denselayer%d % (i1,), layer)# class _Transition(nn.Sequential): # Transition layer between two adjacent DenseBlock # def __init__(self, num_input_feature, num_output_features): # super(_Transition, self).__init__() # self.add_module(norm, nn.BatchNorm2d(num_input_feature)) # self.add_module(relu, nn.ReLU(inplaceTrue)) # self.add_module(conv, nn.Conv2d(num_input_feature, num_output_features, # kernel_size1, stride1, biasFalse)) # self.add_module(pool, nn.AvgPool2d(2, stride2))# class DenseNet(nn.Module): # DenseNet-BC model # def __init__(self, growth_rate32, block_config(6, 12, 24, 16), num_init_features64, # bn_size4, compression_rate0.5, drop_rate0, num_classes1000): # # :param growth_rate: (int) number of filters used in DenseLayer, k in the paper # :param block_config: (list of 4 ints) number of layers in each DenseBlock # :param num_init_features: (int) number of filters in the first Conv2d # :param bn_size: (int) the factor using in the bottleneck layer # :param compression_rate: (float) the compression rate used in Transition Layer # :param drop_rate: (float) the drop rate after each DenseLayer # :param num_classes: (int) number of classes for classification # # super(DenseNet, self).__init__() # # first Conv2d # self.features nn.Sequential(OrderedDict([ # (conv0, nn.Conv2d(3, num_init_features, kernel_size7, stride2, padding3, biasFalse)), # (norm0, nn.BatchNorm2d(num_init_features)), # (relu0, nn.ReLU(inplaceTrue)), # (pool0, nn.MaxPool2d(3, stride2, padding1)) # ]))# # DenseBlock # num_features num_init_features # for i, num_layers in enumerate(block_config): # block _DenseBlock(num_layers, num_features, bn_size, growth_rate, drop_rate) # self.features.add_module(denseblock%d % (i 1), block) # num_features num_layers*growth_rate # if i ! len(block_config) - 1: # transition _Transition(num_features, int(num_features*compression_rate)) # self.features.add_module(transition%d % (i 1), transition) # num_features int(num_features * compression_rate)# # final bnReLU # self.features.add_module(norm5, nn.BatchNorm2d(num_features)) # self.features.add_module(relu5, nn.ReLU(inplaceTrue))# # classification layer # self.classifier nn.Linear(num_features, num_classes)# # params initialization # for m in self.modules(): # if isinstance(m, nn.Conv2d): # nn.init.kaiming_normal_(m.weight) # elif isinstance(m, nn.BatchNorm2d): # nn.init.constant_(m.bias, 0) # nn.init.constant_(m.weight, 1) # elif isinstance(m, nn.Linear): # nn.init.constant_(m.bias, 0)# def forward(self, x): # features self.features(x) # out F.avg_pool2d(features, 7, stride1).view(features.size(0), -1) # out self.classifier(out) # return outimport torch import torch.nn as nn import torch.nn.functional as F from collections import OrderedDictclass SqueezeExcitationLayer(nn.Module):def __init__(self, num_input_features, filter_sq):super(SqueezeExcitationLayer, self).__init__()self.filter_sq filter_sqself.global_avg_pool nn.AdaptiveAvgPool2d(1) # 等效于全局平均池化self.fc1 nn.Linear(num_input_features, filter_sq) # 输入通道数是num_input_features输出通道数是filter_sqself.relu nn.ReLU()self.fc2 nn.Linear(filter_sq, num_input_features) # 最后的输出通道数与输入的通道数相同self.sigmoid nn.Sigmoid()def forward(self, x):# Squeeze阶段squeeze self.global_avg_pool(x) # Shape: (batch_size, channels, 1, 1)squeeze squeeze.view(squeeze.size(0), -1) # 拉平成(batch_size, channels)# Excitation阶段excitation self.fc1(squeeze) # Shape: (batch_size, filter_sq)excitation self.relu(excitation)excitation self.fc2(excitation) # Shape: (batch_size, num_input_features)excitation self.sigmoid(excitation) # Shape: (batch_size, num_input_features)# Reshape back to match input dimensions for element-wise multiplicationexcitation excitation.view(excitation.size(0), excitation.size(1), 1, 1) # Shape: (batch_size, channels, 1, 1)# Scale input with excitation weightsscale x * excitation # Element-wise multiplicationreturn scaleclass _DenseLayer(nn.Sequential):Basic unit of DenseBlock (using bottleneck layer) def __init__(self, num_input_features, growth_rate, bn_size, drop_rate, se_filter_sq16):super(_DenseLayer, self).__init__()self.add_module(norm1, nn.BatchNorm2d(num_input_features))self.add_module(relu1, nn.ReLU(inplaceTrue))self.add_module(conv1, nn.Conv2d(num_input_features, bn_size*growth_rate,kernel_size1, stride1, biasFalse))self.add_module(norm2, nn.BatchNorm2d(bn_size*growth_rate))self.add_module(relu2, nn.ReLU(inplaceTrue))self.add_module(conv2, nn.Conv2d(bn_size*growth_rate, growth_rate,kernel_size3, stride1, padding1, biasFalse))# 添加SE模块self.se SqueezeExcitationLayer(growth_rate, se_filter_sq)self.drop_rate drop_ratedef forward(self, x):new_features super(_DenseLayer, self).forward(x)new_features self.se(new_features) # 将SE模块加到特征图上if self.drop_rate 0:new_features F.dropout(new_features, pself.drop_rate, trainingself.training)return torch.cat([x, new_features], 1)class _DenseBlock(nn.Sequential):DenseBlockdef __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate, se_filter_sq16):super(_DenseBlock, self).__init__()for i in range(num_layers):layer _DenseLayer(num_input_featuresi*growth_rate, growth_rate, bn_size,drop_rate, se_filter_sq)self.add_module(denselayer%d % (i1,), layer)class _Transition(nn.Sequential):Transition layer between two adjacent DenseBlockdef __init__(self, num_input_feature, num_output_features):super(_Transition, self).__init__()self.add_module(norm, nn.BatchNorm2d(num_input_feature))self.add_module(relu, nn.ReLU(inplaceTrue))self.add_module(conv, nn.Conv2d(num_input_feature, num_output_features,kernel_size1, stride1, biasFalse))self.add_module(pool, nn.AvgPool2d(2, stride2))class DenseNet(nn.Module):DenseNet-BC modeldef __init__(self, growth_rate32, block_config(6, 12, 24, 16), num_init_features64,bn_size4, compression_rate0.5, drop_rate0, num_classes1000, se_filter_sq16)::param growth_rate: (int) number of filters used in DenseLayer, k in the paper:param block_config: (list of 4 ints) number of layers in each DenseBlock:param num_init_features: (int) number of filters in the first Conv2d:param bn_size: (int) the factor using in the bottleneck layer:param compression_rate: (float) the compression rate used in Transition Layer:param drop_rate: (float) the drop rate after each DenseLayer:param num_classes: (int) number of classes for classification:param se_filter_sq: (int) the number of filters used in SE modules fully connected layersuper(DenseNet, self).__init__()# first Conv2dself.features nn.Sequential(OrderedDict([ (conv0, nn.Conv2d(3, num_init_features, kernel_size7, stride2, padding3, biasFalse)),(norm0, nn.BatchNorm2d(num_init_features)),(relu0, nn.ReLU(inplaceTrue)),(pool0, nn.MaxPool2d(3, stride2, padding1))]))# DenseBlocknum_features num_init_featuresfor i, num_layers in enumerate(block_config):block _DenseBlock(num_layers, num_features, bn_size, growth_rate, drop_rate, se_filter_sq)self.features.add_module(denseblock%d % (i 1), block)num_features num_layers * growth_rateif i ! len(block_config) - 1:transition _Transition(num_features, int(num_features * compression_rate))self.features.add_module(transition%d % (i 1), transition)num_features int(num_features * compression_rate)# final bnReLUself.features.add_module(norm5, nn.BatchNorm2d(num_features))self.features.add_module(relu5, nn.ReLU(inplaceTrue))# classification layerself.classifier nn.Linear(num_features, num_classes)# params initializationfor m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight)elif isinstance(m, nn.BatchNorm2d):nn.init.constant_(m.bias, 0)nn.init.constant_(m.weight, 1)elif isinstance(m, nn.Linear):nn.init.constant_(m.bias, 0)def forward(self, x):features self.features(x)out F.avg_pool2d(features, 7, stride1).view(features.size(0), -1)out self.classifier(out)return out代码解释 SqueezeExcitationLayer实现了SE模块它根据输入通道的重要性生成一个权重通过全局平均池化和两个全连接层。_DenseLayer在每个DenseLayer后添加了SE模块并将SE模块的输出与输入特征图相乘从而对每个通道进行加权。_DenseBlock每个DenseBlock中的_DenseLayer都调用了SE模块。DenseNet在DenseNet类中您可以指定se_filter_sq参数这控制SE模块中的全连接层的大小。这样您的DenseNet就可以利用SE模块来提升其表示能力了。 # # Now, instantiate and use the model # densenet121 DenseNet(num_init_features64, # init_channel64, # growth_rate32, # block_config(6,12,24,16), # num_classeslen(classNames)) # model densenet121.to(device) # model# Now, instantiate and use the model se_filter_sq 16 # 可以根据需要调整SE模块的输出大小densenet121 DenseNet(num_init_features64, # init_channel64,growth_rate32,block_config(6, 12, 24, 16),num_classeslen(classNames), # 根据您的分类任务设置类别数se_filter_sqse_filter_sq # 传递SE模块的参数 )model densenet121.to(device) # 将模型移动到指定的设备上 modelDenseNet((features): Sequential((conv0): Conv2d(3, 64, kernel_size(7, 7), stride(2, 2), padding(3, 3), biasFalse)(norm0): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu0): ReLU(inplaceTrue)(pool0): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(denseblock1): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(64, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(96, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(96, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(128, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(160, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(160, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(192, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(192, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(224, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(224, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid())))(transition1): _Transition((norm): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv): Conv2d(256, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(pool): AvgPool2d(kernel_size2, stride2, padding0))(denseblock2): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(128, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(160, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(160, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(192, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(192, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(224, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(224, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(256, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(288, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(288, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer7): _DenseLayer((norm1): BatchNorm2d(320, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(320, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer8): _DenseLayer((norm1): BatchNorm2d(352, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(352, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer9): _DenseLayer((norm1): BatchNorm2d(384, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(384, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer10): _DenseLayer((norm1): BatchNorm2d(416, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(416, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer11): _DenseLayer((norm1): BatchNorm2d(448, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(448, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer12): _DenseLayer((norm1): BatchNorm2d(480, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(480, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid())))(transition2): _Transition((norm): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv): Conv2d(512, 256, kernel_size(1, 1), stride(1, 1), biasFalse)(pool): AvgPool2d(kernel_size2, stride2, padding0))(denseblock3): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(256, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(288, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(288, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(320, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(320, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(352, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(352, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(384, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(384, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(416, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(416, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer7): _DenseLayer((norm1): BatchNorm2d(448, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(448, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer8): _DenseLayer((norm1): BatchNorm2d(480, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(480, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer9): _DenseLayer((norm1): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(512, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer10): _DenseLayer((norm1): BatchNorm2d(544, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(544, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer11): _DenseLayer((norm1): BatchNorm2d(576, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(576, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer12): _DenseLayer((norm1): BatchNorm2d(608, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(608, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer13): _DenseLayer((norm1): BatchNorm2d(640, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(640, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer14): _DenseLayer((norm1): BatchNorm2d(672, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(672, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer15): _DenseLayer((norm1): BatchNorm2d(704, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(704, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer16): _DenseLayer((norm1): BatchNorm2d(736, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(736, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer17): _DenseLayer((norm1): BatchNorm2d(768, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(768, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer18): _DenseLayer((norm1): BatchNorm2d(800, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(800, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer19): _DenseLayer((norm1): BatchNorm2d(832, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(832, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer20): _DenseLayer((norm1): BatchNorm2d(864, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(864, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer21): _DenseLayer((norm1): BatchNorm2d(896, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(896, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer22): _DenseLayer((norm1): BatchNorm2d(928, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(928, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer23): _DenseLayer((norm1): BatchNorm2d(960, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(960, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer24): _DenseLayer((norm1): BatchNorm2d(992, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(992, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid())))(transition3): _Transition((norm): BatchNorm2d(1024, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv): Conv2d(1024, 512, kernel_size(1, 1), stride(1, 1), biasFalse)(pool): AvgPool2d(kernel_size2, stride2, padding0))(denseblock4): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(512, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(544, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(544, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(576, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(576, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(608, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(608, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(640, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(640, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(672, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(672, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer7): _DenseLayer((norm1): BatchNorm2d(704, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(704, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer8): _DenseLayer((norm1): BatchNorm2d(736, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(736, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer9): _DenseLayer((norm1): BatchNorm2d(768, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(768, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer10): _DenseLayer((norm1): BatchNorm2d(800, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(800, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer11): _DenseLayer((norm1): BatchNorm2d(832, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(832, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer12): _DenseLayer((norm1): BatchNorm2d(864, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(864, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer13): _DenseLayer((norm1): BatchNorm2d(896, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(896, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer14): _DenseLayer((norm1): BatchNorm2d(928, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(928, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer15): _DenseLayer((norm1): BatchNorm2d(960, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(960, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid()))(denselayer16): _DenseLayer((norm1): BatchNorm2d(992, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu1): ReLU(inplaceTrue)(conv1): Conv2d(992, 128, kernel_size(1, 1), stride(1, 1), biasFalse)(norm2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu2): ReLU(inplaceTrue)(conv2): Conv2d(128, 32, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size1)(fc1): Linear(in_features32, out_features16, biasTrue)(relu): ReLU()(fc2): Linear(in_features16, out_features32, biasTrue)(sigmoid): Sigmoid())))(norm5): BatchNorm2d(1024, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu5): ReLU(inplaceTrue))(classifier): Linear(in_features1024, out_features2, biasTrue) )解释 se_filter_sq在模型中传递了se_filter_sq参数用于控制SE模块的内部全连接层的大小。您可以根据实验的需要调整此值。其余的部分num_init_featuresgrowth_rateblock_confignum_classes等可以根据您的需求调整。通过这样修改您的模型现在会正确地包括SE模块并且能够按预期运行。 # 查看模型详情 import torchsummary as summary summary.summary(model,(3,224,224))----------------------------------------------------------------Layer (type) Output Shape Param # Conv2d-1 [-1, 64, 112, 112] 9,408BatchNorm2d-2 [-1, 64, 112, 112] 128ReLU-3 [-1, 64, 112, 112] 0MaxPool2d-4 [-1, 64, 56, 56] 0BatchNorm2d-5 [-1, 64, 56, 56] 128ReLU-6 [-1, 64, 56, 56] 0Conv2d-7 [-1, 128, 56, 56] 8,192BatchNorm2d-8 [-1, 128, 56, 56] 256ReLU-9 [-1, 128, 56, 56] 0Conv2d-10 [-1, 32, 56, 56] 36,864 AdaptiveAvgPool2d-11 [-1, 32, 1, 1] 0Linear-12 [-1, 16] 528ReLU-13 [-1, 16] 0Linear-14 [-1, 32] 544Sigmoid-15 [-1, 32] 0 SqueezeExcitationLayer-16 [-1, 32, 56, 56] 0 AdaptiveAvgPool2d-17 [-1, 32, 1, 1] 0Linear-18 [-1, 16] 528ReLU-19 [-1, 16] 0Linear-20 [-1, 32] 544Sigmoid-21 [-1, 32] 0 SqueezeExcitationLayer-22 [-1, 32, 56, 56] 0BatchNorm2d-23 [-1, 96, 56, 56] 192ReLU-24 [-1, 96, 56, 56] 0Conv2d-25 [-1, 128, 56, 56] 12,288BatchNorm2d-26 [-1, 128, 56, 56] 256ReLU-27 [-1, 128, 56, 56] 0Conv2d-28 [-1, 32, 56, 56] 36,864 AdaptiveAvgPool2d-29 [-1, 32, 1, 1] 0Linear-30 [-1, 16] 528ReLU-31 [-1, 16] 0Linear-32 [-1, 32] 544Sigmoid-33 [-1, 32] 0 SqueezeExcitationLayer-34 [-1, 32, 56, 56] 0 AdaptiveAvgPool2d-35 [-1, 32, 1, 1] 0Linear-36 [-1, 16] 528ReLU-37 [-1, 16] 0Linear-38 [-1, 32] 544Sigmoid-39 [-1, 32] 0 SqueezeExcitationLayer-40 [-1, 32, 56, 56] 0BatchNorm2d-41 [-1, 128, 56, 56] 256ReLU-42 [-1, 128, 56, 56] 0Conv2d-43 [-1, 128, 56, 56] 16,384BatchNorm2d-44 [-1, 128, 56, 56] 256ReLU-45 [-1, 128, 56, 56] 0Conv2d-46 [-1, 32, 56, 56] 36,864 AdaptiveAvgPool2d-47 [-1, 32, 1, 1] 0Linear-48 [-1, 16] 528ReLU-49 [-1, 16] 0Linear-50 [-1, 32] 544Sigmoid-51 [-1, 32] 0 SqueezeExcitationLayer-52 [-1, 32, 56, 56] 0 AdaptiveAvgPool2d-53 [-1, 32, 1, 1] 0Linear-54 [-1, 16] 528ReLU-55 [-1, 16] 0Linear-56 [-1, 32] 544Sigmoid-57 [-1, 32] 0 SqueezeExcitationLayer-58 [-1, 32, 56, 56] 0BatchNorm2d-59 [-1, 160, 56, 56] 320ReLU-60 [-1, 160, 56, 56] 0Conv2d-61 [-1, 128, 56, 56] 20,480BatchNorm2d-62 [-1, 128, 56, 56] 256ReLU-63 [-1, 128, 56, 56] 0Conv2d-64 [-1, 32, 56, 56] 36,864 AdaptiveAvgPool2d-65 [-1, 32, 1, 1] 0Linear-66 [-1, 16] 528ReLU-67 [-1, 16] 0Linear-68 [-1, 32] 544Sigmoid-69 [-1, 32] 0 SqueezeExcitationLayer-70 [-1, 32, 56, 56] 0 AdaptiveAvgPool2d-71 [-1, 32, 1, 1] 0Linear-72 [-1, 16] 528ReLU-73 [-1, 16] 0Linear-74 [-1, 32] 544Sigmoid-75 [-1, 32] 0 SqueezeExcitationLayer-76 [-1, 32, 56, 56] 0BatchNorm2d-77 [-1, 192, 56, 56] 384ReLU-78 [-1, 192, 56, 56] 0Conv2d-79 [-1, 128, 56, 56] 24,576BatchNorm2d-80 [-1, 128, 56, 56] 256ReLU-81 [-1, 128, 56, 56] 0Conv2d-82 [-1, 32, 56, 56] 36,864 AdaptiveAvgPool2d-83 [-1, 32, 1, 1] 0Linear-84 [-1, 16] 528ReLU-85 [-1, 16] 0Linear-86 [-1, 32] 544Sigmoid-87 [-1, 32] 0 SqueezeExcitationLayer-88 [-1, 32, 56, 56] 0 AdaptiveAvgPool2d-89 [-1, 32, 1, 1] 0Linear-90 [-1, 16] 528ReLU-91 [-1, 16] 0Linear-92 [-1, 32] 544Sigmoid-93 [-1, 32] 0 SqueezeExcitationLayer-94 [-1, 32, 56, 56] 0BatchNorm2d-95 [-1, 224, 56, 56] 448ReLU-96 [-1, 224, 56, 56] 0Conv2d-97 [-1, 128, 56, 56] 28,672BatchNorm2d-98 [-1, 128, 56, 56] 256ReLU-99 [-1, 128, 56, 56] 0Conv2d-100 [-1, 32, 56, 56] 36,864 AdaptiveAvgPool2d-101 [-1, 32, 1, 1] 0Linear-102 [-1, 16] 528ReLU-103 [-1, 16] 0Linear-104 [-1, 32] 544Sigmoid-105 [-1, 32] 0 SqueezeExcitationLayer-106 [-1, 32, 56, 56] 0 AdaptiveAvgPool2d-107 [-1, 32, 1, 1] 0Linear-108 [-1, 16] 528ReLU-109 [-1, 16] 0Linear-110 [-1, 32] 544Sigmoid-111 [-1, 32] 0 SqueezeExcitationLayer-112 [-1, 32, 56, 56] 0BatchNorm2d-113 [-1, 256, 56, 56] 512ReLU-114 [-1, 256, 56, 56] 0Conv2d-115 [-1, 128, 56, 56] 32,768AvgPool2d-116 [-1, 128, 28, 28] 0BatchNorm2d-117 [-1, 128, 28, 28] 256ReLU-118 [-1, 128, 28, 28] 0Conv2d-119 [-1, 128, 28, 28] 16,384BatchNorm2d-120 [-1, 128, 28, 28] 256ReLU-121 [-1, 128, 28, 28] 0Conv2d-122 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-123 [-1, 32, 1, 1] 0Linear-124 [-1, 16] 528ReLU-125 [-1, 16] 0Linear-126 [-1, 32] 544Sigmoid-127 [-1, 32] 0 SqueezeExcitationLayer-128 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-129 [-1, 32, 1, 1] 0Linear-130 [-1, 16] 528ReLU-131 [-1, 16] 0Linear-132 [-1, 32] 544Sigmoid-133 [-1, 32] 0 SqueezeExcitationLayer-134 [-1, 32, 28, 28] 0BatchNorm2d-135 [-1, 160, 28, 28] 320ReLU-136 [-1, 160, 28, 28] 0Conv2d-137 [-1, 128, 28, 28] 20,480BatchNorm2d-138 [-1, 128, 28, 28] 256ReLU-139 [-1, 128, 28, 28] 0Conv2d-140 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-141 [-1, 32, 1, 1] 0Linear-142 [-1, 16] 528ReLU-143 [-1, 16] 0Linear-144 [-1, 32] 544Sigmoid-145 [-1, 32] 0 SqueezeExcitationLayer-146 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-147 [-1, 32, 1, 1] 0Linear-148 [-1, 16] 528ReLU-149 [-1, 16] 0Linear-150 [-1, 32] 544Sigmoid-151 [-1, 32] 0 SqueezeExcitationLayer-152 [-1, 32, 28, 28] 0BatchNorm2d-153 [-1, 192, 28, 28] 384ReLU-154 [-1, 192, 28, 28] 0Conv2d-155 [-1, 128, 28, 28] 24,576BatchNorm2d-156 [-1, 128, 28, 28] 256ReLU-157 [-1, 128, 28, 28] 0Conv2d-158 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-159 [-1, 32, 1, 1] 0Linear-160 [-1, 16] 528ReLU-161 [-1, 16] 0Linear-162 [-1, 32] 544Sigmoid-163 [-1, 32] 0 SqueezeExcitationLayer-164 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-165 [-1, 32, 1, 1] 0Linear-166 [-1, 16] 528ReLU-167 [-1, 16] 0Linear-168 [-1, 32] 544Sigmoid-169 [-1, 32] 0 SqueezeExcitationLayer-170 [-1, 32, 28, 28] 0BatchNorm2d-171 [-1, 224, 28, 28] 448ReLU-172 [-1, 224, 28, 28] 0Conv2d-173 [-1, 128, 28, 28] 28,672BatchNorm2d-174 [-1, 128, 28, 28] 256ReLU-175 [-1, 128, 28, 28] 0Conv2d-176 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-177 [-1, 32, 1, 1] 0Linear-178 [-1, 16] 528ReLU-179 [-1, 16] 0Linear-180 [-1, 32] 544Sigmoid-181 [-1, 32] 0 SqueezeExcitationLayer-182 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-183 [-1, 32, 1, 1] 0Linear-184 [-1, 16] 528ReLU-185 [-1, 16] 0Linear-186 [-1, 32] 544Sigmoid-187 [-1, 32] 0 SqueezeExcitationLayer-188 [-1, 32, 28, 28] 0BatchNorm2d-189 [-1, 256, 28, 28] 512ReLU-190 [-1, 256, 28, 28] 0Conv2d-191 [-1, 128, 28, 28] 32,768BatchNorm2d-192 [-1, 128, 28, 28] 256ReLU-193 [-1, 128, 28, 28] 0Conv2d-194 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-195 [-1, 32, 1, 1] 0Linear-196 [-1, 16] 528ReLU-197 [-1, 16] 0Linear-198 [-1, 32] 544Sigmoid-199 [-1, 32] 0 SqueezeExcitationLayer-200 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-201 [-1, 32, 1, 1] 0Linear-202 [-1, 16] 528ReLU-203 [-1, 16] 0Linear-204 [-1, 32] 544Sigmoid-205 [-1, 32] 0 SqueezeExcitationLayer-206 [-1, 32, 28, 28] 0BatchNorm2d-207 [-1, 288, 28, 28] 576ReLU-208 [-1, 288, 28, 28] 0Conv2d-209 [-1, 128, 28, 28] 36,864BatchNorm2d-210 [-1, 128, 28, 28] 256ReLU-211 [-1, 128, 28, 28] 0Conv2d-212 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-213 [-1, 32, 1, 1] 0Linear-214 [-1, 16] 528ReLU-215 [-1, 16] 0Linear-216 [-1, 32] 544Sigmoid-217 [-1, 32] 0 SqueezeExcitationLayer-218 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-219 [-1, 32, 1, 1] 0Linear-220 [-1, 16] 528ReLU-221 [-1, 16] 0Linear-222 [-1, 32] 544Sigmoid-223 [-1, 32] 0 SqueezeExcitationLayer-224 [-1, 32, 28, 28] 0BatchNorm2d-225 [-1, 320, 28, 28] 640ReLU-226 [-1, 320, 28, 28] 0Conv2d-227 [-1, 128, 28, 28] 40,960BatchNorm2d-228 [-1, 128, 28, 28] 256ReLU-229 [-1, 128, 28, 28] 0Conv2d-230 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-231 [-1, 32, 1, 1] 0Linear-232 [-1, 16] 528ReLU-233 [-1, 16] 0Linear-234 [-1, 32] 544Sigmoid-235 [-1, 32] 0 SqueezeExcitationLayer-236 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-237 [-1, 32, 1, 1] 0Linear-238 [-1, 16] 528ReLU-239 [-1, 16] 0Linear-240 [-1, 32] 544Sigmoid-241 [-1, 32] 0 SqueezeExcitationLayer-242 [-1, 32, 28, 28] 0BatchNorm2d-243 [-1, 352, 28, 28] 704ReLU-244 [-1, 352, 28, 28] 0Conv2d-245 [-1, 128, 28, 28] 45,056BatchNorm2d-246 [-1, 128, 28, 28] 256ReLU-247 [-1, 128, 28, 28] 0Conv2d-248 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-249 [-1, 32, 1, 1] 0Linear-250 [-1, 16] 528ReLU-251 [-1, 16] 0Linear-252 [-1, 32] 544Sigmoid-253 [-1, 32] 0 SqueezeExcitationLayer-254 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-255 [-1, 32, 1, 1] 0Linear-256 [-1, 16] 528ReLU-257 [-1, 16] 0Linear-258 [-1, 32] 544Sigmoid-259 [-1, 32] 0 SqueezeExcitationLayer-260 [-1, 32, 28, 28] 0BatchNorm2d-261 [-1, 384, 28, 28] 768ReLU-262 [-1, 384, 28, 28] 0Conv2d-263 [-1, 128, 28, 28] 49,152BatchNorm2d-264 [-1, 128, 28, 28] 256ReLU-265 [-1, 128, 28, 28] 0Conv2d-266 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-267 [-1, 32, 1, 1] 0Linear-268 [-1, 16] 528ReLU-269 [-1, 16] 0Linear-270 [-1, 32] 544Sigmoid-271 [-1, 32] 0 SqueezeExcitationLayer-272 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-273 [-1, 32, 1, 1] 0Linear-274 [-1, 16] 528ReLU-275 [-1, 16] 0Linear-276 [-1, 32] 544Sigmoid-277 [-1, 32] 0 SqueezeExcitationLayer-278 [-1, 32, 28, 28] 0BatchNorm2d-279 [-1, 416, 28, 28] 832ReLU-280 [-1, 416, 28, 28] 0Conv2d-281 [-1, 128, 28, 28] 53,248BatchNorm2d-282 [-1, 128, 28, 28] 256ReLU-283 [-1, 128, 28, 28] 0Conv2d-284 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-285 [-1, 32, 1, 1] 0Linear-286 [-1, 16] 528ReLU-287 [-1, 16] 0Linear-288 [-1, 32] 544Sigmoid-289 [-1, 32] 0 SqueezeExcitationLayer-290 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-291 [-1, 32, 1, 1] 0Linear-292 [-1, 16] 528ReLU-293 [-1, 16] 0Linear-294 [-1, 32] 544Sigmoid-295 [-1, 32] 0 SqueezeExcitationLayer-296 [-1, 32, 28, 28] 0BatchNorm2d-297 [-1, 448, 28, 28] 896ReLU-298 [-1, 448, 28, 28] 0Conv2d-299 [-1, 128, 28, 28] 57,344BatchNorm2d-300 [-1, 128, 28, 28] 256ReLU-301 [-1, 128, 28, 28] 0Conv2d-302 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-303 [-1, 32, 1, 1] 0Linear-304 [-1, 16] 528ReLU-305 [-1, 16] 0Linear-306 [-1, 32] 544Sigmoid-307 [-1, 32] 0 SqueezeExcitationLayer-308 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-309 [-1, 32, 1, 1] 0Linear-310 [-1, 16] 528ReLU-311 [-1, 16] 0Linear-312 [-1, 32] 544Sigmoid-313 [-1, 32] 0 SqueezeExcitationLayer-314 [-1, 32, 28, 28] 0BatchNorm2d-315 [-1, 480, 28, 28] 960ReLU-316 [-1, 480, 28, 28] 0Conv2d-317 [-1, 128, 28, 28] 61,440BatchNorm2d-318 [-1, 128, 28, 28] 256ReLU-319 [-1, 128, 28, 28] 0Conv2d-320 [-1, 32, 28, 28] 36,864 AdaptiveAvgPool2d-321 [-1, 32, 1, 1] 0Linear-322 [-1, 16] 528ReLU-323 [-1, 16] 0Linear-324 [-1, 32] 544Sigmoid-325 [-1, 32] 0 SqueezeExcitationLayer-326 [-1, 32, 28, 28] 0 AdaptiveAvgPool2d-327 [-1, 32, 1, 1] 0Linear-328 [-1, 16] 528ReLU-329 [-1, 16] 0Linear-330 [-1, 32] 544Sigmoid-331 [-1, 32] 0 SqueezeExcitationLayer-332 [-1, 32, 28, 28] 0BatchNorm2d-333 [-1, 512, 28, 28] 1,024ReLU-334 [-1, 512, 28, 28] 0Conv2d-335 [-1, 256, 28, 28] 131,072AvgPool2d-336 [-1, 256, 14, 14] 0BatchNorm2d-337 [-1, 256, 14, 14] 512ReLU-338 [-1, 256, 14, 14] 0Conv2d-339 [-1, 128, 14, 14] 32,768BatchNorm2d-340 [-1, 128, 14, 14] 256ReLU-341 [-1, 128, 14, 14] 0Conv2d-342 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-343 [-1, 32, 1, 1] 0Linear-344 [-1, 16] 528ReLU-345 [-1, 16] 0Linear-346 [-1, 32] 544Sigmoid-347 [-1, 32] 0 SqueezeExcitationLayer-348 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-349 [-1, 32, 1, 1] 0Linear-350 [-1, 16] 528ReLU-351 [-1, 16] 0Linear-352 [-1, 32] 544Sigmoid-353 [-1, 32] 0 SqueezeExcitationLayer-354 [-1, 32, 14, 14] 0BatchNorm2d-355 [-1, 288, 14, 14] 576ReLU-356 [-1, 288, 14, 14] 0Conv2d-357 [-1, 128, 14, 14] 36,864BatchNorm2d-358 [-1, 128, 14, 14] 256ReLU-359 [-1, 128, 14, 14] 0Conv2d-360 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-361 [-1, 32, 1, 1] 0Linear-362 [-1, 16] 528ReLU-363 [-1, 16] 0Linear-364 [-1, 32] 544Sigmoid-365 [-1, 32] 0 SqueezeExcitationLayer-366 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-367 [-1, 32, 1, 1] 0Linear-368 [-1, 16] 528ReLU-369 [-1, 16] 0Linear-370 [-1, 32] 544Sigmoid-371 [-1, 32] 0 SqueezeExcitationLayer-372 [-1, 32, 14, 14] 0BatchNorm2d-373 [-1, 320, 14, 14] 640ReLU-374 [-1, 320, 14, 14] 0Conv2d-375 [-1, 128, 14, 14] 40,960BatchNorm2d-376 [-1, 128, 14, 14] 256ReLU-377 [-1, 128, 14, 14] 0Conv2d-378 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-379 [-1, 32, 1, 1] 0Linear-380 [-1, 16] 528ReLU-381 [-1, 16] 0Linear-382 [-1, 32] 544Sigmoid-383 [-1, 32] 0 SqueezeExcitationLayer-384 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-385 [-1, 32, 1, 1] 0Linear-386 [-1, 16] 528ReLU-387 [-1, 16] 0Linear-388 [-1, 32] 544Sigmoid-389 [-1, 32] 0 SqueezeExcitationLayer-390 [-1, 32, 14, 14] 0BatchNorm2d-391 [-1, 352, 14, 14] 704ReLU-392 [-1, 352, 14, 14] 0Conv2d-393 [-1, 128, 14, 14] 45,056BatchNorm2d-394 [-1, 128, 14, 14] 256ReLU-395 [-1, 128, 14, 14] 0Conv2d-396 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-397 [-1, 32, 1, 1] 0Linear-398 [-1, 16] 528ReLU-399 [-1, 16] 0Linear-400 [-1, 32] 544Sigmoid-401 [-1, 32] 0 SqueezeExcitationLayer-402 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-403 [-1, 32, 1, 1] 0Linear-404 [-1, 16] 528ReLU-405 [-1, 16] 0Linear-406 [-1, 32] 544Sigmoid-407 [-1, 32] 0 SqueezeExcitationLayer-408 [-1, 32, 14, 14] 0BatchNorm2d-409 [-1, 384, 14, 14] 768ReLU-410 [-1, 384, 14, 14] 0Conv2d-411 [-1, 128, 14, 14] 49,152BatchNorm2d-412 [-1, 128, 14, 14] 256ReLU-413 [-1, 128, 14, 14] 0Conv2d-414 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-415 [-1, 32, 1, 1] 0Linear-416 [-1, 16] 528ReLU-417 [-1, 16] 0Linear-418 [-1, 32] 544Sigmoid-419 [-1, 32] 0 SqueezeExcitationLayer-420 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-421 [-1, 32, 1, 1] 0Linear-422 [-1, 16] 528ReLU-423 [-1, 16] 0Linear-424 [-1, 32] 544Sigmoid-425 [-1, 32] 0 SqueezeExcitationLayer-426 [-1, 32, 14, 14] 0BatchNorm2d-427 [-1, 416, 14, 14] 832ReLU-428 [-1, 416, 14, 14] 0Conv2d-429 [-1, 128, 14, 14] 53,248BatchNorm2d-430 [-1, 128, 14, 14] 256ReLU-431 [-1, 128, 14, 14] 0Conv2d-432 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-433 [-1, 32, 1, 1] 0Linear-434 [-1, 16] 528ReLU-435 [-1, 16] 0Linear-436 [-1, 32] 544Sigmoid-437 [-1, 32] 0 SqueezeExcitationLayer-438 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-439 [-1, 32, 1, 1] 0Linear-440 [-1, 16] 528ReLU-441 [-1, 16] 0Linear-442 [-1, 32] 544Sigmoid-443 [-1, 32] 0 SqueezeExcitationLayer-444 [-1, 32, 14, 14] 0BatchNorm2d-445 [-1, 448, 14, 14] 896ReLU-446 [-1, 448, 14, 14] 0Conv2d-447 [-1, 128, 14, 14] 57,344BatchNorm2d-448 [-1, 128, 14, 14] 256ReLU-449 [-1, 128, 14, 14] 0Conv2d-450 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-451 [-1, 32, 1, 1] 0Linear-452 [-1, 16] 528ReLU-453 [-1, 16] 0Linear-454 [-1, 32] 544Sigmoid-455 [-1, 32] 0 SqueezeExcitationLayer-456 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-457 [-1, 32, 1, 1] 0Linear-458 [-1, 16] 528ReLU-459 [-1, 16] 0Linear-460 [-1, 32] 544Sigmoid-461 [-1, 32] 0 SqueezeExcitationLayer-462 [-1, 32, 14, 14] 0BatchNorm2d-463 [-1, 480, 14, 14] 960ReLU-464 [-1, 480, 14, 14] 0Conv2d-465 [-1, 128, 14, 14] 61,440BatchNorm2d-466 [-1, 128, 14, 14] 256ReLU-467 [-1, 128, 14, 14] 0Conv2d-468 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-469 [-1, 32, 1, 1] 0Linear-470 [-1, 16] 528ReLU-471 [-1, 16] 0Linear-472 [-1, 32] 544Sigmoid-473 [-1, 32] 0 SqueezeExcitationLayer-474 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-475 [-1, 32, 1, 1] 0Linear-476 [-1, 16] 528ReLU-477 [-1, 16] 0Linear-478 [-1, 32] 544Sigmoid-479 [-1, 32] 0 SqueezeExcitationLayer-480 [-1, 32, 14, 14] 0BatchNorm2d-481 [-1, 512, 14, 14] 1,024ReLU-482 [-1, 512, 14, 14] 0Conv2d-483 [-1, 128, 14, 14] 65,536BatchNorm2d-484 [-1, 128, 14, 14] 256ReLU-485 [-1, 128, 14, 14] 0Conv2d-486 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-487 [-1, 32, 1, 1] 0Linear-488 [-1, 16] 528ReLU-489 [-1, 16] 0Linear-490 [-1, 32] 544Sigmoid-491 [-1, 32] 0 SqueezeExcitationLayer-492 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-493 [-1, 32, 1, 1] 0Linear-494 [-1, 16] 528ReLU-495 [-1, 16] 0Linear-496 [-1, 32] 544Sigmoid-497 [-1, 32] 0 SqueezeExcitationLayer-498 [-1, 32, 14, 14] 0BatchNorm2d-499 [-1, 544, 14, 14] 1,088ReLU-500 [-1, 544, 14, 14] 0Conv2d-501 [-1, 128, 14, 14] 69,632BatchNorm2d-502 [-1, 128, 14, 14] 256ReLU-503 [-1, 128, 14, 14] 0Conv2d-504 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-505 [-1, 32, 1, 1] 0Linear-506 [-1, 16] 528ReLU-507 [-1, 16] 0Linear-508 [-1, 32] 544Sigmoid-509 [-1, 32] 0 SqueezeExcitationLayer-510 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-511 [-1, 32, 1, 1] 0Linear-512 [-1, 16] 528ReLU-513 [-1, 16] 0Linear-514 [-1, 32] 544Sigmoid-515 [-1, 32] 0 SqueezeExcitationLayer-516 [-1, 32, 14, 14] 0BatchNorm2d-517 [-1, 576, 14, 14] 1,152ReLU-518 [-1, 576, 14, 14] 0Conv2d-519 [-1, 128, 14, 14] 73,728BatchNorm2d-520 [-1, 128, 14, 14] 256ReLU-521 [-1, 128, 14, 14] 0Conv2d-522 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-523 [-1, 32, 1, 1] 0Linear-524 [-1, 16] 528ReLU-525 [-1, 16] 0Linear-526 [-1, 32] 544Sigmoid-527 [-1, 32] 0 SqueezeExcitationLayer-528 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-529 [-1, 32, 1, 1] 0Linear-530 [-1, 16] 528ReLU-531 [-1, 16] 0Linear-532 [-1, 32] 544Sigmoid-533 [-1, 32] 0 SqueezeExcitationLayer-534 [-1, 32, 14, 14] 0BatchNorm2d-535 [-1, 608, 14, 14] 1,216ReLU-536 [-1, 608, 14, 14] 0Conv2d-537 [-1, 128, 14, 14] 77,824BatchNorm2d-538 [-1, 128, 14, 14] 256ReLU-539 [-1, 128, 14, 14] 0Conv2d-540 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-541 [-1, 32, 1, 1] 0Linear-542 [-1, 16] 528ReLU-543 [-1, 16] 0Linear-544 [-1, 32] 544Sigmoid-545 [-1, 32] 0 SqueezeExcitationLayer-546 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-547 [-1, 32, 1, 1] 0Linear-548 [-1, 16] 528ReLU-549 [-1, 16] 0Linear-550 [-1, 32] 544Sigmoid-551 [-1, 32] 0 SqueezeExcitationLayer-552 [-1, 32, 14, 14] 0BatchNorm2d-553 [-1, 640, 14, 14] 1,280ReLU-554 [-1, 640, 14, 14] 0Conv2d-555 [-1, 128, 14, 14] 81,920BatchNorm2d-556 [-1, 128, 14, 14] 256ReLU-557 [-1, 128, 14, 14] 0Conv2d-558 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-559 [-1, 32, 1, 1] 0Linear-560 [-1, 16] 528ReLU-561 [-1, 16] 0Linear-562 [-1, 32] 544Sigmoid-563 [-1, 32] 0 SqueezeExcitationLayer-564 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-565 [-1, 32, 1, 1] 0Linear-566 [-1, 16] 528ReLU-567 [-1, 16] 0Linear-568 [-1, 32] 544Sigmoid-569 [-1, 32] 0 SqueezeExcitationLayer-570 [-1, 32, 14, 14] 0BatchNorm2d-571 [-1, 672, 14, 14] 1,344ReLU-572 [-1, 672, 14, 14] 0Conv2d-573 [-1, 128, 14, 14] 86,016BatchNorm2d-574 [-1, 128, 14, 14] 256ReLU-575 [-1, 128, 14, 14] 0Conv2d-576 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-577 [-1, 32, 1, 1] 0Linear-578 [-1, 16] 528ReLU-579 [-1, 16] 0Linear-580 [-1, 32] 544Sigmoid-581 [-1, 32] 0 SqueezeExcitationLayer-582 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-583 [-1, 32, 1, 1] 0Linear-584 [-1, 16] 528ReLU-585 [-1, 16] 0Linear-586 [-1, 32] 544Sigmoid-587 [-1, 32] 0 SqueezeExcitationLayer-588 [-1, 32, 14, 14] 0BatchNorm2d-589 [-1, 704, 14, 14] 1,408ReLU-590 [-1, 704, 14, 14] 0Conv2d-591 [-1, 128, 14, 14] 90,112BatchNorm2d-592 [-1, 128, 14, 14] 256ReLU-593 [-1, 128, 14, 14] 0Conv2d-594 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-595 [-1, 32, 1, 1] 0Linear-596 [-1, 16] 528ReLU-597 [-1, 16] 0Linear-598 [-1, 32] 544Sigmoid-599 [-1, 32] 0 SqueezeExcitationLayer-600 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-601 [-1, 32, 1, 1] 0Linear-602 [-1, 16] 528ReLU-603 [-1, 16] 0Linear-604 [-1, 32] 544Sigmoid-605 [-1, 32] 0 SqueezeExcitationLayer-606 [-1, 32, 14, 14] 0BatchNorm2d-607 [-1, 736, 14, 14] 1,472ReLU-608 [-1, 736, 14, 14] 0Conv2d-609 [-1, 128, 14, 14] 94,208BatchNorm2d-610 [-1, 128, 14, 14] 256ReLU-611 [-1, 128, 14, 14] 0Conv2d-612 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-613 [-1, 32, 1, 1] 0Linear-614 [-1, 16] 528ReLU-615 [-1, 16] 0Linear-616 [-1, 32] 544Sigmoid-617 [-1, 32] 0 SqueezeExcitationLayer-618 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-619 [-1, 32, 1, 1] 0Linear-620 [-1, 16] 528ReLU-621 [-1, 16] 0Linear-622 [-1, 32] 544Sigmoid-623 [-1, 32] 0 SqueezeExcitationLayer-624 [-1, 32, 14, 14] 0BatchNorm2d-625 [-1, 768, 14, 14] 1,536ReLU-626 [-1, 768, 14, 14] 0Conv2d-627 [-1, 128, 14, 14] 98,304BatchNorm2d-628 [-1, 128, 14, 14] 256ReLU-629 [-1, 128, 14, 14] 0Conv2d-630 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-631 [-1, 32, 1, 1] 0Linear-632 [-1, 16] 528ReLU-633 [-1, 16] 0Linear-634 [-1, 32] 544Sigmoid-635 [-1, 32] 0 SqueezeExcitationLayer-636 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-637 [-1, 32, 1, 1] 0Linear-638 [-1, 16] 528ReLU-639 [-1, 16] 0Linear-640 [-1, 32] 544Sigmoid-641 [-1, 32] 0 SqueezeExcitationLayer-642 [-1, 32, 14, 14] 0BatchNorm2d-643 [-1, 800, 14, 14] 1,600ReLU-644 [-1, 800, 14, 14] 0Conv2d-645 [-1, 128, 14, 14] 102,400BatchNorm2d-646 [-1, 128, 14, 14] 256ReLU-647 [-1, 128, 14, 14] 0Conv2d-648 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-649 [-1, 32, 1, 1] 0Linear-650 [-1, 16] 528ReLU-651 [-1, 16] 0Linear-652 [-1, 32] 544Sigmoid-653 [-1, 32] 0 SqueezeExcitationLayer-654 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-655 [-1, 32, 1, 1] 0Linear-656 [-1, 16] 528ReLU-657 [-1, 16] 0Linear-658 [-1, 32] 544Sigmoid-659 [-1, 32] 0 SqueezeExcitationLayer-660 [-1, 32, 14, 14] 0BatchNorm2d-661 [-1, 832, 14, 14] 1,664ReLU-662 [-1, 832, 14, 14] 0Conv2d-663 [-1, 128, 14, 14] 106,496BatchNorm2d-664 [-1, 128, 14, 14] 256ReLU-665 [-1, 128, 14, 14] 0Conv2d-666 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-667 [-1, 32, 1, 1] 0Linear-668 [-1, 16] 528ReLU-669 [-1, 16] 0Linear-670 [-1, 32] 544Sigmoid-671 [-1, 32] 0 SqueezeExcitationLayer-672 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-673 [-1, 32, 1, 1] 0Linear-674 [-1, 16] 528ReLU-675 [-1, 16] 0Linear-676 [-1, 32] 544Sigmoid-677 [-1, 32] 0 SqueezeExcitationLayer-678 [-1, 32, 14, 14] 0BatchNorm2d-679 [-1, 864, 14, 14] 1,728ReLU-680 [-1, 864, 14, 14] 0Conv2d-681 [-1, 128, 14, 14] 110,592BatchNorm2d-682 [-1, 128, 14, 14] 256ReLU-683 [-1, 128, 14, 14] 0Conv2d-684 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-685 [-1, 32, 1, 1] 0Linear-686 [-1, 16] 528ReLU-687 [-1, 16] 0Linear-688 [-1, 32] 544Sigmoid-689 [-1, 32] 0 SqueezeExcitationLayer-690 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-691 [-1, 32, 1, 1] 0Linear-692 [-1, 16] 528ReLU-693 [-1, 16] 0Linear-694 [-1, 32] 544Sigmoid-695 [-1, 32] 0 SqueezeExcitationLayer-696 [-1, 32, 14, 14] 0BatchNorm2d-697 [-1, 896, 14, 14] 1,792ReLU-698 [-1, 896, 14, 14] 0Conv2d-699 [-1, 128, 14, 14] 114,688BatchNorm2d-700 [-1, 128, 14, 14] 256ReLU-701 [-1, 128, 14, 14] 0Conv2d-702 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-703 [-1, 32, 1, 1] 0Linear-704 [-1, 16] 528ReLU-705 [-1, 16] 0Linear-706 [-1, 32] 544Sigmoid-707 [-1, 32] 0 SqueezeExcitationLayer-708 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-709 [-1, 32, 1, 1] 0Linear-710 [-1, 16] 528ReLU-711 [-1, 16] 0Linear-712 [-1, 32] 544Sigmoid-713 [-1, 32] 0 SqueezeExcitationLayer-714 [-1, 32, 14, 14] 0BatchNorm2d-715 [-1, 928, 14, 14] 1,856ReLU-716 [-1, 928, 14, 14] 0Conv2d-717 [-1, 128, 14, 14] 118,784BatchNorm2d-718 [-1, 128, 14, 14] 256ReLU-719 [-1, 128, 14, 14] 0Conv2d-720 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-721 [-1, 32, 1, 1] 0Linear-722 [-1, 16] 528ReLU-723 [-1, 16] 0Linear-724 [-1, 32] 544Sigmoid-725 [-1, 32] 0 SqueezeExcitationLayer-726 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-727 [-1, 32, 1, 1] 0Linear-728 [-1, 16] 528ReLU-729 [-1, 16] 0Linear-730 [-1, 32] 544Sigmoid-731 [-1, 32] 0 SqueezeExcitationLayer-732 [-1, 32, 14, 14] 0BatchNorm2d-733 [-1, 960, 14, 14] 1,920ReLU-734 [-1, 960, 14, 14] 0Conv2d-735 [-1, 128, 14, 14] 122,880BatchNorm2d-736 [-1, 128, 14, 14] 256ReLU-737 [-1, 128, 14, 14] 0Conv2d-738 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-739 [-1, 32, 1, 1] 0Linear-740 [-1, 16] 528ReLU-741 [-1, 16] 0Linear-742 [-1, 32] 544Sigmoid-743 [-1, 32] 0 SqueezeExcitationLayer-744 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-745 [-1, 32, 1, 1] 0Linear-746 [-1, 16] 528ReLU-747 [-1, 16] 0Linear-748 [-1, 32] 544Sigmoid-749 [-1, 32] 0 SqueezeExcitationLayer-750 [-1, 32, 14, 14] 0BatchNorm2d-751 [-1, 992, 14, 14] 1,984ReLU-752 [-1, 992, 14, 14] 0Conv2d-753 [-1, 128, 14, 14] 126,976BatchNorm2d-754 [-1, 128, 14, 14] 256ReLU-755 [-1, 128, 14, 14] 0Conv2d-756 [-1, 32, 14, 14] 36,864 AdaptiveAvgPool2d-757 [-1, 32, 1, 1] 0Linear-758 [-1, 16] 528ReLU-759 [-1, 16] 0Linear-760 [-1, 32] 544Sigmoid-761 [-1, 32] 0 SqueezeExcitationLayer-762 [-1, 32, 14, 14] 0 AdaptiveAvgPool2d-763 [-1, 32, 1, 1] 0Linear-764 [-1, 16] 528ReLU-765 [-1, 16] 0Linear-766 [-1, 32] 544Sigmoid-767 [-1, 32] 0 SqueezeExcitationLayer-768 [-1, 32, 14, 14] 0BatchNorm2d-769 [-1, 1024, 14, 14] 2,048ReLU-770 [-1, 1024, 14, 14] 0Conv2d-771 [-1, 512, 14, 14] 524,288AvgPool2d-772 [-1, 512, 7, 7] 0BatchNorm2d-773 [-1, 512, 7, 7] 1,024ReLU-774 [-1, 512, 7, 7] 0Conv2d-775 [-1, 128, 7, 7] 65,536BatchNorm2d-776 [-1, 128, 7, 7] 256ReLU-777 [-1, 128, 7, 7] 0Conv2d-778 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-779 [-1, 32, 1, 1] 0Linear-780 [-1, 16] 528ReLU-781 [-1, 16] 0Linear-782 [-1, 32] 544Sigmoid-783 [-1, 32] 0 SqueezeExcitationLayer-784 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-785 [-1, 32, 1, 1] 0Linear-786 [-1, 16] 528ReLU-787 [-1, 16] 0Linear-788 [-1, 32] 544Sigmoid-789 [-1, 32] 0 SqueezeExcitationLayer-790 [-1, 32, 7, 7] 0BatchNorm2d-791 [-1, 544, 7, 7] 1,088ReLU-792 [-1, 544, 7, 7] 0Conv2d-793 [-1, 128, 7, 7] 69,632BatchNorm2d-794 [-1, 128, 7, 7] 256ReLU-795 [-1, 128, 7, 7] 0Conv2d-796 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-797 [-1, 32, 1, 1] 0Linear-798 [-1, 16] 528ReLU-799 [-1, 16] 0Linear-800 [-1, 32] 544Sigmoid-801 [-1, 32] 0 SqueezeExcitationLayer-802 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-803 [-1, 32, 1, 1] 0Linear-804 [-1, 16] 528ReLU-805 [-1, 16] 0Linear-806 [-1, 32] 544Sigmoid-807 [-1, 32] 0 SqueezeExcitationLayer-808 [-1, 32, 7, 7] 0BatchNorm2d-809 [-1, 576, 7, 7] 1,152ReLU-810 [-1, 576, 7, 7] 0Conv2d-811 [-1, 128, 7, 7] 73,728BatchNorm2d-812 [-1, 128, 7, 7] 256ReLU-813 [-1, 128, 7, 7] 0Conv2d-814 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-815 [-1, 32, 1, 1] 0Linear-816 [-1, 16] 528ReLU-817 [-1, 16] 0Linear-818 [-1, 32] 544Sigmoid-819 [-1, 32] 0 SqueezeExcitationLayer-820 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-821 [-1, 32, 1, 1] 0Linear-822 [-1, 16] 528ReLU-823 [-1, 16] 0Linear-824 [-1, 32] 544Sigmoid-825 [-1, 32] 0 SqueezeExcitationLayer-826 [-1, 32, 7, 7] 0BatchNorm2d-827 [-1, 608, 7, 7] 1,216ReLU-828 [-1, 608, 7, 7] 0Conv2d-829 [-1, 128, 7, 7] 77,824BatchNorm2d-830 [-1, 128, 7, 7] 256ReLU-831 [-1, 128, 7, 7] 0Conv2d-832 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-833 [-1, 32, 1, 1] 0Linear-834 [-1, 16] 528ReLU-835 [-1, 16] 0Linear-836 [-1, 32] 544Sigmoid-837 [-1, 32] 0 SqueezeExcitationLayer-838 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-839 [-1, 32, 1, 1] 0Linear-840 [-1, 16] 528ReLU-841 [-1, 16] 0Linear-842 [-1, 32] 544Sigmoid-843 [-1, 32] 0 SqueezeExcitationLayer-844 [-1, 32, 7, 7] 0BatchNorm2d-845 [-1, 640, 7, 7] 1,280ReLU-846 [-1, 640, 7, 7] 0Conv2d-847 [-1, 128, 7, 7] 81,920BatchNorm2d-848 [-1, 128, 7, 7] 256ReLU-849 [-1, 128, 7, 7] 0Conv2d-850 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-851 [-1, 32, 1, 1] 0Linear-852 [-1, 16] 528ReLU-853 [-1, 16] 0Linear-854 [-1, 32] 544Sigmoid-855 [-1, 32] 0 SqueezeExcitationLayer-856 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-857 [-1, 32, 1, 1] 0Linear-858 [-1, 16] 528ReLU-859 [-1, 16] 0Linear-860 [-1, 32] 544Sigmoid-861 [-1, 32] 0 SqueezeExcitationLayer-862 [-1, 32, 7, 7] 0BatchNorm2d-863 [-1, 672, 7, 7] 1,344ReLU-864 [-1, 672, 7, 7] 0Conv2d-865 [-1, 128, 7, 7] 86,016BatchNorm2d-866 [-1, 128, 7, 7] 256ReLU-867 [-1, 128, 7, 7] 0Conv2d-868 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-869 [-1, 32, 1, 1] 0Linear-870 [-1, 16] 528ReLU-871 [-1, 16] 0Linear-872 [-1, 32] 544Sigmoid-873 [-1, 32] 0 SqueezeExcitationLayer-874 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-875 [-1, 32, 1, 1] 0Linear-876 [-1, 16] 528ReLU-877 [-1, 16] 0Linear-878 [-1, 32] 544Sigmoid-879 [-1, 32] 0 SqueezeExcitationLayer-880 [-1, 32, 7, 7] 0BatchNorm2d-881 [-1, 704, 7, 7] 1,408ReLU-882 [-1, 704, 7, 7] 0Conv2d-883 [-1, 128, 7, 7] 90,112BatchNorm2d-884 [-1, 128, 7, 7] 256ReLU-885 [-1, 128, 7, 7] 0Conv2d-886 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-887 [-1, 32, 1, 1] 0Linear-888 [-1, 16] 528ReLU-889 [-1, 16] 0Linear-890 [-1, 32] 544Sigmoid-891 [-1, 32] 0 SqueezeExcitationLayer-892 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-893 [-1, 32, 1, 1] 0Linear-894 [-1, 16] 528ReLU-895 [-1, 16] 0Linear-896 [-1, 32] 544Sigmoid-897 [-1, 32] 0 SqueezeExcitationLayer-898 [-1, 32, 7, 7] 0BatchNorm2d-899 [-1, 736, 7, 7] 1,472ReLU-900 [-1, 736, 7, 7] 0Conv2d-901 [-1, 128, 7, 7] 94,208BatchNorm2d-902 [-1, 128, 7, 7] 256ReLU-903 [-1, 128, 7, 7] 0Conv2d-904 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-905 [-1, 32, 1, 1] 0Linear-906 [-1, 16] 528ReLU-907 [-1, 16] 0Linear-908 [-1, 32] 544Sigmoid-909 [-1, 32] 0 SqueezeExcitationLayer-910 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-911 [-1, 32, 1, 1] 0Linear-912 [-1, 16] 528ReLU-913 [-1, 16] 0Linear-914 [-1, 32] 544Sigmoid-915 [-1, 32] 0 SqueezeExcitationLayer-916 [-1, 32, 7, 7] 0BatchNorm2d-917 [-1, 768, 7, 7] 1,536ReLU-918 [-1, 768, 7, 7] 0Conv2d-919 [-1, 128, 7, 7] 98,304BatchNorm2d-920 [-1, 128, 7, 7] 256ReLU-921 [-1, 128, 7, 7] 0Conv2d-922 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-923 [-1, 32, 1, 1] 0Linear-924 [-1, 16] 528ReLU-925 [-1, 16] 0Linear-926 [-1, 32] 544Sigmoid-927 [-1, 32] 0 SqueezeExcitationLayer-928 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-929 [-1, 32, 1, 1] 0Linear-930 [-1, 16] 528ReLU-931 [-1, 16] 0Linear-932 [-1, 32] 544Sigmoid-933 [-1, 32] 0 SqueezeExcitationLayer-934 [-1, 32, 7, 7] 0BatchNorm2d-935 [-1, 800, 7, 7] 1,600ReLU-936 [-1, 800, 7, 7] 0Conv2d-937 [-1, 128, 7, 7] 102,400BatchNorm2d-938 [-1, 128, 7, 7] 256ReLU-939 [-1, 128, 7, 7] 0Conv2d-940 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-941 [-1, 32, 1, 1] 0Linear-942 [-1, 16] 528ReLU-943 [-1, 16] 0Linear-944 [-1, 32] 544Sigmoid-945 [-1, 32] 0 SqueezeExcitationLayer-946 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-947 [-1, 32, 1, 1] 0Linear-948 [-1, 16] 528ReLU-949 [-1, 16] 0Linear-950 [-1, 32] 544Sigmoid-951 [-1, 32] 0 SqueezeExcitationLayer-952 [-1, 32, 7, 7] 0BatchNorm2d-953 [-1, 832, 7, 7] 1,664ReLU-954 [-1, 832, 7, 7] 0Conv2d-955 [-1, 128, 7, 7] 106,496BatchNorm2d-956 [-1, 128, 7, 7] 256ReLU-957 [-1, 128, 7, 7] 0Conv2d-958 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-959 [-1, 32, 1, 1] 0Linear-960 [-1, 16] 528ReLU-961 [-1, 16] 0Linear-962 [-1, 32] 544Sigmoid-963 [-1, 32] 0 SqueezeExcitationLayer-964 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-965 [-1, 32, 1, 1] 0Linear-966 [-1, 16] 528ReLU-967 [-1, 16] 0Linear-968 [-1, 32] 544Sigmoid-969 [-1, 32] 0 SqueezeExcitationLayer-970 [-1, 32, 7, 7] 0BatchNorm2d-971 [-1, 864, 7, 7] 1,728ReLU-972 [-1, 864, 7, 7] 0Conv2d-973 [-1, 128, 7, 7] 110,592BatchNorm2d-974 [-1, 128, 7, 7] 256ReLU-975 [-1, 128, 7, 7] 0Conv2d-976 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-977 [-1, 32, 1, 1] 0Linear-978 [-1, 16] 528ReLU-979 [-1, 16] 0Linear-980 [-1, 32] 544Sigmoid-981 [-1, 32] 0 SqueezeExcitationLayer-982 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-983 [-1, 32, 1, 1] 0Linear-984 [-1, 16] 528ReLU-985 [-1, 16] 0Linear-986 [-1, 32] 544Sigmoid-987 [-1, 32] 0 SqueezeExcitationLayer-988 [-1, 32, 7, 7] 0BatchNorm2d-989 [-1, 896, 7, 7] 1,792ReLU-990 [-1, 896, 7, 7] 0Conv2d-991 [-1, 128, 7, 7] 114,688BatchNorm2d-992 [-1, 128, 7, 7] 256ReLU-993 [-1, 128, 7, 7] 0Conv2d-994 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-995 [-1, 32, 1, 1] 0Linear-996 [-1, 16] 528ReLU-997 [-1, 16] 0Linear-998 [-1, 32] 544Sigmoid-999 [-1, 32] 0 SqueezeExcitationLayer-1000 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-1001 [-1, 32, 1, 1] 0Linear-1002 [-1, 16] 528ReLU-1003 [-1, 16] 0Linear-1004 [-1, 32] 544Sigmoid-1005 [-1, 32] 0 SqueezeExcitationLayer-1006 [-1, 32, 7, 7] 0BatchNorm2d-1007 [-1, 928, 7, 7] 1,856ReLU-1008 [-1, 928, 7, 7] 0Conv2d-1009 [-1, 128, 7, 7] 118,784BatchNorm2d-1010 [-1, 128, 7, 7] 256ReLU-1011 [-1, 128, 7, 7] 0Conv2d-1012 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-1013 [-1, 32, 1, 1] 0Linear-1014 [-1, 16] 528ReLU-1015 [-1, 16] 0Linear-1016 [-1, 32] 544Sigmoid-1017 [-1, 32] 0 SqueezeExcitationLayer-1018 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-1019 [-1, 32, 1, 1] 0Linear-1020 [-1, 16] 528ReLU-1021 [-1, 16] 0Linear-1022 [-1, 32] 544Sigmoid-1023 [-1, 32] 0 SqueezeExcitationLayer-1024 [-1, 32, 7, 7] 0BatchNorm2d-1025 [-1, 960, 7, 7] 1,920ReLU-1026 [-1, 960, 7, 7] 0Conv2d-1027 [-1, 128, 7, 7] 122,880BatchNorm2d-1028 [-1, 128, 7, 7] 256ReLU-1029 [-1, 128, 7, 7] 0Conv2d-1030 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-1031 [-1, 32, 1, 1] 0Linear-1032 [-1, 16] 528ReLU-1033 [-1, 16] 0Linear-1034 [-1, 32] 544Sigmoid-1035 [-1, 32] 0 SqueezeExcitationLayer-1036 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-1037 [-1, 32, 1, 1] 0Linear-1038 [-1, 16] 528ReLU-1039 [-1, 16] 0Linear-1040 [-1, 32] 544Sigmoid-1041 [-1, 32] 0 SqueezeExcitationLayer-1042 [-1, 32, 7, 7] 0BatchNorm2d-1043 [-1, 992, 7, 7] 1,984ReLU-1044 [-1, 992, 7, 7] 0Conv2d-1045 [-1, 128, 7, 7] 126,976BatchNorm2d-1046 [-1, 128, 7, 7] 256ReLU-1047 [-1, 128, 7, 7] 0Conv2d-1048 [-1, 32, 7, 7] 36,864 AdaptiveAvgPool2d-1049 [-1, 32, 1, 1] 0Linear-1050 [-1, 16] 528ReLU-1051 [-1, 16] 0Linear-1052 [-1, 32] 544Sigmoid-1053 [-1, 32] 0 SqueezeExcitationLayer-1054 [-1, 32, 7, 7] 0 AdaptiveAvgPool2d-1055 [-1, 32, 1, 1] 0Linear-1056 [-1, 16] 528ReLU-1057 [-1, 16] 0Linear-1058 [-1, 32] 544Sigmoid-1059 [-1, 32] 0 SqueezeExcitationLayer-1060 [-1, 32, 7, 7] 0BatchNorm2d-1061 [-1, 1024, 7, 7] 2,048ReLU-1062 [-1, 1024, 7, 7] 0Linear-1063 [-1, 2] 2,050Total params: 7,080,258 Trainable params: 7,080,258 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 311.15 Params size (MB): 27.01 Estimated Total Size (MB): 338.73 ----------------------------------------------------------------5. 设置超参数定义损失函数学习率以及根据学习率定义优化器等 # loss_fn nn.CrossEntropyLoss() # 创建损失函数# learn_rate 1e-3 # 初始学习率 # def adjust_learning_rate(optimizer,epoch,start_lr): # # 每两个epoch 衰减到原来的0.98 # lr start_lr * (0.92 ** (epoch//2)) # for param_group in optimizer.param_groups: # param_group[lr] lr# optimizer torch.optim.Adam(model.parameters(),lrlearn_rate)# 调用官方接口示例 loss_fn nn.CrossEntropyLoss()learn_rate 1e-4 lambda1 lambda epoch:(0.92**(epoch//2))optimizer torch.optim.Adam(model.parameters(),lr learn_rate) scheduler torch.optim.lr_scheduler.LambdaLR(optimizer,lr_lambdalambda1) # 选定调整方法6. 训练函数 # 训练函数 def train(dataloader,model,loss_fn,optimizer):size len(dataloader.dataset) # 训练集大小num_batches len(dataloader) # 批次数目train_loss,train_acc 0,0for X,y in dataloader:X,y X.to(device),y.to(device)# 计算预测误差pred model(X)loss loss_fn(pred,y)# 反向传播optimizer.zero_grad()loss.backward()optimizer.step()# 记录acc与losstrain_acc (pred.argmax(1)y).type(torch.float).sum().item()train_loss loss.item()train_acc / sizetrain_loss / num_batchesreturn train_acc,train_loss7. 测试函数 # 测试函数 def test(dataloader,model,loss_fn):size len(dataloader.dataset)num_batches len(dataloader)test_acc,test_loss 0,0with torch.no_grad():for X,y in dataloader:X,y X.to(device),y.to(device)# 计算losspred model(X)loss loss_fn(pred,y)test_acc (pred.argmax(1)y).type(torch.float).sum().item()test_loss loss.item()test_acc / sizetest_loss / num_batchesreturn test_acc,test_loss8. 正式训练 import copyepochs 40train_acc [] train_loss [] test_acc [] test_loss []best_acc 0.0for epoch in range(epochs):# 更新学习率——使用自定义学习率时使用# adjust_learning_rate(optimizer,epoch,learn_rate)model.train()epoch_train_acc,epoch_train_loss train(train_dl,model,loss_fn,optimizer)scheduler.step() # 更新学习率——调用官方动态学习率时使用model.eval()epoch_test_acc,epoch_test_loss test(test_dl,model,loss_fn)# 保存最佳模型到 best_modelif epoch_test_acc best_acc:best_acc epoch_test_accbest_model copy.deepcopy(model)train_acc.append(epoch_train_acc)train_loss.append(epoch_train_loss)test_acc.append(epoch_test_acc)test_loss.append(epoch_test_loss)# 获取当前学习率lr optimizer.state_dict()[param_groups][0][lr]template (Epoch:{:2d},Train_acc:{:.1f}%,Train_loss:{:.3f},Test_acc:{:.1f}%,Test_loss:{:.3f},Lr:{:.2E})print(template.format(epoch1,epoch_train_acc*100,epoch_train_loss,epoch_test_acc*100,epoch_test_loss,lr))print(Done)Epoch: 1,Train_acc:65.3%,Train_loss:0.633,Test_acc:68.5%,Test_loss:0.595,Lr:1.00E-04 Epoch: 2,Train_acc:70.1%,Train_loss:0.578,Test_acc:71.3%,Test_loss:0.562,Lr:9.20E-05 Epoch: 3,Train_acc:72.3%,Train_loss:0.542,Test_acc:75.5%,Test_loss:0.510,Lr:9.20E-05 Epoch: 4,Train_acc:74.0%,Train_loss:0.502,Test_acc:80.0%,Test_loss:0.457,Lr:8.46E-05 Epoch: 5,Train_acc:76.6%,Train_loss:0.474,Test_acc:76.7%,Test_loss:0.488,Lr:8.46E-05 Epoch: 6,Train_acc:79.3%,Train_loss:0.434,Test_acc:79.0%,Test_loss:0.440,Lr:7.79E-05 Epoch: 7,Train_acc:80.9%,Train_loss:0.423,Test_acc:83.0%,Test_loss:0.397,Lr:7.79E-05 Epoch: 8,Train_acc:83.3%,Train_loss:0.375,Test_acc:78.1%,Test_loss:0.433,Lr:7.16E-05 Epoch: 9,Train_acc:83.6%,Train_loss:0.360,Test_acc:82.5%,Test_loss:0.374,Lr:7.16E-05 Epoch:10,Train_acc:84.8%,Train_loss:0.333,Test_acc:88.3%,Test_loss:0.320,Lr:6.59E-05 Epoch:11,Train_acc:88.1%,Train_loss:0.294,Test_acc:87.4%,Test_loss:0.337,Lr:6.59E-05 Epoch:12,Train_acc:87.3%,Train_loss:0.293,Test_acc:84.6%,Test_loss:0.364,Lr:6.06E-05 Epoch:13,Train_acc:89.1%,Train_loss:0.257,Test_acc:88.6%,Test_loss:0.269,Lr:6.06E-05 Epoch:14,Train_acc:90.3%,Train_loss:0.238,Test_acc:84.6%,Test_loss:0.356,Lr:5.58E-05 Epoch:15,Train_acc:91.2%,Train_loss:0.210,Test_acc:84.4%,Test_loss:0.328,Lr:5.58E-05 Epoch:16,Train_acc:91.8%,Train_loss:0.202,Test_acc:89.3%,Test_loss:0.279,Lr:5.13E-05 Epoch:17,Train_acc:93.3%,Train_loss:0.165,Test_acc:89.3%,Test_loss:0.277,Lr:5.13E-05 Epoch:18,Train_acc:93.5%,Train_loss:0.168,Test_acc:89.5%,Test_loss:0.324,Lr:4.72E-05 Epoch:19,Train_acc:93.7%,Train_loss:0.173,Test_acc:87.9%,Test_loss:0.293,Lr:4.72E-05 Epoch:20,Train_acc:93.8%,Train_loss:0.156,Test_acc:90.7%,Test_loss:0.249,Lr:4.34E-05 Epoch:21,Train_acc:95.2%,Train_loss:0.122,Test_acc:89.3%,Test_loss:0.266,Lr:4.34E-05 Epoch:22,Train_acc:96.2%,Train_loss:0.123,Test_acc:90.7%,Test_loss:0.270,Lr:4.00E-05 Epoch:23,Train_acc:95.9%,Train_loss:0.124,Test_acc:89.5%,Test_loss:0.290,Lr:4.00E-05 Epoch:24,Train_acc:96.0%,Train_loss:0.118,Test_acc:91.4%,Test_loss:0.296,Lr:3.68E-05 Epoch:25,Train_acc:95.2%,Train_loss:0.131,Test_acc:91.4%,Test_loss:0.248,Lr:3.68E-05 Epoch:26,Train_acc:95.7%,Train_loss:0.113,Test_acc:90.4%,Test_loss:0.306,Lr:3.38E-05 Epoch:27,Train_acc:97.6%,Train_loss:0.077,Test_acc:93.7%,Test_loss:0.226,Lr:3.38E-05 Epoch:28,Train_acc:96.6%,Train_loss:0.089,Test_acc:91.8%,Test_loss:0.286,Lr:3.11E-05 Epoch:29,Train_acc:97.3%,Train_loss:0.084,Test_acc:92.8%,Test_loss:0.243,Lr:3.11E-05 Epoch:30,Train_acc:96.6%,Train_loss:0.093,Test_acc:91.8%,Test_loss:0.227,Lr:2.86E-05 Epoch:31,Train_acc:97.4%,Train_loss:0.075,Test_acc:93.7%,Test_loss:0.236,Lr:2.86E-05 Epoch:32,Train_acc:97.6%,Train_loss:0.073,Test_acc:92.1%,Test_loss:0.246,Lr:2.63E-05 Epoch:33,Train_acc:97.8%,Train_loss:0.066,Test_acc:93.0%,Test_loss:0.223,Lr:2.63E-05 Epoch:34,Train_acc:98.4%,Train_loss:0.053,Test_acc:92.1%,Test_loss:0.265,Lr:2.42E-05 Epoch:35,Train_acc:98.4%,Train_loss:0.056,Test_acc:91.6%,Test_loss:0.250,Lr:2.42E-05 Epoch:36,Train_acc:98.2%,Train_loss:0.062,Test_acc:92.5%,Test_loss:0.301,Lr:2.23E-05 Epoch:37,Train_acc:97.6%,Train_loss:0.068,Test_acc:93.5%,Test_loss:0.236,Lr:2.23E-05 Epoch:38,Train_acc:98.1%,Train_loss:0.049,Test_acc:91.8%,Test_loss:0.244,Lr:2.05E-05 Epoch:39,Train_acc:98.9%,Train_loss:0.043,Test_acc:94.2%,Test_loss:0.216,Lr:2.05E-05 Epoch:40,Train_acc:98.7%,Train_loss:0.045,Test_acc:92.8%,Test_loss:0.245,Lr:1.89E-05 Done9. 结果可视化 epochs_range range(epochs)plt.figure(figsize (12,3))plt.subplot(1,2,1) plt.plot(epochs_range,train_acc,label Training Accuracy) plt.plot(epochs_range,test_acc,label Test Accuracy) plt.legend(loc lower right) plt.title(Training and Validation Accuracy)plt.subplot(1,2,2) plt.plot(epochs_range,train_loss,label Test Accuracy) plt.plot(epochs_range,test_loss,label Test Loss) plt.legend(loc lower right) plt.title(Training and validation Loss) plt.show() 10. 模型的保存 # 自定义模型保存 # 状态字典保存 torch.save(model.state_dict(),./模型参数/J5_densenet121SE_model_state_dict.pth) # 仅保存状态字典# 定义模型用来加载参数 best_model DenseNet(num_init_features64, # init_channel64,growth_rate32,block_config(6, 12, 24, 16),num_classeslen(classNames), # 根据您的分类任务设置类别数se_filter_sqse_filter_sq # 传递SE模块的参数 ).to(device)best_model.load_state_dict(torch.load(./模型参数/J5_densenet121SE_model_state_dict.pth)) # 加载状态字典到模型All keys matched successfully11. 使用训练好的模型进行预测 # 指定路径图片预测 from PIL import Image import torchvision.transforms as transformsclasses list(total_data.class_to_idx) # classes list(total_data.class_to_idx)def predict_one_image(image_path,model,transform,classes):test_img Image.open(image_path).convert(RGB)# plt.imshow(test_img) # 展示待预测的图片test_img transform(test_img)img test_img.to(device).unsqueeze(0)model.eval()output model(img)print(output) # 观察模型预测结果的输出数据_,pred torch.max(output,1)pred_class classes[pred]print(f预测结果是:{pred_class})# 预测训练集中的某张照片 predict_one_image(image_path./data/mpox_recognize/Monkeypox/M01_01_04.jpg,model model,transform test_transforms,classes classes)tensor([[ 2.6228, -3.6656]], devicecuda:0, grad_fnAddmmBackward0) 预测结果是:Monkeypoxclasses[Monkeypox, Others]

查看全文

http://www.lakalapos1.cn/news/19308/